Skip to content

Artificial Intelligence (AI)

Artifical Analysis

Hardware > Infrastructure > Models > Applications

Google continues to stand out as the most vertically integrated from TPU accelerators to Gemini

Reasoning models, longer contexts, and agents are multiplying compute demand per user query

Increasing the size of both scale-up domains (single coherent system, eg. NVL72 connected with NVLINK) and scale-out domains (networked nodes, ethernet based networking technologies) allows the delivery of greater training compute

Inference techniques confined until recently to the frontier labs are becoming widely available – driven by DeepSeek’s open sourcing, NVIDIA Dynamo and upcoming work from open source projects including SGLang.

Key techniques include prefill/decode disaggregation and expert parallelism across dozens or hundreds of GPUs, along with novel load balancing techniques like scaling expert replicas depending on activation frequency.

Huawei is emerging as China’s chip leader, designing chips and systems that may approach Hopper-level performance that are manufactured on a mix of TSMC and SMIC nodes

MLPerf Training

Resource

Knowledge

Others

Google Gemini

Apple Intelligence

SemiAnalysis

AI Playground

karpathy

CUDA

The NVIDIA® CUDA® Toolkit provides a development environment for creating high-performance, GPU-accelerated applications.

The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library.

cuDNN

The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks.

cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, attention, matmul, pooling, and normalization.

OpenMPI

The Open MPI Project is an open source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners.

Metal

Metal powers hardware-accelerated graphics on Apple platforms by providing a low-overhead API, rich shading language, tight integration between graphics and compute, and an unparalleled suite of GPU profiling and debugging tools.

GPU Clouds

Google Cloud GPUs

Lambda

On-demand & reserved cloud NVIDIA GPUs for AI training & inference

SambaNova Systems

SambaNova strives to be the most efficient and adaptable AI platform on the planet. Our AI solution is designed to empower enterprises to control the trajectory of their data and AI future.

Cerebras

Today, Cerebras stands alone as the world’s fastest AI inference and training platform. Organizations across fields like medical research, cryptography, energy, and agentic AI use our CS-2 and CS-3 systems to build on-premise supercomputers, while developers and enterprises everywhere can access the power of Cerebras through our pay-as-you-go cloud offerings.

We have come a long way in our first decade. But our journey is just beginning.

Groq

With the seismic shift in AI toward deploying or running models – known as inference – developers and enterprises alike can experience instant intelligence with Groq. We provide fast AI inference in the cloud and in on-prem AI compute centers. We power the speed of iteration, fueling a new wave of innovation, productivity, and discovery. Groq was founded in 2016 to build technology to advance AI because we saw this moment coming.

Fireworks

MiniMax

MiniMax is a global AI foundation model company. Founded in early 2022, we are committed to advancing the frontiers of AI towards AGI via our mission Intelligence with Everyone.

Our proprietary multimodal models, led by MiniMax M1, Hailuo-02, Speech-02 and Music-01, have ultra-long context processing capacity and can understand, generate, and integrate a wide range of modalities, including text, audio, images, video, and music. These models power our major AI-native products — including MiniMax, Hailuo AI, MiniMax Audio, Talkie, and our enterprise and developer-facing Open API Platform — which collectively deliver intelligent, dynamic experiences to enhance productivity and quality of life for users worldwide.

To date, our proprietary models and AI-native products have cumulatively served over 157 million individual users across over 200 countries and regions, and more than 50,000 enterprises and developers across over 90 countries and regions.

A121 Labs

AI21 is pioneering the development of enterprise AI Systems and Foundation Models. Our mission is to build trustworthy artificial intelligence that powers humanity towards superproductivity.

We offer privately deployed models with unmatched performance and reliability with tailored solutions for every organization.

AI21 Labs was founded in 2017 by pioneers of artificial intelligence, Professor Amnon Shashua (founder and CEO of Mobileye), Professor Yoav Shoham (Professor Emeritus at Stanford University and former Principal Scientist at Google), and Ori Goshen (serial entrepreneur and founder of CrowdX) with the goal of building AI systems that become thought partners for humans.

AI21 Labs enables enterprises to design their own generative AI applications powered by our groundbreaking models at the core.

Our mission? To build trustworthy artificial intelligence that powers humanity towards superproductivity.

MidJourney

Moonshot AI

Mistral

We are Mistral AI, a pioneering French artificial intelligence startup founded in April 2023 by three visionary researchers: Arthur Mensch, Guillaume Lample, and Timothée Lacroix.

United by their shared academic roots at École Polytechnique and experiences at Google DeepMind and Meta, they envisioned a different, audacious approach to artificial intelligence—to challenge the opaque-box nature of ‘big AI’, and making this cutting-edge technology accessible to all.

This manifested into the company’s mission of democratizing artificial intelligence through open-source, efficient, and innovative AI models, products, and solutions.

upstage

Founded in 2020, we’re a dynamic team of 100+ top AI researchers, engineers, and business leaders with a proven track record of building AI solutions trusted by major enterprises worldwide. Our team collaborates remotely, with hubs in Seoul, San Francisco, and Tokyo.

Google Models

Google Veo 3

Nano Banana