Understanding infrastructure: The role of compute power
Imagine you’re exploring a new city and need to impress your vegan in-laws with a restaurant choice. You use the voice feature on the Meta AI app and say, “Hey Meta, what are the best vegan options around?” Within a few seconds, Meta AI, powered by Muse Spark, provides a list of local vegan restaurants, a brief description of each one’s atmosphere, and a map showing their locations. This quick and seamless exchange feels effortless, but it’s underpinned by complex calculations enabled by compute power.
What is compute power?
Compute power refers to the capacity of a computer chip to perform work and the speed at which it can do so, similar to horsepower in a car engine. It is measured in FLOPS, or floating-point operations per second, which indicates how many calculations a chip can execute in one second. The speed of compute is measured in FLOPS, while gigawatts measure its scale, indicating how many chips can be operated simultaneously.
When you ask Meta AI for vegan restaurant recommendations, billions of calculations are performed within seconds. Your voice is captured, converted into text from sound waves, and processed by computers or servers within a data center. A large language model (LLM) analyzes your query, and the results are delivered straight to you.
Even straightforward actions, like searching for a local barbershop on Instagram, require multiple computational layers: understanding language, processing the query, scanning an index, generating results, and delivering them before your thumb leaves the screen. All this processing power is harnessed through chips within servers at our data centers.
The building blocks of compute
Despite being an abstract concept, compute power relies on tangible chips. Different chips are designed for various computations and workloads.
- Central Processing Units (CPUs): These processors facilitate AI training and inference. Originally designed to handle tasks sequentially, CPUs are adept at managing network traffic, running application logic, and coordinating workflows across systems.
- Graphics Processing Units (GPUs): Initially designed for rendering graphics, GPUs excel at conducting thousands of calculations simultaneously, which is essential for powering AI. Training models to understand languages, recognize images, or engage in conversation requires extensive calculations running simultaneously for extended periods.
- Custom chips: These processors are tailored for specific tasks like ranking, recommendations, and generative AI. Meta has developed the Meta Training and Inference Accelerator (MTIA), a series of custom chips designed specifically for AI workloads. While mainstream GPUs are typically used for large-scale AI training, MTIA is optimized for inference workloads, offering unmatched flexibility and efficiency.
How compute power supports Meta’s AI
At Meta, we are building a network of AI-optimized data centers designed to support both AI and other essential workloads. This requires a diverse approach to infrastructure, sourcing silicon from various partners to ensure the right chips match the right tasks. Our custom MTIA silicon plays a crucial role in these efforts. Over the next two years, we plan to develop and deploy four new generations of chips to support ranking, recommendations, and generative AI workloads.
In April, we expanded our partnership with Broadcom to co-develop multiple generations of MTIA chips. Earlier this year, we partnered with Arm to develop the Arm AGI CPU, the first data center processor specifically for handling the massive data movement demanded by AI workloads. We also announced partnerships with AWS, AMD, and NVIDIA to supply chips for our compute portfolio. These collaborations will enable ongoing innovation and development of AI tools for the future.
We recently introduced Muse Spark, our most advanced AI model to date, built by Meta Superintelligence Labs. This model processes voice, text, and images together, enabled by compute power at every level. From training models across thousands of GPUs to supporting billions of inferences daily on custom MTIA chips, it all operates through efficient networks of servers globally. As AI becomes more integrated into daily life, the demand for more powerful and efficient compute will continue to grow, and we are committed to building the necessary infrastructure to support it.