Building GEM at Meta: Lessons from Building Our Largest Ads Foundation Model
Our Generative Ads Recommendation Model (GEM) at Meta brings LLM-scale modeling to ads. GEM’s architecture enables efficient, cost-effective scaling with data and compute. Advanced post-training techniques boost knowledge transfer across the ads model fleet, while an optimized training stack harnesses thousands of GPUs for parallel processing.
The result? Stronger ad relevance, higher conversions and an ad recommendation system (RecSys) that learns faster as we add data and compute.
To get there, the GEM team overcame multiple engineering hurdles, including:
Lessons Learned from Building GEM: Our Frontier Generative Ads Recommendation Model
1. Diverse Data and Massive Compute Demand a New Training Stack
GEM presented challenges beyond sheer scale. It unified data from Facebook, Instagram and Business Messaging, each with unique formats and objectives. This required new architectures with massive compute to capture nuanced, multimodal user–ad interactions.
Working with this level of complexity pushes the team to expand their skills in new, sometimes unconventional ways.
“GEM’s data is incredibly complex. We train across enormous survey and graph datasets from Facebook, Instagram and others — each with distinct patterns. The real challenge was teaching GEM to learn from cross-surface interactions while still optimizing for each surface’s objectives. It definitely stretched our creativity.” — Chunzhi Yang, Software Engineer, Ranking AI Foundational Engineering Team
2. Scaling Data Requires a Co-Designed Architecture
Adding data is only part of the puzzle. GEM’s performance gains came from a co-designed architecture that incorporates scaling laws, long-sequence modeling, domain-specific optimization and multi-dimensional parallelism (custom GPU kernels and model-system co-design). The payoff: near-linear scaling and peak throughput on thousands of GPUs.
“One of the most fascinating parts of working on GEM is how frontier our research is — we essentially created our own scaling law for how to increase data while improving model performance. The Meta work culture encourages bold innovation, which makes it easy to share new ideas freely. I’m also learning from an incredibly talented group of people,” — Huayu Li, Research Scientist, Ranking AI Foundational Engineering Team
Chunzhi agrees, explaining, “It’s never one person coming up with these ideas — it’s collaboration across multiple teams. We move fast, but we also make sure both modeling and infrastructure perspectives guide every decision.”
3. Knowledge Transfer Turns Frontier Research into Real-World Impact
GEM’s true breakthrough is maximizing downstream impact through direct and hierarchical knowledge transfer — using techniques like distillation, representation learning and parameter sharing. This resulted in 2x the effectiveness of standard distillation and measurable gains across key surfaces.
“We’re applying GEM to advertising today, but its potential goes far beyond that — from organic recommendations to future RecSys research. Building GEM helped advance our foundational understanding of RecSys and uncover new ways to balance user interests with advertiser goals. It’s motivated me to build better benchmarks and share insights across the industry so we can all move the field forward.” — Ellie Wen, Research Scientist, Ranking AI Foundational Engineering Team
Ellie believes Meta is one of the few places capable of doing this level of frontier recommendation system (RecSys) work, thanks to the richness of our data and breadth of user interactions.
“GEM lets us move beyond traditional RecSys limits. We were able to build much larger models and try new strategies like parameter sharing, leading to better efficiency, improved knowledge transfer and new possibilities in ad recommendations.”
Why It Matters:
GEM exemplifies the engineering journey at Meta. Teams tackle problems only possible at global scale, pioneer breakthroughs in core research and make a tangible difference for billions. The work behind GEM isn’t just about pushing recommendation systems forward; it’s about setting the standard for AI innovation in real-world production.
👉 Dive deeper into the making of GEM in this recent blog from the Engineering at Meta team.
👉 Ready to join us? Explore Meta Careers
Thanks to supertechrecovering@gmail.com For helping me to reactivate my account after 3 months of running around trying to recover my account,meta support can be very frustrating sometimes,imagine wanting your account so bad only for meta to ignore you,but thanks to supertechrecovering@gmail.com for helping out
GEM is a testament to the power of teamwork and cutting-edge AI. This model is pushing the limits of what's possible in ad tech.
Great work, Dingqiao!!!
Been the largest Ads foundation model is awesome. It shows the capability Ai can showcase if well utilized. GEM is reshaping the future of Ads, scaling and execution. Meta is a trailblazer.
Amazing beautiful work 😁