Google Gemini Dethrones GPT-4

PLUS: NVIDIA Has Competition

Hello readers,

Welcome to another edition of This Week in the Future! Competition is heating up in the AI space. Google has unveiled Gemini, their answer to GPT-4. Gemini is natively multimodal and outperforms GPT-4 on a number of key metrics. Plus, Meta and IBM have formed the AI Alliance and AMD has unveiled their MI300 chips to compete with the NVIDIA H100.

As always, thanks for being a subscriber! We hope you enjoy this week’s content — for a video breakdown, check out the episode on YouTube.

Let’s get into it!

The Gemini Era

Google has recently unveiled its new AI model named Gemini, which is described as the most capable, flexible, and general AI model the company has ever built. What flew under the radar was AlphaCode 2, which uses Gemini to excel at coding.

Let’s take a closer look at Gemini:

Multimodal Capabilities: Gemini is built from the ground up to be multimodal, meaning it can generalize and seamlessly understand, operate across, and combine different types of information, including text, images, audio, video, and code. This feature sets it apart from previous models and enables it to handle a wide range of tasks more efficiently.

Different Versions for Varied Applications: Gemini is available in three different sizes - Ultra, Pro, and Nano. Each of these versions is tailored for specific applications: Gemini Ultra is designed for highly complex tasks, Gemini Pro for scaling across a wide range of tasks, and Gemini Nano for on-device tasks, such as those on the Pixel 8 Pro.

Advanced Performance: According to Google, Gemini Ultra is the first model to outperform human experts on the MMLU (massive multitask language understanding) benchmark, which tests knowledge and problem-solving abilities across 57 subjects like math, physics, history, law, medicine, and ethics.

Integration with Google Products: Gemini Pro is already integrated with Google's chatbot Bard, and a paid Bard Advanced coming next year will run Gemini Ultra. Gemini is also planned to be incorporated into most Google products and services, such as Search, Ads, Chrome, YouTube, and Duet AI.

Developer and Enterprise Access: Starting December 13, developers and enterprise customers can access Gemini via the Gemini API in Google AI Studio or Google Cloud Vertex AI. This opens up opportunities for a wide range of applications in different domains and an alternative to Microsoft’s offerings with OpenAI.

State-of-the-Art Capabilities: Gemini has demonstrated a state-of-the-art score of 59.4 percent on the new MMMU benchmark, which consists of multimodal tasks requiring deliberate reasoning. It also reportedly outperforms OpenAI's ChatGPT in general reasoning, math, and code tasks.

Our Take

There now exists a model that’s competitive with GPT-4, which means there’s not only a considerable alternative to Microsoft’s AI offerings but also ChatGPT itself with Bard and Bard Advanced. Gemini’s native multimodality is impressive and considered by some to be a step towards AGI. However, OpenAI still has Q* (maybe) and GPT-5 in the works. We will need to see for ourselves what Gemini Ultra is capable of when it comes out early next year (since Google’s demo was apparently faked). Regardless, Google's Gemini represents another significant development in the field of AI, with DeepMind already planning to use it to advance robotics.

NVIDIA Has Competition

AMD has taken its first step in challenging NVIDIA’s AI dominance with the release of the MI300 series of chips. The AMD Instinct MI300 and NVIDIA H100 are both high-performance chips, but they have distinct features and capabilities that differentiate them. Here's a detailed comparison:

AMD Instinct MI300

Architecture: The MI300 utilizes the CDNA 3 architecture and is manufactured using TSMC's 5 nm process. It features a multi-chip and multi-IP design, integrating both next-gen CDNA 3 GPU cores and Zen 4 CPU cores.

Performance: AMD claims an 8x boost in AI performance (TFLOPs) and a 5x AI performance per watt (TFLOPs/watt) boost over its predecessor, the Instinct MI250X.

Memory and Bandwidth: 128 GB of HBM3 memory and a bandwidth of 3,277 GB/s.

Power and Interface: Power draw of 600 W and uses a PCIe 4.0 x16 interface.

NVIDIA H100

Architecture: The H100 is based on the Hopper architecture, manufactured using a 4 nm process by TSMC. It represents the successor to NVIDIA's Ampere architecture.

Performance: NVIDIA touts the H100 as delivering up to 30X higher AI inference performance on large models and up to 7X higher performance for HPC applications compared to its predecessors.

Memory and Bandwidth: The H100 features 80 GB of HBM2e memory with a bandwidth of 2,039 GB/s. It also supports PCIe Gen5 connectivity and HBM3 high-bandwidth memory, offering up to 3TBps of memory bandwidth.

Power and Interface: Max power draw of 350 W and PCI-Express 5.0 x16 interface.

Key Differences

Architecture: The MI300's integration of both GPU and CPU cores in a single package is a notable distinction, potentially offering more versatile computing capabilities.

Memory Type: The MI300 uses HBM3 memory, which is the latest and typically faster than HBM2e used in the H100.

Power Consumption: The MI300 has a higher power consumption (600 W) compared to the H100 (350 W), which could be a consideration in data centers where power efficiency is crucial.

Connectivity: The H100's support for PCIe Gen5 offers higher throughput compared to the PCIe 4.0 interface of the MI300.

Why This Matters

NVIDIA now has real competition in the chip market. Companies like Meta and Oracle have already committed to using AMD’s chips. Increased competition should accelerate innovation and lower prices, making AI compute more attainable for a greater number of businesses. We’re holding out for when you can run a frontier model in your basement.

🔥 Rapid Fire

🎙️ The AI For All Podcast

This week’s episode featured Juan Sanchez, CIO of Inteleos, who discussed the impacts of AI on companies and practical tips on deploying AI within organizations. He covered integrating AI into decision-making, the importance of having a good data foundation, and the influence of AI hype cycles in organizations.

📖 What We’re Reading

This week’s handpicked content includes two articles from Deepbrain AI, an AI video generation platform, about how AI is changing the media landscape. Plus, Andreessen Horowitz has shared the biggest ideas in tech for 2024.

Generative AI Limited in The Writing Room, But Still Making the Big Screen (link)

“Today, many news outlets identify as entertainment, at least to some degree. Media outlets prosper off creative thought but are stressed by quick timelines and a hedonistic cycle of content to maintain high engagement. Generative AI offers an opportunity to alleviate these pain points so journalists can reallocate their time to better serve their newsroom.”

Source: AI For All
Three Ways AI Humans are Here to Help in your Daily Life (link)

“The concept of an AI human may still seem futuristic to many in mainstream society, but its use cases are beginning to take hold in a very public way. Those who can recall their first encounters with an AI or robot-human might remember feeling uneasy – experiencing what’s dubbed as the uncanny valley. Today, developers are triumphantly finding themselves outside of the valley and creating AI humans that truly look human-like.”

Source: AI For All
Big Ideas in Tech for 2024 (link)

“Smart energy grids. Voice-first companion apps. Programmable medicines. AI tools for kids.
We asked over 40 partners across a16z to preview one big idea they believe will drive innovation in 2024.”

Source: Andreessen Horowitz

💻️ AI Tools and Platforms

  • Deepbrain AI → All-in-one AI video generator

  • Respell → Automate work with AI

  • Asato → AI-powered copilot for enterprise CIOs

  • Replicate → Run and deploy AI models easily

  • Strut → All-in-one AI workspace for writers