AI For All
Posts
OpenAI Drops The Best Text-to-Video Model

OpenAI Drops The Best Text-to-Video Model

PLUS: Gemini 1.5 Is Already Here

AI For All
February 16, 2024

Hello readers,

Welcome to another edition of This Week in the Future! Out of nowhere, OpenAI revealed a text-to-video model called Sora, and it’s the best we’ve ever seen. Plus, Google has already released Gemini 1.5, an update to Gemini 1.0 but not the same as Gemini Ultra 1.0 (it’s confusing), which has an absurd 1M token context window.

As always, thanks for being a subscriber! We hope you enjoy this week’s content — for a video breakdown, check out the episode on YouTube.

Let’s get into it!

Sora Stuns in Debut

Frame from video generated by Sora

OpenAI just disrupted the AI industry (again) with the surprise drop of a text-to-video model named Sora. It’s the best AI-generated video we’ve ever seen, surpassing the likes of Runway and Pika Labs — both dedicated AI video companies.

Sora is not publicly available yet, as it is currently being red teamed for safety reasons. Compared to other models, the videos are remarkably stable, the camera movements are fluid, and the humans are very realistic. Sora can create multiple shots within a single generated video that accurately persist characters and visual style.

Why This Matters

We don’t think many people were expecting this amount of progress in video generation so soon, much less for it to come out of nowhere from OpenAI. Sora is already good enough to get the gears turning in the minds of Hollywood executives and advertisers. A few AI-generated shots here and there where it saves the most money will become the norm. However, the misinformation apocalypse and mass confusion that could ensue is a serious risk, and OpenAI is clearly aware of that.

Gemini 1.5

Google released Gemini 1.5, and it’s a worthy upgrade. This new model uses a Mixture of Experts (MoE) approach, which routes requests to a group of smaller “expert” neural networks. As a result, responses are faster and higher quality. Developers can sign up for the Private Preview of Gemini 1.5 Pro inside Google AI Studio.

Context Window

The headline feature of Gemini 1.5 is its 1 million token context window 🤯. Before Gemini 1.5, the largest publicly available context window for a language model was 200,000 tokens (Anthropic’s Claude 2). Google demoed Gemini’s large context understanding by having it process a 44-minute Buster Keaton film, a 100 thousand line codebase, and a 402-page transcript.

Our Take

To see a significant update like this so soon after Gemini was first unveiled tells us that Google is likely to do just fine in the AI race. They are determined not to be left behind and are putting everything into this. While large context windows don’t necessarily mean better performance, and OpenAI still has a head start, this update shows just how fast the dynamics can change. After all, OpenAI just leaped ahead of entire dedicated AI video startups, and they’re not even primarily a video company.

🔥 Rapid Fire

OpenAI adds memory for ChatGPT
NVIDIA releases Chat with RTX, a PC-native chatbot
Cohere launches Aya, a new multilingual LLM
Meta introduces new architecture for machine learning
Brilliant Labs launches AI-powered AR glasses
Stability AI introduces Stable Cascade
Salesforce adds generative AI to Slack
Azure OpenAI Service announces multiple updates
Anaconda partners with IBM watsonx to scale enterprise AI
VAST Data and Run:ai partner on full-stack AI solution
Cisco and NVIDIA partner on enterprise AI infrastructure
Vercel introduces multiple AI integrations
GitHub adds Copilot to GitHub Support
Amazon’s new LLM has emergent properties
White House plans to tag authentic content
ElevenLabs offers payouts to voice actors
UPenn launches artificial intelligence degree

🎙️ The AI For All Podcast

This week’s episode featured Hans-Martin Will, Chief AI Officer at LegalMation, who discussed the impact of AI on traditional industries like legal and insurance. We covered the role of a Chief AI Officer and its growing demand, the concept of hybrid intelligence, and the potential risks and challenges involved in adopting AI technologies, focusing on transparency and explainability.

YouTube • Spotify • Apple • Google • Amazon

📖 What We’re Reading

The Rise of AI Ethics and Responsible Technology (link)

“As AI applications become more prevalent in our daily lives, the conversation around AI ethics has gained significant momentum. The implementation of AI technology raises a multitude of ethical considerations, such as privacy, bias, accountability, and transparency.”

Source: AI For All

From Google DeepMind’s AI Gallery

AI-Powered KPIs Measure Success Better. And Redefine It. (link)

“Companies that use the same old KPIs to measure success are missing out on opportunities to better align people and processes, prioritize resources—and generate value. By contrast, forward-looking organizations are benefiting from using AI to generate KPIs that are more intelligent, adaptive, accurate, and predictive than legacy performance indicators.”

Source: Boston Consulting Group

💻️ AI Tools and Platforms

Dify → Easily build and operate generative AI apps
Kolena → ML testing platform for model validation
Baseten → ML infrastructure that just works
Anyword → Advanced AI writing built for marketing
Plus AI → AI for Google Slides and Docs

	Sponsored The NeuronDon't fall behind on AI. Get the AI trends and tools you need to know. Join 500,000+ professionals from top companies like Microsoft, Apple, Salesforce and more. 👇