• AI For All
  • Posts
  • OpenAI Beats Google to Multimodal AI

OpenAI Beats Google to Multimodal AI

PLUS: Meta Connect 2023 and Amazon's $4 Billion Investment

Hello readers,

Welcome to another edition of This Week in the Future! Huge news this week with OpenAI beating Google to multimodal AI, Meta announcing a ton of new AI features at Meta Connect, the Hollywood writers strike ending with tentative agreements around AI, and Amazon investing $4 billion into Anthropic.

As always, thanks for being a subscriber! We hope you enjoy this week’s content — for a video breakdown, check out the episode on YouTube.

Let’s get into it!

OpenAI Beats Google

OpenAI's announcement of their GPT-4 model promised a future of enhanced multimodal capabilities. However, the wait for these features was a tad longer than anticipated. Now, OpenAI has updated ChatGPT to see, hear, and speak with a new GPT-4 Vision model, beating Google to a publicly available multimodal AI (Google intends Gemini to be multimodal). Let’s take a closer look:

The Multimodality of GPT-4

Initially promised to handle image recognition and voice commands, the updated GPT-4 (or GPT-4V) can now process and understand a variety of input streams. Users can feed images, audio, and video to ChatGPT directly through the Android and iOS apps. ChatGPT can also output in formats like images and voice, with the door open for video in the future.

A Real Life J.A.R.V.I.S.

If you’re having issues with your bicycle, you can snap a picture, and ChatGPT can guide you through the repair process, answering questions related to specific bike parts. This is similar to the original example OpenAI gave when they announced GPT-4, where ChatGPT could give you a recipe based on a picture of the food in your pantry.

One company has already made use of GPT-4V. OpenAI collaborated with Be My Eyes to develop a tool known as Be My AI, which aims to provide visually impaired users with descriptions of their surroundings.

How It Works

For those interested in the technical side, NExT-GPT is an open-source multimodal LLM that lets us see how GPT-4V likely works.

The Road Ahead

The new ChatGPT signifies a quantum leap for OpenAI, bringing society closer to the often touted future of personal AI assistants. While the initial results are promising, the full capabilities of this technology will be learned as more users engage and test its boundaries. As the technology matures, it will change how people live their lives.

Meta Connect 2023

Not to be left out of the conversation, Meta recently showcased several AI developments with plans to integrate AI into their social media platforms. Here are the highlights:

Emu: Following the animal-naming trend started with Llama, the Emu model will be responsible for new image editing and generation features on Instagram.

Meta AI: Meta AI is an AI assistant like ChatGPT that is tailored for Meta platforms including WhatsApp, Messenger, and Instagram.

AI Characters: An interesting if not gimmicky development from Meta includes AI characters, some of which are based on actual celebrities. For instance, users will soon be able to converse with a photorealistic version of Tom Brady. These AI characters are part of an attempt by Meta to draw in more Gen Z users due to the popularity of character.ai with the demographic.

AI Studio: AI Studio will empower users to craft their own AI entities, and there will be a version for businesses. The platform will be developer-friendly with a dedicated API, but it will also feature a sandbox environment for regular users.

Our Take

The brand shift from Facebook to Meta is interesting. It’s similar to Twitter’s rebranding as X, suggesting a trend toward consolidating multiple companies under a single umbrella. There could be a reason for this trend. The merger of various platforms under one brand can enhance the development of AI models. A unified data pool, consisting of text from Facebook and multimedia from Instagram, for example, can lead to the creation of more potent AI tools. It also simplifies the deployment of these models across various platforms without the need for platform-specific solutions.

Amazon’s $4 Billion Investment

Amazon made a substantial investment in Anthropic, the AI company responsible for Claude 2. As a result, Amazon now has a minority stake in Anthropic, and Anthropic will have to use AWS for its AI infrastructure.

Our Take

While Anthropic clearly benefits from a hardware and computing standpoint, what does Amazon get for 4 billion dollars? Collaborating with a top-tier generative AI company boosts the prominence and credibility of its AWS platform for AI development. The company might be seeking a prominent AI partner in the same vein as Google with DeepMind and Microsoft with OpenAI. The success of Anthropic on the AWS platform can serve as a testimonial for other enterprises or individuals pondering cloud-based solutions. The alliance also positions Amazon more strongly within the AI ecosystem, potentially driving more AI-focused initiatives and projects through its infrastructure.

🔥 Rapid Fire

🎙️ The AI For All Podcast

This week’s episode featured Sateesh Seetharamiah, CEO of EdgeVerve, who discussed how AI is driving digital transformation and the value that enterprise data holds for AI. We also discuss how enterprises can be more agile and the challenges enterprises will face with AI.

📖 What We’re Reading

This week’s handpicked insights include a practical dive into how your development team can take advantage of AI to accelerate time-to-market among other benefits, plus tons of data has come in on the divided response to generative AI amongst the C-suite.

Harnessing Automation in DevOps for Successful AI Models (link)

“Automation is a key pillar of DevOps, enabling teams to automate various tasks and processes throughout the software development lifecycle. Via automation, DevOps takes advantage of AI to reduce human error, accelerate time-to-market, and improve overall software quality.”

Source: IoT For All
What’s Dividing the C-Suite on Generative AI? (link)

“According to a BCG survey of 2,000 global executives, more than 50% still discourage generative AI adoption.”

Source: Boston Consulting Group

💻️ AI Tools and Platforms

  • Seamless.AI → AI sales and business leads software

  • Frame → All-in-one collaboration OS with AI assistant

  • Stack AI → No-code AI automation platform

  • ScreenshotAI → All your screenshots organized by AI

  • Spice AI → Data and AI infrastructure for Web3