- AI For All
- Posts
- Apple Enters the AI Realm
Apple Enters the AI Realm
PLUS: OpenAI's Voice Engine Clones Any Voice
Hello readers,
Welcome to another edition of This Week in the Future! Apple researchers introduced ReALM, an on-device language model that competes with GPT-4. Plus, OpenAI unveiled Voice Engine to clone any voice with a single 15-second sample.
Let’s get into it!
Apple Enters the AI Realm
In a new research paper, Apple engineers introduced ReALM (Reference Resolution As Language Modeling). Simply put, ReALM sees what’s on your screen and can perform actions autonomously. It works by converting everything it sees and hears into text. This makes it much more efficient and means it doesn't need as much compute power as it’s intended to run on-device and will likely be integrated with Siri.
Furthermore, Apple claims that this new system can match and even outperform GPT-4, especially in understanding complex requests or instructions based on what's happening on your screen. For example, you could be browsing a website and tell Siri to "call the business" you just saw without needing to be more specific. Siri will understand the context and know which number to call just based on your current screen.
Why This Matters
Apple has been waiting patiently in the shadows for the opportune moment to strike, and with a robust AI strategy expected to be announced at WWDC, ReALM is our best glimpse yet at Apple’s AI future. Expect a long overdue update to Siri and at least one “Apple special” we haven’t seen before in the consumer AI space.
OpenAI’s Voice Engine
OpenAI has released a preview of Voice Engine, which can clone voices based on a single 15-second audio sample. OpenAI is treading carefully and has yet to decide how and if they will deploy the technology at scale. Positive applications highlighted include reading assistance and translation. Interestingly, OpenAI issued recommendations for how society should adapt to the consequences of widely-available voice cloning technology (while being the originator of said consequences). They include:
Phasing out voice based authentication for security
Making the public aware that everything they hear might be fake
Our Take
Translation is the most promising use case. Then again, subtitles never hurt anyone, right? After all, it could be argued that voice cloning has few worthwhile applications and plenty of dangerous ones, which is why OpenAI has been keeping this under wraps since late 2022. That being said, the demos are impressive.
🔥 Rapid Fire Inferno
OpenAI expands custom models program and API
ChatGPT is now available without an account
OpenAI enables image editing in DALL·E 3
Cohere introduces Command R+ for enterprises
AI search engine Perplexity plans to sell ads
Google wants to charge for AI search features
Big tech launches consortium to address AI job loss
Anthropic jailbreaks large context models
DeepMind’s Mixture-of-Depths improves AI processing speed
Mistral Large is now available on AWS
Stability AI introduces Stable Audio 2.0
S&P Global releases AI benchmark for business and finance
US and UK partner to safety test AI models
Air Force looks for contractors in the AI space
Oracle and Palantir partner on mission critical AI solutions
Cloudflare enables one-click AI deployment on Hugging Face
Copilot for Microsoft 365 upgrades to GPT-4 Turbo
Opera browser now lets you download local LLMs
Brave’s Leo AI assistant now available on iOS
Replit announces Replit Teams with Code Repair LLM
OctoAI introduces private AI production stack for enterprises
MultiOn enables AI agents in devices and apps with API
AssemblyAI introduces best-in-class speech-to-text model
CodiumAI announces Devin for enterprises
Princeton releases the open source Devin
Stanford develops new drugs with AI and on-device agents
Yahoo acquires struggling AI news app Artifact
📖 What We’re Reading
Generative AI for the Public Sector: The Journey to Scale (link)
“Generative artificial intelligence has the potential to make governments much more efficient and effective. The impact of GenAI on the public sector will be significant. For instance, in our first article in this series, we revealed that the potential productivity improvements from GenAI could be worth $1.75 trillion per year by 2033 globally across all levels of government.”
💻️ AI Tools and Platforms
Retell AI → Conversational voice API for LLMs
CodeRabbit → AI-driven code reviews for teams
Ellipsis → AI dev tool for pull requests and comments
Keywords AI → DevOps platform for AI applications
Hailo → The world’s best edge AI processors
MaxAI.me - Outsmart Most People with 1-Click AI
MaxAI.me best AI features:
Chat with GPT-4, Claude 3, Gemini 1.5.
Perfect your writing anywhere.
Save 90% of your reading & watching time with AI summary.
Reply 10x faster on email & social media.