- AI For All
- Posts
- Can LLMs Reason?
Can LLMs Reason?
PLUS: GPT-4o mini
Hello readers,
Welcome to another edition of This Week in the Future! Strawberry is the fruit of OpenAI’s labor, a secretive project that represents a supposed breakthrough in the reasoning capability of LLMs. Plus, OpenAI finally has a mini version of GPT-4.
The Last Strawberry
OpenAI has shared new research into what they call Prover-Verifier Games, a technique to ensure that the output from LLMs remains legible to humans. They found that optimizing strong models solely for getting the correct answer led to solutions that were harder for humans to understand. By training strong models to output text that weaker models could easily verify, it was found that humans could also more easily evaluate the text. The models in question were solving grade-school math problems.
The focus on math is interesting because according to an exclusive report from Reuters, OpenAI is working on a secretive project called ‘Strawberry’ that supposedly represents a breakthrough in the reasoning capability of LLMs with the ability to solve complex math problems. Strawberry appears to be Q*, last year’s AI mystery, and will be able to autonomously navigate the internet to perform ‘deep research.’
Why This Matters
LLMs desperately need a breakthrough to prove that they’re not plateauing. If Strawberry fails to amount to anything (or GPT-5 disappoints), it could be the last straw for generative AI. If OpenAI delivers, then we might have LLMs that approach human-level reasoning. This should make them more reliable at carrying out multi-step tasks.
🔥 Rapid Fire
OpenAI releases GPT-4o mini to challenge Haiku and Flash
OpenAI adds compliance and admin tools for ChatGPT Enterprise
Mistral AI releases Codestral Mamba, Mathstral, and NeMo
Hugging Face releases SmolLM, a family of small language models
ProtonMail introduces Proton Scribe, a private AI writing assistant
Tech giants form open source coalition on secure AI
Tech giants used thousands of YouTube videos to train AI
Anthropic and Menlo Ventures launch $100M Anthology Fund
Cohere and Fujitsu partner on Japanese enterprise AI services
SoftBank Group acquires British AI chip maker Graphcore
Salesforce launches yet another Einstein AI product
Researchers propose TTT models to replace transformers
Microsoft unveils SpreadsheetLLM in new research paper
C3 AI and Google Cloud launch generative AI for governments
US Marine Corps releases AI strategy and DARPA prepares for AI
Amazon says it’s time to align on global responsible AI policies
New Senate bill seeks to protect artists and journalists from AI
Your Brilliant Business Idea Just Got a New Best Friend
Got a business idea? Any idea? We're not picky. Big, small, "I thought of this in the shower" type stuff–we want it all. Whether you're dreaming of building an empire or just figuring out how to stop shuffling spreadsheets, we're here for it.
Our AI Ideas Generator asks you 3 questions and emails you a custom-built report of AI-powered solutions unique to your business.
Imagine having a hyper-intelligent, never-sleeps, doesn't-need-coffee AI solutions machine at your beck and call. That's our AI Ideas Generator. It takes your business conundrum, shakes it up with some LLM magic and–voila!--emails you a bespoke report of AI-powered solutions.
Outsmart, Outpace, Outdo: Whether you're aiming to leapfrog the competition or just be best-in-class in your industry, our custom AI solutions have you covered.
Ready to turn your business into the talk of the town (or at least the water cooler)? Let's get cracking! (And yes, it’s free!)
📖 What We’re Reading
“Despite challenging overall market conditions in 2023, continuing investments in frontier technologies promise substantial future growth in enterprise adoption. Generative AI has been a standout trend since 2022, with the extraordinary uptick in interest and investment in this technology unlocking innovative possibilities across interconnected trends such as robotics and immersive reality.”
💻️ AI Tools and Platforms
Atlan → Find, trust, and govern AI-ready data
Enkrypt AI → Control layer for enterprise AI
RunPod → GPU cloud for AI workloads
LanceDB → The database for multimodal AI
Decagon → AI support agents for enterprise
What did you think of today's newsletter?We value your feedback, so please take time to let us know what you liked and what could be improved. |