Falcon 180B Beats Llama 2

PLUS: Training Cluster as a Service

Hello readers,

Welcome to another edition of This Week in the Future where we’ll be diving into Falcon 180B, a new open source LLM out of Abu Dhabi. With this new frontier model, the emergence of RLAIF as a possible alternative to the labor-intensive RLHF, and the introduction of Training Cluster as a Service by Hugging Face, generative AI is becoming increasingly accessible to smaller organizations and businesses.

As always, thanks for being a subscriber! We hope you enjoy this week’s content — for a video breakdown, check out the episode on YouTube.

Let’s dive in!

Falcon 180B Takes Flight

The Technology Innovation Institute (TII) of Abu Dhabi, recognized for its pioneering work in AI and quantum computing, recently introduced their newest achievement, Falcon 180B. This innovation has now risen to prominence, establishing itself as the premier open-source LLM. Not only has it surpassed Llama 2 in performance, but it also stands toe-to-toe with notable models from industry titans such as Google and OpenAI. In terms of its capabilities, Falcon 180B aligns closely with PaLM 2 and fluctuates in benchmarks, positioning itself between GPT-3.5 and GPT-4.

It’s Not Cheap

While Falcon 180B boasts a large parameter count, harnessing its power comes at a price. Prepare for a tech investment; it demands around 640GB of video memory or eight A100, 80GB GPUs, translating to a hardware cost surpassing $100,000.

A General Purpose Model

Interestingly, Falcon 180B isn't fine-tuned for a specific application. Yet, it exhibits stellar performance. We're keen to see the strides it'll make post-fine-tuning across diverse use cases.

The Trend

Falcon 180B hits the scene at the same time as Training Cluster as a Service by Hugging Face. Fresh off their notable partnership with NVIDIA and a significant funding round, Hugging Face is stepping up as a go-to solution for demanding computational tasks. The service is aimed at those keen on training, fine-tuning, or hosting AI models but without the requisite resources, following the growing trend of making AI more accessible.

Our Take

High-performing models, often labeled 'frontier models,' are usually proprietary and carry hefty price tags, hindering comprehensive research. Falcon 180B, despite its costs, offers a more affordable alternative to these models. This release holds immense promise for researchers and companies prioritizing top-tier self-hosted model performance.

The AI community awaits the wealth of research Falcon 180B will enable, deepening our understanding of ultra-high-performance models.

Reinforcement Learning From AI Feedback

Google Research released a paper that sheds light on the potential of Reinforcement Learning From AI Feedback (RLAIF) as an alternative or supplement to RL From Human Feedback (RLHF). RLHF refines supervised models by incorporating extensive human feedback, optimizing the model's output to align more with human preferences. It's the magic that elevated GPT-3 to ChatGPT's prowess.

However, the RLHF process is both resource-intensive and time-consuming. Enter RLAIF, an approach that contemplates using pre-trained LLMs as the feedback source instead of humans. The study employed PaLM 2-L as the AI assistant with PaLM 2 Extra Small as the base model — a choice that was probably made for its cost efficiency.

The Results

Both RLHF- and RLAIF-enhanced summaries won over just the baseline, supervised model. There wasn't a significant statistical difference in human evaluators' preference between the two methods. However, PaLM 2 XS, with its modest parameter count, was the base model used. This raises questions about the potential results with larger models.

The research also highlighted a limitation: the evaluation was based on a relatively compact dataset. Google acknowledges this, indicating the necessity for broader tests in future research.

The Preliminary Conclusion

There's promise in RLAIF, though its full capability remains under exploration. If validated, RLAIF could catalyze the development of top-tier LLMs, making AI advancements more accessible and efficient for developers.

🔥 Rapid Fire

🎙️ The AI For All Podcast

This week’s episode featured Adnan Masood, Chief AI Architect at UST, who discussed how AI robots are transforming industries and the impact robots will have on society. We uncovered the efficiency gains and safety improvements that robots can bring to businesses, plus some fun (or frightening) speculation about the future of AI robots.

📖 What We’re Reading

This week’s handpicked article is in keeping with the theme of AI accessibility. On-device AI will require smaller, more efficient models, the existence of which will create a new, accessible ecosystem for application development.

Democratizing on-device generative AI with sub-10 billion parameter models (link)

“Previous series posts established the prohibitive costs and AI privacy issues inherent in running generative artificial intelligence models solely in the cloud. As such, the only viable solution for driving widespread AI adoption and explosive innovation is through on-device generative AI.”

Source: Qualcomm OnQ

💻️ AI Tools and Platforms

  • Intenseye → AI-powered workplace safety

  • Clearbit → B2B marketing intelligence

  • Octocom → AI chatbot for eCommerce stores

  • Claude Pro → Anthropic’s ChatGPT Plus

  • Anecdote → CX and feedback analysis with AI