## 🎯 Conference Overview

Table of Contents

👥 Session Details

  • Time: 14:01
  • Type: Panel
  • Speaker(s):
    • Dev Rishi, CEO and Co-founder, Predibase
    • Margaret, Head of Product, Mistral AI
    • Pablo, Distinguished Scientist and Research Manager, NVIDIA
    • Luna, Lead of the Small Language Model team, Hugging Face
    • Diego, Head of Generative AI Partnerships, Meta
  • Session Goal: Discuss the future of generative AI, focusing on the training and serving of small language models

💡 Key Technical Insights

Defining SLMs:

  • Models running on laptops/mobile devices with low latency
  • Typically less than 3-4 billion parameters
  • Optimized through quantization and compression
  • Suited for specific tasks not requiring extensive world knowledge:
    • Rephrasing
    • Summarization
    • Dialogue generation

Implementation Strategies:

  • Hybrid deployment combining small and large models
  • Task-based model selection
  • Fine-tuning with synthetic data from larger models
  • Focus on agentic workflows
  • Development of reasoning engines

🤖 Featured Technology

Hamba Language Model:

  • 1.5 billion parameters
  • MMLU score: 50
  • Designed for:
    • On-device deployment
    • Rephrasing
    • Summarization
    • Dialogue generation

📈 Future Outlook

2025 Predictions:

  • Advanced generative AI capabilities
  • Sophisticated agentic workflows
  • Improved reasoning engines
  • Enhanced deployment strategies

Industry Direction:

  • Investment in open-source development
  • Focus on device-deployable models
  • Growing agentic workflow adoption
  • Widespread industry implementation

The panel established core definitions and characteristics of SLMs while highlighting their role in the future of AI deployment. Key emphasis was placed on the practical advantages of small models and their complementary relationship with larger systems.

Author: Jason Walsh

j@wal.sh

Last Updated: 2026-04-19 15:33:47

build: 2026-05-19 23:11 | sha: 5cfabd4