-   Name: SmallCon
-   Date: December 11, 2024
-   Focus: Small Language Models (SLMs) and Enterprise AI Implementation
-   Format: Virtual conference with mixed session types


# 👥 Session Details

Fireside Chat with Paul Beswick [~13:20-13:37]

-   Type: Fireside Chat
-   Speaker: Paul Beswick, Global CIO, Marshall McLean
-   Role: Manages 5000+ technologists globally
-   Session Goal: Share enterprise Gen AI implementation insights and evolution of their approach


# 💡 Key Technical Insights

Architecture Evolution:

-   Initial Approach (Early 2023):
    -   Started with API-based access (April 2023)
    -   Secured APIs by June 2023)
    -   Launched organization-wide LLM assistant in August/September 2023
    -   Current scale: ~25 million requests annually
    -   85% organizational adoption rate

Infrastructure Strategy:

-   Rent models by API call instead of self-hosting
-   Uses fine-tuned small models for specific tasks
-   Current volume: ~500,000 requests/week through fine-tuned model
-   Training costs: ~$20 per training cycle
-   Achieving accuracy exceeding GPT-4 with better response times

Technical Evolution:

1.  Initial Phase:
    -   Focus on prompting and RAG
    -   API-based implementation
    -   Minimal infrastructure complexity

2.  Current Phase:
    -   Implementation of fine-tuned models
    -   Shared infrastructure approach
    -   Low-cost training cycles
    -   Specialized model targeting


# 🤖 Technical Implementation Details

Infrastructure Management:

-   Avoided self-hosting large language models
-   Implemented pay-per-call model architecture
-   Security managed through API access controls
-   Conservative estimate: Over 1 million hours saved through implementation

Cost Economics:

-   Training cost: ~$20 per cycle
-   Infrastructure sharing across use cases
-   Focus on ROI for specific task automation
-   Economy of scale through shared resources


# 📈 Industry Trends

Evolution of Enterprise AI:

-   Movement from general-purpose to task-specific models
-   Shift toward automated fine-tuning processes
-   Focus on fragmenting models for specialized subtasks
-   Trend toward job augmentation over replacement


# 📋 Follow-up Actions

Technical Focus Areas:

-   [ ] Investigation of automated fine-tuning pipelines
-   [ ] Research on model specialization approaches
-   [ ] Review of infrastructure sharing strategies
-   [ ] Analysis of automation vs. augmentation use cases

Future Development (2025):

1.  Continued office suite integration
2.  Enhanced AI-powered helper applications
3.  Direct efficiency improvements through automation
4.  Increased focus on specialized, task-specific models
5.  Implementation of staged approach: LLM prompting → data collection → fine-tuning

The session provided valuable insights into enterprise-scale AI implementation, particularly highlighting the evolution from initial skepticism about fine-tuning to successful large-scale deployment through innovative infrastructure approaches and careful economic consideration.