Liquid AI LFM2.5-350M: AI Agents in 81MB on Mobile

Liquid AI has released LFM2.5-350M, a 350-million-parameter open-weight language model that runs AI agents on devices as constrained as smartphones. The model requires as little as 81MB on mobile GPUs and 169MB on NPUs, making it one of the smallest production-ready agent models available.

What Happened

LFM2.5-350M is the newest and smallest addition to Liquid AI's LFM2.5 family, which launched in January with 1.2B+ parameter models. The 350M variant was trained on 28 trillion tokens with scaled reinforcement learning, giving it strong instruction-following capabilities (IFEval score: 76.96) despite its compact size.

The model is optimized specifically for data extraction, structured outputs, and tool use on edge devices. It supports a 32k context window and processes up to 40,400 output tokens per second on a single H100. All weights are available on HuggingFace under open-weight licensing.

Why It Matters

Running AI agents locally eliminates API costs, latency, and privacy concerns. At 81MB, LFM2.5-350M fits on hardware that cannot run even the smallest Llama or Mistral models. For creators building AI-powered tools, apps, or workflows, this opens agent capabilities on phones, tablets, and lightweight laptops without cloud dependencies.

The model joins a growing wave of small models designed for edge deployment, alongside Qwen 3.5 Small and SD3.5 Flash for on-device image generation. The trend is clear: capable AI is moving from data centers to personal devices.

Key Details

350M parameters, trained on 28T tokens with reinforcement learning
81MB on mobile GPUs, 169MB on NPUs via specialized inference engines
32k context window for processing longer documents and conversations
Supports llama.cpp, MLX, and vLLM inference frameworks out of the box
Optimized for tool use and structured outputs, not general knowledge tasks
Open weights available on HuggingFace today

What to Do Next

Developers building mobile or embedded AI applications can download the model from HuggingFace and run it via llama.cpp or MLX. The model works best for agentic tasks like API calling, form filling, and data extraction rather than knowledge-heavy conversations or code generation. For creative workflows that need local AI processing without cloud costs, LFM2.5-350M is worth testing as a lightweight agent backbone.

Liquid AI LFM2.5-350M Runs AI Agents in 81MB on Mobile

What Happened

Why It Matters

Key Details

What to Do Next

Keep reading

ElevenLabs Revamps ElevenMusic as AI Music Creator Platform

Wonder Public Alpha: AI Design Agent Goes Canvas-to-Code

Hera Launch: AI Product Videos From a Single Prompt

What Happened

Why It Matters

Key Details

What to Do Next

Stay ahead of AI

Keep reading

ElevenLabs Revamps ElevenMusic as AI Music Creator Platform

Wonder Public Alpha: AI Design Agent Goes Canvas-to-Code

Hera Launch: AI Product Videos From a Single Prompt

Stay ahead of Creative AI