What Is NeoSmith?
NeoSmith is an autodistillation platform that gives every agent in your agentic workflow its own dedicated Small Language Model (SLM). Instead of routing all agents through expensive frontier LLMs like GPT-4 or Claude, NeoSmith automatically distills one expert SLM per agent - each purpose-built for its specific task: tool calling, multi-step reasoning, structured data extraction, or autonomous task execution.
Traditional model distillation requires deep ML expertise, weeks of experimentation, and expensive GPU infrastructure. NeoSmith reduces this to a single API call per agent. Point to your agentic flow, and NeoSmith creates one specialized SLM for each agent - handling architecture selection, training, evaluation, and deployment. The result: every agent runs on a model that's an expert in your custom workflow.
Key Features
Automated Distillation Pipelines
Define a task specification and NeoSmith automatically selects the optimal student architecture, generates synthetic training data from your teacher model, runs distillation, and validates the output against your accuracy benchmarks. No ML infrastructure management required.
Agentic Workflow Optimization
SLMs distilled by NeoSmith are specifically tuned for agent behaviors - tool calling with structured JSON output, multi-step reasoning chains, context window management, and graceful error recovery. Your agents get faster, cheaper, and more reliable.
Sub-1B Parameter Models
Deploy models small enough to run on edge devices, in serverless functions, or alongside your application code. NeoSmith targets the sweet spot between capability and efficiency - typically 100M to 800M parameter models that outperform generic models 10x their size on your specific task.
Production-Ready Deployment
Export distilled models as ONNX, GGUF, or serve them directly via NeoSmith's inference API. Built-in A/B testing lets you compare your SLM against the teacher model before cutting over. Monitor accuracy drift with integrated eval pipelines.