Skip to main content
HuggingChat: Multi-Model Chat with Omni Router + MCP Tools

HuggingChat: Multi-Model Chat with Omni Router + MCP Tools

·903 words·5 mins
Alejandro AO
Author
Alejandro AO

Software Engineer and Educator. Developer Advocate at Hugging Face 🤗

I help you build AI Apps that just work.

Introduction
#

HuggingChat is Hugging Face’s free, open-source chat interface that gives you access to multiple AI models. Unlike proprietary alternatives, HuggingChat lets you choose from various open-source models and even route requests automatically to the best model for your task.

What makes HuggingChat stand out:

  • Multiple models - Switch between Qwen, Llama, and other open models
  • Omni router - Automatically selects the best model per request
  • MCP integration - Connect external tools and services
  • Spaces as tools - Access thousands of HF Spaces as MCP tools
HuggingChat

Multi-Model Access
#

HuggingChat provides access to several open-source models. You can browse the full list at huggingface.co/chat/models.

Available models include:

  • Qwen3-Coder - Optimized for coding tasks
  • Llama-3.3-70B-Instruct - General-purpose conversations
  • Llama-3.2-90B-Vision - Image understanding
  • Mistral, DeepSeek, and more

To switch models, click the model name in the chat interface and select from the dropdown. Each model has different strengths—coding models excel at programming, vision models can analyze images, and general models handle everyday questions.

Omni - The Model Router
#

Instead of manually picking a model, you can select Omni and let HuggingChat route your request to the best model automatically.

Omni uses katanemo/Arch-Router-1.5B to analyze your message and select the appropriate model:

Request TypeRouted To
Coding, debuggingQwen3-Coder
Image analysisLlama-3.2-90B-Vision
Tool usageLlama-3.3-70B
General chatLlama-3.3-70B

When Omni routes your request, the UI shows which model was selected. This transparency lets you see the routing in action and understand why certain models were chosen.

The router runs on every message with a 10-second timeout. It analyzes your conversation context and matches it against predefined routes.

MCP Integration
#

MCP (Model Context Protocol) is a standard for connecting AI assistants to external tools. HuggingChat supports MCP servers natively, letting you extend its capabilities.

Two types of MCP servers:

  1. Base servers - Pre-configured by administrators, available to all users
  2. User servers - Add your own from the settings UI

To add an MCP server:

  1. Open the chat input and click the + button
  2. Select MCP Servers
  3. Click Add Server
  4. Enter the server name and URL
  5. Run Health Check to verify

When MCP servers are enabled and you use Omni, the router automatically selects a tools-capable model.

The HF MCP Server
#

Hugging Face provides an official MCP server at hf.co/mcp that connects your assistant to the entire Hub.

Built-in capabilities:

  • Search models - Find models by name, task, or description
  • Search datasets - Discover datasets for training
  • Search Spaces - Find community apps and demos
  • Search papers - Browse ML research papers
  • Run tools - Execute MCP-compatible Gradio Spaces

Setup
#

  1. Visit huggingface.co/settings/mcp while logged in
  2. Select your client (HuggingChat, Claude Desktop, VS Code, etc.)
  3. Copy the configuration snippet
  4. Paste into your client settings

For authenticated access, the server supports OAuth. This lets MCP servers access user-specific resources on your behalf.

Example prompts after connecting:

  • “Search Hugging Face models for Qwen 3 quantizations”
  • “Find a Space that can transcribe audio files”
  • “Show datasets about weather time-series”
  • “Create a 1024x1024 image of a cat in Ghibli style”

Spaces as MCP Tools
#

Any Gradio Space with MCP support can become a tool in your chat. This opens access to thousands of community-built AI applications.

Browse MCP-enabled Spaces at huggingface.co/spaces?filter=mcp-server.

Popular examples:

  • Image generation - FLUX.1 and other diffusion models
  • Audio transcription - Whisper-based Spaces
  • Document search - RAG and embedding Spaces
  • Video generation - Text-to-video models

To add a Space as a tool:

  1. Go to huggingface.co/settings/mcp
  2. Browse or search for MCP-enabled Spaces
  3. Click to add the Space
  4. Restart your client

Gradio MCP apps expose their functions as tools with arguments and descriptions, so your assistant can call them directly and return results to your conversation.

HuggingChat on Docs Pages
#

As of December 12, 2025, all Hugging Face documentation pages feature a built-in assistant in the bottom-right corner.

This assistant is powered by:

  • HuggingChat for the chat interface
  • HF MCP server for documentation search

Use cases:

  • Ask questions about Transformers, Diffusers, or any HF library
  • Get code examples and explanations
  • Navigate complex documentation quickly

The assistant has context about HF libraries and can search documentation to find answers. No setup required—just visit any docs page and click the assistant icon.

Demo Walkthrough
#

Here’s how to use these features together:

1. Omni Routing
#

  1. Go to huggingface.co/chat
  2. Select Omni as your model
  3. Ask a coding question: “Write a Python function to sort a list”
  4. Watch the UI show which model was selected (likely Qwen3-Coder)

2. Enable HF MCP Server
#

  1. Click + in the chat input
  2. Select MCP Servers
  3. Enable the Hugging Face server

3. Model Discovery
#

With HF MCP enabled, ask:

  • “Find the most downloaded text-to-image models”
  • “Search for datasets about sentiment analysis”

4. Image Generation
#

  1. Add a FLUX Space from MCP settings
  2. Ask: “Generate an image of a sunset over mountains”
  3. The model calls the Space and returns the image

5. Documentation Search#

  1. Visit any HF docs page (e.g., Transformers docs)
  2. Click the assistant icon in the bottom-right
  3. Ask: “How do I load a model with 4-bit quantization?”

Conclusion
#

HuggingChat combines multiple open-source models with intelligent routing and tool integration:

  • Omni routes requests to the best model automatically
  • MCP support connects external tools and services
  • HF MCP server gives access to the entire Hub
  • Spaces as tools extends capabilities with community apps
  • Docs assistant helps navigate HF documentation

Get started:

Buy Me A Coffee
undefined