This written version of the video tutorial was generated by an LLM from the video transcript, and supervised by me, Alejandro.
You do not need an expensive cloud machine to run an always-on AI agent. In this tutorial, you’ll see how to host Hermes Agent on a small VPS and keep the total monthly cost around $15/month: roughly $5–10 for the server and $10 for the model budget.
The setup uses a regular VPS provider such as Hetzner or OVH, a Minimax token plan for predictable model usage, and a Pi Agent skill that can help configure the server for you.
What We Are Building
The goal is to run Hermes Agent on your own server so it can stay online all the time.
That gives you:
- Persistent Hermes sessions
- Model configuration in one place
- Cron jobs and skills
- A dashboard hosted on your own domain
- A server that keeps running when your laptop is closed
This tutorial is a shorter companion to my full Hermes VPS setup walkthrough. If you want the deeper, step-by-step version, start with the longer video linked in the references.
The $15/Month Setup
The setup has two main costs:
- VPS hosting — around $5–10/month
- Model access — around $10/month with a Minimax token plan
The VPS does not need a GPU. It only runs Hermes, the dashboard, and the supporting services. The actual model calls go to a hosted model provider.
That is what keeps the server cheap.
Choose a VPS Provider
In the video, I use Hetzner as the main example, but the same setup works with other VPS providers.
Good options include:
- Hetzner Cloud
- OVHcloud VPS
- DigitalOcean Droplets
A small cost-optimized server is enough for most Hermes setups. You are not running the language model locally on the VPS, so you mainly need enough CPU and RAM for the agent process, Node runtime, dashboard, logs, and background jobs.
When creating the server, use:
- Ubuntu
- SSH key login
- A non-root user if possible
- Basic cloud-init configuration for packages and firewall setup
If Hetzner’s smallest instance is unavailable, OVH can be a good alternative. In the video, I compare a few options and show that the exact VPS provider matters less than having a reliable server with SSH access.
Use Minimax for Predictable Model Costs
For the model budget, I recommend looking at Minimax.
The key idea is to use the token plan, not pay-as-you-go, if you want predictable monthly costs. The lowest token plan is around $10/month and is enough for one or two concurrent agents in many personal workflows.
That makes it a good fit for Hermes Agent because you can cap your spend instead of worrying about every token.
Minimax gives you access to capable models with large context windows. For an always-on assistant, that is useful because Hermes can handle longer context, tool outputs, and ongoing sessions.
Alternative: Hugging Face Inference Providers
If you do not want a monthly token plan, you can also use Hugging Face Inference Providers.
This lets you call supported models through the Hugging Face router with a Hugging Face access token. It is a nice option if you want to experiment with multiple models or only use the agent occasionally.
The flow is:
- Go to Hugging Face models.
- Filter for models available through Inference Providers.
- Create a Hugging Face access token.
- Enable permission to make calls to Inference Providers.
- Configure Hermes to use the Hugging Face router endpoint.
This is more flexible, but the Minimax token plan is easier if your main goal is a predictable monthly cap.
Let an AI Agent Configure the VPS
The most useful part of the workflow is that you do not have to configure everything manually.
In the demo, I connect to the VPS over SSH and use Pi Agent with a Hermes VPS setup skill. The skill guides the agent through the server setup process and asks questions when it needs more information.
The setup skill handles common VPS tasks such as:
- Updating the system
- Installing runtime dependencies
- Configuring a firewall
- Enabling Fail2ban
- Disabling root SSH login
- Setting up Hermes
- Preparing DNS and Caddy configuration
- Creating a GitHub remote backup workflow
You still need to supervise the agent, especially for security-related steps, but it removes a lot of tedious manual work.
Configure Hermes with Your Model Provider
Once the VPS is ready, Hermes needs a model provider.
For the $15/month setup, use the Minimax API key from your token plan. Be careful to use the key connected to the subscription plan, not a pay-as-you-go configuration, if your goal is to cap monthly expenses.
You can also configure Hermes with Hugging Face Inference Providers if you prefer a more flexible model-router setup.
The important point is that the VPS is only the host. The model itself runs elsewhere, so a cheap VPS is enough.
Host Hermes on Your Own Domain
The setup skill can also guide you through DNS and Caddy configuration so the Hermes dashboard is available on your own domain.
That usually means:
- Creating DNS records that point to your VPS.
- Configuring Caddy as the reverse proxy.
- Securing the dashboard.
- Verifying that the service is reachable over HTTPS.
Once that is done, Hermes can live on your own server and domain, while the dashboard gives you access to sessions, models, cron jobs, and skills.
When This Setup Makes Sense
This setup is a good fit if you want:
- A personal AI assistant that is always online
- Predictable monthly model costs
- A self-hosted dashboard
- A server you control
- A lightweight environment for agent skills and cron jobs
It is not meant for running large open models locally. For that, you would need GPU infrastructure. This approach keeps costs low by using hosted model providers and running only the agent infrastructure on the VPS.
