SaladCloud Blog

INSIDE SALAD

Building an Automated News Bot with OpenClaw and SaladCloud: A Real-World Cost Breakdown

Maksim Gorkii

We wanted to build something practical with OpenClaw not a demo, but an actual workflow we can use every day. The idea: an agent that continuously pulls news about topics we are interested in, summarizes them using a Salad-hosted LLM, and delivers the summaries straight to a Telegram chat. Every five minutes, around the clock.

Here’s how we set it up, what it actually costs, and why the numbers make running your own model on SaladCloud an easy choise.

The Setup

The architecture has three parts:

SaladCloud deployment. We deployed an Ollama container running gpt-oss:20b model on SaladCloud using an RTX 3090 on the lowest priority tier, which costs $0.09 per hour. This model handles all the summarization work.

OpenClaw running locally. We installed OpenClaw on a local machine and configured it with two model providers: our SaladCloud-hosted model as the primary model, and Claude Opus 4.5 available for any tasks that need heavier reasoning.

Telegram integration. OpenClaw connects directly to Telegram. The agent sends news summaries to a designated chat, so we do not miss anything of our interest. We can also pull new posts from our channels on Telegram and have the agent summarize those too but we will keep this of this project for now.

The Workflow

Every five minutes, the agent:

  1. Pulls the latest news on several topics we defined.
  2. Sends the content to our gpt-oss:20b model on SaladCloud for summarization.
  3. Posts the summary to our Telegram chat.

The Numbers

After running the workflow, we measured what a typical summarization request actually costs in tokens. Each request uses roughly 8,000 input tokens (the raw news content being summarized) and 500 output tokens (the summary itself). With a request every five minutes, that gets to:

  • 102,000 tokens per hour (96K input + 6K output)
  • ~2.4 million tokens over 24 hours

Now here’s where the cost comparison gets interesting.

If we ran this on Claude Opus 4.5 at $5 per million input tokens and $25 per million output tokens, the daily bill would be roughly $15 per day – or about $450 per month just for automated news summaries.

On SaladCloud, we’re running an RTX 3090 at $0.09/hour on the lowest priority tier. Running it 24 hours a day, that’s $2.16 per day – or about $65 per month. That’s an 86% cost reduction.

We are also only using the model once every five minutes. The GPU sits idle between requests. We could run far more summarization tasks, add more topics, summarize Telegram channels, or run entirely different workloads on the same deployment – all without spending a single dollar more. On SaladCloud, you pay per hour of compute, not per token. Whether you send 100 requests or 10,000, the cost is the same.

Why Smaller Models Work Here

A 20B parameter model is more than capable of producing high-quality summaries. Summarization is a well-understood task that doesn’t require frontier-model reasoning since the model needs to read content, identify what matters, and condense it clearly. Modern open-source models at 14B–20B parameters do this extremely well. Same model can be used for other purposes as well, translation for example, or key points extraction.

Where you might still want a bigger model is for the initial setup. Defining the workflow, writing the prompts, configuring the agent behavior, and debugging edge cases might be much quicker and easy on the big models. For that, having Opus 4.5 available as an option in the same OpenClaw config is useful. However once the workflow is running the self-hosted model handles the repetitive summarization work without any quality issues.

The Config

Here’s the relevant portion of the OpenClaw configuration we used:

{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "<https://your-salad-deployment-url.salad.cloud/v1>",
        "apiKey": "ollama-local",
        "api": "openai-completions",
        "models": [
          {
            "id": "gpt-oss:20b",
            "name": "gpt-oss:20b",
            "reasoning": false,
            "input": ["text"],
            "cost": {
              "input": 0,
              "output": 0,
              "cacheRead": 0,
              "cacheWrite": 0
            },
            "contextWindow": 128000,
            "maxTokens": 8192
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "ollama/gpt-oss:20b",
        "fallbacks": [
          "anthropic/claude-opus-4-5"
        ]
      },
      "models": {
        "ollama/gpt-oss:20b": { "alias": "gpt-oss" },
        "anthropic/claude-opus-4-5": { "alias": "opus" }
      },
      "heartbeat": {
        "every": "5m",
        "model": "ollama/gpt-oss:20b",
        "target": "last"
      }
    }
  }
}

The SaladCloud-hosted model is the primary for all automated work. Opus is there when we need it – switch with /model opus – but the daily grind of summarization runs entirely on our $0.09/hour GPU.

Cost Summary

Claude Opus 4.5SaladCloud (RTX 3090)
Billing modelPer tokenPer hour
Hourly cost (this workload)~$0.63$0.09
Daily cost (24h)~$15$2.16
Monthly cost~$453~$65
Additional usageCosts scale linearlyAlready included

The SaladCloud cost stays flat regardless of how much more we use the model. The Opus cost scales with every additional token.

Takeaway

For repetitive, well-defined tasks like news summarization, a smaller model hosted on SaladCloud is dramatically cheaper than using a model API and the quality is more than sufficient. In addition hourly compute model means you can keep adding workloads without extra cost. Our news bot runs every five minutes, but the same deployment could simultaneously handle Telegram channel digests, document summaries, emails, or any other summarization or other task we give it.

Summarization is just one example. Smaller self-hosted models are capable of handling a wide range of everyday agent tasks:

  • Email drafting. Have the agent scan your inbox on a schedule, flag what’s urgent, and draft replies for routine messages. The input/output pattern is similar to news summarization – mostly reading, with short structured output.
  • Code review. Point OpenClaw at a git diff and ask for a review. Models at 14B–20B can catch bugs, suggest improvements, and flag style issues reliably, especially coding-focused models like Qwen Coder.
  • Meeting prep and follow-ups. Pull calendar events and related documents, generate briefing notes before meetings, and draft follow-up action items from notes afterward.
  • Content repurposing. Take a blog post and have the agent generate social media posts, email newsletter, or internal summaries – all different output formats from the same source material.
  • And many more. The range of options is huge. Most agents need far less “thinking” than LLM’s you talk to directly, if you provide well-defined instructions. Also new open-source models get released daily and improve quickly.

The common thing is that these are all well-defined, repeatable tasks where the model’s job is to read, process, and produce structured output – not to reason about novel problems. That’s the sweet spot for self-hosted models on affordable hardware.

The frontier model might still be needed for building the workflow, handling complex reasoning, and tackling tasks that need it. But for the 90% of agent work that’s routine, a 20B model on a $0.09/hour GPU gets the job done.

Have questions about enterprise pricing for SaladCloud?

Book a 15 min call with our team.

Related Blog Posts

Salad will become a Render Subnet, Salad and Render Partnership

RNP-023 Approved: Salad Is Joining the Render Network

It's official. RNP-023 has passed the community vote, and Salad will now become an exclusive subnet on the Render Network. A few weeks ago we shared our proposal to fully...
Read More

Use Cline with SaladCloud: Building Real Apps for Under $0.01

At SaladCloud, we've been working on easy-to-deploy recipes designed to cover most agentic use cases out of the box. When you run LLMs on Salad, you're not worried about token...
Read More

Salad Proposes Integration with the Render Network

I’m excited to share that Salad has submitted a formal proposal alongside the Render Network Foundation to become a subnet on the Render Network. This would involve fully transitioning our...
Read More

Don’t miss anything!

Subscribe To SaladCloud Newsletter & Stay Updated.