SaladCloud Blog

INSIDE SALAD

Introducing SSAP: Migrate to Salad GPU Cloud easily & save up to 80%

Prashanth Shankara

AI companies are overpaying for compute today

Affordable, accessible compute is the defining challenge for many AI startups today. Recently, we’ve seen many news stories of innovative AI companies struggling with profitability or running out of cash. The reason? Exorbitant cloud bills and a race to secure high-end, AI-focused GPUs (There were even a couple startups that spent almost 50% of their fundraising on GPUs). Thankfully, the incumbent cloud/chip monopolies are being challenged by new GPU clouds bringing the power of crowdsourced, consumer GPUs for AI/ML workloads.

There are three reasons this is happening:

Wrong GPU choice: The mighty marketing machines at chipmakers have convinced the market that everything needs to be run on high-end, AI-focused GPUs that are hard to secure.

  • The reality? Most AI inference use cases deliver better cost-performance on consumer GPUs

A market made by the monopolies: The cloud monopolies have secured most of these high-end GPUs, creating a scarcity leading to expensive prices. As Chris Z. from Wing Venture Capital explains here, the rest of the market is just acting as a pipeline of capital to the chipmakers.

  • The reality? Alternate cloud providers (Salad, RunPod, etc.) are utilizing underused consumer GPUs & a distributed model to cut costs by 50-80%.

Big margins for APIs: Managed Service Providers are much easier to first integrate with and come with low ongoing maintenance, but these come at a very high price. You are paying their margins! For example, a transcription service is priced from $0.30/hr (API provider) to $1.40/hr (A popular big cloud).

  • The reality? Small teams can drive massive cost improvements by moving down the cloud stack and using Serverless products.

Due to these reasons, many AI companies are massively overpaying for compute, especially serving inference, even though cost-effective options are available today.

The urgent need for cloud migration in today’s AI landscape

With profitability on top of their mind, the last year has seen many AI startups and enterprises alike take a multi-cloud approach and move production workloads to alternate clouds and consumer GPUs with lower prices and similar performance.

But infrastructure migration can be a huge challenge, especially for startups with minimal resources. There’s also the not-so-minor issue of benchmarking on a new GPU provider.

To tackle these challenges and help resource-strapped companies migrate seamlessly to Salad, we are introducing a new initiative – Salad Solutions Architect Program (SSAP). We know the name is a mouthful (No thanks to our marketing team here). But the service has already helped 20+ AI companies migrate from another cloud provider or API service to Salad, saving thousands of dollars per month in cloud costs.

“Over time, inference will increasingly be price-performance oriented and older hardware will run some AI workloads — though inference demand will rise exponentially”

– Chris Zeoli, Partner at Wing Venture Capital (https://www.linkedin.com/pulse/great-gpu-shortage-richpoor-chris-zeoli-5cs5c/)

What is Salad GPU Cloud?

SaladCloud is a distributed GPU cloud powered by a secure network of 1000s of individual consumer GPUs. Due to our marketplace model, our GPU prices are the lowest in the market. Salad’s fleet of RTX GPUs are powering inference at scale for many AI companies today, including some of the Top 50 most visited AI websites in the world.

Our benchmarks show similar or better cost-performance with consumer GPUs for many popular AI use cases like Text to Image, Speech to Text, Text to Speech, Computer Vision and more. All of this comes at least 50% less cost compared to serving inference with high-end GPUs on big clouds.

Performance comparison for Stable Diffusion v1.4 Inference across different GPU classes
Performance comparison for Stable Diffusion v1.4 Inference across different GPU classes

Here is feedback from a Generative AI startup founder wondering why they are paying 5x more for a V100 to get 1/2 the performance from a RTX 4090.

What is the Salad Solutions Architect Program (SSAP)?

“As much as we want to migrate to Salad (the pricing makes perfect business sense), we are very busy with a new product launch and frankly don’t have the resources or time to migrate. If your team can help on that end, we’d be on board right away”

– Founder of a Top 50 Generative AI image generation platform

The Salad Solutions Architect Program (SSAP) was born from repeated feedback similar to the one quoted above. We heard from AI startups that they were keen to migrate away from the big clouds to cut cloud costs but were hampered by two main challenges:

  • Resource constraints to architect migration to a distributed cloud model
  • Time constraints to benchmark their models on Salad’s consumer GPUs

With this feedback in mind, we designed SSAP to help companies migrate production workloads to SaladCloud easily. SSAP will essentially act as an extended team to companies helping them with benchmarking and migration. As part of the program, our team can assist with building, migrating, testing and benchmarking your workload on our highly scalable, cost performant cloud.

Qualified teams will gain access to a dedicated Solutions Engineer from Salad who will assist in coding, adjusting backend architecture, configuring SaladCloud container groups, and benchmarking results across our diverse range of consumer GPUs. Once onboarded, our managed container service allows you to run stateless docker containers seamlessly across our network.

How do I qualify for SSAP?

Why should a company join the Salad Solutions Architect Program?

For AI companies struggling with enormous cloud costs but strapped for resources, SSAP offers a way to seamlessly move production workloads to Salad’s distributed infrastructure. Some of the program benefits include:

Free credits worth $5k-10k

Qualifying companies get up to $10,000 in credits for a 2 month duration to test their use case on SaladCloud. SSAP allows new customers to test and integrate with SaladCloud risk free, as well as realising the cost benefits immediately.

Coding done for you

Developers and companies access SaladCloud via Salad Container Engine (SCE), a massively scalable orchestration engine, purpose-built to simplify container development. As long as you can containerize your model, switching to SaladCloud is a simple process. Our Solutions Architect will handle all the coding to get your containers up and running on Salad.

GPU benchmarking

While we have published numerous benchmarks comparing the performance of popular AI models on various consumer GPUs, we understand the need to benchmark GPU performance on your own models. More importantly, the right GPU choice for a use case could save thousands of dollars. As part of SSAP, our Solutions Architect will benchmark your use case against multiple GPU classes on SaladCloud. This ensures you find the right balance of optimal performance and cost.

Custom documentation

We understand every use case and container has specific needs. Our team will create custom documentation specific to your use case and containers.

How SSAP works in implementation?

If you are interested in cutting your AI inference costs by 50-80% with Salad, you can enroll in SSAP after a chat with our team. You can book a call here.

SSAP is a 5-step program:

  1. Sign mNDAs to ensure your data remains confidential.
  2. Join an exclusive Slack Connect group chat for easy communication and collaboration.
  3. Provide access to your code repos and architecture diagrams for our solutions engineers.
  4. Draft SOW outlining our deliverables and expected timeframe for architecture buildout.
  5. Continue with your work while SSAP gets your containers up and running on SaladCloud.

Sign up for SSAP. Stop overpaying for inference and start saving!

SSAP has already helped some of the Top 50 AI companies (by traffic) migrate entire production workloads to SaladCloud within weeks. Our consumer GPU prices are the lowest in the market enabling you to serve inference at the lowest market cost. With over 1 Million PCs on the network and 10,000+ GPUs at any time, our infrastructure is also built to help you scale easily as you grow (No more GPU shortages or signing expensive, long-term contracts just to secure compute power).

Sign up for a call with our team today and we’ll evaluate your use case for SSAP right away.

Have questions about SaladCloud for your workload?

Book a 15 min call with our team. Get $50 in testing credits.

Related Blog Posts

AI batch transcription of 1 million hours of video

AI Batch Transcription Benchmark: Transcribing 1 Million+ Hours of Videos in just 7 days for $1800

AI batch transcription benchmark: Speech-to-text at scale Building upon the inference benchmark of Parakeet TDT 1.1B for YouTube videos on SaladCloud and with our ongoing efforts to enhance the system...
Read More
AI transcription - Parakeet TRT 1.1B batch transription compared against APIs

AI Transcription Benchmark: 1 Million Hours of Youtube Videos with Parakeet TDT 1.1B for Just $1260, a 1000-fold cost reduction 

Building upon the inference benchmark of Parakeet TDT 1.1B on SaladCloud and with our ongoing efforts to enhance the system architecture and implementation for batch jobs, we have achieved a 1000-fold...
Read More
Self-managed Openvoice vs Metavoice comparison: A Text to speech API alternative

Text-to-Speech (TTS) API Alternative: Self-Managed OpenVoice vs MetaVoice Comparison

A cost-effective alternative to Text-to-speech APIs In the realm of text-to-speech (TTS) technology, two open-source models have recently garnered everyone's attention: OpenVoice and MetaVoice. Each model has unique capabilities in...
Read More

Don’t miss anything!

Subscribe To SaladCloud Newsletter & Stay Updated.