Blend cuts AI inference cost by 85% on Salad while running 3X more scale

INSIDE SALAD

Blend cuts AI inference cost by 85% on Salad while running 3X more scale

Published: June 5, 2024

Prashanth Shankara

Blend cuts AI inference cost by 85% on SaladCloud running 3X more scale

Key takeaways:
– The team at Blend were facing high inference costs & scalability challenges on major cloud
providers & local vendors
– Switching to SaladCloud for image generation helped them serve 3X more inference at 1/6th
the cost they were paying on GCP & Azure
– They also switched from limited A100s to readily-available consumer GPUs on Salad
– Their P95 also improved from 13.5 seconds to 10 seconds on Salad

Startups often have a unique origin story. Blend is no different. Their origins lie in a Whatsapp channel and a local fireworks seller who printed 2000 physical posters for a peculiar reason. Blend is an AI copilot for e-commerce that helps sellers create professional product photos and designs in two clicks without hiring an agency. Their mission is to help entrepreneurs & small sellers grow sales online with compelling social graphics, product photos and SEO optimized copy.

Today, Blend serves thousands of sellers generating around 6000 images every hour on Salad’s distributed network.

In this chat with Jamsheed Kamardeen, Chief Technology Officer (CTO) at Blend, we discuss their growth, the switch to a distributed cloud, inference cost & more.

How did the idea for Blend come about?

It was during Covid-19. We were in many Whatsapp groups looking for common problems faced by e-commerce sellers and we found a peculiar thing. There were many coaching sessions on how to use photo and design apps. Turns out, many of the sellers didn’t have a design team on their own but they needed to promote their products on social media, create posters, ads and the like.

In fact, one of my cousins had a fireworks shop in a small village in India and 70% of his sales came from posters on Whatsapp & Instagram. He’d go to a local printing shop and use the designer there to create & print out 2000 posters just so he could get a soft copy of the design for ads/promotions. He didn’t even bother distributing the physical posters. So looking at the challenges of local sellers in promoting their products in a digital world – that’s where the idea came from.

Photo editing and design apps have been out there for a long time, right? Weren’t they sufficient?

Yes. There were a lot of apps & horizontal design tools which were good for designers. But the sellers aren’t designers. Plus there is the paradox of choice. Most tools had 100s of templates, colors, etc and needed a significant time commitment. Plus, often the sellers ended up with a design that looked terrible because they tweaked too much or too little. So we decided to create Blend to offload the design decision making. Just upload a picture of the product and tell us what you want the offer to be. We’ll remove the background, put the product in appropriate settings and deliver a design with text. Our goal was always to get them the final design in the fewest clicks possible.

A selection of design capabilities in Blend

Today, you have Millions of downloads for your app. How crucial was the arrival of Generative AI in your user growth?

Our initial version included background removal and adding in an appropriate background with some other features. But generative AI completely changed the game for us.

For example, if a shoe store wants to do a 25% off Diwali promotion, all they had to do was upload the product photo and describe the offer and event. With Generative AI & Stable Diffusion models, we can identify it’s a shoe, have LLMs make the decision on what to paint & such, create an aesthetically pleasing urban background, automatically create appropriate text with the right color scheme and deliver the copy. All it takes is a couple of clicks. This is what led to our massive user growth.

Today, 40% of our users are individual sellers, so we are introducing a separate web app for them as well.

With big growth comes big cloud bills. That must have been the case for Blend as well. What infrastructure challenges did you face here?

Right. Since we are an AI first design company, inference became our biggest cost factor. Plus, we needed powerful GPUs to power Diffusion models. Sourcing GPUs to keep up with surge in demand quickly became a nightmare.

The existing providers didn’t have the right options for a company like us. AWS only had multi cluster A100s but there was no single cluster A100 option. GCP or Azure had them but they were expensive. So we started looking for alternatives.

We found a local provider who offered A100s for a cheaper price. But that came with reliability & scalability issues. We didn’t always have enough GPUs during times of higher traffic. I started losing a lot of sleep over this GPU shortage.

We’re a small team, so when the machines go down, my sleep goes away.

So again, we were looking for an alternative. That’s when we found Salad.

How has switching to SaladCloud impacted your cost and scaling?

When we switched from the hyperscalers to A100s with a local provider, we didn’t really think the cost could go any lower. But switching to Salad was eye-opening.

On Salad’s consumer GPUs, we are running 3X more scale at half the cost of A100s on our local provider and almost 85% less cost than the two major hyperscalers we were using before.

Plus Salad is much more reliable. We’ve migrated all current and new workloads to Salad.

I’m not losing sleep over scaling issues anymore.

On Salad’s consumer GPUs, we are running 3X more scale at half the cost of A100s on our local provider and almost 85% less cost than the two major hyperscalers we were using before. I’m not losing sleep over scaling issues anymore.
Jamsheed Kamardeen, Chief Technology Officer (CTO) at Blend

As a CTO, making the switch to a distributed cloud is a huge decision. What was the decision making process?

That’s a good question. I was very skeptical initially about the reliability of Salad. From a technical standpoint, my major question was this: Compared to data centers with reliable internet, how am I going to have reliable workloads on random people’s computers on a distributed cloud? We needed to implement some solutions to make reliability strong but it wasn’t as difficult as I initially perceived it to be.

One thing that helped us was the engineering support offered by Salad which made our system a lot more fault tolerant – initial health checks, reliability checks, active monitoring, etc.

You were also switching from A100s to consumer GPUs like the RTX4090. We find that many AI companies run inference on the same high-end GPUs they ran training on initially. How comfortable were you with this switch?

We already had some RTX 3090s for local testing and training. We ran inference loads on them and found the latency acceptable. Salad also had benchmarks showing the cost-performance on consumer GPUs was on par, if not better than high-end GPUs.

What was the operational shift in terms of mindset when it comes to reliability & latency?

We had an active monitoring setup with Datadog. Once we set up some monitoring and auto-remediation steps there, reliability wasn’t an issue at all.

And because there is an unlimited supply of GPUs with Salad, everytime the system gets latency sensitive, we could detect it and remediate it by making an API call to Salad. We could easily over-provision machines and still be half the cost of our earlier cloud providers.

With larger cloud providers, we were very conservative with our auto-scaling. With Salad, we could be very liberal and assign more GPUs than needed which helps reliability. Our P95 was at 13.5 seconds before but moving to Salad brought down our P95 to under 10 seconds.

It all seems like a herculean task in the beginning. But we tested & migrated all workloads to Salad within a week.

Can you explain a bit more about the migration process? As more companies look to alternate clouds like Salad, this is a topic of interest for many technical teams.

We were already deploying in Kubernetes before, so already had things setup as Docker containers. I just needed to go to my docker repository, create credentials, put my docker link and credentials on Salad and within 30 mins, I was up and running. I didn’t need to do anything extra to test on Salad. The barrier to enter was so low and we could test out the network pretty easily.

Because of how easy the migration was, we diverted 20% of traffic to Salad right on day one.

Now that Blend is a power user, what do you like the most about Salad today?

What I like the most about Salad is the cost. It’s just amazing that Salad’s cost is so low compared to others and we don’t struggle with resource constraints.

“With every other cloud provider, even the larger ones, we have to apply for a quota and
wait for days to get approved. I’m talking 48 hours just to get approved for two A100s.
With Salad, we practically have unlimited GPUs at our disposal at the lowest cost,
so we don’t need to worry about scaling”.
Jamsheed Kamardeen, Chief Technology Officer (CTO) at Blend

What’s next for Blend?

So far, we’ve grown by leaning into influencer marketing on TikTok and educational content. Our biggest use case today is running diffusion models on Salad to create aesthetic photo backgrounds for products. Every hour, around 6000 images get generated for this.

Next, we are launching a web app this month due to popular requests. Every day, close to 10,000 new products are added to Blend right now. We expect the number of remixes – different perspectives, different zooms, etc. – to increase by at least 2X. Just increasing the number of outputs for the same product.

We will also introduce a new mobile app for personal use cases which will increase the workloads by at least 3X from where they are today. We’ve got excellent feedback from our beta testers already.

Prashanth Shankara

Prashanth Shankara is an Aerospace Engineer turned Marketer with a decade of experience in Product, Content, Customer & Developer Marketing. A firm believer in the saying “It’s not rocket science. It’s just marketing”, Prashanth loves combining systematic, iterative, data-driven approaches with creative tactics in marketing. As Senior Manager, Marketing at Salad, Prashanth leads the Marketing function with a singular goal – To break the cloud monopoly and help companies fuel AI/ML innovation on the cloud at more affordable prices.

Have questions about SaladCloud for your workload?

Blend cuts AI inference cost by 85% on Salad while running 3X more scale

Prashanth Shankara

How did the idea for Blend come about?

Photo editing and design apps have been out there for a long time, right? Weren’t they sufficient?

Today, you have Millions of downloads for your app. How crucial was the arrival of Generative AI in your user growth?

With big growth comes big cloud bills. That must have been the case for Blend as well. What infrastructure challenges did you face here?

How has switching to SaladCloud impacted your cost and scaling?

As a CTO, making the switch to a distributed cloud is a huge decision. What was the decision making process?

You were also switching from A100s to consumer GPUs like the RTX4090. We find that many AI companies run inference on the same high-end GPUs they ran training on initially. How comfortable were you with this switch?

What was the operational shift in terms of mindset when it comes to reliability & latency?

Can you explain a bit more about the migration process? As more companies look to alternate clouds like Salad, this is a topic of interest for many technical teams.

Now that Blend is a power user, what do you like the most about Salad today?

What’s next for Blend?

Book a 15 min call with our team. Get $50 in testing credits.

Related Blog Posts

How to run cog applications on SaladCloud

Civitai powers 10 Million AI images per day with Salad’s distributed cloud

AI Batch Transcription Benchmark: Transcribing 1 Million+ Hours of Videos in just 7 days for $1800

Subscribe To SaladCloud Newsletter & Stay Updated.