Tag 309K Images/$ with Recognize Anything Model++ (RAM++) On Consumer GPUs

Recognize anything model++ gpu benchmark

What is the Recognize Anything Model++? The Recognize Anything Model++ (RAM++) is a state of the art image tagging foundational model released last year, with pre-trained model weights available on huggingface hub. It significantly outperforms other open models like CLIP and BLIP in both the scope of recognized categories and accuracy. But how much does […]

Segment Anything Model (SAM) Benchmark: 50K Images/$ on Consumer GPUs

Segment anything model (SAM) benchmark on consumer GPUs on SaladCloud

What is the Segment Anything Model (SAM)? The Segment Anything Model (SAM) is a foundational image segmentation model released by Meta AI Research last year, with pre-trained model weights available through the GitHub repository. It can be prompted with a point or a bounding box, and performs well on a variety of segmentation tasks. More […]

Stable Diffusion v1.5 Benchmark On Consumer GPUs

Stable-Diffusion v1.5 -benchmark-Salad-cover

Benchmarking Stable Diffusion v1.5 across 23 consumer GPUs What’s the best way to run inference at scale for stable diffusion? It depends on many factors. In this Stable Diffusion (SD) benchmark, we used SD v1.5 with a controlnet to generate over 460,000 fancy QR codes. The benchmark was run across 23 different consumer GPUs on […]

Comparing Price-Performance of 22 GPUs for AI Image Tagging (GTX vs RTX)

AI Image Tagging GPU Benchmark on Salad

Older Consumer GPUs: A Perfect-Fit for AI Image Tagging In the current AI boom, there’s a palpable excitement around sophisticated image generation models like Stable Diffusion XL (SDXL) and the cutting-edge GPUs that power them. These models often require more powerful GPUs with larger amounts of vRAM. However, while the industry is abuzz with these […]

Bark Benchmark: Reading 144K Recipes with Text-to-Speech on SaladCloud

BARK Benchmark - Text-to-speech-gpu - SaladCloud

Speech Synthesis with suno-ai/bark When you think of speech synthesis, you might think of a very robotic sounding voice, like this one from 1979. Maybe you think of more modern voice assistants, like Siri or the Google Assistant. While these are certainly improvements over what we had in the 1970s, they still wouldn’t be mistaken […]

The AI GPU Shortage: How Gaming PCs Offer a Solution and a Challenge

The GPU shortage: How Gaming PCs Offer a Solution and a Challenge for Generative AI

Reliability in Times of AI GPU Shortage In the world of cloud computing, leading providers have traditionally utilized expansive, state-of-the-art data centers to ensure top-tier reliability. These data centers, boasting redundant power supplies, cooling systems, and vast network infrastructures, often promise uptime figures ranging from 99.9% to 99.9999% – terms you might have heard as […]

Stable Diffusion XL (SDXL) Benchmark – 769 Images Per Dollar on Salad

Stable Diffusion XL (SDXL) Benchmark - Salad

Stable Diffusion XL (SDXL) Benchmark A couple months back, we showed you how to get almost 5000 images per dollar with Stable Diffusion 1.5. Now, with the release of Stable Diffusion XL, we’re fielding a lot of questions regarding the potential of consumer GPUs for serving SDXL inference at scale. The answer from our Stable […]

Whisper Large Inference Benchmark: 137 Days of Audio Transcribed in 15 Hours for Just $117

Whisper Inference GPU Benchmark - Speech to text - Transcription

Save Over 99% On Audio Transcription Using Whisper-Large-v2 and Consumer GPUs Harnessing the power of OpenAI’s Whisper Large V2, an automatic speech recognition model, we’ve dramatically reduced audio transcription costs and time. Here’s a deep dive into our benchmark against the substantial English CommonVoice dataset and how we achieved a 99.1% cost reduction. A Costly […]

Exploring AI Bias by Turning Faces into Salads – An Experiment

Exploring AI Bias in Image Generation

What is AI Bias in Image Generation? Type in ‘An engineer smiling at the camera’ as the prompt into a few AI image generators. What do you see? A collection of men in most cases. In a recent experiment, 298 out of the 300 stable diffusion generated images were of perceived men for the prompt […]