distributed cloud Archives - SaladCloud Blog

Announcing Salad’s SOC 2 Type 1 Certification

We are pleased to announce that Salad Technologies is now SOC 2 Type I Compliant, reinforcing our commitment to democratize the cloud while ensuring the highest standards of security & compliance. This certification is in accordance with American Institute of Certified Public Accountants (AICPA) standards for SOC for Service Organizations also known as SSAE 18. What is the SOC 2 Type 1 Certification? The SOC 2 Type I certification is the first level of certification within the SOC 2 framework. It gives a snapshot assessment of an organization’s security controls as of a specific point in time. SOC 2 Type I reports on the description of security & compliance controls provided by the organization and attests that the controls are suitably designed and implemented. SOC 2 Type I certification is based on five trust service principles or criteria: security, availability, processing integrity, confidentiality, and privacy. These principles define the standards for managing customer data and ensuring its protection. Why does this matter to Salad – and to our users? As the world’s largest distributed cloud, we have securely distributed business applications/workloads to anonymous hosts since 2018. With the introduction of SaladCloud to businesses in 2023, the number of B2B & B2C companies running production workloads across out network is growing tremendously. As of last count, our network welcomes more than 1 Million consumer GPUs and 100s of B2B/B2C businesses running workloads on our cloud at affordable prices. Large scale AI image generation, voice AI, large language models (LLMs), data collection, residential IP addresses, computer vision – the applications running on SaladCloud are growing and so is our network of community-sourced GPUs. To protect the data and workloads of our contributors, customers and users, our team has implemented and assiduously maintains redundant security layers across our network, organization, and distributed machine environments. This certification is confirmation that our information security practices, policies, procedures, and operations meet the SOC 2 standards for security. Security at the heart of our mission to democratize cloud computing Secure Personnel: We take the security of its data and that of its clients and customers seriously and ensures that only vetted personnel are given access to their resources. Secure Development: Our software development is conducted in line with OWASP Top 10 recommendations for web application security. Secure Testing: We deploy third party penetration testing and vulnerability scanning of all production and Internet facing systems on a regular basis. In addition, we also perform static and dynamic software application security testing of all code, including open source libraries. Secure Cloud: We provide maximum security on SaladCloud with complete customer isolation in a modern, multi-tenant cloud architecture. We leverage the native physical and network security features of the cloud service, and rely on the providers to maintain the infrastructure, services, and physical access policies and procedures. All customer cloud environments and data are isolated using our patented isolation approach. Each customer environment is stored within a dedicated trust zone to prevent any accidental or malicious co-mingling. All data is also encrypted at rest and in transmission to prevent any unauthorized access andprevent data breaches. Our entire platform is also continuously monitored by dedicated, highly trained experts. We separate each customer’s data and our own, utilizing unique encryption keys to ensure data is protected and isolated. Our data protection complies with SOC 2 standards to encrypt data in transit and at rest, ensuring customer and company data and sensitive information is protected at all times. We implement role-based access controls and the principles of least privileged access, and review revoke access as needed. Salad’s SOC 2 Type I Certification Salad Technologies, INC was audited by Prescient Assurance , a leader in security and compliance certifications for B2B, SAAS companies worldwide. Prescient Assurance is a registered public accounting in the US and Canada and provide risk management and assurance services which includes but not limited to SOC 2, PCI, ISO, NIST, GDPR, CCPA, HIPAA, CSA STAR etc. For more information about Prescient Assurance, you may reach out them at [email protected]. To learn about trust, security & compliance on Salad, visit out Trust center. Prashanth ShankaraPrashanth Shankara is an Aerospace Engineer turned Marketer with a decade of experience in Product, Content, Customer & Developer Marketing. A firm believer in the saying “It’s not rocket science. It’s just marketing”, Prashanth loves combining systematic, iterative, data-driven approaches with creative tactics in marketing. As Senior Manager, Marketing at Salad, Prashanth leads the Marketing function with a singular goal – To break the cloud monopoly and help companies fuel AI/ML innovation on the cloud at more affordable prices.

Comparing Price-Performance of 22 GPUs for AI Image Tagging (GTX vs RTX)

Older Consumer GPUs: A Perfect-Fit for AI Image Tagging In the current AI boom, there’s a palpable excitement around sophisticated image generation models like Stable Diffusion XL (SDXL) and the cutting-edge GPUs that power them. These models often require more powerful GPUs with larger amounts of vRAM. However, while the industry is abuzz with these advancements, we shouldn’t overlook the potential of older GPUs, especially for tasks like image tagging and search embedding generation. These processes, employed by image generation platforms like Civit.ai and Midjourney, play a crucial role in enhancing search capabilities and overall user experience. We leveraged Salad’s distributed GPU cloud to evaluate the cost-performance of this task across a wide range of hardware configurations. What is AI Image Tagging? AI image tagging is a technology that can automatically identify and label the content of images, such as objects, people, places, colors, and more. This helps users to organize, search, and discover their images more easily and efficiently. AI image tagging works: AI image tagging can be used for various purposes and applications, such as: Benchmarking 22 Consumer-Grade GPUs for AI Image Tagging In designing the benchmark, our primary objective was to ensure a comprehensive and unbiased evaluation. We selected a range of GPUs on SaladCloud, starting from the GTX 1050 and extending up to the RTX 4090, to capture a broad spectrum of performance capabilities. Each node in our setup was equipped with 16 vCPUs and 7 GB of RAM, ensuring a standardized environment for all tests. For the datasets, we chose two prominent collections from Kaggle: the AVA Aesthetic Visual Assessment and the COCO 2017 Dataset. These datasets offer a mix of aesthetic visuals and diverse object categories, providing a robust testbed for our image tagging and search embedding generation tasks. We used ConvNextV2 Tagger V2 to generate tags and ratings for images, and CLIP to generate embedding vectors. The tagger model used the ONNX runtime, while CLIP used Transformers with PyTorch. ONNX’s GPU capabilities are not a great fit for Salad, because of inconsistent Nvidia driver versions across the network, so we chose to go with the CPU runtime and to allocate 16 vCPUs for each node. PyTorch with Transformers works quite well across a large range of GPUs and driver versions with no additional configuration, so CLIP was run on GPU. Benchmark Results: GTX 1650 is the Surprising Winner As expected, our nodes with higher end GPUs took less time per image, with the flagship RTX 4090 offering the best performance. What is interesting, though, is that the median time per image is actually very similar for the GTX 1650 and the RTX 4090: 1 second. The best-case and worst-case performance of the 4090 is notably better. Weighting our findings by cost, we can confirm our intuition that the 1650 is a much better value at $0.02/hr than is the 4090 at $0.30/hr. While the older GPUs like the GTX 1650 have worse absolute performance compared to the RTX 4090, the great difference in price causes the older GPUs to be the best value, as long as your use case can withstand the additional latency. In fact, we see all GTX series GPUs outperforming all RTX GPUs in terms of images tagged per dollar. GTX Series: The Cost-Effective Option for AI Image Tagging with 3x More Images Tagged per Dollar In the ever-advancing realm of AI and GPU technology, the allure of the latest hardware often overshadows the nuanced decisions that drive optimal performance. Our analysis not only emphasizes the balance between raw performance and cost-effectiveness but also resonates with broader cloud best practices. Just as it’s pivotal not to oversubscribe to compute resources in cloud environments, it’s equally essential to avoid overcommitting to high-end GPUs when more cost-effective options can meet the requirements. The GTX 1650’s value proposition, especially for tasks with flexible latency needs, serves as a testament to this principle, delivering 3x as many images tagged per dollar as the RTX 4090. As we navigate the expanding AI applications landscape, making judicious hardware choices based on comprehensive real-world benchmarks becomes paramount. It’s a reminder that the goal isn’t always about harnessing the most powerful tools, but rather the most appropriate ones for the task and budget at hand. Run Your Image Tagging on Salad’s Distributed Cloud If you are running AI image tagging or any AI inference at scale, Salad’s distributed cloud has 10,000+ GPUs at the lowest price in the market. Sign up for a demo with our team to discuss your specific use case. Shawn RushefskyShawn Rushefsky is a passionate technologist and systems thinker with deep experience across a number of stacks. As Generative AI Solutions Architect at Salad, Shawn designs resilient and scalable generative ai systems to run on our distributed GPU cloud. He is also the founder of Dreamup.ai, an AI image generation tool that donates 30% of its proceeds to artists.

The AI GPU Shortage: How Gaming PCs Offer a Solution and a Challenge

Reliability in Times of AI GPU Shortage In the world of cloud computing, leading providers have traditionally utilized expansive, state-of-the-art data centers to ensure top-tier reliability. These data centers, boasting redundant power supplies, cooling systems, and vast network infrastructures, often promise uptime figures ranging from 99.9% to 99.9999% – terms you might have heard as “Three Nines” to “Six Nines.” For those who have engaged with prominent cloud providers, these figures are seen as a gold standard of reliability. However, the cloud computing horizon is expanding. Harnessing the untapped potential of idle gaming PCs is not only a revolutionary departure from conventional models but also a timely response to the massive compute demands of burgeoning AI businesses. The ‘AI GPU shortage‘ is everywhere today as GPU-hungry businesses are fighting for affordable, scalable computational power. Leveraging gaming PCs, which are often equipped with high-performance GPUs, provides an innovative solution to meet these growing demands. While this fresh approach offers unparalleled GPU inference rates and wider accessibility, it also presents a unique set of reliability factors to consider. The decentralized nature of a system built on individual gaming PCs does introduce variability. A single gaming PC might typically offer reliability figures between 90-95% (1 to 1.5 nines). At first glance, this might seem significantly different from the high “nines” many are familiar with. However, it’s crucial to recognize that we’re comparing two different models. While an individual gaming PC might occasionally face challenges, from software issues to local power outages, the collective strength of the distributed system ensures redundancy and robustness on a larger scale. When exploring our cloud solutions, it’s essential to view reliability from a broader perspective. Instead of concentrating solely on the performance of individual nodes, we highlight the overall resilience of our distributed system. This approach offers a deeper insight into our next-generation cloud infrastructure, blending cost-efficiency with reliability in a transformative way, perfectly suited for the computational needs of modern AI-driven businesses and to solve the ongoing AI GPU shortage. Exploring the New Cloud Landscape Embracing Distributed Systems Unlike traditional centralized systems, distributed clouds, particularly those harnessing the power of gaming PCs, operate on a unique paradigm. Each node in this setup is a personal computer, potentially scattered across the globe, rather than being clustered in a singular data center. Navigating Reliability Differences Nodes based on gaming PCs might individually present a reliability range of 90-95%. Various elements influence this: Unpacking the Benefits of Distributed Systems Global Redundancy Amidst Climate Change The diverse geographical distribution of nodes (geo-redundancy) offers an inherent safeguard against the increasing unpredictability of climate change. As extreme weather events, natural disasters, and environmental challenges become more frequent, centralized data centers in vulnerable regions are at heightened risk of disruptions. However, with nodes spread across various parts of the world, the distributed cloud system ensures that if one region faces climate-induced challenges or outages, the remaining global network can compensate, maintaining continuous availability. This decentralized approach not only ensures business continuity in the face of environmental uncertainties but also underscores the importance of forward-thinking infrastructure planning in our changing world. Seamless Scalability Distributed systems are designed for effortless horizontal scaling. Integrating more nodes into a group is a straightforward process. Fortifying Against Localized Disruptions Understanding the resilience against localized disruptions is pivotal when appreciating the strengths of distributed systems. This is especially evident when juxtaposed against potential vulnerabilities of a centralized model, like relying solely on a specific AWS region such as US-East-1. Catering to AI’s Growing Demands Harnessing idle gaming PCs is not just innovative but also a strategic response to the escalating computational needs of emerging AI enterprises. As AI technologies advance, the quest for affordable, scalable computational power intensifies. Gaming PCs, often equipped with high-end GPUs, present an ingenious solution to this challenge. Achieving Lower Latency The vast geographic distribution of nodes means data can be processed or stored closer to end-users, potentially offering reduced latency for specific applications. Cost-Effective Solutions Tapping into the dormant resources of idle gaming PCs can lead to substantial cost savings compared to the expenses of maintaining dedicated data centers. The Collective Reliability Factor While individual nodes might have a reliability rate of 90-95%, the combined reliability of the entire system can be significantly higher, thanks to redundancy and the sheer number of nodes. Consider this analogy: Flipping a coin has a 50% chance of landing tails. But flipping two coins simultaneously reduces the probability of both landing tails to 25%. For three coins, it’s 12.5%, and so on. Applying this to our nodes, if each node has a 10% chance of being offline, the probability of two nodes being offline simultaneously is just 1%. As the number of nodes increases, the likelihood of all of them being offline simultaneously diminishes exponentially. Thus, as the size of a network grows, the chances of the entire system experiencing downtime decrease dramatically. Even if individual nodes occasionally falter, the distributed nature of the system ensures its overall availability remains impressively high. Here is a real example: 24 hours sampled from a production AI image generation workload with 100 requested nodes. As we would expect, it’s fairly uncommon for all 100 to be running at the same time, but 100% of the time we have at least 82 live nodes. For this customer, 82 simultaneous nodes offered plenty of throughput to keep up with their own internal SLOs, and provided a 0-downtime experience. Gaming PCs as a Robust, High-Availability Solution for AI GPU Shortage While gaming PC nodes might seem to offer modest reliability compared to enterprise servers, when viewed as part of a distributed system, they present a robust, high-availability solution. This system, with its inherent benefits of redundancy, scalability, and resilience, can be expertly managed to provide a formidable alternative to traditional centralized systems. By leveraging the untapped potential of gaming PCs, we not only address the growing computational demands of industries like AI but also pave the way for a more resilient, cost-effective, and globally

Optimizing AI Image Generation: Streamlining Stable Diffusion with ControlNet in Containerized Environments

Implementing Stable Diffusion with ControlNet in any containerized environment comes with a plethora of challenges, mainly due to the sizable amount of additional model weights required to be incorporated in the image. In this blog, we discuss these challenges and how they can be mitigated. What is ControlNet for Stable Diffusion? ControlNet is a network structure that empowers users to manage diffusion models by setting extra conditions. It gives users immense control over the images they generate using Stable Diffusion, using a single reference image without noticeably inflating VRAM requirements. ControlNet has revolutionized AI image generation, demonstrating a key advantage of open models like Stable Diffusion against proprietary competitors such as Midjourney. ControlNet for Stable Diffusion: Reference Image (left), Depth-Midas (middle), Output Image (right) Note how the image composition remains the same between the reference image and the final image. This is accomplished by using the depth map as a control image. Images from Dreamup.ai using the Dreamshaper model, MiDas Depth Estimation, and the depth controlnet. Challenges in Implementing ControlNet for Stable Diffusion However, this remarkable feature presents challenges. There are (at the time of this writing) 14 distinct controlnets compatible with stable diffusion, each offering unique control over the output, necessitating a different “control image.” All these models, freely accessible on Huggingface, in separate repositories, amount to roughly 4.3GB each, totaling up to an additional storage need of 60.8GB. This represents a near tenfold increase compared to a minimal Stable Diffusion image. Furthermore, each ControlNet model comes with one or more image preprocessors used to decipher the “control image.” For instance, generating a depth map from an image is a prerequisite for one of the ControlNets. These additional model weights further bloat the total VRAM requirement, exceeding the capacity of commonly used graphics cards like the RTX3060 12GB. The basic approach to implementing ControlNet in a containerized environment. Optimizing ControlNet Implementation in a Containerized Environment on a Distributed Cloud Building a container image for continuous ControlNet stable diffusion inference, if approached without optimization in mind, can rapidly inflate in size. This results in prohibitively long cold-start times as the image is downloaded onto the server running it. This problem is prominent in data centers and becomes even more pronounced in a distributed cloud such as Salad, which depends on residential internet connections of varying speed and reliability. A two-pronged strategy can effectively address this issue. Firstly, isolate ControlNet annotation as a separate service or leverage a pre-built service like this one from Dreamup.ai. This division of labor not only reduces VRAM requirements and model storage in the stable diffusion service but also enhances efficiency when creating numerous output images from a single input image and prompt. Secondly, rather than cloning entire repositories for each model, perform a shallow clone of the model repository without git lfs. Then, use wget to selectively download only the necessary weights, as exemplified by this Dockerfile from the Dreamup.ai Stable Diffusion Service. This tactic alone can save more than 40GB of storage. Stable Diffusion-split-service-controlnet The end result? Two services, both with manageable container image sizes. The ControlNet Preprocessor Service, inclusive of all annotator models, sizes up to 7.5GB and operates seamlessly on an RTX3060 12GB. The Stable Diffusion Service, even with every ControlNet packaged in, comes up to 21.1GB, and also runs smoothly on an RTX3060 12GB. Further reductions can be achieved by tailoring what ControlNets to support. For instance, Dreamup.ai excludes MLSD, Shuffle, and Segmentation ControlNets in their production image, thereby saving about 4GB of storage. Shawn RushefskyShawn Rushefsky is a passionate technologist and systems thinker with deep experience across a number of stacks. As Generative AI Solutions Architect at Salad, Shawn designs resilient and scalable generative ai systems to run on our distributed GPU cloud. He is also the founder of Dreamup.ai, an AI image generation tool that donates 30% of its proceeds to artists.