Government Science on Citizen Machines – Why Distributed Cloud is the Way
Every year, private industry overlooks hundreds of potentially groundbreaking scientific computing projects, either due to the upfront cost or uncertain prospects for profitability. One of the most affected areas is what’s known as “fundamental research”—exactly the kind of rigorous theory testing required to better understand natural phenomena, but which usually doesn’t translate to short-term returns. That means the burden of advancing human progress often falls on the taxpayers through federal research grants. Government science already runs on citizen dollars. If we ran it on citizen machines on a distributed cloud, we could conduct life-changing research and return some of the taxpayers’ hard-earned income. The Compute-Industrial Complex Whether they know it or not, taxpayers have always been instrumental to pushing the boundaries of science. Today’s most robust computer modeling is only made possible through public funds. Without these resources, scientists could never study climate change on a digital twin of the Earth, or simulate the young universe to grok the Big Bang. Humanity owes a large part of its technological legacy to the continuous give-and-take between public sector research and private sector development. Entire multinational industries derive their lifeblood from scientific discoveries promulgated by the world’s governments. And though we may benefit from the resulting products and services, everyday people like you and me are footing the bill at both ends. In government-administered computing projects, the burden on taxpayers is unnecessarily compounded by infrastructure; the bulk of public compute-assisted research funding goes to building the kinds of supercomputers necessary to actually do the work. Supercomputing clusters are complex, individualized systems where the speed of data transfer is just as vital as the total processing power. Colocating and networking processing hardware together with high-throughput interconnects helps maintain the rate of data flow, but this introduces additional, often expensive considerations like cooling and redundant energy supplies. Traditional data centers also need to contend with the hardware upgrade lifecycle; without regular upgrades, many supercomputers become obsolete within five years. Only a handful of companies (such as HPE or IBM) have access to both the expertise and the relationships with hardware suppliers required to design and build massive supercomputing infrastructure. Other renowned private enterprises may specialize in building these types of systems under contract, but neither their services nor the underlying hardware come cheap. Citizen Science Isn’t New Given the exponential pace of computer development, it makes sense that the majority of today’s fundamental research should incorporate sophisticated technology. But did you know that distributed networks of heterogeneous consumer hardware can act as cloud layers that are equally performant and more cost-effective than traditional data centers? Consumer-supported computing allows individual users to connect their PCs to a distributed network in order to process discrete parts of a larger workload. With a well-architected system in place, anyone with a reasonably powerful computer1 and access to the Internet could support cutting-edge research like protein synthesis or climate modeling right from home. There are also good reasons to think that private citizens would be more than willing to do so. Do you recall seeing monitors in your school computer lab displaying colorful charts like these? That was none other than SETI@home, a crowdsourced research application created by the Search for Extraterrestrial Intelligence (SETI) in hopes of identifying anomalous events that may originate from our intergalactic neighbors. With thousands of connected individuals lending processing power to SETI’s workloads, researchers were able to analyze a massive collection of radio wave emissions originally detected by the Arecibo and Green Bank radio telescopes. SETI@home established one of the first successful implementations of consumer-supported distributed computing at scale. In March 2020, SETI shut down the project after 21 years of operation because it was too successful; network participants had processed such an overwhelming volume of data that astronomers would need years to cross-reference the results for signs of extraterrestrial life. Following their example, researchers at Berkeley developed the Berkeley Open Infrastructure for Network Computing (BOINC) platform. BOINC permits academic researchers to upload programs and data as workloads for distributed processing on a consumer-supported network, where participants—private individuals running the BOINC desktop client on their personal computers—can voluntarily share resources with research that interests them.2 Since its founding, the BOINC platform has gone on to support SETI@home and a multitude of other fundamental research projects in various scientific fields. These projects have proven that consumer-supported, distributed networks can be leveraged for scientific applications. With the right incentives to participate, I believe we can reduce the typical fundamental research budget to a trivial expense. A Problem of Incentive The world’s latent processing reserves are perfect for the types of government-administered, scientific research projects that usually require expensive supercomputers.3 Since the hardware has been purchased by private consumers, there are fewer concerns about budgeting for expensive upgrades simply to reach a current hardware generation or speed up your 3D-accelerated processing. Distributed networks like BOINC and SETI@home demonstrated just how readily millions might choose to aid a scientific effort out of enthusiasm alone (an exciting data point for those managing research grants at the NSF, to say the least). All that’s left is networking those devices in a way that allows dynamic allocation of their shared resources—but accomplishing that requires solving a problem endemic to contemporary academic science. How do you motivate enough participants to conduct the research? Of hundreds of vital research projects listed on the BOINC platform at any given time, only a thankful few garner sufficient interest or processing supply to achieve their desired outcomes, while sexier projects—like curing COVID-19, or dialing up the nearest spacefaring civilization—attract supporters in droves. Far too often, high-profile workloads overshadow equally important research in niche fields that doesn’t make for a good headline. When forced to compete for mindshare, scientists must effectively moonlight as marketers simply to conduct their research. But what if you could engage a whole nation of users with a modest incentive? At Salad Technologies, we’ve built a distributed cloud network based on a mutual reward model we call computesharing. The Computesharing Model Most fundamental research projects require public funds and private computing infrastructure, but the actual computations are done on processors identical to those
Government Science on Citizen Machines – Why Distributed Cloud is the Way Read More »