J. D. BIERSDORFERThe secret life of the home computer
Thursday 15 June 2000
Idle hands may be the devil's workshop, but an idle computer has the potential to seek out extraterrestrial life, crack encryption codes and maybe make a little extra cash for its owner.
Distributed computing, in which a large problem or daunting amount of data is parcelled out to many - even millions - of computers to work on at once, is becoming a viable substitute for the once-almighty supercomputer.
Even the humblest home machine can become a cog in a larger computational wheel, receiving data, churning through it at its own speed and returning the results. It's a digital-age illustration of the old adage about how to eat an elephant: one bite at a time.
By far the best-known distributed computing project is SETI@Home, a screen saver program (designed for home computers) which analyses radio signals from the cosmos for patterns or other signs of alien life. Other projects have found million-digit prime numbers, deciphered secret codes and helped in the design of storage containers for nuclear waste.
Like many successful inventions originally created by scientists to share information and promote the greater good (including the World Wide Web), distributed computing is being examined for its commercial potential: computer owners may be able to, in effect, rent out unused processing time.
SETI@Home (SETI stands for Search for Extraterrestrial Intelligence) was developed at the University of California at Berkeley. It went into full distribution about a year ago and recently signed up its two millionth user.
After a participant downloads the free screen saver program, packets of data recorded at a radio telescope in Puerto Rico are sent to the user's computer via the Internet. The participant's computer analyses a packet when it is not handling other tasks, then sends the data back to the SETI@Home server and receives another packet.
Most of SETI@Home's users have probably never even noticed the computing time that they have donated to the project because the screen saver program makes use of the computer's idle time. The program can be set to do its analyses when the computer has been idle for a while (for example, when the user takes a break). But it can also be set to function in the much smaller intervals of idle time that occur even when a machine appears to be in constant use.
"Let's say you're typing a story. Each time you hit a key, your computer works for about a hundredth of a second," said Dr David Anderson, a computer scientist who has done extensive research in distributed computing and is also the project director of SETI@Home.
"And then it sits there for another second, waiting for you to hit the next key. So even while you think you are using your computer, about 99.99 percent of its time it's not actually doing anything useful."That is, unless it happens to be doing a little distributed work on the side.
Not all supersize projects are suitable for the distributed approach, and large computing chores, such as weather simulation, do not work well because quick calculations are needed.
One project in which a cluster proved effective was the RC5-56 Secret Key Challenge, in which thousands of volunteers donated computer time to crack a 56-bit encryption code.
The project was one of a number run by Distributed.net, a non-profit organisation that has signed up about 60,000 participants to take part in a variety of code-cracking contests.
All told, SETI@Home's users are currently averaging about 12 teraflops (12 trillion calculations per second) over the course of a day, said Dr Dan Werthimer, an astronomer at Berkeley and the chief scientist for the project.
"It's the largest computation that's ever been done - on this planet, anyway," Dr Werthimer said.
He calculated that home users had donated 283,000 years of computing time to the project so far. "They're donating time, about 1000 years every day," he said.
Supercomputers are usually used for large computing problems, like nuclear weapons research and weather forecasting, that require a staggering number of calculations, but they are "super" in both processing power and price tag. Computers capable of such power can cost millions of dollars, a price that is beyond the reach of many institutions. And a supercomputer, like any computer, can quickly become obsolete.
SETI researchers ran into the obsolescence problem before turning to distributed computing. They began by building specialised supercomputers to handle the data from the radio telescopes. But rapid advances in technology have meant a SETI-specific supercomputer has a short lifespan.
"After five years, your machine is kind of a dinosaur," Dr Werthimer explained, "and it's time to build something new."
Distributed computing has been around for several decades but it has been greatly aided by two developments: the increasing speed of personal computers and the advent of the Internet, which provides an easy and almost instantaneous means of exchanging data between machines.
"The speed of the common PC has increased to such a dramatic rate," Dr Anderson said, "that the high-end PC today is as fast as a Cray supercomputer was about 10 years ago".
The real rise of distributed computing can be traced to 1994, when doctors Donald Becker and Thomas Sterling, working for the Center of Excellence in Space Data and Information Sciences at the Goddard Space Flight Centre in Maryland, created a cluster computer from 16 processors networked together and dubbed it Beowulf.
Dr Becker recalled that Dr Sterling picked the name for several reasons, including a line loosely translated from Beowulf, the epic Anglo-Saxon poem, that described the heroic Beowulf as having the strength of many.
The Beowulf Project proved that an incredibly powerful computer could be created by harnessing together many smaller computers like Pentium IIs, running open-source software like Linux. The idea took off and clusters began to spring up at academic institutions and NASA research centres.
"We know there are thousands - there might be as many as 10,000 clusters out there," Dr Becker said.
The Beowulf Project website contains a wealth of information about the project, the recipe for making a Beowulf cluster, plus links to mailing lists and other Beowulf-related forums.
"An army of underused computers stands ready to crunch data distributed over the Internet," Dr Becker said.
He expects Beowulf cluster computers to take over many, but not all, of the complicated problems that were handled only by supercomputers.
"There are jobs where only the largest machines are appropriate to use, so there's still a very solid role for traditional supercomputers," he said.
"But for most other applications, a cluster like this is much more cost-effective."
Most distributed computing systems designed for public participation are on the philanthropic side, but some sites want to bring money into the picture.
"Isn't it time your computer started paying for itself?" asks the homepage of the ProcessTree Network, an enterprise that would like to pay users money in exchange for their computers' idle time. The ProcessTree Network and DCypher.Net, another distributed computing venture, recently merged as Distributed Science.
Although the service hasn't opened yet, its operations are expected to be similar to SETI@Home's. After a participant installs the ProcessTree program in a home computer, that user connects with the Internet to download a work unit, or batch of data, from one of the company's servers.
Once the work unit is on the home computer, the ProcessTree software will munch and crunch the information during the PC's idle time. It will send data back to the ProcessTree server then collect a new unit. In return for those spare processor cycles, the company will mail the user a cheque or credit the user's online account with small payments that can be spent online like electronic cash.
Armin Lenz, a software developer at Distributed Science, said that more than 25,000 people had signed up to work on the venture. He predicted that distributed computing would become an "industry unto itself".
"Think in terms of going from 'railroads' of supercomputing to the 'individual mass traffic' of distributed processing.
"There are thousands of projects that cannot be tackled because of a lack of computing power that is at the same time sitting idle on everyone's desk." ProcessTree members won't know exactly what their computers are working on, but it could range from video animation, analysis of scientific and corporate research data and weather models, to cryptography.
Dr Anderson is less sure about whether such payments will be worthwhile for home computer users. "The value of the CPU time is probably small enough that it wouldn't amount to a significant quantity of money for most people," he said.
The ProcessTree venture will be ready to start as soon as its founders are satisfied with the home-user software. Lenz said they were negotiating with several well-known companies that might need its heavy-duty data analyses.
"For DCypher.Net, we have more projects waiting in line than we have people to power them," he said. "There's certainly the need for massive amounts of distributed computing."
Whether for pay or play, distributed computing can be found in computer labs or family dens. And in the case of SETI@Home, which registers the users of its program and tracks the work done by each participating computer, it might just mean that a volunteer somewhere will be the lucky one to have the computer that first detects signals from extraterrestrial beings.