Cloud computing model developed at Medical College enhances protein research

May 15, 2009 12:00 am

Researchers at the Medical College of Wisconsin (MCW) in Wauwatosa have developed a cloud computing model that makes the expensive process of protein research more affordable to fellow scientists around the world.

Established as one of 10 such sites throughout the country by the National Institutes of Health (NIH), the Medical College’s Biotechnology and Bioengineering Center was charged with developing new technologies to support the effort, and the cloud computing model created there has dramatically reduced the cost of what was previously a very expensive process.

MCW has developed a set of free tools called ViPDAC (virtual proteomics data analysis), which is used in combination with Amazon’s inexpensive cloud computing service, said researcher Brian Halligan, an assistant professor of physiology. This provides researchers the option to rent processing time on one of Amazon’s powerful servers.

“What we have is a system that combines distributed, on-demand cloud computing and open-source software that allows labs to set up virtual, scalable proteomics analysis without a huge investment in computing hardware or software licensing fees,” Halligan said. “The tools we have produced allow anyone with a credit card anywhere in the world to analyze proteomics data in the cloud, while reaping the benefits of having significant computing resources to speed up their data analysis.”

Proteomics is the large-scale study of all the proteins in an organism, and generally involves identifying proteins in both normal and disease states.

Five researchers at the Medical College are applying the cloud computing model developed there last year to study cardiovascular disease and the effects of radiation damage. They also work in collaboration with the University of Wisconsin-Madison’s stem cell research group.

One of the major challenges in running a proteomics research program has been setting up the computing infrastructure required for the vast amount of data generated by mass spectrometry, Simono Twigger an assistant professor of physiology at MCW said.

Researchers can build an image of one of these computers and the data that is on its hard drive, and Amazon stores the information on the cloud. When researchers want to use one, they can sign up for an Amazon account with a credit card and make it exist and run their analysis. The costs are relatively small in that each one of these computer nodes costs about 20 cents an hour per node to run, and researchers can have as many nodes as they want, Halligan says.

“It’s hard for us to even compute the cost (savings), because you buy hardware, and it depreciates and you have to house it and support it,” Halligan said. “For the cloud, it costs a dollar or two per run, it’s at least a tenfold savings, it’s more efficient than owning your own cluster. Owning your own cluster means you sacrifice speed. We can make a virtual cluster as big as we want. And using this (virtual) system, going fast costs as much as going slow.”

Another nice feature is that with a virtual cluster, researchers have the ability to customize their own systems. This means investigators can analyze data in greater depth, making it possible to learn more about the genetic systems they are studying. Researchers without access to computer resources can now undertake complex analysis and try different approaches that were not feasible before, Halligan said.

“It helps speed up the process because you can put more computing resources on it,” Halligan said. “We can do more complex things without having to make a huge investment for the occasional time you want to do it.”

Until recently, standard software programs used for proteomics data analysis were almost exclusively propriety and expensive, with the fees equaling or exceeding the cost of the hardware necessary to run them.

In 2004, a group from the NIH developed an open-source proteomics search program called Open Mass Spectrometry Algorithm. A second open-source proteomics data base search developed by the Bevius Laboratory at the University of Manitoba called X!Tandem is also now available.

A link on MCW’s Proteomics Center website provides step-by-step instructions on how to implement the proteomics analysis. The system has been publicly available on Amazon since November.

“We put a user friendly interface on it, but, also developed the underlying framework of how these searches are done,” Halligan said. “We have a lot of under-the-hood stuff about how the data runs and passes messages between itself.”

Loading…

Here are the results for the search: "{{td_search_query}}"

No results!

{{post_title}}

RELATED ARTICLES

The Boldt Co. names former Microsoft leader as its director of...

Details unveiled for Port Washington data center development agreement

Coffee Break: Tom Wielenbeck