Welcome to BioHPC Home Page
One of the challenges of High Performance Computing (HPC) for
biology is resource accessibility. Most biologists are focused on
experimental aspects of their research and are not familiar with HPC
environment. They often know the algorithms and programs required
for analysis of their data, but they have no expertise to use them
efficiently in HPC environment. Very often the analysis is
computationally intensive and cannot be carried out locally or using
free web-based Internet tools. This forces biological research
groups to acquire their own computer resources. Even then, using
these resources often proves to be a challenge, since there are few
user-friendly interfaces available for HPC bioinformatics, and the
existing ones are expensive. These problems become even more
important now with the increasing flow of next generation sequencing
data requiring specialized computational infrastructure integration.
At the Cornell University
Computational Biology Service Unit we have
developed a suit of computational biology applications for HPC
(BioHPC) that allows researchers from biological laboratories to
submit their jobs to a parallel cluster through an easy-to-use web
interface, as well as manage their jobs and data. Users don't need to deal with parallel job submission,
queues, clusters: knowing the application, parameters and input is
all that is required.
The newest additions to BioHPC are the support
for next generation sequencing and web service layer. BioHPC is now able to interact
with a sequencing centers and has components for data
distribution and management. Sequencing data managed at BioHPC
can be used for further analysis with a need of data transfers.
Web service layer allows job submission and control through
other clients, such as MS Excel or Perl scripts. Through web
services, BioHPC can be integrated with various software
platforms.
BioHPC is a suite of applications modified and adapted for
efficient use in an HPC environment with ASP.NET interface allowing
both easy access and use, as well as easy administration of users,
applications, data and other features. For more information about
the interface, administration and usage please see our
info pages or
documentation.
BioHPC interacts with
Microsoft
Windows HPC Server 2008 or
Microsoft
Windows Compute Cluster Server 2003 as a cluster
scheduler, it also works
with remote clusters via HPC Profile/JSDL.
The suite is freely available for download and use. We only
request that each
installation be registered with us. Any new modules,
applications and improvements that you wish to share should be submitted to us, so they can be incorporated
into BioHPC release and made available to all. See the
license for more details.
BioHPC is quite complex, but installation is not that difficult and
you
can always contact us with any questions. Examples of working BioHPC
installations that were made available for guests are listed in
the implementations section. You
may want to
register with BioHPC
in order to receive notifications about new releases and
bug fixes.
Development of BioHPC is currently supported in part by
Microsoft Research.
An example of BioHPC
installation used for local and remote access is
BioHPC installation at Cornell
University.
For more information please consult other
parts of this website, or some recent posters and presentations
about BioHPC: BioHPC general poster,
BioHPC next generation
support poster, BioHPC
overview on MSR external research symposium (April 2010).