BioHPC Home
BioHPC
Computational Biology Application Suite for High Performance Computing
 
What's new Using BioHPC Architecture Applications Future Directions BioHPC @ CBSU Using BioHPC Administration of BioHPC Installing on cluster Installing on server Real-time scheduler Download from CBSU

BioHPC: Applications

BioHPC provides users with popular bioinformatics tools covering various aspects of computational biology:
  • Data mining / sequence analysis
    BLAST, HMMER, InterProScan, RepeatFinder, GIMSAN

  • Protein structure prediction and modeling
    LOOPP, Modeller

  • Population genetics
    BEAST, BEST, CLUMPP, IM, IMa, InStruct, MDIV,   Migrate, MKPRF, MSVAR, OmegaMap, Parentage, SFS_CODE,   Structurama,     Structure, TESS

  • Phylogenetics
    MrBayes, ClustalW, T-COFFEE

  • Association analysis / statistics
    PLINK, R


The system is flexible and can be easily customized to include other software, in fact the number of applications available in BioHPC grows fast. The interface to each application is standardized,  users can choose the cluster, number of nodes or allow the interface to determine it based on the best load balance and node availability. It is also scalable, the installation on our servers currently processes approximately 15,000 job submissions per year, many of them requiring massively parallel computations for a long time. It is integrating different cluster technologies (MS CCS, MS HPC Server 2008, JSDL). There are both parallel and serial applications available through the interface. LOOPP and MrBayes are examples of genuine parallel applications. P-BLAST, P-HMMER and P-IPRSCAN are parallelized through input sequence distribution (trivial parallelization). MPI is used for communication. 

The applications accessible via BioHPC are various third party programs governed by their respective licenses. Only part developed at CBSU is covered by BioHPC license. It is sole responsibility of the administrators/owners of a particular BioHPC server to assure that use of these applications is in agreement with their respective licenses.

Currently BioHPC @ CBSU implementation is processing around 15,000 jobs a year. The most popular applications for job submission from 6/13/2003 to 11/14/2008 are:

MDIV 18,144 (population genetics) 1 node from few hours to two weeks   (average: 2-5 days).
LOOPP 17,250(protein structure prediction)
5-20 nodes for 3-10 hours
MrBayes 8,825(population genetics) 8-20 nodes for a few hours to two weeks (average: 5 days)
P-BLAST 3,579(sequence analysis / data mining) 10 – 100 nodes for a few days to a week (average 3 days)
IM / IMa12,327(population genetics) 1 node for 2-5 days
Structure4,850(population genetics) 1 node for 2-5 days
All applications 79,406 (average 14,888 per year)  
(since 1/1/2008)26,573   



BioHPC @ Cornell What's new   Contact us