Skip to main content
Powered by Drupal

DBs/Tools

Below are listed databases and tools developed by bioinformatic researchers at the University of Toronto. This list was updated on 8 November 2017, and is reasonably comprehensive. Please contact nicholas.provart@utoronto.ca to add any other tools or DBs as these are published.

Name of ToolReferenceDescription
Analysis
CisRegTestMoses, BMC Evolutionary Biology (2009)Statistical tests for natural selection on regulatory regions based on the strength of transcription factor binding sites.
MONKEYMoses~Eisen, Genome Biology (2004)MONKEY Identifies conserved transcription-factor binding sites in multispecies alignments.
MLHKAWright and Charlesworth, Genetics (2004)MLHKA is a program for testing for positive or balancing selection using polymorphism and divergence data from two species.
WASPBarash~Frey, Nature (2010)A web application that predicts whether or not an exon is alternatively spliced, and if so, how its splicing depends on different cellular conditions, such as tissue type. The application also maps putative regulatory elements in primary transcript sequence that is nearby regulated exons.
SubseqerHe and Parkinson, Bioinformatics (2008)Graph-based Webtool for uncovering meaningful sequence motifs from low complexity sequences.
ISOLATEQuon~Morris, Bioinformatics (2009)Separates heterogeneous tumor gene expression profiles into its constituent, purified tumor and healthy tissue gene expression profiles.
OrthoNetsHao~Wodak, Bioinformatics (2011)This Cytoscape plugin enables the simultaneous visualization of interaction and domain co-occurrence networks in multiple organisms, using information aggregated in DAnCER and iRefWeb/iRefIndex.
GeneProVlasblom~Wodak, Bioinformatics (2006)GenePro is a Cytoscape plug-in for the visualization and analysis of protein and gene interaction networks at multiple levels of resolution.
CytoscapeCline~Bader, Nature Protocols (2007)Cytoscape is a bioinformatics software platform for visualizing molecular interaction networks and integrating these interactions with gene expression profiles and other state data. Additional features are available as plugins.
Cytoscape WebLopes~Bader, Bioinformatics (2010)Cytoscape Web is a simplified web-based version of Cytoscape
NetMatch Cytoscape pluginFerro~Shasha, Bioinformatics (2007)NetMatch is a Cytoscape plugin that finds user defined network motifs in any Cytoscape network. Node and edge attributes of any type and paths of unknown length can be specified in the search.
MCODE Cytoscape pluginBader~Hogue, BMC Bioinformatics (2003)MCODE is a Cytoscape plugin that finds clusters (highly interconnected regions) in a network.
Enrichment Map Cytoscape pluginMerico~Bader, PLOS ONE (2010)Enrichment Map is a Cytoscape plugin for functional enrichment visualization. Enrichment results have to be generated outside Enrichment Map, using any of the available methods. 
WordCloud Cytoscape plugin The WordCloud plugin is a Cytoscape plugin that generates a visual summary of a network. It displays string attributes associated with nodes in the network as a tag cloud, where more frequent words are displayed using a larger font size.
GeneMANIA Cytoscape pluginMontojo~Bader, Bioinformatics (2010)The GeneMANIA Cytoscape plugin brings fast gene function prediction capabilities to the desktop. GeneMANIA identifies the most related genes to a query gene set using a guilt-by-association approach.
NAViGaTOR . Network Analysis, Visualization and Graphing, TorontoBrown~Jurisica, Bioinformatics (2009) and other refs.Visualization and analysis of large networks
MoDILLee~Brudno, Nature Methods (2009)MoDIL, or Mixture of Distributions Indel Locator, is a novel method for finding medium sized insertions or deletions from high throughput sequencing datasets. Our method can take advantage of the high clone coverage of these datasets to identify progressively shorter indel variants, even if the individual clone sizes are unreliable.
VARiDDalca~Brudno, Bioinformatics (2010)VARiD is a Hidden Markov Model for SNP and indel identification with AB-SOLiD color-space as well as regular letter-space reads. VARiD combines both types of data in a single framework which allows for accurate predictions. VARiD was developed at the University of Toronto Computational Biology Lab.
Savant Genome BrowserFiume~Brudno, Bioinformatics (2010)The Savant Genome Browser is a desktop visualization tool for genomic data. It was primarily developed for visualizing high throughput (aka next generation) sequencing data, although it can be used to visualize virtually any genome-based sequence, point, interval, or continuous dataset.
SHRiMPRumble~Brudno, PLoS Computational Biology (2009)SHRiMP, or SHort Read Mapping Program, is a software package for aligning genomic reads against a target genome. It was primarily developed with the multitudinous short reads of next generation sequencing machines in mind, as well as Applied Biosystem's colourspace genomic representation.
SHRiMP2David~Brudno, Bioinformatics (2011)A major update of the original SHort Read Mapping Program (SHRiMP). SHRiMP2 primarily targets mapping sensitivity, and is able to achieve high accuracy at a very reasonable speed.
CNVerMedvedev~Brudno, Genome Research (2010)CNVer is a method for CNV detection that supplements the depth-of-coverage with paired-end mapping information, where matepairs mapping discordantly to the reference serve to indicate the presence of variation.
SCPSRMMoses~Durbin, Genome Biology (2007)Scripts to identify Spatial clustering of phosphorylation site recognition motifs for predicting the targets of cyclin-dependent kinase.
SGRP Blast ServerLiti~Louis, Nature (2009)Blast server for Saccharomyces Genome Resequencing Project.
Density Estimation Tool for Enzyme Classification (DETECT)Hung~Parkinson, Bioinformatics (2010)DETECT is a probabilistic method and standalone tool for enzyme prediction that accounts for the sequence diversity across enzyme families.
NLStradamusNguyen~Moses, BMC Bioinformatics (2009)Webserver using hidden Markov models (HMMs) to predict novel Nuclear Localization Signals in proteins.
RNAcontextKazan~Morris, PLoS Computational Biology (2010)RNAcontext is a motif-finding algorithm to infer sequence and structure preferences of RNA binding proteins (RBP) from experimental affinity data. The input to RNAcontext consists of a set of sequences, their associated structure annotation profiles and affinity estimates (binary or continuous) for the given RBP.
Restricted Neighborhood Search Clustering Algorithm (RNSC)King~Jurisica, Bioinformatics (2004)Protein complex prediction via cost-based clustering.
Modular Subnetwork Biomarker IdentificationFortney~Jurisica, Genome Biology (2010)A method for biomarker identification that combines networks of genes selected based on phenotype-dependent activity and a graph-theoretic property called modularity.
kmerHMMWong~Zhang, Nucleic Acids Research (2013)De novo motif discovery method for Protein Binding Microarray (PBM) data. (Similar tools: MEME and Gibbs Sampler).
SNPdryadWong~Zhang, Bioinformatics (2014)Deleterious non-synonymous SNP predictions for human. (Similar tools: Polyphen2 and SIFT).
SegwayHoffman~Noble, Nature Methods (2012)Segway performs semi-automated genome annotation using multiple tracks of genome-wide data such as that from ChIP-seq or DNase-seq experiments. It produces annotations that can be used to visualize complex multivariate data in a simple way and interpret the effects of noncoding variation.
GenomedataHoffman~Noble, Bioinformatics (2010)Genomedata is a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint. We have also developed utilities to load data into this format and a reference implementation in Python and C components.
Data Visualization
ePlantWaese~Provart, The Plant Cell (2017)Web-tool for exploring large data sets from Arabidopsis from the km- to nm-scales.
NAViGaTOR . Network Analysis, Visualization and Graphing, TorontoBrown~Jurisica, Bioinformatics (2009) and other refs.Visualization and analysis of large networks
CytoscapeCline~Bader, Nature Protocols (2007)Cytoscape is a bioinformatics software platform for visualizing molecular interaction networks and integrating these interactions with gene expression profiles and other state data. Additional features are available as plugins (see Analysis section above).
SegtoolsBuske, Hoffman~Noble, BMC Bioinformatics (2011)Segtools is a Python package for analyzing genomic segmentations. The software efficiently calculates a variety of summary statistics and produces corresponding publication quality visualizations. The overall goal of Segtools is to provide a bird's-eye view of complex genomic data sets, allowing researchers to easily generate and confirm hypotheses.
Databases
Yeast KIDSharifpoor & Nguyen~Moses & Andrews, Genome Biology (2011)The Yeast Kinase Interaction Database contains curated data relevant to phosphorylation events in budding yeast.
CDIP, Cancer Data Integration Portal Database of significantly deregulated genes in lung, ovarian, prostate, head and neck cancer, sarcoma.
DAnCERTurinsky~Wodak, Bioinformatics (2011)DAnCER permits the exploration of chromatin modification (CM)-related genes in the full context of protein complexes, gene-expression regulation and pathways.
PhyloProXiong~Parkinson, Bioinformatics (2011)A web-based tool for the generation and visualization of phylogenetic profiles across Eukarya.
PartiGeneDBPeregrin-Alvarez~Parkinson, Nucleic Acids Research (2005)Database of Partial Genomes based on Expressed Sequence Tags.
eNetHu,Janga,Babu,Díaz-Mejía,Butland~Moreno-Hagelsied&Emili, PLoS Biology (2009)eNet is a database of gene function prediction in Escherichia coli K12.
GeneCards mirror Comprehensive gene annotation portal
iRefWebTurinsky~Wodak, Bioinformatics (2010)iRefWeb is a web interface to protein interaction data consolidated from 10 public databases: BIND, BioGRID, CORUM, DIP, IntAct, HPRD, MINT, MPact, MPPI and OPHID.
Yeast Interactome DatabaseKrogan&Cagney~Emili&Greenblatt, Nature (2006)Yeast TAP Project aims to identify protein-protein interactions in yeast, Saccharomyces cerevisiae.
Bacteriome.orgSu~Parkinson, Nucleic Acids Research (2008)Database of high quality E. coli interactions.
BIND TranslationIsserlin~Bader, Database (2011)Conversion of the BIND molecular interaction database to PSI-MI 2.5
I2D, Interologous Interaction DatabaseBrown~Jurisica, Bioinformatics (2005) and other refs.Integrated database of physical protein-protein interactions for human, mouse, rat, fly, worm and yeast. Integrates curated, HTP and predicted interactions
mirDIP, microRNA:target prediction Data Integration PortalShirdel~Jurisica, PLoS ONE (2011)Integrated portal for microRNA prediction from 11 databases
Bio-Analytic ResourceToufighi~Provart, The Plant Journal (2005)Suite of web-based tools for exploring and analyzing gene expression and other data from several plant species.
eFP BrowserWinter~Provart, PLoS ONE (2007)Web-tool for exploring gene expression data from several plant species in an intuitive manner.
Mouse Proteome ProjectKislinger&Kanna~Rossant, Hughes, Frey & Emili, Cell (2006)A mouse proteome collection of abundance profiles obtained for proteins of special interest, permitting complete access of database results, altered tissue and organelle spectral counts, and high-confidence subcellular assignments.
ElastoDBHe~Parkinson, Matrix Biology (2007)Database of elastic-like sequences.
Function Prediction
The GeneMANIA prediction serverWarde-Farley, Donaldson, Comes & Zuberi (joint first authors)~Bader & Morris (joint senior authors), Nucleic Acids Research (2010)State of the art, query-customized /in silico/ gene function prediction with multiple data types, live over the web.
Modeling
Cell++Sanford~Parkinson, Bioinformatics (2006)Stochastic cell simulation environment for modelling dynamic biochemical systems within a spatial context.
Pathways
cPathCerami~Sander, BMC Bioinformatics (2006)cPath, an open source database and web application for collecting, storing, browsing and querying biological pathway data.
Pathway CommonsCerami~Sander, Nucleic Acids Research (2011)Pathway Commons is a convenient point of access to biological pathway information collected from public pathway databases, which you can browse or search. Pathways include biochemical reactions, complex assembly, transport and catalysis events, and physical interactions involving proteins, DNA, RNA, small molecules and complexes.
The Cancer Cell Map The Cancer Cell Map contains selected cancer related signaling pathways which you can browse or search. Biologists can browse and search the Cancer Cell Map pathways. View gene expression data on any pathway. Computational biologists can download all pathways in BioPAX format for global analysis. Software developers can build software on top of the Cancer Cell Map using the web service API.
BioPAXDemir~Rajasimha, Nature Biotechnology (2010)BioPAX (Biological Pathway Exchange) is a collaborative effort to create a data exchange format for biological pathway data. BioPAX covers metabolic pathways, molecular interactions and protein post-translational modifications.
PathguideBader~Sander, Nucleic Acids Research (2006)Pathguide, the Pathway Resource List, contains information about hundreds of online biological pathway resources. Databases that are free and those supporting BioPAX, CellML, PSI-MI or SBML standards are highlighted.
Structure
Structural Genomics of Histone Tail RecognitionWang~Schapira, Bioinformatics (2010)Histone tails are subjected to various post translational modifications, which regulates gene expression and differentiation. This website highlights the structural mechanisms underlying recognition of histone tails by the readers, writers and erasers of methyl and acetyl marks.
LigAlignHeifets~Lilien, Journal of Molecular Graphics and Modeling (2010)Automated ligand-based active site alignment.