[A Human Experimental/FunctionAL MaPper]
A functional map is a way of usefully exploring information from thousands of experimental results, focused on a specific query of interest. This might mean finding data that pertains to a single gene/protein, a group of related (or unrelated) genes, a pathway, process, or set of disease-related genes. Functional maps rely on data integration to summarize genomic data as functional relationship networks. Each network encodes how likely it is for every pair of genes in the genome to interact functionally - possibly a direct interaction, like protein binding, or an indirect functional relationship, like participating in the same cellular process. Functional mapping analyzes portions of these networks related to user-specified groups of genes and biological processes and displays the results as probabilities (for individual genes), functional association p-values (for groups of genes), or graphically (as an interaction network). HEFalMp contains information from roughly 15,000 microarray conditions, over 15,000 publications on genetic and physical protein interactions, and several types of DNA and protein sequence analyses and allows the exploration of over 200 H. sapiens process-specific functional relationship networks, including a global, process-independent network capturing the most general functional relationships.
| Year | Title | Authors |
|---|---|---|
| 2009 | Exploring the human genome with functional maps | C. Huttenhower ; E.M. Haley ; M.A. Hibbs ; V. Dumeaux ; D.R. Barrett ; H.A. Coller ; O.G. Troyanskaya |
[The Sleipnir Library for Computational Functional Genomics]
Sleipnir is a C++ library enabling efficient analysis, integration, mining, and machine learning over genomic data. This includes a particular focus on microarrays, since they make up the bulk of available data for many organisms, but Sleipnir can also integrate a wide variety of other data types, from pairwise physical interactions to sequence similarity or shared transcription factor binding sites. All analysis is done with attention to speed and memory usage, enabling the integration of hundreds of datasets covering tens of thousands of genes. In addition to the core library, Sleipnir comes with a variety of pre-made tools, providing solutions to common data processing tasks and examples to help you use Sleipnir in your own programs. Sleipnir is free, open source, fully documented, and ready to be used by itself or as a component in your computational biology analyses.
| Year | Title | Authors |
|---|---|---|
| 2008 | The Sleipnir library for computational functional genomics | Curtis Huttenhower; Mark Schroeder; Maria D. Chikina; Olga G. Troyanskaya |
[Serial Pattern of Expression Levels Locator]
SPELL is a query-driven search engine for large gene expression microarray compendia. Given a small set of query genes, SPELL identifies which datasets are most informative for these genes, then within those datasets additional genes are identified with expression profiles most similar to the query set.
| Year | Title | Authors |
|---|---|---|
| 2007 | Exploring the functional landscape of gene expression: directed search of large microarray compendia | Matthew A. Hibbs; David C. Hess; Chad L. Myers; Curtis Huttenhower; Kai Li; Olga G. Troyanskaya |
[Horizontally Integrated Dataset Relationship Analysis]
HIDRA is a visualization and analyis framework for simultaneously exploring multiple microarray datasets at once. HIDRA allows users to quicky identify patterns common across many datasets as well as patterns unique to individual datasets. HIDRA is currently in beta testing and is still under development.
| Year | Title | Authors |
|---|---|---|
| 2007 | Viewing the Larger Context of Genomic Data through Horizontal Integration | Matthew A. Hibbs; Grant Wallace; Maitreya Dunham; Kai Li; Olga G. Troyanskaya |
[Nearest Neighbor Networks clustering]
NNN is a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods.
| Year | Title | Authors |
|---|---|---|
| 2007 | Nearest Neighbor Networks: Clustering Expression Data Based on Gene Neighborhoods | Curtis Huttenhower; Avi I. Flamholz; Jessica N. Landis; Suhard Sahi; Chad L. Myers; Kellen L. Olszewski; Matthew A. Hibbs; Nathan O. Siemers; Olga G. Troyanskaya; Hilary A. Coller |
[Microarray Experiment Functional Integration Technology]
MEFIT is a system for microarray integration. As a framework, MEFIT uses the results of many microarray experiments in combination with known biological process annotations (drawn from the Gene Ontology, KEGG, MIPS, or a biologist's own pathways of interest) to predict new gene pair functional relationships within the given biological functions. Or in other words, MEFIT is a system that takes microarray results and known functional annotations as inputs and produces predicted gene pair functional relationships as output.
| Year | Title | Authors |
|---|---|---|
| 2006 | A scalable method for integration and functional analysis of multiple microarray datasets | Curtis Huttenhower; Matt Hibbs; Chad Myers; Olga G. Troyanskaya |
[Gene Relationship Identification in Functional data]
GRIFn is a system for evaluation of datasets and methods using a functional genomics gold standard based on curation by expert biolgists. It allows users to assess the ability of their datasets or methods to recapitulate known biology both in a global sense and in the context of specific biological processes. GRIFn allows enables fair comparisons between various data types and methods.
| Year | Title | Authors |
|---|---|---|
| 2006 | Finding function: evaluation methods for functional genomic data | Chad L Myers; Daniel R Barrett; Matthew A Hibbs; Curtis Huttenhower; Olga G Troyanskaya |
[Gene Ontology Local Exploration Map]
GOLEM is a tool for viewing, navigating, and analyzing the hierarchical structure and annotations to the gene ontology. The visualization component allows a user to see the local graph structure around a GO term of interest and navigate to nearby nodes. GOLEM also provides the ability to look for statistical enrichment of GO terms in lists of genes and then observe the relationships between those terms. GOLEM is available both as an applet for use online and as a standalone download.
| Year | Title | Authors |
|---|---|---|
| 2006 | GOLEM: an interactive graph-based gene-ontology navigation and analysis tool | Rachel SG Sealfon; Matthew A Hibbs; Curtis Huttenhower; Chad L Myers; Olga G Troyanskaya |
[Biological Pathway Inference from eXperimental Interaction Evidence]
bioPIXIE is a novel system for biological data integration and visualization for S. cereviciae. It allows the user to discover interaction networks and pathways in which the user's gene(s) of interest participate. The system is based on a Bayesian algorithm for identification of biological networks based on integrated diverse genomic data.
| Year | Title | Authors |
|---|---|---|
| 2007 | Context-sensitive data integration and prediction of biological networks | Chad L. Myers; Olga G. Troyanskaya |
| 2005 | Discovery of biological networks from diverse functional genomic data | Chad L Myers; Drew Robson; Adam Wible; Matthew A Hibbs; Camelia Chiriac; Chandra L Theesfeld; Kara Dolinski; Olga G Troyanskaya |
[Chromosomal Abberation Region Miner and Viewer]
ChARMView is a visualization and analysis system for guided discovery of chromosomal abnormalities from microarray data. Our system facilitates manual or automated discovery of aneuploidies through dynamic visualization and integrated statistical analysis. ChARMView can be used with array CGH and gene expression microarray data, and multiple experiments can be viewed and analyzed simultaneously.
| Year | Title | Authors |
|---|---|---|
| 2005 | Visualization-based discovery and analysis of genomic aberrations in microarray data. | Chad L Myers; Xing Chen; Olga G Troyanskaya |
| 2004 | Accurate detection of aneuploidies in array CGH and gene expression microarray data | Chad L. Myers; Maitreya Dunham; S. Y. Kung; Olga G. Troyanskaya |
[Genomic Visualization and Analysis of Datasets]
geneVAnD is an implementation of several visualization techniques that incorporate meaningful statistics that are noise-robust for the purpose of analyzing the results of clustering algorithms on microarray data. This includes a rank-based visualization method that is more robust to noise, a difference display method to aid assessments of cluster quality and detection of outliers, and a projection of high dimensional data into a three dimensional space in order to examine relationships between clusters. Our methods are interactive and are dynamically linked together for comprehensive analysis. Further, our approach applies to both protein and gene expression microarrays, and our architecture is scalable for use on both desktop/laptop screens and large-scale display devices.
| Year | Title | Authors |
|---|---|---|
| 2005 | Visualization methods for statistical analysis of microarray clusters | Matthew A Hibbs; Nathaniel C Dirksen; Kai Li; Olga G Troyanskaya |
[K-Nearest Neighbors Imputation]
KNNimpute is an implementation of the k-nearest neighbors algorithm for estimation of missing values in microarray data. In our comparative study of several different methods used for missing value estimation we determined that KNNimpute provides superior performance in a variety of situations.
| Year | Title | Authors |
|---|---|---|
| 2001 | Missing value estimation methods for DNA microarrays | Olga G. Troyanskaya; Michael Cantor; Gavin Sherlock; Pat Brown; Trevor Hastie; Robert Tibshirani; David Botstein; Russ B. Altman |