U.S. Department of Energy Office of Science
Genomic Science Program
Systems Biology for Energy and Environment
Highlights of Research Progress
Synechococcus Encyclopedia: Integrating Heterogeneous Databases and Tools for High-Throughput Microbial Analysis
http://modpod.csm.ornl.gov/gtl/
High-throughput experimental data are extremely diverse in format and source and distributed across many internet sites, making integrated access to the information difficult. To address this challenge, GTL researchers at Sandia National Laboratories and Oak Ridge National Laboratory developed the Synechococcus Encyclopedia. This new computational infrastructural capability provides integrated access to 23 genomic and proteomic databases via an advanced-query language for browsing across multiple data sources. Sources include databases for sequence annotations, protein structure, protein interactions, pathways, and raw mass spectrometry and microarray data. Integrative analysis will yield major insights into the behavior of these abundant marine cyanobacteria and their importance to global carbon fixation. Also available are web-based analysis tools for exploration and analysis of information on the Synechococcus species.
These resources are enabling biologists to combine knowledge and see relationships that previously were obscured by the distributed nature and diverse data types present in biological databases. GTL researchers are using the tools to create knowledgebases for other organisms as well (e.g., R. palustris and Shewanella). [Grant Heffelfinger, Sandia National Laboratories]

