DOE Genomes
Human Genome Project Information  Genomics:GTL  DOE Microbial Genomics  home
-

Close Window

Table 7. Computing Roadmap: Facility for Whole Proteome Analysis (acronyms)

Topic

Research, Design,
and Development

Demonstration: Pilots and Modular Deployment

Integration and Production Deployment

LIMS and Workflow Management

Participate in GTL cross-facility LIMS working group

Develop technologies and methods to:

  • Manage massive dataflow
  • Process and integrate data
  • Manage workflow
  • Conduct QA/QC
  • Deploy collaborative tools for shared access to data and processes

Archival storage systems

Prototype bulk data capture and retrieval systems

Prototype inter- and intralab limited LIMS

Shared LIMS and workflow technology for each analytical capability

LIMS for each analytical pipeline

Data archives for each analytical pipeline

Inter- and intralab LIMS

Establish LIMS for each analytical pipeline workflow

Products:

Output data products

Cross-facility access and tracking

Information management systems and automation

Efficient, analytically rigorous pipelines

Components and integration to GTL process

Bioinformatics

Participate in GTL cross-facility working group for data representation and standards

Provide user environments, community access, database development

Integrate data-analysis methods

Develop large-scale integrated experiment designs, analysis pipelines

Workflow processes and database needs

Evaluation of technical solutions

Large-scale storage and retrieval solutions

Entire workflow processes and methods for experimentation and analysis

Algorithms

Quality control and assessment measures

Statistically designed experiments

Multidimensional data-analysis and integration tools for large-scale experimentation

Multilevel databases for bulk and derived data for each profiling method

Analysis pipeline for derived data

Community-access systems

Cross-facility data-sharing processes and analysis methods

Archival, computing, and network capacity to match demand

Bulk data archives for key data sets

Process to link archives to production activities

Local facility data archive

Cross-facility data-sharing processes and analysis methods

Mature bulk data archives, analysis piles

Scaleup of archival activities, computing, and network capacity to match demand

Products:

  • Whole proteome analysis for each GTL organism
  • Experiment templates and data sets for modeling and simulation
  • Defined experiment archive integrated with data and analysis from each analytical pipeline
  • Molecular profiling context-dependent database

Computing Infrastructure

Participate in GTL cross-cutting working group for computing infrastructure

Establish scientific computing with massive data reduction, archival storage application development

Develop infrastructure: hardware, software, code control, libraries, environments

Use ultrahigh-speed internet connection to GTL facilities

Operations process

Computational architecture

Large-scale data mining

Access and security plans and processes

Performance and quality metrics of service

Capacity planning

Backup and recovery strategy

Testing plans

Workflow

Dev-Test-Pro strategy for implementations

Test network

Development environment

Validation methods

Data archive

Access methods

Storage and retrieval methods

Application integration and implementation

Production infrastructure

Cross-facility data sharing

Infrastructure: hardware, software, and network

Production environment and data archive

Bulk data archives for key data sets

Process to link production activities to local facility data archive

Cross-facility data-sharing processes and analysis methods

Mature bulk data archives and analysis pipelines