| Tasks for Base 
    Knowledge Databases | 
    Description | 
  
  
    | Download Genomic/Proteomic/Clinical Data | 
    Available information for directly 
    supporting experimental results interpretation and data mining is downloaded. Scripts 
    and custom applications  provide  automatic extraction. | 
  
  
    | Mine Journals & Publications | 
    Data that is not contained in directly accessible databases needs to 
    be manually extracted and put into electronic form for data entry. | 
  
  
    | Define Control Vocabulary | 
    Each type of data needs to have a standard set of names, 
    tags and/or units to facilitate complex queries. | 
  
  
    | Add Calculated & Reference Data | 
    Depending upon the type of experiments and data mining 
    needs, the knowledge databases may need to be augmented with specific types 
    of calculated and/or reference data such as disease types, translational 
    modifications, DNA Binding proteins, SNPs, ... | 
  
  
    | Filter/Curate Data | 
    Extensive checks are made to ensure data is 
    consistent, regularized and uses control vocabulary  before adding to the database. | 
  
  
    | Design Database | 
    First, the database needs to be designed in such a way to 
    encapsulate scientific relationships -- such as a gene is part of a 
    Chromosome and contains a location and has an expression value under certain 
    conditions and codes for a protein. Second the database has to be designed 
    for the type of integration and data mining. The design for a database 
    containing gene data  with  microarray data is different from one 
    without microarray data.  | 
  
  
    | Load Database | 
    Once the database is designed and built, then scripts or 
    applications routinely load the information into the 
    knowledge database. | 
  
  
    | Tasks for Extending 
    Knowledge Databases | 
    Description | 
  
  
    | Add Links to Internal 
    Experimental Data | 
    This represents the seamless integration of the 
    Microarray Process Process Flow Laboratory Information Management System 
    (LIMS) Data that are used in analysis and 
    interpretation of results and report generation. | 
  
  
    | Add Expression Data | 
    Expression data from DNA, SNP, Protein, antibody 
    and other arrays can be added to the knowledge database directly or accessed 
    via links to BxArray(tm) databases. | 
  
  
    | Add Links to External Data | 
    Data that is not explicitly incorporated into 
    the local Knowledge databases can be accessed via links to the external 
    databases. An effective method of linking disparate data sources is by 
    creating metadata using OLAP (On-Line Analytical Processing). | 
  
  
    | Data Mining Applications | 
    The results of data mining applications, such as 
    reference disease biomarkers can be added to the Knowledge databases. | 
  
  
    | Add Related Experimental Data | 
    Different types of experiments such as 2D gels 
    can be used to validate microarray experiments. This type of data can be 
    added to the Knowledge databases. | 
  
  
    | Update/Annotate | 
    A general purpose utility for manually or 
    automatically updating and 
    annotating any data that is stored in the Knowledge databases. |