Systems biology aims to gain insight into complex biological systems by integrating disparate piece of data from various sources and from different levels (such as genome, transcriptome, proteome, metabolome, interactome or reactome), and formulate models that describe how the systems work . The explosive growth in biological and biochemical data is beneficial for systems biology research and it has driven the development of diverse types of biological databases, such as GenBank , UniProt , SGD , HMDB , BioGRID , KEGG , ArrayExpress  and GEO . However only 20% of the millions of deposited data in GEO have been referred in other work , indicating a bottleneck in utilization of large-scale data. Even though these public repositories ensure easy access to data and hence represent a platform for systems biology research, they were in many cases implemented in isolated groups with a particular purpose in mind. Furthermore, these databases often have distinct data models, different file formats, varied semantic concepts and specific data access techniques , and they often contain incomplete data. All in all, those factors make data management and data integration extremely challenging and error-prone.
Attempts have been made to resolve these key issues through the development of numerous data standards (e.g. SBML , CellML , PSI-MI , BioPAX , GO  and SBO ), the implementation of centralized and federated databases (e.g. cPath , PathCase  and Pathway Commons ) and the proposal of design methodologies for software and databases (e.g. I-cubed  and ). Although, there are still no best practices or solutions to this problem, research and development are underway by making use of current computational technologies, standards and frameworks (see  for a review). Here we describe the development of a dedicated database system for handling multi-level data that represents an ongoing endeavor to serve researchers in systems biology and provide alternative solutions for vital issues in data handling, data access and integration of data in a single database. The database system was designed and developed by taking into account: 1) the ability to integrate multi-level data; 2) that biological data are complex, heterogeneous, and dynamic ; 3) diversities of resources in terms of data model, semantic heterogeneity, data completeness and data correctness; 4) reusability, extensibility and interoperability of the system; and 5) integrity, consistency and reliability of data in the database. The design of database schema is adapted from BioPAX and implemented based on an object-oriented concept which represents practical information as an object with related attributes and a variety of relationships. This concept is applicable for biological information, which is apparently heterogeneous and sophisticated . The database API was developed in C++ and included a library providing important functions to manage and interact with the system.
To illustrate the integration of multi-level data under a sole database environment, a yeast data repository was developed. The database contains multi-level data of yeast Saccharomyces cerevisiae (e.g. genome, annotation data, interactome and metabolic model) from different resources. Data population, data management and data access are managed by the database system. A simple query interface is provided to access the data and related information. Furthermore, two research cases were presented to demonstrate extensibility and efficiency of the database and the underlining database system in facilitating data integration tasks to achieve specific requests.