Biomedical informatics: Development of a comprehensive data warehouse for clinical and genomic breast cancer research

Hai Hu*, Henry Brzeski, Joe Hutchins, Mohan Ramaraj, Long Qu, Richard Xiong, Surendran Kalathil, Rand Kato, Santhosh Tenkillaya, Jerry Carney, Rosann Redd, Sheshkumar Arkalgudvenkata, Kashif Shahzad, Richard Scott, Hui Cheng, Stephen Meadow, John McMichael, Shwu Lin Sheu, David Rosendale, Leonid KvecherStephen Ahern, Song Yang, Yonghong Zhang, Rick Jordan, Stella B. Somiari, Jeffrey Hooke, Craig D. Shriver, Richard I. Somiari, Micheal N. Liebman

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Scopus citations


The Windber Research Institute is an integrated high-throughput research center employing clinical, genomic and proteomic platforms to produce terabyte levels of data. We use biomedical informatics technologies to integrate all of these operations. This report includes information on a multi-year, multi-phase hybrid data warehouse project currently under development in the Institute. The purpose of the warehouse is to host the terabyte-level of internal experimentally generated data as well as data from public sources. We have previously reported on the phase I development, which integrated limited internal data sources and selected public databases. Currently, we are completing phase II development, which integrates our internal automated data sources and develops visualization tools to query across these data types. This paper summarizes our clinical and experimental operations, the data warehouse development, and the challenges we have faced. In phase III we plan to federate additional manual internal and public data sources and then to develop and adapt more data analysis and mining tools. We expect that the final implementation of the data warehouse will greatly facilitate biomedical informatics research. 2004

Original languageEnglish
Pages (from-to)933-941
Number of pages9
Issue number7
StatePublished - Oct 2004
Externally publishedYes


  • Clinical information
  • Data warehouse
  • Federated
  • Genomics
  • Hybrid
  • Integrated
  • Proteomics


Dive into the research topics of 'Biomedical informatics: Development of a comprehensive data warehouse for clinical and genomic breast cancer research'. Together they form a unique fingerprint.

Cite this