Cancer Res Treat.  2007 Jun;39(2):74-81.

An Attempt for Combining Microarray Data Sets by Adjusting Gene Expressions

  • 1Oral Cancer Research Institute, Yonsei University College of Dentistry, Korea.
  • 2Cancer Metastasis Research Center, Yonsei University College of Medicine, Seoul, Korea.
  • 3National Biochip Research Center, Yonsei University College of Medicine, Seoul, Korea.
  • 4Yonsei Cancer Center, Yonsei University College of Medicine, Seoul, Korea.
  • 5Brain Korea 21 Project for Medical Science, Yonsei University College of Medicine, Seoul, Korea.
  • 6Department of Internal Medicine, Yonsei University College of Medicine, Seoul, Korea.


PURPOSE: The diverse experimental environments in microarray technology, such as the different platforms or different RNA sources, can cause biases in the analysis of multiple microarrays. These systematic effects present a substantial obstacle for the analysis of microarray data, and the resulting information may be inconsistent and unreliable. Therefore, we introduced a simple integration method for combining microaray data sets that are derived from different experimental conditions, and we expected that more reliable information can be detected from the combined data set rather than from the separated data sets.
This method is based on the distributions of the gene expression ratios among the different microarray data sets and it transforms, gene by gene, the gene expression ratios into the form of the reference data set. The efficiency of the proposed integration method was evaluated using two microarray data sets, which were derived from different RNA sour-ces, and a newly defined measure, the mixture score.
The proposed integration method intermixed the two data sets that were obtained from different RNA sources, which in turn reduced the experimental bias between the two data sets, and the mixture score increased by 24.2%. A data set combined by the proposed method preserved the inter-group relationship of the separated data sets.
The proposed method worked well in adjusting systematic biases, including the source effect. The ability to use an effectively integrated microarray data set yields more reliable results due to the larger sample size and this also decreases the chance of false negatives.


Microarray; Gene expression; Integration method; Different platforms; Different RNA sources; Systematic effects

MeSH Terms

Bias (Epidemiology)
Gene Expression*
Sample Size
Full Text Links
  • CRT
export Copy
  • Twitter
  • Facebook
Similar articles
    DB Error: unknown error