TY - JOUR
T1 - Large-scale automated analysis of location patterns in randomly tagged 3T3 cells
AU - Osuna, Elvira García
AU - Hua, Juchang
AU - Bateman, Nicholas W.
AU - Zhao, Ting
AU - Berget, Peter B.
AU - Murphy, Robert F.
N1 - Funding Information:
We would like to thank Dr. Jonathan Jarvik for helpful discussions and Yehuda Creeger for technical assistance. This work was supported by Commonwealth of Pennsylvania Tobacco Settlement Fund grant 017393, NIH grant GM068845-01, and NSF grant EF-0331657.
PY - 2007/6
Y1 - 2007/6
N2 - Location proteomics is concerned with the systematic analysis of the subcellular location of proteins. In order to perform high-resolution, high-throughput analysis of all protein location patterns, automated methods are needed. Here we describe the use of such methods on a large collection of images obtained by automated microscopy to perform high-throughput analysis of endogenous proteins randomly-tagged with a fluorescent protein in NIH 3T3 cells. Cluster analysis was performed to identify the statistically significant location patterns in these images. This allowed us to assign a location pattern to each tagged protein without specifying what patterns are possible. To choose the best feature set for this clustering, we have used a novel method that determines which features do not artificially discriminate between control wells on different plates and uses Stepwise Discriminant Analysis (SDA) to determine which features do discriminate as much as possible among the randomly-tagged wells. Combining this feature set with consensus clustering methods resulted in 35 clusters among the first 188 clones we obtained. This approach represents a powerful automated solution to the problem of identifying subcellular locations on a proteome-wide basis for many different cell types.
AB - Location proteomics is concerned with the systematic analysis of the subcellular location of proteins. In order to perform high-resolution, high-throughput analysis of all protein location patterns, automated methods are needed. Here we describe the use of such methods on a large collection of images obtained by automated microscopy to perform high-throughput analysis of endogenous proteins randomly-tagged with a fluorescent protein in NIH 3T3 cells. Cluster analysis was performed to identify the statistically significant location patterns in these images. This allowed us to assign a location pattern to each tagged protein without specifying what patterns are possible. To choose the best feature set for this clustering, we have used a novel method that determines which features do not artificially discriminate between control wells on different plates and uses Stepwise Discriminant Analysis (SDA) to determine which features do discriminate as much as possible among the randomly-tagged wells. Combining this feature set with consensus clustering methods resulted in 35 clusters among the first 188 clones we obtained. This approach represents a powerful automated solution to the problem of identifying subcellular locations on a proteome-wide basis for many different cell types.
KW - CD-tagging
KW - Cluster analysis
KW - Fluorescence microscopy
KW - Location proteomics
KW - Protein subcellular location
KW - Subcellular location features
KW - Subcellular location trees
UR - http://www.scopus.com/inward/record.url?scp=34249666037&partnerID=8YFLogxK
U2 - 10.1007/s10439-007-9254-5
DO - 10.1007/s10439-007-9254-5
M3 - Article
C2 - 17285363
AN - SCOPUS:34249666037
SN - 0090-6964
VL - 35
SP - 1081
EP - 1087
JO - Annals of Biomedical Engineering
JF - Annals of Biomedical Engineering
IS - 6
ER -