DNA microarrays provide a new technique of measuring gene expression, which has attracted a lot of research interest in recent years. It was suggested that gene expression data from microarrays (biochips) can be employed in many biomedical areas, e.g., in cancer classification. Although several, new and existing, methods of classification were tested, a selection of proper (optimal) set of genes, the expressions of which can serve during classification, is still an open problem. Recently we have proposed a new recursive feature replacement (RFR) algorithm for choosing a suboptimal set of genes. The algorithm uses the support vector machines (SVM) technique. In this paper we use the RFR method for finding suboptimal gene subsets for tumornormal colon tissue classification. The obtained results are compared with the results of applying other methods recently proposed in the literature. The comparison shows that the RFR method is able to find the smallest gene subset (only six genes) that gives no misclassifications in leave-one-out cross-validation for a tumornormal colon data set. In this sense the RFR algorithm outperforms all other investigated methods.
@article{bwmeta1.element.bwnjournal-article-amcv13i3p327bwm, author = {Fujarewicz, Krzysztof and Wiench, Ma\l gorzata}, title = {Selecting differentially expressed genes for colon tumor classification}, journal = {International Journal of Applied Mathematics and Computer Science}, volume = {13}, year = {2003}, pages = {327-335}, zbl = {1035.92018}, language = {en}, url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv13i3p327bwm} }
Fujarewicz, Krzysztof; Wiench, Małgorzata. Selecting differentially expressed genes for colon tumor classification. International Journal of Applied Mathematics and Computer Science, Tome 13 (2003) pp. 327-335. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv13i3p327bwm/
[000] Alon U., Barkai N., Notterman D.A., Gish K., Ybarra S., Mack D. and Levine A.J. (1999): Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. — Proc. Natl. Acad. Sci., Vol. 96, pp. 6745–6750.
[001] Boser B.E., Guyon I.M. and Vapnik V. (1992): A training algorithm for optimal margin classifiers. — Proc. 5-th Ann. Workshop Computational Learning Theory, Pittsburgh, pp. 144–152.
[002] Brown M.P.S., Groundy W.N., Lin D., Cristianini N., Sugnet C.W., Furey T.S., Ares Jr M. and Haussler D. (2000): Knowledge based analysis of microarray gene expression data by using support vector machines. — Proc. Nat. Acad. Sci., Vol. 97, No. 1, pp. 262–267.
[003] Chilingaryan A., Gevorgyan N., Vardanyan A., Jones D. and Szabo A. (2002): A multivariate approach for selecting sets of differentially expressed genes. — Math. Biosci., Vol. 176, pp. 59–69. | Zbl 0996.92021
[004] Christianini N. and Shawe-Tylor J. (2000): An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. —Cambridge: Cambridge Univ. Press.
[005] Deuser L.M. (1971): A hybrid multispectral feature selection criterion. — IEEE Trans. Comp., pp. 1116–1117.
[006] Dua K.,Williams T.M. and Beretta L. (2001): Translational control of the proteome: relevance to cancer. — Proteomics, Vol. 1, pp. 1191–1199.
[007] Fleishmann J., Kremmer E., Muller S., Sommer P., Kirchner T., Niedobitek G. and Grasser F.A. (1999): Expression of deoxyuridine triphosphatase (dUTPase) in colorectal tumour. —Int. J. Cancer, Vol. 84, pp. 614–617.
[008] Fujarewicz K. and Rzeszowska-Wolny J. (2000): Cancer classification based on gene expression data. — J. Med. Inf. Technol., Vol. 5, pp. BI23–BI27.
[009] Fujarewicz K. and Rzeszowska-Wolny J. (2001): Neural network approach to cancer classification based on gene expression levels. — Proc. IASTED Int. Conf. Modelling Identification and Control, Innsbruck, Austria, pp. 564– 568.
[010] Fujarewicz K., Kimmel M., Rzeszowska-Wolny J. and Swierniak A. (2003): A note on classification of gene expression data using support vector machines.—J. Biol. Syst., Vol. 11, No. 1, pp. 43–56. | Zbl 1041.92015
[011] Furey T.S., Christianini N., Duffy N., Bednarski D.W., Schummer M. and Haussler D. (2000): Support vector machine classification and validation of cancer tissue samples using microarray expression data.—Bioinformatics, Vol. 16, No. 10, pp. 906–914.
[012] Galbavy S., Lukac L., Porubsky Y., Cerna M., Labuda M., Kmet’ova J., Papincak J., Durdik S. and Jakubowsky J. (2002): Collagen type IV in epithelial tumours of colon. — Acta Histochem 2002, Vol. 104, pp. 331–334.
[013] Golub T.R., Slonim T.K., Tamayo P., Huard C., Gaasenbeek M., Mesirov J.P., Coller H., Downing J.R., Caliguri M.A., Bloomfield C.D. and Lander E.S. (1999): Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. — Science, Vol. 286, pp. 531–537.
[014] Gomi S., Nakao M., Niija F., Imamura Y., Kawano K., Nishizaka S., Hayashi A., Sobao Y., Oizumi K. and Itoh K. (1999): A cyclophilin B gene encodes antigenic epitopes recognized by HLA-A24-restricted and tumor-specific CTLs.—J. Immunol., Vol. 163, pp. 4994–5004.
[015] Grider J.R. and Makhlouf G.M. (1992): Enteric GABAA: Mode of action and role in the regulation of the peristaltic reflex. — Am. J. Physiol., Vol. 262, pp. G690–694.
[016] Guyon I., Weston J., Barnhill S. and Vapnik V. (2002): Gene selection for cancer classification using support vector machines. —Mach. Learn., Vol. 64, pp. 389–422. | Zbl 0998.68111
[017] Haykin S. (1999): Neural Networks—A Comprehensive Foundation (2nd Ed.). —Upper Saddle River, NJ: Prentice-Hall. | Zbl 0934.68076
[018] Hejna M., Hamilton G., Brodowicz T., Haberl I., Fiebiger W.C., Scheithauer W., Virgolin I., Kostler W.J., Oberhuber G. and Raderer M. (2001): Serum levels of vasoactive intestinal peptide (VIP) in patients with adenocarcinoma of the gastrointestinal tract. — Anticancer. Res., Vol. 21, pp. 1183–1187.
[019] Jurianz K., Ziegler S., Garcia-Schuler H., Kraus S., Bohana- Kashtan O., Fishelson Z. and Kirschfink M. (1999): Complement resistance of tumor cells: Basal and induced mechanisms. —Mol. Immunol., Vol. 36, pp. 929–939.
[020] Kwiatkowski D.J. (1999): Functions of gelsolin: Motility, signaling, apoptosis, cancer. — Curr. Opin. Cell. Biol., Vol. 11, pp. 103–108.
[021] Ladner R.D., Lynch F.J., Groshen S., Xiong Y.P., Sherrod A., Caradonna S.J., Stoehlmacher J. and Lenz H.J. (2000): dUTP nucleotidohydrolase isoform expression in normal and neoplastic tissues: Association with survival and response to 5-fluorourcil in colorectal cancer. — Cancer Res., Vol. 60, pp. 3493–3503.
[022] Li L., Weinberg C.R., Darden T.A. and Pedersen L.G. (2001): Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. — Bioinformatics, Vol. 17, pp. 1131–1142.
[023] Lobo M.V., Martin M.E., Perez M.I., Alonso F.J., Redondo C., Alvarez M.I. and Salinas M. (2000): Levels, phosphorylation status and cellular localization of translational factor eIF2 in gastrointestinal carcinomas. — Histochem J., Vol. 32, pp. 139–150.
[024] Nguyen D.V. and Rocke D.M. (2002): Tumor classification by partial least squares using microarray gene expression data. — Bioinformatics, Vol. 18, No. 1, pp. 39–50.
[025] Oka Y., Naito I., Manabe K., Sado Y., Matsushima H., Ninomiya Y., Mizuno M. and Tsuji T. (2002): Distribution of collagen type IV alpha 1–6 chains in human normal colorectum and colorectal cancer demonstrated by immunofluorescence staining using chain-specific apitope-defined monoclonal antibodies. — J. Gastroenterol. Hepatol., Vol. 17, pp. 980–986.
[026] Porter R.M., Holme T.C., Newman E.L., Hopwood D., Wilkinson J.M. and Cuschieri A. (1993): Monoclonal antibodies to cytoskeletal proteins: an immunohistochemical investigation of human colon cancer. — J. Pathol., Vol. 170, pp. 435–440.
[027] Raderer M., Kurtaran A., Hejna M., Vorbeck F., Angelberger P., Scheithauer W. and Virgolini I. (1998): 123I-labelled vasoactive intestinal peptide receptor scintigraphy in patients with colorectal cancer. — Br. J. Cancer, Vol. 78, pp. 1–5.
[028] Rao J. (2002): Targeting actin remodeling profiles for the detection and management of urothelial cancers—A perspective for bladder cancer research. — Front. Biosci., Vol. 7, pp. e1–8.
[029] Schmitt C.A., Schwaeble W., Wittig, B.M., Meyer zum Buschenfelde K.H. and Dippold W.G. (1999): Expression and regulation by interferon-gamma of the membranebound complement regulators CD46 (MCP), CD55 (DAF), and CD59 in gastrointestinal tumours. — Eur. J. Cancer, Vol. 35, pp. 117–124.
[030] Sebestyen G.S. (1962): Decision Making Processes in Pattern Recognition. — New York: Macmillan.
[031] Sobczak W. and Malina W. (1978): Methods of Data Selection. — Warsaw: WNT, (in Polish).
[032] Szabo A., Boucher K., Carroll W.L., Klebanov L.B., Tsodikov A.D. and Yakovlev A.Y. (2002): Variable selection and pattern recognition with gene expression data generated by the microarray technology. — Math. Biosci., Vol. 176, pp. 71–98. | Zbl 1006.62093
[033] Tamura M., Nishizaka S., Maeda Y., Ito M., Harashima N., Harada M., Shichijo S. and Itoh K. (2001): Identification of cyclophilin B-derived peptides capable of inducing histocompatibility leukocyte antigen-A2-restricted and tumor-specific cytotoxic T lymphocytes. — Jpn. J. Cancer Res., Vol. 92, pp. 762–767.
[034] Thorsteinsson L., O’Dowd G.M., Harrington P.M. and Johnson P.M. (1998): The complement regulatory proteins CD46 and CD59, but not CD55, are highly expressed by glandular epithelium of human breast and colorectal tumour tissues. —APMIS, Vol. 106, pp. 869–878.
[035] Vapnik V. (1995): The Nature of Statistical Learning Theory.— New-York: Springer-Verlag. | Zbl 0833.62008
[036] Winston J.S., Asch H.L., Zhang P.J., Edge S.B., Hyland A. and Asch B.B. (2001): Downregulation of gelsolin correlates with the progression to breast carcinoma. — Breast Cancer Res. Treat, Vol. 65, pp. 11–21.