CINXE.COM

{"title":"Protein Graph Partitioning by Mutually Maximization of cycle-distributions","authors":"Frank Emmert Streib","volume":8,"journal":"International Journal of Biomedical and Biological Engineering","pagesStart":516,"pagesEnd":521,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/11557","abstract":"The classification of the protein structure is commonly\r\nnot performed for the whole protein but for structural domains, i.e.,\r\ncompact functional units preserved during evolution. Hence, a first\r\nstep to a protein structure classification is the separation of the\r\nprotein into its domains. We approach the problem of protein domain\r\nidentification by proposing a novel graph theoretical algorithm. We\r\nrepresent the protein structure as an undirected, unweighted and\r\nunlabeled graph which nodes correspond the secondary structure\r\nelements of the protein. This graph is call the protein graph. The\r\ndomains are then identified as partitions of the graph corresponding\r\nto vertices sets obtained by the maximization of an objective function,\r\nwhich mutually maximizes the cycle distributions found in the\r\npartitions of the graph. Our algorithm does not utilize any other kind\r\nof information besides the cycle-distribution to find the partitions. If\r\na partition is found, the algorithm is iteratively applied to each of\r\nthe resulting subgraphs. As stop criterion, we calculate numerically\r\na significance level which indicates the stability of the predicted\r\npartition against a random rewiring of the protein graph. Hence,\r\nour algorithm terminates automatically its iterative application. We\r\npresent results for one and two domain proteins and compare our\r\nresults with the manually assigned domains by the SCOP database\r\nand differences are discussed.","references":"[1] H. Kopka and P. W. Daly, A Guide to LATEX, 3rd ed. Harlow, England:\r\nAddison-Wesley, 1999.\r\n[2] A. Andreeva, D. Howorth, S.E. Brenner, T.J. Hubbard, C. Chothia and\r\nA.G. Murzin, SCOP database in 2004: refinements integrate structure and\r\nsequence fmily data, Nucleic Acids Res., 32:D226-229, 2004.\r\n[3] F.C. Bernstein, T.F. Koetzle, G.J.B. Williams, E.F. Meyer Jr, M.D. Brice,\r\nJ.R. Rodgers, O. Kennard, T. Shimanouchi, M. Tasumi, The Protein Data\r\nBank: A computer-based archival file for macromolecular structures. J.\r\nMol. Biol., 112:535-542, 1977.\r\n[4] R.F. Doolittle, The multiplicity of domains in proteins, Annu. Rev.\r\nBiochem., 64:287-314, 1995.\r\n[5] J-t. Guo, D. Xu, D. Kim and Y.Xu, Improving the performance of\r\nDomainParser for structural domain partition using neural network, Nucl.\r\nAcids Res., 31(3)944-952, 2003.\r\n[6] L. Holm and C. Sander, Dictionary of recurrent domains in protein\r\nstructures, Proteins: Structure, Function and Genetics, 33:88-96, 1998.\r\n[7] P.J. Kraulis, Molscript: A program to produce both detailed and schematic\r\nplots of protein sturctures, J. of Appl. Chrystallograph, 24:946-950, 1991.\r\n[8] N.J. Mulder et al., InterPro, progress and status in 2005, Nucleic Acids\r\nRes., 33:D201-205, 2005.\r\n[9] A.G. Murzin, S.E. Brenner, T. Hubbard and C. Chothia, SCOP: a structural\r\nclassification of proteins database for the investigation of sequences\r\nand structures, J. Mol. Biol., 247:536-540, 1995.\r\n[10] D.C. Phillips, The dree-dimensional structure of an enzyme molecule,\r\nSci. Am., 215:78-90, 1966.\r\n[11] A.S. Siddiqui and G.J. Barton, Contious and discontinuous domains:\r\nAn algorithm for the automatic generation of reliable protein domain\r\ndefinietions, Protein Science, 4:872-884, 1995.\r\n[12] G.D. Rose, Hierachic organization of domains in globular proteins, J.\r\nMol. Biol., 134(3):447-470, 1979.\r\n[13] M.G. Rossmann and A. Liljas, Recognition of structural domains in\r\nglobular proteins, J. Mol. Biol., 85:177-181, 1974\r\n[14] W.R. Taylor, Protein structural domain identification, Protein Eng.,\r\n12(3):203-216, 1999.\r\n[15] S. Veretnik, P.E. Bourne, N.N. Alexandrov and I.N Shindyalov, Toward\r\nconsistent assignment of structural domains in proteins, J. Mol. Biol.,\r\n339:647-678, 2004.\r\n[16] D. Vitkup, E. Melamud, J. Moult and C. Sander, Completness in\r\nstructural genomics, Nat. Struct. Biol., 8(6):559-566, 2001.\r\n[17] D. Wetlaufer, Nucleation, rapid folding, and globular intrachain regions\r\nin proteins, Proc. Natl. Acad. Sci., 70:697-701, 1973\r\n[18] Y. Xu, D. Xu and H.N. Gabow, Protein domain decomposition using a\r\ngraph-theoretic apprach, Bioinformatics, 16(12):1091-1104, 2000.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 8, 2007"}