Most non-infectious diseases are associated with dysfunction of proteins or protein complexes. Аssociation between sequence and structure is analyzed since a long time, and analysis of sequence organization in domains and motifs is actual research area. A mathematical method is proposed here to identify the hierarchical organization of protein sequences. The method is based on pentapeptide as a unit of protein sequences. This method was applied on a non-homologous dataset of protein sequences. The analysis revealed 11 hierarchical levels of protein sequence organization, showing the relationship of these multiple fragments of sequences. Using different examples, we illustrated how the fragments of the spatial structure of protein correspond to the elements of the hierarchical structure of the protein sequence. A hierarchical structure is observed in the protein sequence. This methodology is an interesting basis for mathematically based classification of elements of spatial organization of proteins. Elements of the hierarchical structure of different levels of the hierarchy can be used for biotechnological and medical problems.
Keywords:
Subject: Biology and Life Sciences - Biochemistry and Molecular Biology
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.