Hide/Show Apps

Amino acid substitution matrices based on 4-body Delaunay contact profiles

Sequence similarity search of proteins is one of the basic and most common steps followed in bioninformatics research and is used in making evolutionary, structural, and functional inferences. The quality of the search and the alignment of the protein sequences depend crucially on the underlying amino-acid substitution matrix. We present a method for deriving amino acid substitution matrices from 4-body contact propensities of amino-acids in 3D protein structures. Unlike current popular methods, our method does not rely on mutational analysis, evolutionary arguments, or alignment of protein sequences or structures. The alignment accuracy of our derived matrices is evaluated using the BAliBASE reference alignment set and is found to be comparable to that of popular matrices from the literature. Notably, the metric subset of our matrices outperform other available metric matrices. Our matrices will be useful especially in the development of empirical potential energy functions and in distance-based sequence indexing.