
Of the seven categories of the most common natural ligands, including coenzyme A, dinucleotide, DNA/RNA, heme, metal, nucleoside, and sugar, the trace metal binding residues display the most prominent coupling, followed by the sugar binding sites. We found that the chi-squared metric is the most informative for the identification of coevolving functional sites, followed by the Pearson correlation-based, whereas mutual information is the least informative. Building upon our previous work, CoeViz, we have conducted a large scale covariance analysis among 7,595 non-redundant proteins with resolved 3D structures to assess 1) whether the residues with the same function coevolve, 2) which covariance metric captures such couplings better, and 3) how different molecular functions compare in this context. In this work, we investigated whether covariance analysis may reveal residues involved in the same molecular function. It has been long recognized that covariance of amino acids between distant positions within a protein sequence allows for the inference of long range contacts to facilitate 3D structure modeling. Proteins by and large carry out their molecular functions in a folded state when residues, distant in sequence, assemble together in 3D space to bind a ligand, catalyze a reaction, form a channel, or exert another concerted macromolecular interaction. 5Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States.4Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States.3Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States.2Advanced Concepts Laboratory, Georgia Tech Research Institute, Fairborn, OH, United States.


Daniel Corcoran 1, Nicholas Maltbie 1, Shivchander Sudalairaj 1, Frazier N.
