Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 10;13(1):1265.
doi: 10.1038/s41467-022-28865-w.

Improved prediction of protein-protein interactions using AlphaFold2

Affiliations

Improved prediction of protein-protein interactions using AlphaFold2

Patrick Bryant et al. Nat Commun. .

Erratum in

Abstract

Predicting the structure of interacting protein chains is a fundamental step towards understanding protein function. Unfortunately, no computational method can produce accurate structures of protein complexes. AlphaFold2, has shown unprecedented levels of accuracy in modelling single chain protein structures. Here, we apply AlphaFold2 for the prediction of heterodimeric protein complexes. We find that the AlphaFold2 protocol together with optimised multiple sequence alignments, generate models with acceptable quality (DockQ ≥ 0.23) for 63% of the dimers. From the predicted interfaces we create a simple function to predict the DockQ score which distinguishes acceptable from incorrect models as well as interacting from non-interacting proteins with state-of-art accuracy. We find that, using the predicted DockQ scores, we can identify 51% of all interacting pairs at 1% FPR.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. DockQ scores for the test set (n = 1481 for all but RF, n = 1455).
Distribution of DockQ scores as boxplots for different modelling strategies on the test set. Boxes encompass data quartiles, horizontal lines mark the medians and upper and lower whiskers indicate respectively maximum and minimum values for each distribution. All AF2 models have been run with the same neural network configuration (m1-10-1). Outlier points are not displayed here. AF2, refers to running AF2 using the default AF2 MSAs, “Paired” refers to using MSAs paired using information about species and “Block” refers to using block diagonalization MSAs.
Fig. 2
Fig. 2. Model quality metrics and multiple model ranking.
a ROC curve as a function of different metrics for the test dataset (n = 1481, first run). Cβs within 8 Å from each other from different chains are used to define the interface. IF_plDDT is the average plDDT of interface residues, min plDDT per chain is the minimum average plDDT of both chains, average plDDT is the average of the entire complex and IF_contacts and IF_residues are the number of interface residues and contacts respectively. pDockQ is a sigmoidal fit to the combined metric IF_plDDT⋅log(IF_contacts) fitted to predict DockQ as the _target score, see C. b Average interface plDDT vs the logarithm of the interface contacts coloured by DockQ score on the test set (n = 1481). Increasing both the number of interface contacts and average interface plDDT results in higher DockQ scores. c Using the combined metric IF_plDDT⋅log(IF_contacts), we fit a sigmoidal curve towards the DockQ scores on the test set (n = 1481), enabling predicting the DockQ score in a continuous manner (pDockQ). The average error overall is 0.14 DockQ score. d Impact of different initialisations on the modelling outcome in terms of DockQ score on the test dataset (n = 1481). The maximal and minimal scores are plotted against the top-ranked models using the pDockQ scores for the AF2 + paired MSAs, m1-10-1.
Fig. 3
Fig. 3. DockQ distributions for test dataset (n = 1481) tertiles.
a Distribution of DockQ scores for three sets of interfaces with the majority of Helix, Sheet and Coil secondary structures. b Distribution of DockQ scores for tertiles derived from the distribution of contact counts in docking model interfaces. c Distribution of DockQ scores for tertiles derived from the distribution of Paired MSAs Neff scores. d Distribution of DockQ scores for the top three organisms H. sapiens, S. cerevisiae and E. coli.
Fig. 4
Fig. 4. Predicted and native structures from the set of novel proteins without templates.
The native structures are represented as grey ribbons. a Docking of 7EIV chains A (blue) and C (green) (DockQ = 0.76). b Docking of 7MEZ chains A (blue) and B (green) (DockQ = 0.53). c Prediction of structure 7EL1 chains A (blue) and E (green) (DockQ = 0.01). The DNA going through chain A is coloured in orange. d Docking of 7LF7 chains A (blue) and M (magenta) (DockQ = 0.02) and chains B (green) and M (magenta) (DockQ = 0.02).
Fig. 5
Fig. 5. Discrimination of interacting (n = 1481) and non-interacting (n = 5694) proteins.
a The ROC curve as a function of different metrics for discriminating between interacting and non-interacting proteins. IF_plDDT is the average plDDT in the interface, min plDDT per chain is the minimum average plDDT of both chains, average plDDT is the average of the entire complex and IF_contacts and IF_residues are the number of interface residues and contacts respectively. pDockQ is a sigmoidal fit to this with DockQ as the _target score, as described above. bd Distribution of the top discriminating features average interface plDDT (b), the number of interface contacts (c), and d the combination of these (IF_plDDT⋅log(IF_contacts)) and the pDockQ for interacting (non-grey) and non-interacting proteins (grey).
Fig. 6
Fig. 6. Comparison of different MSAs.
a Depiction of MSAs generated by AF2 and the paired version matched using organism information. Both AF and paired representations are sections containing 10% of the sequences aligned in the original MSA. Concatenated chains are separated by a vertical line (magenta). The visualisations were made using Jalview version 2.11.1.4. b Docking visualisations for PDB ID 5D1M with the model/native chains A in blue/grey and B in green/magenta using the three different MSAs in (a). The DockQ scores are 0.01, 0.02 and 0.90 for AF2, paired, and AF2 + paired MSAs, respectively.

Similar articles

Cited by

References

    1. Liddington, R. C. Structural Basis of Protein–Protein Interactions. Protein-Protein Interactions261, 3–14 10.1385/1-59259-762-9:003 (2004). - PubMed
    1. Keskin O, Gursoy A, Ma B, Nussinov R. Principles of protein-protein interactions: what are the preferred ways for proteins to interact? Chem. Rev. 2008;108:1225–1244. - PubMed
    1. Nooren IMA. NEW EMBO MEMBER’S REVIEW: diversity of protein-protein interactions. EMBO J. 2003;22:3486–3492. - PMC - PubMed
    1. Cong Q, Anishchenko I, Ovchinnikov S, Baker D. Protein interaction networks revealed by proteome coevolution. Science. 2019;365:185–189. - PMC - PubMed
    1. Zhang QC, et al. Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature. 2012;490:556–560. - PMC - PubMed

Publication types

MeSH terms

  NODES
HOME 2
Javascript 1
os 9
text 12
twitter 2
visual 2
web 4