Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models

Edrisse Chermak; Renato de Donato; Marc Lensink; Andrea Petta; Luigi Serra; Vittorio Scarano; Luigi Cavallo; Romina Oliva

doi:10.1371/journal.pone.0166460

Article Dans Une Revue PLoS ONE Année : 2016

Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models

(1) , (1) , (2) , (3) , (3) , (3) , (1) , (4)

1
2
3
4

Edrisse Chermak

Fonction : Auteur

King Abdullah University of Science and Technology [Saudi Arabia]

Renato de Donato

Fonction : Auteur

King Abdullah University of Science and Technology [Saudi Arabia]

Marc Lensink

Fonction : Auteur
PersonId : 180132
IdHAL : marc-lensink
ORCID : 0000-0003-3957-9470
IdRef : 223604917

Unité de Glycobiologie Structurale et Fonctionnelle - UMR 8576

Andrea Petta

Fonction : Auteur

Università degli Studi di Salerno = University of Salerno

Luigi Serra

Fonction : Auteur

Università degli Studi di Salerno = University of Salerno

Vittorio Scarano

Fonction : Auteur

Università degli Studi di Salerno = University of Salerno

Luigi Cavallo

Fonction : Auteur
PersonId : 1244103
ORCID : 0000-0002-1398-338X

King Abdullah University of Science and Technology [Saudi Arabia]

Romina Oliva

Fonction : Auteur

Università degli Studi di Napoli “Parthenope” = University of Naples

Résumé

Correctly scoring protein-protein docking models to single out native-like ones is an open challenge. It is also an object of assessment in CAPRI (Critical Assessment of PRedicted Interactions), the community-wide blind docking experiment. We introduced in the field the first pure consensus method, CONSRANK, which ranks models based on their ability to match the most conserved contacts in the ensemble they belong to. In CAPRI, scorers are asked to evaluate a set of available models and select the top ten ones, based on their own scoring approach. Scorers' performance is ranked based on the number of targets/interfaces for which they could provide at least one correct solution. In such terms, blind testing in CAPRI Round 30 (a joint prediction round with CASP11) has shown that critical cases for CONSRANK are represented by targets showing multiple interfaces or for which only a very small number of correct solutions are available. To address these challenging cases, CONSRANK has now been modified to include a contact-based clustering of the models as a preliminary step of the scoring process. We used an agglomerative hierarchical clustering based on the number of common inter-residue contacts within the models. Two criteria, with different thresholds, were explored in the cluster generation, setting either the number of common contacts or of total clusters. For each clustering approach, after selecting the top (most populated) ten clusters, CONSRANK was run on these clusters and the top-ranked model for each cluster was selected, in the limit of 10 models per target. We have applied our modified scoring approach, Clust-CONSRANK, to SCORE_SET, a set of CAPRI scoring models made recently available by CAPRI assessors, and to the subset of homodimeric targets in CAPRI Round 30 for which CONSRANK failed to include a correct solution within the ten selected models. Results show that, for the challenging cases, the clustering step typically enriches the ten top ranked models in native-like solutions. The best performing clustering approaches we tested indeed lead to more than double the number of cases for which at least one correct solution can be included within the top ten ranked models.

Mots clés

Protein Structure Secondary Consensus Protein Interaction Mapping Proteins Algorithms Databases Protein Protein Binding Molecular Docking Simulation Software Protein Interaction Domains and Motifs Binding Sites Research Design Cluster Analysis

Domaines

Biologie structurale [q-bio.BM]

Fichier principal

pone.0166460.pdf (3.04 Mo)

Origine	Fichiers éditeurs autorisés sur une archive ouverte

LillOA Université de Lille : Connectez-vous pour contacter le contributeur

https://hal.univ-lille.fr/hal-03172949

Soumis le : jeudi 18 mars 2021-10:11:50

Dernière modification le : vendredi 19 avril 2024-09:58:04

Archivage à long terme le : lundi 21 juin 2021-08:56:04

Dates et versions

hal-03172949 , version 1 (18-03-2021)

Licence

Paternité

Identifiants

HAL Id : hal-03172949 , version 1
DOI : 10.1371/journal.pone.0166460
PUBMED : 27846259

Citer

Edrisse Chermak, Renato de Donato, Marc Lensink, Andrea Petta, Luigi Serra, et al.. Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models. PLoS ONE, 2016, 11 (11), pp.e0166460. ⟨10.1371/journal.pone.0166460⟩. ⟨hal-03172949⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LILLE

39 Consultations

61 Téléchargements

Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager