Background: Docking allows to predict ligand binding to proteins, since the 3D-structure for the target is available. Several docking studies have been carried out to identify potential ligands for drug targets. Many of these studies resulted in the leads that were later developed as drugs.
Objective: Our goal here is to describe the development of an integrated computational tool to assess docking accuracy and build new scoring functions to predict ligandbinding affinity. Method: We carried out docking simulations using MVD program for a data set available on CSAR 2014 database (coagulation factor Xa) for which ligand-binding information and structures are available. These docking results were analyzed using SAnDReS available at www.sandres.net. Machine learning methods were applied to build new scoring functions and our results were compared with previously published benchmarks. Results: Our integrated docking strategy generated poses with docking accuracy higher than previously published benchmarks. In addition, the new scoring function developed using SAnDReS shows better performance than well-established scoring functions such the ones available in Autodock, Autodock- Vina, Gold, Glide, and MVD. Conclusion: The big data generated during docking lacked an integrated computational tool for statistical analysis of the influence of structural parameters on docking and scoring function performance. Here we describe methods to evaluate docking results using SAnDReS, a computational environment for statistical analysis of docking results and development of scoring functions. We believe that SAnDReS is a computational tool with potential to improve accuracy in docking projects.Keywords: Dock, protein, target, drug, machine learning.