Preparing the ligands

Download the ligand’s structure from PUBCHEM (http://pubchem.ncbi.nlm.nih.gov/) or the DUD (http://blaster.docking.org/dud/)

we need to correct and optimize the ligands. The reason we need to do this is because the docking program we use needs to have the ligands with the right molecular mechanics parameters and atom types or the results will be distorted, if the docking job runs at all.

We can use Schrodinger Glide and these steps are common to most preliminary ligand preparation steps.

1.) If you have access to Glide (or any other docking program), you need to bring the structures into your project from an SDF file format. In fact, just get used to importing/exporting all your ligand files as either SDF format.

2.) Once you have imported your files, run ligprep or other ligand prep utility. The basic idea is to generate clean 3D structures from all of your ligands.

3.) Generally, it generate all of the tautomers and make sure that all of your ionizable groups are properly protonated.

4.) Most lig prep programs will run a minimization step. This is a good idea because when you start with a clean ligand, you are less likely to end up with funky aromatic rings and improper torsions and stuff that will look obviously wrong to you after you’ve docked them. You’ll have to redock but you may not immediately realize that the fault is in the ligprep step. The same goes with preparation of the receptor. If the program you’re using makes you go through what looks like a long list of tedious steps for preparation of the receptor, do it, at least the first time.

6.) After you have generated your ligands, check them with painstaking attention to detail. In particular, check the bond order of nitrogens. They’re problematic in every lig prep program I have used. Some programs will protonate aromatic nitrogens, for example the N of pyridine. You need to correct this before you go forward. You may have to manually correct the bond orders of connected atoms before you proceed. Make sure the formal charges are correct as well. If the minimized structures still look “off”, re-minimize them.

Requirement for Controls (Ligands and Decoys)

Ligand enrichment among top-ranking hits is a key metric of molecular docking. To avoid bias, decoys should resemble ligands physically, so that enrichment is not simply a separation of gross features, yet be chemically distinct from them, so that they are unlikely to be binders. We have assembled a directory of useful decoys (DUD). Every ligand has 36 decoy molecules that are physically similar but topologically  distinct. DUD is freely available online as a benchmarking set for docking at http://blaster.docking.org/dud/.

The success of a docking screen is evaluated by its capacity to enrich the small number of known active compounds in the top ranks of a screen from among a much greater number of decoy molecules in the database.

Virtual screening is benchmarked using two criteria:

(1) enrichment of annotated ligands among top scoring docked molecules from a database of decoys and  (2) the geometric fidelity of the docked poses compared to those of the experimental structures.

The docking enrichment factor (EF) reflects the ability of the docking calculations to find true positives throughout the background database compared to random selection. This enrichment factor is calculated as EFsubset = {ligandsselected/Nsubset}/{ligandstotal/Ntotal}

Advertisements

Result test data DHFR

newest_result_with_all_data_enrichment_plot

 

Methods BedROC (docking score)
GLIDE 0.6428304
VINA 0.4358998
AUTODOCK 0.09409243
DOCK 0.08583455

Enrichment Factor

The EF metric is the measure of how many more actives we find within a defines “early recognition” fraction of the ordered list relative to a random distribution.

The maximum value that EF take is 1 and the minimum value is 0.

Disadvantage of EF metric-

It equally weight the actives within the cutoff such that it cannot distinguish the better ranking algorithm where all the actives are ranked at the very beginning of the ordered list from a worse algorithm where all the actives are ranked just before the cutoff (e.g, 10%).

ROC

A receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied. It is created by plotting the fraction of true positives out of the total actual positives (TPR = true positive rate) vs. the fraction of false positives out of the total actual negatives (FPR = false positive rate), at various threshold settings.

TPR is also known as sensitivity or recall in machine learning. The FPR is also known as the fall-out and can be calculated as one minus the more well known specificity. The ROC curve is then the sensitivity as a function of fall-out. In general, if both of the probability distributions for detection and false alarm are known, the ROC curve can be generated by plotting the Cumulative Distribution Function (area under the probability distribution from -inf to +inf) of the detection probability in the y-axis versus the Cumulative Distribution Function of the false alarm probability in x-axis.

Presentation of Results

Many metrics are currently used to evaluate the performance of ranking methods in virtual screening (VS), for instance the area under the receiver operating curve (ROC), the area under the accumulation curve (AUAC), the average rank of actives, the enrichment factor (EF) and the robust initial enhancement (RIE) proposed by Sheridan et al.

 

The area under the receiver operating curve is used by influential groups to measure VS performance in part because it does possess desirable statistical behaviours.

 

Area under the Accumulation Curve- Accumulation curves are widely used to display ranking performances. the corresponding area under the curve, the AUAC, is not as often used partly because it is believed to be largely dependent on the ratio of actives in the set.

 

Area under the Receiver Operating Characteristics curve-

The ROC metric is widely used across many disciplines. It has its root in signal detection analysis and was widely applied in the medical community to evaluate the discriminatory power of tests for normal or abnormal.