CSARdock.org - Home
CSAR Header
Email *
Passphrase *
Cannot access your account?

Welcome to CSAR -- A Resource for Docking and Scoring Development

2014 Benchmark Exercise - click here for information

2013 Benchmark Exercise - click here

2012 Datasets – Full Release click here

Computational chemists need reliable experimental data. The Community Structure-Activity Resource (CSAR) provides experimental datasets of crystal structures and binding affinities for diverse protein-ligand complexes. Some datasets will be generated in house at Michigan while others will be collected from the literature or deposited by academic labs, national centers, and the pharmaceutical industry.

We aim to provide the highest quality data for a diverse collection of proteins and small molecule ligands. We need input from the community in developing our target priorities. Ideal targets will have many high-quality crystal structures (apo and 10-20 bound to diverse ligands) and affinity data for ≥25 compounds that range in size, scaffold, and logP. It is best if the ligand set has several congeneric series that span a broad range of affinity, with low nanomolar to mid-micromolar being most desirable. We prefer Kd data over Ki data over IC50 data (no % activity data). We will determine solubility, pKa, logP/logD data for the ligands whenever possible. We have augmented some donated IC50 data by determining Kon/Koff and ITC data.

CSAR is funded by a U01 grant from the National Institute of General Medical Sciences. The original RFA can be found at http://grants.nih.gov/grants/guide/rfa-files/RFA-GM-08-008.html. Press releases about CSAR can be found at:

Why should my company donate proprietary data to the public domain? Computational techniques are very successful at enriching hit rates when identifying sets of compounds for experimental testing. However, it is not possible to reliably rank nanomolar-level compounds over those with micromolar affinities. By donating data, it outsources the development of better tools. Pharma has the data, but not the time, to develop improved tools. Second, you have nothing to lose because we are asking for “old” data. Abandoned projects have the kind of data we need, and some could be donated without compromising a company’s competitive advantage on current projects. Third, participation in CSAR can provide visibility in the field. In particular, the donated data could be used to conduct a community-wide blind evaluation of docking and scoring methods. Lastly, there may be a possible financial benefit. Data has value, and it might be possible for the company to declare a charitable donation (of course, this requires consultation with the company’s legal and accounting teams). Our first dataset has been contributed by Abbott (urokinase), and we have reached a legal agreement with GSK to obtain data. We are working with scientists at BMS, Vertex, Pfizer, Merck, Genentech, and Eli Lilly to identify possible depositions. For the community to improve our approaches, we need exceptional datasets to train scoring functions and develop new docking algorithms. That is the goal of the CSAR project.



Phase 1 Answer Key

01_FXA_gtc101 Answer: 146
02_FXA_gtc398 Answer: 190
03_FXA_gtc401 Answer: 37

04_SYK_gtc224 Answer: 116
05_SYK_gtc225 Answer: 51
06_SYK_gtc233 Answer: 71
07_SYK_gtc249 Answer: 48
08_SYK_gtc250 Answer: 141

09_TRMD_gtc445 Answer: 126
10_TRMD_gtc446 Answer:76
11_TRMD_gtc447 Answer: 18
12_TRMD_gtc448 Answer:3
13_TRMD_gtc451 Answer: 5
14_TRMD_gtc452 Answer: 79
15_TRMD_gtc453 Answer:32
16_TRMD_gtc456 Answer: 185
17_TRMD_gtc457 Answer: 88
18_TRMD_gtc458 Answer: 116
19_TRMD_gtc459 Answer: 195
20_TRMD_gtc460 Answer: 27
21_TRMD_gtc464 Answer: 118
22_TRMD_gtc465 Answer: 154

The CSAR 2014 Benchmark Exercise is in progress. Phase 2 (docking and scoring) has begun and will end on 11 July 2014. Please go to the 2014 Exercise area to download the data for Phase 2.

The CSAR 2014 Benchmark Exercise has begun. Go to the 2014 Benchmark page to download the files. Phase 1 will end on Fri May 2nd. Also, we have added 123 new PDB entries to the HiQ set, go to “Download Datasets” in the left panel in the Datasets block to get the set or go to “Browse Datasets” to see what is there.

We have updated the HiQ set with 123 new structures from the PDB from the years 2009 to 2011. These have been setup in a similar fashion to the original HiQ set. See the Download Datasets area.

Special issue of JCIM to be devoted to 2011-2012 CSAR Benchmark Exercise is due out shortly.