Small molecule databases like CAS, Beilstein, PubChem, KEGG and many others are discussed here.
Title | Large-Scale Annotation of Small-Molecule Libraries Using Public Databases Yingyao Zhou,* Bin Zhou, Kaisheng Chen, S. Frank Yan, Frederick J. King, Shumei Jiang, and Elizabeth A. Winzeler |
Source | J. Chem. Inf. Model.; 2007, ASAP |
DOI | http://dx.doi.org/10.1021/ci700092v |
Short Review |
Targeted at biomedical and chemical (drug) research this article calculates the overlap of different closed-source (CAS,WDI,IDDB3) and open chemical databases (PubChem, KEGG, MeSH). A pipeline for annotating and merging compound data from more than 15 databases is shown. This is a serious issue and the CAS Advisory Committee needs to do something about, otherwise the innovation-spin at CAS is lost after 100 years. And it is very obvious and clear what do: "It would be preferable if all 26 million structures in the CAS system were readily accessible via PubChem or other informatic services, in order to prescan in-house compound collections quickly and identify those that have patent, reaction, and/or other literature data in the CAS database." CAS has developed an important database, but forgot to innovate during the last years and the digital revolution of the 21st century. Among these things which are required for 21 century databases are web-services and programming APIs and new program interfaces or rich clients. That is also shown in the article: This explains some comments why the CAS database is an important tool, but not a platform. CAS should take the PubChem data (which is freely available) and merge it into the CAS database and should also provide CAS data to PubChem (like many other companies do) to allow links back from PubChem to the CAS database. "Since the structures in PubChem were not collected with any obviously biased filters, we hypothesize that the chemical space covered by PubChem and the 26 million structures collected by the CAS database to be similar. This is important because the Chemical Structure Lookup Service (CSLS) has already 39 million compounds (29 million unique) and IBM is working on a chemical patent search engine for (Markush) structures from patent data. A link to the developed search service of the Genomics Insitute of the Novartis Research Foundation is provided here: Batch Compound Annotation Service |
Title | |
Source | |
DOI | doi:10.1016/j.jmr.2004.11.028 |
Short Review |
Title | |
Source | |
DOI | doi:10.1002/mrc.1517 |
Short Review |
Title | |
Source | |
DOI | |
Short Review |
Title | |
Source | |
DOI | |
Short Review |
Title | |
Source | |
DOI | |
Short Review |
Today is a cool and nice day. Why? Ask yourself!