The Molecular Formula Space describes the number of molecular formulas or elemental compositions in a certain mass range. From the molecular formula space the molecular isomer space can be derived. The metabolome (the number of molecules derived from living organisms) is just a subset of the molecular isomer space.
One of the accomplishments of the Seven Golden Rules is the calculation of the approximate number of molecular formulas which may exist with a high probability based on existing molecular formulas (PubChem, Wiley, DNP, NIST, Beilstein) and heuristic rules.
For the elements C, H, N, S, O and P and up to 2000 Da around 597 million most probable elemental compositions exist. Please be reminded that each molecular formula can expand to billions of isomers. As an example, the enhanced brute force formula calculator HR2 can calculate the number of highly probable formulas (30 million, <1000 Da) in 22 seconds on an AMD Opteron 254. You can download the brute force formula generator HR2 as part of the Seven Golden Rules ZIP package.
|Formula Space||Number of formulas (CHNSOP)|
|CHNSOP ALL formulas < 2000 Da||8,000,000,000|
|CHNSOP 7GR formulas < 2000 Da||600,000,000|
|Natural, Drugs, Toxicants||50,000|
|Natural compound formulae||30,000|
The valence values assumed for the elements are vN=3, vS=2,4,6 and vP= 3,5. In organic compounds nitrogen usually has a maximum valence of 3. There exist halogen-nitrogen compounds with vN=5. It is important to note that for sulfur (thiols, sulfoxides and sulfones ) and phosphorous (phosphines and phophonates) that exceeded valences are usually only existing in S-O or S-halogene compounds. So the numbers presented here are approximations. If somebody comes up with eight platinum rules or nine rhodium rules the numbers might change slightly J.
There are also certain compounds which are not included in the Seven Golden Rules. Such compounds are salt-forms of compounds (with a broken molecular formula or ionic bondings) or odd electron compounds like existing nitroso-compounds. Also approximations on the number of halogen containing compounds are hard to calculate, because halogens are very reactive and many exceptions would occur.
The collection of most probable elemental compositions or elemental formulas in the ranges 500 Da, 1000 Da, 2000 Da, 3000 Da can be easily obtained with HR2. The file for all molecular formulae up to 500 Da is around 33 MByte large. The calculation takes 5 seconds on an AMD Opteron 2.8 GHz with a QSOFT Ramdisk Enterprise. Generating all 29 million molecular formulas up to 1000 Da (elements CHNSOP) generates 1.5 GByte of data in 275 seconds. You can download HR2 in the software section.
|Elements||Number of molecular formulas
(<2000 Da, Seven Golden Rules)
All molecular formulas for CHO up to 1000 Da and C100 (not checked by the Seven Golden Rules, but Senior and LEWIS) in a zipped TXT file. [Free download as ZIP]