MongoChem PubChem Import: Difference between revisions

From wiki.openchemistry.org
Jump to navigation Jump to search
(Created page with "== Importing Data == The PubChem data in the SDF format can be imported with MongoChem's SDF importer. Under the File->Import menu select SDF to bring up the import dialog. Th...")
 
(No difference)

Latest revision as of 13:13, 10 April 2013

Importing Data

The PubChem data in the SDF format can be imported with MongoChem's SDF importer. Under the File->Import menu select SDF to bring up the import dialog. Then navigate to the SDF file and click "Import". This will automatically load the data into the Mongo database.

The following will be extracted from the SDF data fields and inserted into the database:

  • PUBCHEM_IUPAC_TRADITIONAL_NAME -> name
  • PUBCHEM_IUPAC_INCHI -> inchi
  • PUBCHEM_IUPAC_INCHIKEY -> inchikey
  • PUBCHEM_MOLECULAR_WEIGHT -> mass, descriptors.mass
  • PUBCHEM_CACTVS_TPSA -> descriptors.tpsa
  • PUBCHEM_XLOGP3_AA -> descriptors.xlogp3

The following fields will be calculated from the molecular structure:

  • formula
  • atomCount
  • heavyAtomCount
  • vabc
  • mass (if PUBCHEM_MOLECULAR_WEIGHT is not present)

Sample Data

A sample dataset containing the first 2,500 molecules in PubChem is available here: pubchem2500.sdf.gz.