MongoChem PubChem Import
The PubChem data in the SDF format can be imported with MongoChem's SDF importer. Under the File->Import menu select SDF to bring up the import dialog. Then navigate to the SDF file and click "Import". This will automatically load the data into the Mongo database.
The following will be extracted from the SDF data fields and inserted into the database:
- PUBCHEM_IUPAC_TRADITIONAL_NAME -> name
- PUBCHEM_IUPAC_INCHI -> inchi
- PUBCHEM_IUPAC_INCHIKEY -> inchikey
- PUBCHEM_MOLECULAR_WEIGHT -> mass, descriptors.mass
- PUBCHEM_CACTVS_TPSA -> descriptors.tpsa
- PUBCHEM_XLOGP3_AA -> descriptors.xlogp3
The following fields will be calculated from the molecular structure:
- mass (if PUBCHEM_MOLECULAR_WEIGHT is not present)
A sample dataset containing the first 2,500 molecules in PubChem is available here: pubchem2500.sdf.gz.