Skip to content

Commit

Permalink
Improve documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
mcs07 committed Mar 7, 2015
1 parent ceccfb9 commit 9ceab6b
Show file tree
Hide file tree
Showing 6 changed files with 173 additions and 39 deletions.
12 changes: 10 additions & 2 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,21 @@ Search functions
.. autofunction:: get_assays
.. autofunction:: get_properties

Compound, Atom and Bond
-----------------------
Compound
--------

.. autoclass:: pubchempy.Compound
:members:

Atom
----

.. autoclass:: pubchempy.Atom
:members:

Bond
----

.. autoclass:: pubchempy.Bond
:members:

Expand Down
4 changes: 4 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,4 +240,8 @@
'pandas': ('http://pandas.pydata.org/pandas-docs/stable/', None),
}

# Sort autodoc members by the order they appear in the source code
autodoc_member_order = 'bysource'

# Concatenate the class and __init__ docstrings together
autoclass_content = 'both'
51 changes: 37 additions & 14 deletions docs/source/guide/compound.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,46 @@
Compound
========

The ``get_compounds`` function returns a list of ``Compound`` objects. You can also instantiate a `Compound` object from
a CID::
The :func:`~pubchempy.get_compounds` function returns a list of :class:`~pubchempy.Compound` objects. You can also
instantiate a :class:`~pubchempy.Compound` object directly if you know its CID::

c = pcp.Compound.from_cid(6819)

Each ``Compound`` has a ``record`` property, which is a dictionary that contains the all the information about the
compound. All other properties are derived from this record.

Compounds with regular 2D coordinates have the following properties: cid, record, atoms, bonds, elements, synonyms,
sids, aids, coordinate_type, charge, molecular_formula, molecular_weight, canonical_smiles, isomeric_smiles, inchi,
inchikey, iupac_name, xlogp, exact_mass, monoisotopic_mass, tpsa, complexity, h_bond_donor_count, h_bond_acceptor_count,
rotatable_bond_count, fingerprint, heavy_atom_count, isotope_atom_count, atom_stereo_count, defined_atom_stereo_count,
undefined_atom_stereo_count, bond_stereo_count, defined_bond_stereo_count, undefined_bond_stereo_count,
covalent_unit_count.
Dictionary representation
-------------------------

Many of the above properties are missing from 3D records, however they do have the following additional properties:
volume_3d, multipoles_3d, conformer_rmsd_3d, effective_rotor_count_3d, pharmacophore_features_3d,
mmff94_partial_charges_3d, mmff94_energy_3d, conformer_id_3d, shape_selfoverlap_3d, feature_selfoverlap_3d,
shape_fingerprint_3d.
Each :class:`~pubchempy.Compound` has a ``record`` property, which is a dictionary that contains the all the information
about the compound, produced exactly from the JSON response from the PubChem API. All other properties are derived from
this record.

Additionally, each :class:`~pubchempy.Compound` provides a ``to_dict()`` method that returns PubChemPy's own dictionary
representation of the Compound data. As well as being more concisely formatted than the raw ``record``, this method also
takes an optional parameter to filter the list of the desired properties::


>>> c = pcp.Compound.from_cid(962)
>>> c.to_dict(properties=['atoms', 'bonds', 'inchi'])
{'atoms': [{'aid': 1, 'element': 'o', 'x': 2.5369, 'y': -0.155},
{'aid': 2, 'element': 'h', 'x': 3.0739, 'y': 0.155},
{'aid': 3, 'element': 'h', 'x': 2, 'y': 0.155}],
'bonds': [{'aid1': 1, 'aid2': 2, 'order': 'single'},
{'aid1': 1, 'aid2': 3, 'order': 'single'}],
'inchi': u'InChI=1S/H2O/h1H2'}

3D Compounds
------------

Many properties are missing from 3D records, and the following properties are *only* available on 3D records:

- ``volume_3d``
- ``multipoles_3d``
- ``conformer_rmsd_3d``
- ``effective_rotor_count_3d``
- ``pharmacophore_features_3d``
- ``mmff94_partial_charges_3d``
- ``mmff94_energy_3d``
- ``conformer_id_3d``
- ``shape_selfoverlap_3d``
- ``feature_selfoverlap_3d``
- ``shape_fingerprint_3d``
30 changes: 29 additions & 1 deletion docs/source/guide/substance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,32 @@
Substance
=========

TODO
The PubChem Substance database contains all chemical records deposited in PubChem in their most raw form, before any
significant processing is applied. As a result, it contains duplicates, mixtures, and some records that don't make
chemical sense. This means that Substance records contain fewer calculated properties, however they do have additional
information about the original source that deposited the record.

The PubChem Compound database is constructed from the Substance database using a standardization and deduplication
process. Hence each Compound may be derived from a number of different Substances.

Retrieving substances
---------------------

Retrieve Substances using the :func:`~pubchempy.get_substances` function::

>>> results = pcp.get_substances('Coumarin 343', 'name')
>>> print results
[Substance(24864499), Substance(85084977), Substance(126686397), Substance(143491255), Substance(152243230), Substance(162092514), Substance(162189467), Substance(186021999), Substance(206257050)]


You can also instantiate a Substance directly from its SID::

>>> substance = pcp.Substance.from_sid(223766453)
>>> print substance.synonyms
['2-(Acetyloxy)-benzoic acid', '2-(acetyloxy)benzoic acid', '2-acetoxy benzoic acid', '2-acetoxy-benzoic acid', '2-acetoxybenzoic acid', '2-acetyloxybenzoic acid', 'BSYNRYMUTXBXSQ-UHFFFAOYSA-N', 'acetoxybenzoic acid', 'acetyl salicylic acid', 'acetyl-salicylic acid', 'acetylsalicylic acid', 'aspirin', 'o-acetoxybenzoic acid']
>>> print substance.source_id
BSYNRYMUTXBXSQ-UHFFFAOYSA-N
>>> print substance.standardized_cid
2244
>>> print substance.standardized_compound
Compound(2244)
Loading

0 comments on commit 9ceab6b

Please sign in to comment.