Releases: opentargets/OnToma
v1.1.2: Changed pandas and python versions
build: fix dependencies versions
What's Changed
- build: fix dependencies versions by @ireneisdoomed in #25
Full Changelog: v1.1.0...v1.1.1
v1.1.0: Version aware EFO caching
The primary update is EFO version aware OnToma update by @DSuveges in #24. This fixes a behaviour where OnToma cache was not updated even when the requested EFO version changed.
In addition, OnToma no longer relies on the retry
library (and consequently on the legacy py
library), so using it will no longer generate security advisories.
Full Changelog: v1.0.3...v1.1.0
v1.0.3: Minor technical improvements
What's Changed
- Minor improvements from issue 1808 by @tskir in #23
- Add --version flag to print version and exit
- Sort
URI_MAPPING
and add entry for CHEBI - Expand documentation on local installation and testing
- Update tests to reflect changes in EFO
Full Changelog: v1.0.2...v1.0.3
v1.0.2
Recently the results of all manual curation efforts within OpenTargets got consolidated into a single repository. As Ontoma is using these files, the corresponding references needed to be updated too.
What's Changed
- Update URLs after changes in curation repo (PR #2) by @ireneisdoomed in #22
Full Changelog: v1.0.1...v1.0.2
v1.0.1: Unpin EFO version
Previously, EFO version was pinned to v3.31.0 due to the later releases missing the efo_otar_slim.owl
file which is essential for OnToma operation: EBISPOT/efo#1180. This is now resolved, so the latest available EFO version will be again used from now on.
Full Changelog: v1.0.0...v1.0.1
v1.0.0: OnToma rewrite
OnToma has been rewritten with a focus on simplicity and mapping reliability. As a new major version, this release introduces some breaking changes to the CLI and Python interfaces, as well as major updates to the processing logic. Most importantly, the mapping results can be expected to change a lot.
Please read these release notes carefully before you consider upgrading. Bug reports and feedback on this release are especially highly appreciated. Please direct them to [email protected].
Mapping approach changes
OnToma has two operation modes, which are now clearly separated based on input type. For ontology input (e.g. OMIM:102900
), OnToma attempts the following steps to map to EFO:
- Exact identifier match from EFO;
- Match terms by cross-references (
hasDbXref
); - Mapping from the manual cross-reference database;
- Request through OxO with a distance of 2.
For string input (e.g. asthma
), the following steps are attempted:
- Exact name match from EFO;
- Exact synonym (
hasExactSynonym
); - Mapping from the manual string-to-ontology database;
- High confidence mapping from ZOOMA with default parameters.
Expected changes in the mapping results
All of the approaches listed in the previous section generate mappings which we consider to be of high quality, and they can be used in automated workflows straight out of OnToma. However, this is achieved at a cost of removing some low confidence approaches, such as fuzzy OLS lookup.
Our preliminary benchmarks, comparing the previous OnToma version (v0.0.18) to this release (v1.0.0), demonstrated the following approximate pattern:
- Sensitivity—percentage of valid input mappings which are discovered—dropped from 96% to 61%.
- At the same time, precision—the percentage of the mappings in OnToma output which are actually correct—rose from 75% to 97%.
Hence, after upgrading a significant drop in the number of the results is expected; however, the remaining results will be of significantly higher quality, which we believe is much more important in nearly all applications. We intend to work on increasing sensitivity in further releases.
Other operation changes
The CLI and Python interfaces have been simplified. The verbose
and suggest
flags have been removed (they might be reimplemented in a more consistent way in future releases).
Importantly, where multiple EFO terms match equally well from the single processing step, OnToma will now return multiple hits per query. (Previously, only one hit was selected, in a mostly random fashion.)
Each OnToma result consists of multiple fields.
- In Python API they are accessed as result object attributes:
OnToma().find_term('astma').id_ot_schema
will containEFO_0000270
. - In CLI the list of fields to output can be configured via the
--columns
flag.
Manually curated mapping sources
Two central resources are currently being set up to store all manually curated ontology to EFO (step 3) and string to EFO (step 7) mappings. External OnToma users are encouraged to contribute to these resources as well. (More information about that will come in future releases.)
Changes to ontology handling
A new module, ontoma.ontology
, was implemented to facilitate conversion between different ways to represent ontology identifiers. For example, ORDO_140162
, ORPHA:140162
, Orphanet:140162
, and http://www.orpha.net/ORDO/Orphanet_140162
all represent the same term. The module implements an algorithm which converts all possible representations into the stable internal normalised representation to make direct comparisons possible.
The output of OnToma always follows the format specified in the Open Targets JSON schema, for example, Orphanet_140162
. This means that you can plug in the output of OnToma directly into the evidence strings.
EFO OT slim is now loaded and parsed more consistently from the OWL file. There is a new option to cache this data to speed up OnToma initialisation in subsequent runs.
Additionally, you can now specify a particular EFO version to use. The version which is used by default in this release is pinned to v3.31.0.
Technical changes
The documentation has been migrated to ReadTheDocs and rewritten. RST build and configuration files have been updated and simplified.
Python 3.7+ is now required and consistently used throughout the code base. Installation has been simplified using pure PIP. The tests and CircleCI configuration have been updated to reflect all of the changes.