Version 1 (modified by 9 years ago) (diff) | ,
---|
ontoCAT R package
Download
We provide two versions of ontoCAT R package:
- Light-weight ontoCAT R package version is available in Bioconductor starting from release 2.7, and includes all single-ontology functionality except for methods to work with multiple ontologies and search in OLS and BioPortal.
- Full ontoCAT R package version includes batch methods and due to package-size limitations is available only from the ontoCAT project sourceforge page.
Description
The R package ontoCAT was created to support basic operations on ontologies: traversal and search, uniform access to ontologies in OWL and OBO formats and to provide R access to major ontology repositories OLS and BioPortal.
Several hundreds of public ontologies and numerous private ontologies for describing biological data exist today. Using ontologies in R is difficult due to the lack of uniform package support. At the same time numerous Java-based ontology projects are available. ontoCAT takes advantage of a standard Java library with the same name "ontoCAT" to implement its functionality. Here are java sources used for our R package: Java sources.
The ontoCAT package:
- gives unified, format-independent access to ontology terms and the ontology hierarchy represented in OWL and OBO formats;
- provides basic methods for ontology traversal, such as searching for terms, listing a specific term's relations, showing paths to the term from the root element of the ontology, showing flattened-tree representations of the ontology hierarchy;
- supports working with groups of ontologies and with major public ontology repositories: searching for terms across ontologies, listing available ontologies and loading ontologies for further analysis as necessary.
No other package with similar functionality exists at the moment in the R environment.
The integration of the above functionality into R allows combining and automating ontology-related tasks.
Different examples of ontology-related tasks that can be accomplished with the help of the ontoCAT package are given at the examples page of the ontoCAT website: gene enrichment test and grouping of results, search and re-annotation of free-text to ontology.
ontoCAT has been included into Bioconductor, the main R open source project in bioinformatics.
Reasoning
Reasoning over ontologies and extracting relationships is supported by using HermiT reasoner. OBO ontologies are translated by OWL API into valid OWL format that can be reasoned over. More info about reasoning
Relationships
In ontoCAT the subsumption "subclass/superclass" is supported in a user friendly form of "child -- parent" relationship.
For instance, ontology term "myocardium" is a parent for term "atrial myocardium" since "atrial myocardium" is subclass of "myocardium".
No distinction is made between universals (classes) and particulars (instances) as they are both treated as ontology terms.
The advantage of using a reasoner in ontoCAT is the ability to work with different relationships in addition to subsumption.
Example of operations with relationships is available here: R Example 3.
Java Heap Size
Java Heap size needed to reason over GO ontology (more than 20 MB in size) is 512MB. Here are the instructions how to increase Java Heap Size in R:
library(rJava)
options(java.parameters="-Xmx512")
.jinit()
#To check the result:
.jcall(.jnew("java/lang/Runtime"), "J", "maxMemory")
#Now we can load ontoCAT library and start to work with large ontologies like GO
library(ontoCAT)
go <- getOntology("http://www.geneontology.org/ontology/obo_format_1_2/gene_ontology_ext.obo")
Methods
Single Ontology Traversal Methods
- To load an ontology getOntology() method of the Ontology class is available. It takes a single argument, specifying the local filesystem path, the full URI for the ontology file, or its OLS/BioPortal accession number. However, if reasoning over ontology is not desirable getOntologyNoReasoning() method should be used instead of getOntology() method described above.
- getAllTermChildren(Ontology, term) returns list of term's all children
- getAllTermChildrenById(Ontology, 'EFO_0000343') returns list of term's all children
- getAllTermIds(Ontology) returns list of all term accessions
- getAllTermParents(Ontology, term) returns list of term's all parents
- getAllTermParentsById(Ontology, 'EFO_0000343') returns list of term's all parents
- getAllTerms(Ontology) returns list of all terms
- getEFOBranchRootIds(Ontology) returns set of branch root accessions. Method specific for EFO ontology
- getOntologyAccession(Ontology) returns parsed ontology accession
- getOntologyDescription(Ontology) returns parsed ontology description
- getRootIds(Ontology) returns list of root terms accessions, if there are any
- getRoots(Ontology) returns list of root terms, if there are any
- getTermAndAllChildren(Ontology, term) returns list of accessions of term itself and all its children recursively
- getTermAndAllChildrenById(Ontology, 'EFO_0000343') returns list of accessions of term itself and all its children recursively
- getTermById(Ontology, 'EFO_0000343') fetchs term by accession. returns external term representation if found in ontology, null otherwise
- getAccession(term) returns term's accession
- getLabel(term) returns term's label
- getTermNameById(Ontology, 'EFO_0000343') returns term's label by accession
- getTermChildren(Ontology, term) returns list of term's direct children
- getTermChildrenById(Ontology, 'EFO_0000343') returns list of term's direct children
- getTermDefinitions(Ontology, term) returns set of term's definitions if there are some
- getTermParents(Ontology, term) returns list of term's direct parents
- getTermParentsById(Ontology, 'EFO_0000343') returns list of term's direct parents
- getTermSynonyms(Ontology, term) returns set of term's synonyms if there are some
- hasTerm(Ontology, 'EFO_0000343') Check if term with specified accession exists in ontology
- isEFOBranchRoot(Ontology, term) returns true if term is branch root of EFO. Method specific for EFO ontology
- isEFOBranchRootById(Ontology, 'EFO_0000343') returns true if term is branch root of EFO. Method specific for EFO ontology
- isRoot(Ontology, term) returns true if term is root of ontology
- isRootById(Ontology, 'EFO_0000343') returns true if term is root of ontology
- searchTerm(Ontology, 'thymus') searches for term in ontology by name
- searchTermPrefix(Ontology, 'thym') searches for prefix in ontology
- showHierarchyDownToTerm(Ontology, term) returns set of terms that represent ontology "opened" down to specified term, hence displaying all its parents first and then a tree level, containing specified term
- showHierarchyDownToTermById(Ontology, 'EFO_0000343') returns set of terms that represent ontology "opened" down to specified term, hence displaying all its parents first and then a tree level, containing specified term
- showPathsToTerm(Ontology, term) returns paths to the specified term from ontology's root term
- showPathsToTermById(Ontology, 'EFO_0000343') returns paths to the specified term from ontology's root term
- getOntologyRelationNames(Ontology) returns list of relations used in ontology
- getTermRelationNames(Ontology, term) returns list of relations that term has
- getTermRelationNamesById(Ontology, 'EFO_0000343') returns list of relations that term under given accession has
- getTermRelations(Ontology, 'EFO_0000343', 'has_part') returns list of terms that are in defined relation with term of interest
- getTermRelations(Ontology, term, 'has_part') returns list of terms that are in defined relation with term of interest
Operations on Multiple Ontologies
- To create a local batch of ontologies the getOntologyFromBatch() method of the batch class is provided, taking a single argument: the path to the local directory containing ontology files. By default, a call to getOntologyFromBatch() without any arguments will load the EFO ontology.
- Ontologies can be added to an existing batch as needed via the addOntology() method.
- searchTerm(batch, 'thymus') searches for term in all ontologies in the batch.
- searchTermInOLS('thymus') searches for term in OLS.
- searchTermInBioPortal('thymus') searches for term in BioPortal.
- serachTermInAll(batch,'thymus') searches for term in all ontologies in the batch as well as in OLS and BioPortal repositories.
- listLoadedOntologies(batch) returns a list of ontologies from the batch.
- listOLSOntologies(batch) returns a list of ontologies available from OLS.
- listBioportalOntologies(batch) returns a list of ontologies from BioPortal.
- When the sought terms are found and term-specific operations (parent/child retrieval, etc.) are needed, the getOntologyFromBatch(batch,accession) returns the ontology parser for the concrete ontology with all single-ontology methods as described above.
Attachments (1)
- ontology.png (8.4 KB) - added by 9 years ago.
Download all attachments as: .zip