| 1 | == ontoCAT R package== |
| 2 | |
| 3 | === Download === |
| 4 | |
| 5 | We provide two versions of '''''ontoCAT R package''''': |
| 6 | * [http://bioconductor.org/help/bioc-views/2.8/bioc/html/ontoCAT.html Light-weight ontoCAT R package version] is available in [http://bioconductor.org Bioconductor] starting from release 2.7, and includes all single-ontology functionality except for methods to work with multiple ontologies and search in OLS and !BioPortal. |
| 7 | * [https://sourceforge.net/projects/ontocat/files/ontoCAT/ontoCAT_R/ontoCAT_1.2.1.tar.gz Full ontoCAT R package version] includes batch methods and due to package-size limitations is available only from the ontoCAT project [https://sourceforge.net/projects/ontocat sourceforge] page. |
| 8 | |
| 9 | |
| 10 | === Description === |
| 11 | |
| 12 | The R package '''''ontoCAT''''' was created to support basic operations on ontologies: traversal and search, |
| 13 | uniform access to ontologies in OWL and OBO formats and to provide R access to major ontology repositories OLS and !BioPortal. |
| 14 | |
| 15 | Several hundreds of public ontologies and numerous private ontologies for describing biological data exist today. |
| 16 | Using ontologies in R is difficult due to the lack of uniform package support. |
| 17 | At the same time numerous Java-based ontology projects are available. '''''ontoCAT''''' takes advantage of a standard Java library with the same name "ontoCAT" to implement its functionality. |
| 18 | Here are java sources used for our R package: [http://www.ontocat.org/browser/trunk/ontoCAT/src/uk/ac/ebi/ontocat/utils Java sources]. |
| 19 | |
| 20 | The '''''ontoCAT''''' package: |
| 21 | * gives unified, format-independent access to ontology terms and the ontology hierarchy represented in OWL and OBO formats; |
| 22 | * provides basic methods for ontology traversal, such as searching for terms, listing a specific term's relations, showing paths to the term from the root element of the ontology, showing flattened-tree representations of the ontology hierarchy; |
| 23 | * supports working with groups of ontologies and with major public ontology repositories: searching for terms across ontologies, listing available ontologies and loading ontologies for further analysis as necessary. |
| 24 | |
| 25 | No other package with similar functionality exists at the moment in the R environment. |
| 26 | |
| 27 | The integration of the above functionality into R allows combining and automating ontology-related tasks.\\ |
| 28 | Different examples of ontology-related tasks that can be accomplished with the help of the '''''ontoCAT''''' package are given at the [http://www.ontocat.org/wiki/OntocatGuide examples page] of the ontoCAT website: gene enrichment test and grouping of results, search and re-annotation of free-text to ontology. |
| 29 | |
| 30 | '''''ontoCAT''''' has been included into Bioconductor, the main R open source project in bioinformatics. |
| 31 | ==== Reasoning ==== |
| 32 | Reasoning over ontologies and extracting relationships is supported by using HermiT reasoner. |
| 33 | OBO ontologies are translated by OWL API into valid OWL format that can be reasoned over. |
| 34 | [wiki:Reasoning More info about reasoning] |
| 35 | |
| 36 | ==== Relationships ==== |
| 37 | In '''''ontoCAT''''' the subsumption "subclass/superclass" is supported in a user friendly form of "child -- parent" relationship. \\ |
| 38 | For instance, ontology term "myocardium" is a parent for term "atrial myocardium" since "atrial myocardium" is subclass of "myocardium". |
| 39 | No distinction is made between universals (classes) and particulars (instances) as they are both treated as ontology terms. |
| 40 | |
| 41 | The advantage of using a reasoner in '''''ontoCAT''''' is the ability to work with different relationships in addition to subsumption. |
| 42 | |
| 43 | Example of operations with relationships is available here: [http://www.ontocat.org/browser/trunk/ontoCAT/src/uk/ac/ebi/ontocat/examples/R/Example3.R R Example 3]. |
| 44 | ==== Java Heap Size ==== |
| 45 | |
| 46 | Java Heap size needed to reason over GO ontology (more than 20 MB in size) is 512MB. |
| 47 | Here are the instructions how to increase Java Heap Size in '''R''': |
| 48 | |
| 49 | > library(rJava) |
| 50 | |
| 51 | > options(java.parameters="-Xmx512") |
| 52 | |
| 53 | > .jinit() |
| 54 | |
| 55 | #To check the result: |
| 56 | > .jcall(.jnew("java/lang/Runtime"), "J", "maxMemory") |
| 57 | |
| 58 | #Now we can load ontoCAT library and start to work with large ontologies like GO |
| 59 | >library(ontoCAT) |
| 60 | |
| 61 | >go <- getOntology("http://www.geneontology.org/ontology/obo_format_1_2/gene_ontology_ext.obo") |
| 62 | === Methods === |
| 63 | |
| 64 | '''Single Ontology Traversal Methods''' |
| 65 | |
| 66 | * To load an ontology ''getOntology()'' method of the ''Ontology'' class is available. It takes a single argument, specifying the local filesystem path, the full URI for the ontology file, or its OLS/BioPortal accession number. However, if reasoning over ontology is not desirable ''getOntologyNoReasoning()'' method should be used instead of ''getOntology()'' method described above. |
| 67 | * ''getAllTermChildren(Ontology, term)'' returns list of term's all children |
| 68 | * ''getAllTermChildrenById(Ontology, 'EFO_0000343')'' returns list of term's all children |
| 69 | * ''getAllTermIds(Ontology)'' returns list of all term accessions |
| 70 | * ''getAllTermParents(Ontology, term)'' returns list of term's all parents |
| 71 | * ''getAllTermParentsById(Ontology, 'EFO_0000343')'' returns list of term's all parents |
| 72 | * ''getAllTerms(Ontology)'' returns list of all terms |
| 73 | * ''getEFOBranchRootIds(Ontology)'' returns set of branch root accessions. Method specific for EFO ontology |
| 74 | * ''getOntologyAccession(Ontology)'' returns parsed ontology accession |
| 75 | * ''getOntologyDescription(Ontology)'' returns parsed ontology description |
| 76 | * ''getRootIds(Ontology)'' returns list of root terms accessions, if there are any |
| 77 | * ''getRoots(Ontology)'' returns list of root terms, if there are any |
| 78 | * ''getTermAndAllChildren(Ontology, term)'' returns list of accessions of term itself and all its children recursively |
| 79 | * ''getTermAndAllChildrenById(Ontology, 'EFO_0000343')'' returns list of accessions of term itself and all its children recursively |
| 80 | * ''getTermById(Ontology, 'EFO_0000343')'' fetchs term by accession. returns external term representation if found in ontology, null otherwise |
| 81 | * ''getAccession(term)'' returns term's accession |
| 82 | * ''getLabel(term)'' returns term's label |
| 83 | * ''getTermNameById(Ontology, 'EFO_0000343')'' returns term's label by accession |
| 84 | * ''getTermChildren(Ontology, term)'' returns list of term's direct children |
| 85 | * ''getTermChildrenById(Ontology, 'EFO_0000343')'' returns list of term's direct children |
| 86 | * ''getTermDefinitions(Ontology, term)'' returns set of term's definitions if there are some |
| 87 | * ''getTermParents(Ontology, term)'' returns list of term's direct parents |
| 88 | * ''getTermParentsById(Ontology, 'EFO_0000343')'' returns list of term's direct parents |
| 89 | * ''getTermSynonyms(Ontology, term)'' returns set of term's synonyms if there are some |
| 90 | * ''hasTerm(Ontology, 'EFO_0000343')'' Check if term with specified accession exists in ontology |
| 91 | * ''isEFOBranchRoot(Ontology, term)'' returns true if term is branch root of EFO. Method specific for EFO ontology |
| 92 | * ''isEFOBranchRootById(Ontology, 'EFO_0000343')'' returns true if term is branch root of EFO. Method specific for EFO ontology |
| 93 | * ''isRoot(Ontology, term)'' returns true if term is root of ontology |
| 94 | * ''isRootById(Ontology, 'EFO_0000343')'' returns true if term is root of ontology |
| 95 | * ''searchTerm(Ontology, 'thymus')'' searches for term in ontology by name |
| 96 | * ''searchTermPrefix(Ontology, 'thym')'' searches for prefix in ontology |
| 97 | * ''showHierarchyDownToTerm(Ontology, term)'' returns set of terms that represent ontology "opened" down to specified term, hence displaying all its parents first and then a tree level, containing specified term |
| 98 | * ''showHierarchyDownToTermById(Ontology, 'EFO_0000343')'' returns set of terms that represent ontology "opened" down to specified term, hence displaying all its parents first and then a tree level, containing specified term |
| 99 | * ''showPathsToTerm(Ontology, term)'' returns paths to the specified term from ontology's root term |
| 100 | * ''showPathsToTermById(Ontology, 'EFO_0000343')'' returns paths to the specified term from ontology's root term |
| 101 | * ''getOntologyRelationNames(Ontology)'' returns list of relations used in ontology |
| 102 | * ''getTermRelationNames(Ontology, term)'' returns list of relations that term has |
| 103 | * ''getTermRelationNamesById(Ontology, 'EFO_0000343')'' returns list of relations that term under given accession has |
| 104 | * ''getTermRelations(Ontology, 'EFO_0000343', 'has_part')'' returns list of terms that are in defined relation with term of interest |
| 105 | * ''getTermRelations(Ontology, term, 'has_part')'' returns list of terms that are in defined relation with term of interest |
| 106 | |
| 107 | '''Operations on Multiple Ontologies''' |
| 108 | |
| 109 | * To create a local batch of ontologies the ''getOntologyFromBatch()'' method of the ''batch'' class is provided, taking a single argument: the path to the local directory containing ontology files. By default, a call to ''getOntologyFromBatch()'' without any arguments will load the EFO ontology. |
| 110 | * Ontologies can be added to an existing batch as needed via the ''addOntology()'' method. |
| 111 | * ''searchTerm(batch, 'thymus')'' searches for term in all ontologies in the batch. |
| 112 | * ''searchTermInOLS('thymus')'' searches for term in OLS. |
| 113 | * ''searchTermInBioPortal('thymus')'' searches for term in !BioPortal. |
| 114 | * ''serachTermInAll(batch,'thymus')'' searches for term in all ontologies in the batch as well as in OLS and !BioPortal repositories. |
| 115 | * ''listLoadedOntologies(batch)'' returns a list of ontologies from the batch. |
| 116 | * ''listOLSOntologies(batch)'' returns a list of ontologies available from OLS. |
| 117 | * ''listBioportalOntologies(batch)'' returns a list of ontologies from !BioPortal. |
| 118 | * When the sought terms are found and term-specific operations (parent/child retrieval, etc.) are needed, the ''getOntologyFromBatch(batch,accession)'' returns the ontology parser for the concrete ontology with all single-ontology methods as described above. |
| 119 | |
| 120 | |
| 121 | |
| 122 | |
| 123 | |
| 124 | |
| 125 | |
| 126 | |