Changes between Initial Version and Version 1 of RprojectInterface

2010-10-01T23:38:13+02:00 (11 years ago)



  • RprojectInterface

    v1 v1  
     2= Using R for data analysis. =
     3This page shows how to connect R scripts to XGAP. See RqtlIntegration for integration with the RqtlPackage.
     5== Connecting R to XGAP ==
     6This example shows how to connect to XGAP from within R:
     8 1. In the XGAP user interface, go to "Programming interfaces".
     10 2. Click the link at "Access from the R project: source the file at api/R".
     12 3. Select and copy all the commands
     14 4. Open R, and paste the commands. If you haven't installed RCurl, please do so now. (See "step 1" in the sourced code)
     16 [[Image(R1.png)]]
     18== Retrieving annotation data ==
     19This examples show how to retrieve annotation data from XGAP (except data matrices). We will use the example of 'marker'.
     21 1. Retrieve annotations use find.*. For example:
     23allMarkers <- find.marker()
     25 Tip: if you only type 'find.' and then push 'TAB' you will see all find.* functions available.
     27 2. One can use the R function dim() to see the dimensions of this object. In this example, there are 251 markers with 17 attributes each.
     31 Result:
     33 [[Image(R2.png)]]
     35 3. By selecting the first column, you get to see the chromosome attribute for each marker:
     36 Use:
     41 It is even easier to retrieve a particular column using the '$ plus column name' notation:
     42 Use:
     46 Result:
     48 [[Image(R3.png)]]
     50 4. You can select the first marker by picking the first row:
     51 Use:
     56 5. And only name of this marker by combining the syntaxes:
     57 Use:
     62 6. The '$' notation does not work for multiple columns. So, if you wish to see specific attributes for all markers, you have to pass the column names as follows:
     63 Use:
     64 {{{
     67 Result:
     69 [[Image(R4.png)]]
     72== Retrieving data matrices ==
     73Data is stored in XGAP in the form of matrices. The following examples show how to retrieve these data sets into R.
     75 1. First, get a list of all data matrices available in the database. Adding some more arguments will limit the output shown to only, name and investigation name.
     76 Use:
     80 Result:
     81 [[Image(R5.png)]]
     83 2. Select and retrieve the data for one data matrix. In this example let's pick the metabolite expression matrix which had id 6. Then you can download it like this:
     85data <- find.datamatrix(6)
     87 3. Like with annotation data above, one can also inspect the size of the downloaded matrix:
     91 Result:
     93 [[Image(R6.png)]]
     95 3. Now make an 'overplotted' plot of the first column. (In this case: the first 1 trait, for all individuals)
     97plot(data[1,], type="o")
     99 Result:
     101 [[Image(R7.png)]]
     103 4. In contrast with annotations, data matrices also have row headers, next to column headers. You can check this by using functions like colnames(), rownames(). Note that you can use these to select a subset of the matrix:
     105data[1:5, 1:5]
     107 Result:
     109 [[Image(R8.png)]]
     111== Uploading annotations ==
     113Get a list of the investigations with attributes 'id' and 'name'. In this case, we use the 'MetaNetwork' investigation which has id = 1.
     115 Use:
     117find.investigation()[,c("id", "name")]
     120Suppose we would like to add a pseudomarker during a QTL investigation. We can easily add a single marker by using add.marker:
     122 Use:
     124add.marker(name="loc50.0", cm="50.0", chr="2", investigation_id=1)
     127It's also possible to add a list of markers at once. This can be done by constructing a dataframe. Use 'colnames' to set the attributes. An example:
     129 Use:
     131pseudo <- NULL
     132pseudo <- rbind(pseudo, c("loc0.0","0.0","14", 1))
     133pseudo <- rbind(pseudo, c("loc10.0","10.0","15", 1))
     134pseudo <- rbind(pseudo, c("loc22.0","22.0","16", 1))
     135colnames(pseudo) <- c("name", "cm", "chr", "investigation")
     139You can add the result to a variable so you can use its properties (eg. the assigned id in the database) later on.
     141 Use:
     143myPseudo = add.marker(pseudo);
     144myPseudo #print the list of markers
     147== Uploading data matrices ==
     149=== Option A ===
     151This is the easier way, using a custom function.
     153Creata a matrix and add two rows with, for example, genotyping data:
     155 Use:
     157data <- NULL
     158data <- rbind(data, c("A", "B"))
     159data <- rbind(data, c("B", "A"))
     162If the individuals and markers are not present in the database, add them first.
     164 Use:
     166marker1 = add.marker(name="myMarker1", cm="10.0", chr="2", investigation_id=1)
     167marker2 = add.marker(name="myMarker2", cm="20.0", chr="2", investigation_id=1)
     168ind1 = add.individual(name="myInd1", investigation_id=1)
     169ind2 = add.individual(name="myInd2", investigation_id=1)
     172Now add reference to the individuals of which the genotypes are measured, and ofcourse the markers that have been genotyped. Be careful to not switch 'rows' with 'columns'.
     174 Use:
     176colnames(data) <- c("myInd1", "myInd2")
     177rownames(data) <- c("myMarker1", "myMarker2")
     180Now we add this matrix by using the custom 'add.datamatrix' function. Several attributes need to be entered, for example the name and row/column type of the matrix. We know the investigation to add this matrix to has id = 1.
     182 Use:
     184add.datamatrix(data, name="myResults", investigation_id=1, rowtype="Marker", coltype="Individual", valuetype="Text")
     187When successful, something like this will appear:
     189 [[Image(R10.png)]]
     191You can inspect the result in the user interface:
     193 [[Image(R9.png)]]
     195=== Option B ===
     197This is the harder way, performing several manual steps by yourself.
     199First, add a Data object. This is basically the description of a datamatrix. We add it under the investigation with id = 1. Say we add genotyping data. In this case, the rowtype will be 'marker', the columns 'individual. We add two rows and two columns. The values will be text. (eg. 'A' or 'B')
     201 Use:
     203data <- = "myResults", investigation_id=1, rowtype="Marker",coltype="Individual",totalrows=2,totalcols=2,valuetype="Text")
     206Now lets add elements which we will refer to. This can also be existing elements ofcourse. We add two markers with some information, and two individuals with just names.
     208 Use:
     210marker1 = add.marker(name="myMarker1", cm="10.0", chr="2", investigation_id=1)
     211marker2 = add.marker(name="myMarker2", cm="20.0", chr="2", investigation_id=1)
     212ind1 = add.individual(name="myInd1", investigation_id=1)
     213ind2 = add.individual(name="myInd2", investigation_id=1)
     216We can now add the actual values. Here we use 'add.textdataelement' contrary to 'add.decimaldataelement' because the valuetype for the matrix is text. From the existing 'data' object, we get the id to add the element to. We do the same for the markers and individuals. Then we indicate the position of the elements in the matrix using indices and finally the value of the element.
     218 Use:
     220add.textdataelement(data_id=data$id, row_id=marker1$id, col_id=ind1$id, rowindex=0, colindex=0, value="A")
     221add.textdataelement(data_id=data$id, row_id=marker1$id, col_id=ind2$id, rowindex=0, colindex=1, value="B")
     222add.textdataelement(data_id=data$id, row_id=marker2$id, col_id=ind1$id, rowindex=1, colindex=0, value="B")
     223add.textdataelement(data_id=data$id, row_id=marker2$id, col_id=ind2$id, rowindex=1, colindex=1, value="A")