| 1 | == Feedback on WormQTL, may 2012 == |
| 2 | That looks very good, because a lot of info in gff format to obtain, which makes integration with other data much easier. |
| 3 | This is also recognizable for the Worm community, which is always nice. |
| 4 | That several of these chromosomes there we'll find another way, in terms of plot summary is what we have now fine, that is even better if we do not map folders which has 400 + informative markers. What you have now is functional! Here people can live with that. Plus Worm Base this format also used and they may want to include or link. |
| 5 | |
| 6 | How to demo: |
| 7 | I can select a dataset to search in from the dropdown box as an additional filter. |
| 8 | I can download the data used to create my plots. [unclear] |
| 9 | I want to download my currently shopped phenotype records in a CSV. |
| 10 | I want to be able to calculate the correlations between all shopped phenotypes. |
| 11 | I want the result to be a CytoScape file that I can use to visualize networks. |
| 12 | I have documentation showing how to best search my favourite gene. What ID or nomenclature will give the best hits? |
| 13 | I want to have a better cis-trans plot with names indicating which traits have high LOD scores. [unclear] |
| 14 | |
| 15 | One thing would be behind the probe particularly in brackets or something the public can come particularly those which are easier to value. |
| 16 | I have no idea about the difference between the browsers, since Worm Base and Mode Code with the same work. For gff track you can also try different Fashion Code Base Worm has also an ftp site with older tracks. I believe that SNP data can also be converted to gff right? |
| 17 | I assume it's a matter of time before Worm Base and other tracks can be combined, in short I am very excited about it. This is a very good example for the upcoming paper, especially if we can say that we (e) QTL results are combined with public information, which is now quite difficult and a lot of manual work content. |
| 18 | Furthermore, we had a nice comment on the poster on WormQTL. Many people saw the need and usefulness of it there. I got a tip to the number of people with Rqtl to combine, even if we would limit to 1 per probe mapping. In terms of visualization, it is therefore easy as Danny (and co.) Have designed a lot of nice figures! |
| 19 | This peak detection is difficult idd but can be unloaded. There is a peak detection in cran package, though not perfect, it also works for QTLs. Furthermore I have ever made, where each chromosome is the highest peak is taken and then looked left and right at the marker is -1.5 LOD (or -2) is then repeated this in front of the markers outside the QTL region , so there is a marker on> 3 where the left and right markers equal or lower, and then into the lower marker -1.5 LOD, etc. .. Can the code so as not to dive. This works but may not be the quickest solution. But you do not often fool the whole set to convert. |
| 20 | In short, I see this design certainly succeed, mainly because it combines well with other data, it is a recognizable makes it much easier for many people to involve QTL data in their research. |
| 21 | |
| 22 | Good, that the data of PANACEA will be put into WormQTL as by the deliverable D6.15. However the focus now is on the release of WormQTL (which means no unpublished data!), so we might want to prioritise that. Anyway i’ve updated the list, which was made by me for the setup of WormQTL. Now we can see what needs to be added or updated still (see attachment). The thing we could do is look for other published QTL studies (especially phenotypic from the Kruglyac lab) and add those. The datasets should be small and can be linked with the Rockman RIAILs which are already in the database. |
| 23 | For the PANACEA data: |
| 24 | - This is not public data it should be protected, it is a separate undertaking from the WormQTL paper. |
| 25 | - the probes of the WUR agilent array need to be blasted against WS220. |
| 26 | - You have the gld-1 data and eQTLs (but we probably should compare the WLS samples and normalisation methods) |
| 27 | - The proteomics data is not finished |
| 28 | - The RNAi data from Elvin & Snoek etal. is in the public data area. |
| 29 | - I’m working on the MIRIL data, which is not yet in the stage of making it available (it will be in a few weeks as far as gene expression and the preliminary QTL mappings go). |
| 30 | - The phenotypic (VI, gonad migration) data and QTLs from the MIRILs are available, but i rather send that together with all the other MIRIL stuff |
| 31 | - The raw data from the large screen of ~200 RNAi on ~50 RILs and ~30 NILs is available but still has to be processed. |
| 32 | - Some apoptosis phenotypes.... |
| 33 | |
| 34 | If we can make some of this available to us and present this at the up-coming meeting it would be great, but nowhere near essential. |
| 35 | To extend this even further, we have data from two other projects GRAPPLE and NEMADAPT which fit right in, but are not public at this moment. |
| 36 | In fact we are currently working or three things (which luckily overlap ;-), |
| 37 | - The WLS paper |
| 38 | - WormQTL and the upcoming deadline for the paper |
| 39 | - Making the PANACEA data available |
| 40 | The first two things are much more urgent than the last since the official progress report is not due until the end of the project. |
| 41 | |
2 | | * It is slow |
3 | | * If one searches for a gene, the QTL profiles of all the probes should be given ( at least in one experiment, preferably over all experiments |
4 | | * The matrixes with QTL effects are QTL effect sizes and not LOD scores |
5 | | * The extra information should only be show after clicking (a drop box?), an overview is missing and most users will be overwhelmed with details. |
6 | | * Make other search types, example give all genes that have a QTL between … and … bp with a lod score > … in a certain experiment |
7 | | * Really a plot where results over experiments can be compared is needed. |
8 | | * Is your last session saved? |
9 | | * Why do I need to click search twice sometimes |
10 | | * All phenotypes gives error; but search for genes works |
11 | | * After probes are found a link that says QTL should be very clear!! |
12 | | * At QTL page, Probe label at the top should be accompanied by gene name |
13 | | * Investigation should not be PANACEA but more general |
14 | | * The QTL graph should be larger |
| 43 | * It is slow |
| 44 | * If one searches for a gene, the QTL profiles of all the probes should be given ( at least in one experiment, preferably over all experiments |
| 45 | * The matrixes with QTL effects are QTL effect sizes and not LOD scores |
| 46 | * The extra information should only be show after clicking (a drop box?), an overview is missing and most users will be overwhelmed with details. |
| 47 | * Make other search types, example give all genes that have a QTL between … and … bp with a lod score > … in a certain experiment |
| 48 | * Really a plot where results over experiments can be compared is needed. |
| 49 | * Is your last session saved? |
| 50 | * Why do I need to click search twice sometimes |
| 51 | * All phenotypes gives error; but search for genes works |
| 52 | * After probes are found a link that says QTL should be very clear!! |
| 53 | * At QTL page, Probe label at the top should be accompanied by gene name |
| 54 | * Investigation should not be PANACEA but more general |
| 55 | * The QTL graph should be larger |
25 | | * Storing your data |
26 | | * Running quick analyses |
27 | | * Quality control of the data (standard, missing markers) |
28 | | * Every new collaboration data sets are different slightly; standardized solution |
29 | | * Time investing of matrix cleaing (missing data, wrong columns); have these times of parsing |
30 | | * Pulldown button for format parsers; inventory of how these data sets look like |
31 | | * Set based select all the trans regulated probes |
| 63 | |
| 64 | * Storing your data |
| 65 | * Running quick analyses |
| 66 | * Quality control of the data (standard, missing markers) |
| 67 | * Every new collaboration data sets are different slightly; standardized solution |
| 68 | * Time investing of matrix cleaing (missing data, wrong columns); have these times of parsing |
| 69 | * Pulldown button for format parsers; inventory of how these data sets look like |
| 70 | * Set based select all the trans regulated probes |
40 | | * To do analyses and run it in parallel myself |
41 | | * Can test my code on the machine and get error message (requirement) |
42 | | * Expected gains: have more R tools on this, for pictures, tables |
43 | | * Expected gains: |
44 | | * Standard importers for Agilent, Affymetrix, Nimblegen |
45 | | * Do you show all annotation of a particular gene, which gene it is, located, is there a SNP |
46 | | * Can we make a pathway picture; I want to submit all my genes there and I know the relations. See QTL and Pathway plots together, possibly in different conditions or factors. |
| 80 | |
| 81 | * To do analyses and run it in parallel myself |
| 82 | * Can test my code on the machine and get error message (requirement) |
| 83 | * Expected gains: have more R tools on this, for pictures, tables |
| 84 | * Expected gains: |
| 85 | * Standard importers for Agilent, Affymetrix, Nimblegen |
| 86 | * Do you show all annotation of a particular gene, which gene it is, located, is there a SNP |
| 87 | * Can we make a pathway picture; I want to submit all my genes there and I know the relations. See QTL and Pathway plots together, possibly in different conditions or factors. |
62 | | * It must be open and clear what exact code is (or will be) executed |
63 | | * Having quality control (tools) for the data would be great |
64 | | * There must be helpful error reporting in combination with script testing possibilities |
65 | | * In general, more tools, visualizations and statistical reports make the system attractive |
66 | | * We must find out if and how the latest technologies and formats fit into the system |
67 | | * Idea: put 'cross' object as a file in the database with seperate script? Works best with workflows and 'pull' architecture |
68 | | * We will get example data and scripts from Yang and Frank soon to testdrive our capabilities and limitations |
69 | | * Being able to input raw data formats (machine output) directly into R-ready datamatrices, this would save much time formatting and checking |
70 | | * Having a 'universal' importer that supports many specific formats (agilent, illumina, affy, etc) and does all verification/importing would be ideal |
71 | | * This importer would be extensible with importers for new formats when needed |
72 | | * To start with this, make an inventory of such used/popular formats, estimate how complex the importers would be and how much time it costs for each |
73 | | * Being able to (for example) sort in the matrix viewer and get a trait linkout (either to db or external) would already be a great help when interpreting results |
74 | | * For biologists, having results plotted on pathways (such as KEGG) would be an immense help for interpretation |
75 | | * Having Yang's mislabeled sample scripts in the system would be great as a quality control step |
| 102 | * It must be open and clear what exact code is (or will be) executed |
| 103 | * Having quality control (tools) for the data would be great |
| 104 | * There must be helpful error reporting in combination with script testing possibilities |
| 105 | * In general, more tools, visualizations and statistical reports make the system attractive |
| 106 | * We must find out if and how the latest technologies and formats fit into the system |
| 107 | * Idea: put 'cross' object as a file in the database with seperate script? Works best with workflows and 'pull' architecture |
| 108 | * We will get example data and scripts from Yang and Frank soon to testdrive our capabilities and limitations |
| 109 | * Being able to input raw data formats (machine output) directly into R-ready datamatrices, this would save much time formatting and checking |
| 110 | * Having a 'universal' importer that supports many specific formats (agilent, illumina, affy, etc) and does all verification/importing would be ideal |
| 111 | * This importer would be extensible with importers for new formats when needed |
| 112 | * To start with this, make an inventory of such used/popular formats, estimate how complex the importers would be and how much time it costs for each |
| 113 | * Being able to (for example) sort in the matrix viewer and get a trait linkout (either to db or external) would already be a great help when interpreting results |
| 114 | * For biologists, having results plotted on pathways (such as KEGG) would be an immense help for interpretation |
| 115 | * Having Yang's mislabeled sample scripts in the system would be great as a quality control step |
82 | | [[TOC()]] |
83 | | Deze app note kan mikken op 2 resultaten: |
84 | | |
85 | | 1. een publieke versie waarin wij onze en partner data sets publiceren (tzt geheel webqtl). We zijn goed op weg een WebQTL killer te maken, zeg maar een myExperiment, FaceBook, ArrayAtlas BioCatalogue of Wikipedia voor QTL studies. Misschien iets voor de naam? myqtl.org? qtlatlas.org? qtlpedia.org? qtlcatalogue.org? Maar dan beter wat in tegenstelling tot WebQTL hoef ik niet steeds zo lang te wachten op resultaten. |
86 | | |
87 | | 2. een downloadbare versie waarmee mensen hun eigen xqtl kunnen runnen. En het mooie is dat ze maar op 'export' hoeven te drukken om een data setje aan ons te sturen zodat die in de publieke catalogus kan. En we kunnen zelfs via de REST api een syndication gaan implementeren ;-) |
| 122 | [[TOC()]] Deze app note kan mikken op 2 resultaten: |
| 123 | |
| 124 | 1. een publieke versie waarin wij onze en partner data sets publiceren (tzt geheel webqtl). We zijn goed op weg een WebQTL killer te maken, zeg maar een myExperiment, FaceBook, ArrayAtlas BioCatalogue of Wikipedia voor QTL studies. Misschien iets voor de naam? myqtl.org? qtlatlas.org? qtlpedia.org? qtlcatalogue.org? Maar dan beter wat in tegenstelling tot WebQTL hoef ik niet steeds zo lang te wachten op resultaten. |
| 125 | |
| 126 | 2. een downloadbare versie waarmee mensen hun eigen xqtl kunnen runnen. En het mooie is dat ze maar op 'export' hoeven te drukken om een data setje aan ons te sturen zodat die in de publieke catalogus kan. En we kunnen zelfs via de REST api een syndication gaan implementeren ;-) |
191 | | * dit scherm kan dan meteen aangeven of je set compleet genoeg is voor analyse. |
192 | | * als men op een resource of data set klikt dan krijg je gewoon de huidige overzichten te zien (zijn prima) |
193 | | * Die icoontjes voor dat kunnen we zelfs varieren als we de data matrices gaan taggen. |
194 | | * In een hulptabel moeten dan per investigation de counts worden bijgehouden omdat het te duur is die steeds uit te rekenen. |
195 | | * in de standaard molgenis formulieren moet het makkelijker worden dus file menu aanpassen in 'download', 'upload' |
| 225 | |
| 226 | * dit scherm kan dan meteen aangeven of je set compleet genoeg is voor analyse. |
| 227 | * als men op een resource of data set klikt dan krijg je gewoon de huidige overzichten te zien (zijn prima) |
| 228 | * Die icoontjes voor dat kunnen we zelfs varieren als we de data matrices gaan taggen. |
| 229 | * In een hulptabel moeten dan per investigation de counts worden bijgehouden omdat het te duur is die steeds uit te rekenen. |
| 230 | * in de standaard molgenis formulieren moet het makkelijker worden dus file menu aanpassen in 'download', 'upload' |