Changes between Version 2 and Version 3 of Courses/ComputationalMolecularBiologyResearch2016/P2


Ignore:
Timestamp:
2016-01-29T18:40:08+01:00 (9 years ago)
Author:
Pieter Neerincx
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Courses/ComputationalMolecularBiologyResearch2016/P2

    v2 v3  
    77== Introduction ==
    88
    9 To help improve clinical diagnosis of genetic disease it is important to measure the activity of all genes of a patient. Gene activity can be quantified using Next Generation Sequencing to measure the level of RNA that is present in the cells. The classical way to do this is by first aligning reads from the RNA sequencing experiment to the genome, and then count the number of reads that overlap with a gene. Two examples of programs that do this are HTSeqCount [1] and FeatureCounts [2]. Recently, tools have been developed that perform the quantification using pseudo-alignment, as opposed to aligning to a reference genome first, examples are Kallisto [3] and Salmon [4]. The advantage of the alignment free quantification methods is that is requires less computation time.
     9To help improve clinical diagnosis of genetic disease it is important to measure the activity of all genes of a patient. Gene activity can be quantified using Next Generation Sequencing to measure the level of RNA that is present in the cells. The classical way to do this is by first aligning reads from the RNA sequencing experiment to the genome, and then count the number of reads that overlap with a gene. Two examples of programs that do this are HTSeqCount (1) and !FeatureCounts (2). Recently, tools have been developed that perform the quantification using pseudo-alignment, as opposed to aligning to a reference genome first, examples are Kallisto (3) and Salmon (4). The advantage of the alignment free quantification methods is that is requires less computation time.
    1010
    1111Currently, we are analysing over 30.000 public RNA sequencing samples and we want to also include gene quantification for downstream analyses. Due to the large size of this dataset we want to use the fastest and most accurate tool available.
    1212
    13 [1] Simon Anders, Paul Theodor Pyl, Wolfgang Huber
    14     HTSeq — A Python framework to work with high-throughput sequencing data
    15     Bioinformatics (2014), in print, online at doi:10.1093/bioinformatics/btu638
    16 [2] Liao, Yang, Gordon K. Smyth, and Wei Shi. "featureCounts: an efficient general purpose program for assigning sequence reads to genomic features." Bioinformatics 30.7 (2014): 923-930.
    17 [3] Weijers, S. R., et al. "KALLISTO: cost effective and integrated optimization of the urban wastewater system Eindhoven." Water Practice and Technology 7.2 (2012): 1-9.
    18 [4] Patro, Rob, Geet Duggal, and Carl Kingsford. "Salmon: Accurate, Versatile and Ultrafast Quantification from RNA-seq Data using Lightweight-Alignment." bioRxiv (2015): 021592.
     131. Simon Anders, Paul Theodor Pyl, Wolfgang Huber [[BR]]
     14   ''HTSeq — A Python framework to work with high-throughput sequencing data'' [[BR]]
     15   Bioinformatics (2014), in print, online at doi:10.1093/bioinformatics/btu638
     162. Liao, Yang, Gordon K. Smyth, and Wei Shi. [[BR]]
     17   ''featureCounts: an efficient general purpose program for assigning sequence reads to genomic features.'' [[BR]]
     18   Bioinformatics 30.7 (2014): 923-930.
     193. Weijers, S. R., et al. [[BR]]
     20   ''KALLISTO: cost effective and integrated optimization of the urban wastewater system Eindhoven.'' [[BR]]
     21   Water Practice and Technology 7.2 (2012): 1-9.
     224. Patro, Rob, Geet Duggal, and Carl Kingsford. [[BR]]
     23   ''Salmon: Accurate, Versatile and Ultrafast Quantification from RNA-seq Data using Lightweight-Alignment.'' [[BR]]
     24   bioRxiv (2015): 021592.
    1925
    20 == Project 2 - ==
     26== Project 2 - RNAseq quantification benchmark ==
    2127
    2228* Literature study of gene quantification methods
    2329* Designing a plan for comparing quality of quantification methods
    24 * Comparison of HTSeq count, FeatureCounts, Kallisto and a selection of other available quantification methods identified in literature
     30* Comparison of HTSeq count, !FeatureCounts, Kallisto and a selection of other available quantification methods identified in literature