Changes between Version 1 and Version 2 of Courses/ComputationalMolecularBiologyResearch2016/P2


Ignore:
Timestamp:
2016-01-29T18:35:50+01:00 (9 years ago)
Author:
Pieter Neerincx
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Courses/ComputationalMolecularBiologyResearch2016/P2

    v1 v2  
    1 = T =
     1= Benchmarking RNAseq gene expression quantification tools =
    22
    33== Supervisors ==
    44
     5Niek de Klein and Freerk van Dijk
    56
    67== Introduction ==
    78
     9To help improve clinical diagnosis of genetic disease it is important to measure the activity of all genes of a patient. Gene activity can be quantified using Next Generation Sequencing to measure the level of RNA that is present in the cells. The classical way to do this is by first aligning reads from the RNA sequencing experiment to the genome, and then count the number of reads that overlap with a gene. Two examples of programs that do this are HTSeqCount [1] and FeatureCounts [2]. Recently, tools have been developed that perform the quantification using pseudo-alignment, as opposed to aligning to a reference genome first, examples are Kallisto [3] and Salmon [4]. The advantage of the alignment free quantification methods is that is requires less computation time.
     10
     11Currently, we are analysing over 30.000 public RNA sequencing samples and we want to also include gene quantification for downstream analyses. Due to the large size of this dataset we want to use the fastest and most accurate tool available.
     12
     13[1] Simon Anders, Paul Theodor Pyl, Wolfgang Huber
     14    HTSeq — A Python framework to work with high-throughput sequencing data
     15    Bioinformatics (2014), in print, online at doi:10.1093/bioinformatics/btu638
     16[2] Liao, Yang, Gordon K. Smyth, and Wei Shi. "featureCounts: an efficient general purpose program for assigning sequence reads to genomic features." Bioinformatics 30.7 (2014): 923-930.
     17[3] Weijers, S. R., et al. "KALLISTO: cost effective and integrated optimization of the urban wastewater system Eindhoven." Water Practice and Technology 7.2 (2012): 1-9.
     18[4] Patro, Rob, Geet Duggal, and Carl Kingsford. "Salmon: Accurate, Versatile and Ultrafast Quantification from RNA-seq Data using Lightweight-Alignment." bioRxiv (2015): 021592.
    819
    920== Project 2 - ==
    1021
     22* Literature study of gene quantification methods
     23* Designing a plan for comparing quality of quantification methods
     24* Comparison of HTSeq count, FeatureCounts, Kallisto and a selection of other available quantification methods identified in literature