wiki:ComparativeProject

Version 1 (modified by trac, 14 years ago) (diff)

--

Comparative Analysis of Genetic Screens

Summary

description:Map screened genes to human orthologs
developers:AlexandrosKanterakis?
homepage:NA
svn:NA
state:production

Introduction

The study comes from the publication: Tjakko J. van Ham, Rainer Breitling, Morris A. Swertz, Ellen A. A. Nollen. Neurodegenerative diseases: Lessons from genome-wide screens in small model organisms. EMBO Molecular Medicine Volume 1, Issue 8-9, Pages 360-370. DOI: 10.1002/emmm.200900051. http://www.embomolmed.org/view/NDc0NzI5L0pBLzE3NDE5MC9udWxs/journalArticlePdf.html

This study contains a list of screened genes belonging to three small model organisms (Drosophila, C. Elegans and S. Cerevisiae). This list can be found in the supplementary material of the article: http://www3.interscience.wiley.com/cgi-bin/fulltext/123192451/sm001.xls?PLACEBO=IE.pdf

Step 1

The purpose of the first step is to identify human orthologues to the genes published. The approach that we followed is:

  • Suppose that we have the gene study: dTPR2 from Drosophilla. We apply a query to ensembl through the following URL:http://www.ensembl.org/Drosophila_melanogaster/Search/Details?species=Drosophila_melanogaster;idx=;q=DTRP2

Then we save the produced HTML page in a file. We parse the file and extract all the nomenclatures. The important Ensembl nomenclatures for this domain is the associated peptides.

  • The associated peptides for dTRP2 in Drosophila are: FBpp0080423, FBpp0080424. From inParanoid we can download the list of all orthologues between several different species: http://inparanoid.sbc.su.se/download/current/sqltables/. For example the orthologues from Homo Sapiens and Drosophila is in the file: sqltable.D.melanogaster.fa-H.sapiens.fa. This file has entries like:
2	5453	D.melanogaster.fa	1.000	FBpp0271744	100%
2	5453	H.sapiens.fa	1.000	ENSP00000300671	99%
2	5453	H.sapiens.fa	0.089	ENSP00000262442

The first column indicates the orthologue group whereas the second indicates the number of common nucleotides. By parsing this file we can locate the human orthologue peptides of the Drosophila peptides. For example the Drosophila peptide FBpp0080423 is the orthologue to the human ENSP00000313311.

  • The next step is to identify the genes that are transcribed in these proteins. We are doing this by applying the same script that identified the different nomenclatures in Entrez. This time we applied the following URL:http://www.ensembl.org/Home_sapiens/Search/Details?species=Homo_sapiens;idx=;q=ENSP0080423

The final result is the gene names: TTC2 DANJC7 TPR2 DNAJC7 7266 DJ11 which are the orthologues to the initial Drosophila gene: dTRP2.

  • The applied scripts are: 
    • /Comparative Genomics Analysis/InParanoid?/python/ensembl.py. It applies a query to ensembl and extracts a list of all available nomenclatures. 
    • /Comparative Genomics Analysis/InParanoid?/python/inParanoid.py . Looks-up at the SQL files of the inParanoid repository. The script returns the orthologue peptides from a species to another according to inParanoid.
    •  /Comparative Genomics Analysis/InParanoid??/python/prepare.py . This script runs in top of the others.
  • Results look like this:
    study: 4 Kazemi-Esfarjani et al., 2000 (Drosophilla)
    Protein folding (PF) 0
    	gene: dTPR2
    		protein: FBpp0080423
    			ENSP00000313311 (1.0)
    				TTC2 DANJC7 TPR2 DNAJC7 7266 DJ11 
    	protein: FBpp0080424
    		None
    	gene: dHDJ1
    		protein: FBpp0076829
    			ENSP00000368026 (1.0)
    				DNAJB5 25822 Hsc40 KIAA1045 
    			ENSP00000359799 (1.0)
    				DjB4 DNAJB4 11080 HLJ1 DNAJW 
    			ENSP00000254322 (0.316)
    				DNAJB1 Sis1 3337 Hsp40 HSPF1 Hdj1 
    

Step 2

According to discussion with Helen Nollen the next step should be to perform exactly the same analysis the other way around! That is for every identified human gene, locate the small organism orthologue and see if these findings match. In case they match indicate it (the # -- $# lines). The results look like this:

study: 2  Giorgini et al., 2005 (Yeast)
	(Vesicle) transport/ER-Golgi/trafficking	Vesicular transport, vacuolar protein sorting
		gene: BFR1
			protein: YOR198C
				Orthologue:	None
		gene: CYK3
			protein: YDL117W
				Orthologue:	None
		gene: DEF1
			protein: YKL054C
				Orthologue:	None
		gene: MSO1
			protein: YNR049C
				Orthologue:	None
		gene: SNA2
			protein: YDR525W-A
				Orthologue:	None
		gene: VPS53
			protein: YJL029C
				Orthologue:	ENSP00000373692  (1.0)
					pp13624
					gene: pp13624
						protein: ENSP00000291074
							Orthologue:	None
						protein: ENSP00000373692
							Orthologue:	YJL029C  (1.0)
								853423	VPS53	#^VPS53##YJL029C##ENSP00000373692##pp13624##ENSP00000373692##YJL029C##VPS53$#

						protein: ENSP00000384294
							Orthologue:	None
						protein: ENSP00000394386
							Orthologue:	None
						protein: ENSP00000401435
							Orthologue:	None
					FLJ10979
					gene: FLJ10979
						protein: ENSP00000291074
							Orthologue:	None
						protein: ENSP00000373692
							Orthologue:	YJL029C  (1.0)
								853423	VPS53	#^VPS53##YJL029C##ENSP00000373692##FLJ10979##ENSP00000373692##YJL029C##VPS53$#

						protein: ENSP00000384294
							Orthologue:	None
						protein: ENSP00000394386
							Orthologue:	None
						protein: ENSP00000401435
							Orthologue:	None