| 1 | = Comparative Analysis of Genetic Screens = |
| 2 | == Summary == |
| 3 | ||description:||Map screened genes to human orthologs|| |
| 4 | ||developers:||AlexandrosKanterakis|| |
| 5 | ||homepage:||NA|| |
| 6 | ||svn:||NA|| |
| 7 | ||state:||production|| |
| 8 | |
| 9 | == Introduction == |
| 10 | The study comes from the publication: Tjakko J. van Ham, Rainer Breitling, Morris A. Swertz, Ellen A. A. Nollen. Neurodegenerative diseases: Lessons from genome-wide screens in small model organisms. EMBO Molecular Medicine Volume 1, Issue 8-9, Pages 360-370. DOI: 10.1002/emmm.200900051. http://www.embomolmed.org/view/NDc0NzI5L0pBLzE3NDE5MC9udWxs/journalArticlePdf.html |
| 11 | |
| 12 | This study contains a list of screened genes belonging to three small model organisms (Drosophila, C. Elegans and S. Cerevisiae). This list can be found in the supplementary material of the article: http://www3.interscience.wiley.com/cgi-bin/fulltext/123192451/sm001.xls?PLACEBO=IE.pdf |
| 13 | |
| 14 | == Step 1 == |
| 15 | The purpose of the first step is to identify human orthologues to the genes published. The approach that we followed is: |
| 16 | |
| 17 | * Suppose that we have the gene study: dTPR2 from Drosophilla. We apply a query to ensembl through the following URL:http://www.ensembl.org/Drosophila_melanogaster/Search/Details?species=Drosophila_melanogaster;idx=;q=DTRP2 |
| 18 | |
| 19 | Then we save the produced HTML page in a file. We parse the file and extract all the nomenclatures. The important Ensembl nomenclatures for this domain is the associated peptides. |
| 20 | |
| 21 | * The associated peptides for dTRP2 in Drosophila are: FBpp0080423, FBpp0080424. From inParanoid we can download the list of all orthologues between several different species: http://inparanoid.sbc.su.se/download/current/sqltables/. For example the orthologues from Homo Sapiens and Drosophila is in the file: sqltable.D.melanogaster.fa-H.sapiens.fa. This file has entries like: |
| 22 | |
| 23 | {{{ |
| 24 | 2 5453 D.melanogaster.fa 1.000 FBpp0271744 100% |
| 25 | 2 5453 H.sapiens.fa 1.000 ENSP00000300671 99% |
| 26 | 2 5453 H.sapiens.fa 0.089 ENSP00000262442 |
| 27 | }}} |
| 28 | The first column indicates the orthologue group whereas the second indicates the number of common nucleotides. By parsing this file we can locate the human orthologue peptides of the Drosophila peptides. For example the Drosophila peptide FBpp0080423 is the orthologue to the human ENSP00000313311. |
| 29 | |
| 30 | * The next step is to identify the genes that are transcribed in these proteins. We are doing this by applying the same script that identified the different nomenclatures in Entrez. This time we applied the following URL:http://www.ensembl.org/Home_sapiens/Search/Details?species=Homo_sapiens;idx=;q=ENSP0080423 |
| 31 | |
| 32 | The final result is the gene names: TTC2 DANJC7 TPR2 DNAJC7 7266 DJ11 which are the orthologues to the initial Drosophila gene: dTRP2. |
| 33 | |
| 34 | * The applied scripts are: |
| 35 | * /Comparative Genomics Analysis/InParanoid/python/ensembl.py. It applies a query to ensembl and extracts a list of all available nomenclatures. |
| 36 | * /Comparative Genomics Analysis/InParanoid/python/inParanoid.py . Looks-up at the SQL files of the inParanoid repository. The script returns the orthologue peptides from a species to another according to inParanoid. |
| 37 | * /Comparative Genomics Analysis/InParanoid?/python/prepare.py . This script runs in top of the others. |
| 38 | * Results look like this: |
| 39 | {{{ |
| 40 | study: 4 Kazemi-Esfarjani et al., 2000 (Drosophilla) |
| 41 | Protein folding (PF) 0 |
| 42 | gene: dTPR2 |
| 43 | protein: FBpp0080423 |
| 44 | ENSP00000313311 (1.0) |
| 45 | TTC2 DANJC7 TPR2 DNAJC7 7266 DJ11 |
| 46 | protein: FBpp0080424 |
| 47 | None |
| 48 | gene: dHDJ1 |
| 49 | protein: FBpp0076829 |
| 50 | ENSP00000368026 (1.0) |
| 51 | DNAJB5 25822 Hsc40 KIAA1045 |
| 52 | ENSP00000359799 (1.0) |
| 53 | DjB4 DNAJB4 11080 HLJ1 DNAJW |
| 54 | ENSP00000254322 (0.316) |
| 55 | DNAJB1 Sis1 3337 Hsp40 HSPF1 Hdj1 |
| 56 | }}} |
| 57 | == Step 2 == |
| 58 | According to discussion with Helen Nollen the next step should be to perform exactly the same analysis the other way around! That is for every identified human gene, locate the small organism orthologue and see if these findings match. In case they match indicate it (the #^ -- $# lines). The results look like this: |
| 59 | {{{ |
| 60 | study: 2 Giorgini et al., 2005 (Yeast) |
| 61 | (Vesicle) transport/ER-Golgi/trafficking Vesicular transport, vacuolar protein sorting |
| 62 | gene: BFR1 |
| 63 | protein: YOR198C |
| 64 | Orthologue: None |
| 65 | gene: CYK3 |
| 66 | protein: YDL117W |
| 67 | Orthologue: None |
| 68 | gene: DEF1 |
| 69 | protein: YKL054C |
| 70 | Orthologue: None |
| 71 | gene: MSO1 |
| 72 | protein: YNR049C |
| 73 | Orthologue: None |
| 74 | gene: SNA2 |
| 75 | protein: YDR525W-A |
| 76 | Orthologue: None |
| 77 | gene: VPS53 |
| 78 | protein: YJL029C |
| 79 | Orthologue: ENSP00000373692 (1.0) |
| 80 | pp13624 |
| 81 | gene: pp13624 |
| 82 | protein: ENSP00000291074 |
| 83 | Orthologue: None |
| 84 | protein: ENSP00000373692 |
| 85 | Orthologue: YJL029C (1.0) |
| 86 | 853423 VPS53 #^VPS53##YJL029C##ENSP00000373692##pp13624##ENSP00000373692##YJL029C##VPS53$# |
| 87 | |
| 88 | protein: ENSP00000384294 |
| 89 | Orthologue: None |
| 90 | protein: ENSP00000394386 |
| 91 | Orthologue: None |
| 92 | protein: ENSP00000401435 |
| 93 | Orthologue: None |
| 94 | FLJ10979 |
| 95 | gene: FLJ10979 |
| 96 | protein: ENSP00000291074 |
| 97 | Orthologue: None |
| 98 | protein: ENSP00000373692 |
| 99 | Orthologue: YJL029C (1.0) |
| 100 | 853423 VPS53 #^VPS53##YJL029C##ENSP00000373692##FLJ10979##ENSP00000373692##YJL029C##VPS53$# |
| 101 | |
| 102 | protein: ENSP00000384294 |
| 103 | Orthologue: None |
| 104 | protein: ENSP00000394386 |
| 105 | Orthologue: None |
| 106 | protein: ENSP00000401435 |
| 107 | Orthologue: None |
| 108 | }}} |