wiki:StoryConvertPhenoData

Version 9 (modified by Erik Roos, 13 years ago) (diff)

--

As a LL data manager, I want to load data from publish layer into EAV (pheno model)

TracNav(MolgenisAppStories)?

Scrum: ticket:1067

How to demo:

Acceptance criteria:

Instructions:

  • Install Oracle SQL Developer using jorislops@…'s account, or make your own
  • Adjust date settings (tools -> preferences -> database -> NLS) to YYYY-MM-DD and time settings to YYYY-MM-DD HH.MI.SSXFF AM TZR
  • Make an SSH tunnel using the command: ssh -L 2000:orawise01.target.rug.nl:15300 username@…
  • Go to Add Connections, hit the + sign, Hostname = localhost, Port = 2000, Service Name = llpacc / llptest / llp, Username = molgenis, Password = <ask Joris, Robert or Erik>
  • Connect to one of the databases (prod, acc, test), go to Other Users, go to LLPOPER User, go to Views, select them all except VW_BEP_OMSCHR and VW_DICT_HULP, and export them to CSV (separate files)
  • Now you have the CSV files that form the input of the convertor
  • VW_DICT_DATA.csv contains all the protocols and measurements
  • Convertor code is in molgenis_apps/Apps/lifelinespheno, build script is build_lifelines.xml (in project root)
  • Uncomment ImportMapper.java and all four Java files in 'listeners' package.
  • Change the path to the CSV files (line 46 of ImportMapper.java in package lifelinespheno.org.molgenis.lifelines)
  • Run the convertor as Java application
  • If you encounter errors, find out what the last part of "our" code was where it went wrong. Probably it's an issue with an entity that's not saved before being flushed. Go to the offending line and find out which entity is the problem. Scroll to the right in the Console to find out which Many-to-One relation of the Entity is the cause. Go to the generated Java class of the entity and uncomment the "cascade" part in the "@ManyToOne?" line just above the line where the relation is defined. Re-run.

Tips from Joris:

  • Acceptatie moet je opnieuw data inladen!
  • Test is al ingeladen!
  • Vergeet niet de persistence.xml aan te passen voor het data inladen van <property name="hibernate.hbm2ddl.auto" value="none"/>

naar <property name="hibernate.hbm2ddl.auto" value="create-drop"/> of iets anders als er een nieuwe set komt!

  • Even eendachtpuntje voordat je de applicatie start vergeet niet dit terug te zetten anders is je database leeg!
  • Aanpassen naar juiste omgeving <property name="javax.persistence.jdbc.url" value="jdbc:oracle:thin:@//localhost:2000/llptest"/> met b.v.

jdbc:oracle:thin:@//orawise01.target.rug.nl:15300/llptest of llpacc

  • Test- en Acceptatieomgevingen: application23 en 24

Short-term to do's (according to Joris)

  • Spss exporter
  • filter opnemen in export
  • opruimen van source-code (maar dit wil ik ook wel doen als ik terug ben), met de schop erdoorheen (alles wat met de oude importer te maken heeft kan de prullenbak in alleen sommige dingen worden gebruikt in de nieuwe importer). Dit moet dus aangepast worden.
  • de source code is een bende, maar dit komt doordat alles in het werk is gesteld om iets werkends te krijgen, voor een lijst van todo's die hier mee te maken hebben zie de wiki.

Wil je de filters van de matrix meenemen in de export dan die je dit te activeren door de juiste variablen mee te sturen.

jqGrid.html
//        var myUrl = jQuery("#list").jqGrid('getGridParam', 'url');
//        myUrl += "?myextraparam=something";
//        var postData = jQuery("#list").jqGrid('getGridParam', 'postData');
//        $.each(postData, function(key, value) {
//          myUrl += "&"+key+"="+encodeURIComponent(value);
//        });

Dit stukje code kan alle paramaters uit het grid halen en kan dus gebruikt worden om de filters ook toe te passen op export.

To-do's / issues:

  • Dates
  • BEZOEK_ID (PK?) empty in LL_VW_BEZOEK
  • The data in the Test and Acceptation environments are not the same
  • Add logging (so we can see what going on when it crashes in production environment, if it ever occurs)
    • Add Thread Monitor
  • How to handle/load/implement descriptive tables like LAB_BEPALING, this table is actually big list of Measurements with a lot of extra fields.
    • options:
      • Create a new type that extends Measurement and hold the additional fields
      • Merge data into the label of Category
  • How to handle/load/implement the table that describes which foreign keys are used between the tables.
    • The matrix viewer should know this info as well to build correct queries
  • Re-factor lifelines packages (it a little bit messy), remove old not used code anymore and place in descriptive packages
  • Remove JPA dependencies
    • Many-to-many in JPA are not working properly with Labels, for example ov.setTarget_Name("x"). In JDBCMapper this is solved, but know not where and how we could best do this for JPA. This set by label should also be put into generated test
    • Remove change org.molgenis.JpaDatabase interface to prevent this
    • trick to prevent compilation problem in Hudson, should be changed!
    • this.em = ((JpaDatabase)db).getEntityManager().getEntityManagerFactory().createEntityManager();
    • Jpa reverse relation cause hudson to give compile error. Should be added to molgenis for non-jpa entities. And implement as @deprecated of throw unsupportedOperationException.
  • Update CSV readers to be multi threaded?
  • In (production) environment it's not a bad idea to put the java executable in the Oracle VM that part of the database.
  • Last but not least, Test if data is loaded correctly (Test from Anco).
  • We should make sure that the data is always loaded into the right format (this means that it always end up the right way in database).