wiki:StoryConvertPhenoData

Context Navigation

Version 4 (modified by Erik Roos, 14 years ago) (diff)
--

As a LL data manager, I want to load data from publish layer into EAV (pheno model)

TracNav(MolgenisAppStories)?

Scrum: ticket:1067

How to demo:

Acceptance criteria:

Instructions:

Install Oracle SQL Developer using jorislops@…'s account, or make your own
Adjust date settings (tools -> preferences -> database -> NLS) to YYYY-MM-DD and time settings to YYYY-MM-DD HH.MI.SSXFF AM TZR
Make an SSH tunnel using the command: ssh -L 2000:orawise01.target.rug.nl:15300 username@…
Go to Add Connections, hit the + sign, Hostname = localhost, Port = 2000, Service Name = llpacc / llptest / llp, Username = molgenis, Password = <ask Joris, Robert or Erik>
Connect to one of the databases (prod, acc, test), go to Other Users, go to LLPOPER User, go to Views, select them all and export them to CSV
Now you have the CSV files that form the input of the convertor
VW_DICT_DATA.csv contains all the protocols and measurements
Convertor code is in molgenis_apps/Apps/lifelinespheno, build script is build_lifelines.xml (in project root)
Change the path to the CSV files (first line of ImportMapper.java in package lifelinespheno.org.molgenis.lifelines)
Run the convertor

To-do's:

Add logging (so we can see what going on when it crashes in production environment, if it ever occurs)
- Add Thread Monitor
How to handle/load/implement descriptive tables like LAB_BEPALING, this table is actually big list of Measurements with a lot of extra fields.
- options:
  - Create a new type that extends Measurement and hold the additional fields
  - Merge data into the label of Category
How to handle/load/implement the table that describes which foreign keys are used between the tables.
- The matrix viewer should know this info as well to build correct queries
Re-factor lifelines packages (it a little bit messy), remove old not used code anymore and place in descriptive packages
Remove JPA dependencies
- Many-to-many in JPA are not working properly with Labels, for example ov.setTarget_Name("x"). In JDBCMapper this is solved, but know not where and how we could best do this for JPA. This set by label should also be put into generated test
- Remove change org.molgenis.JpaDatabase interface to prevent this
- trick to prevent compilation problem in Hudson, should be changed!
- this.em = ((JpaDatabase)db).getEntityManager().getEntityManagerFactory().createEntityManager();
- Jpa reverse relation cause hudson to give compile error. Should be added to molgenis for non-jpa entities. And implement as @deprecated of throw unsupportedOperationException.
Update CSV readers to be multi threaded?
In (production) environment it's not a bad idea to put the java executable in the Oracle VM that part of the database.
Last but not least, Test if data is loaded correctly (Test from Anco).
We should make sure that the data is always loaded into the right format (this means that it always end up the right way in database).

Download in other formats:

Plain Text