wiki:DespoinaLog/2010/12/09

Inconsistency with biobank table (pheno & gcc project)

problem:

The bbmri.xml describes the table as follows :
<entity name="Biobank" extends="Panel">
<description>Describes one biobank cohort</description>
<field name="Acronym" nillable="true" description="Acronym for this
biobank" />
<field name="Institutes" nillable="true" type="mref"
xref_entity="Institute" mref_name="BiobankInstitute"
description="Institute that are custodian of this biobank" />
<field name="Publications" type="mref" xref_entity="Publication"
description="Publications for this biobank" />
<field name="Category" type="xref" xref_entity="OntologyTerm" />
<field name="Type" type="enum" enum_options="[core,support,other]" />
<field name="Contacts" type="mref" xref_entity="Contact"
mref_name="BiobankContact" description="Contact for this biobank" />
<field name="Topics" type="mref" xref_entity="OntologyTerm" />
<field name="Materials" type="mref" xref_entity="OntologyTerm" />
<field name="Size" nillable="true" />
<field name="LastUpdate" type="date" auto="true" />
<field name="Description" type="text" nillable="true" />
</entity>

Similarly the BBMRIupdateDatabase tries to create these fields. But
the created table in DB is missing the fields:

Institutes (mref), Publications(mref), Contacts (mref), Topics
(mref), Materials (mref).

Is this how was supposed to work? Because it causes a problem in mysql
input from the previous scheme (in pheno).

Approach 1)Solution

  • create new table biobank from pheno project as follows:

mysql> create table biobank_new select Acronym, Category, Type, Size, LastUpdate , Description, id from biobank ;

(only fields that exist in biobank in gcc project)

  • Dump and insert new table in gcc project.
    • $ mysqldump5 -u root -p bbmri_old biobank_new > biobank_new.mysql
  • Edit biobank_new.mysql and replace biobank_new (table) with biobank
    • mysql5 -u root -p bbmri < biobank_new.mysql

Approach 2)select query retrieving records in molgenis form

We need data inside INNER/LEFT JOIN tables

select count(*) as num_rows FROM Biobank INNER JOIN

Panel ON (Biobank.id = Panel.id) INNER JOIN

ObservationTarget ON (Biobank.id = ObservationTarget.id) INNER JOIN

ObservationElement ON (Biobank.id = ObservationElement.id) LEFT JOIN

Investigation AS xref_Investigation ON xref_Investigation.id = ObservationElement .Investigation LEFT JOIN

OntologyTerm AS xref_ontologyReference ON xref_ontologyReference.id = ObservationElement.ontologyReference LEFT JOIN

OntologyTerm AS xref_Species ON xref_Species.id = Panel.Species LEFT JOIN

OntologyTerm AS xref_PanelType ON xref_PanelType.id = Panel.PanelType LEFT JOIN

OntologyTerm AS xref_Category ON xref_Category.id = Biobank.Category WHERE ObservationElement.__Type = 'Biobank'

Approach 3) Fix the xls data file according to the consistencies between these entities.

xls colums :

Name	Type	Acronym	category_term	topics_term	institutes_name	contacts_lastName	size	materials_term	GWA data n =	GWA platform	GWA comments	description

try to import as it is :

Need to removes characters like , ( ) ; + . Comma and parenthesis was replaced by or ; by some spaces and plus by plus. OntologyTerm? term was replaced by name .

After uploading : error : Column 'PanelType' cannot be null

two new columns added :

__Type PanelType
OntologyTerm PanelType?

still error with Column 'PanelType' cannot be null

`After these changes import worked :

Added new tabs: ObservationElement (column name), investigation (column name), New columns at ObservedValue : investigation_name feature_name target_name, and new columns at ObservedFeature : investigation_name feature_name target_name

name name name, Biobank also needed new columns : paneltype_name category_name investigation_name

less that 50 year less that 50 year name . 

The xls could not be imported with characters like comma (,) , parenthesis ( ()) , + , 

Last modified 13 years ago Last modified on 2010-12-09T15:49:01+01:00