wiki:DroparconDataManagement

Version 6 (modified by Pieter Neerincx, 10 years ago) (diff)

--

About Droparcon • Data Access • Mailing List (members only)

Data Management

Access

  • For those who work in a group that is already collaborating in Droparcon: Ask your group leader / PI to request an account by e-mailing Pieter Neerincx.
  • For those in a group that is not yet part of Droparcon: contact the steering group

Data uploaded to the sftp server can be shared freely within Droparcon. If you want to share data with external partners, you need to request and get permission from all members of the steering group, so all groups are informed of the plans.

SFTP Server

The block size on the storage is relatively large (~ 5MB). Each file -regardless of its real size- will occupy at least 1 block on the file system. Hence the system is optimised to store large files and big genomic data is preferably stored in a few large files as opposed to a multitude of small files. Typically things like many small logs, scripts, etc. should be compressed into one larger file per project for archiving.

General instructions on how to use the sftp.gcc.rug.nl server

Directory structure a.k.a. what is where

Droparcon data located on the Groningen cluster is stored in

/gcc/groups/droparcon/

We'll refer to /gcc/groups/droparcon/ as

${DROPARCON_HOME}

NOTE: When accessing the site via sftp this will be your root folder. Hence you will be chrooted to /gcc/groups/droparcon/, which will appear to be your /.

  • ${DROPARCON_HOME}/home/(your_account)/
    • Home dirs of individual users.
  • ${DROPARCON_HOME}/prm02/rawdata/
    • Raw data like for example FastQ files.
  • ${DROPARCON_HOME}/prm02/projects/
    • Processed data like ongoing and finished assemblies.