iRODS#

Data and metadata is stored in iRODS in a predefined structured manner according to the UNLOCK ontology. iRODS is a open source data management software solution in which all project related experimental data is captured, backed-up and preserved.

Data management#

What kind of data is stored in iRODS#

The aim is to capture all experimental data that is generated during an experiment. This can for example be data obtained from bioreactor measurements, proteomics or standard sequencing procedures. Through standardisation procedures the data should become better organised, containing essential metadata elements and loss or corruption of data should not happen anymore. Currently, standards for genomics, transcriptomics and amplicon sequencing are in place. Other formats will become available when an agreement on standardisation has been made.

Accessing the iRODS environment#

WebDav#

To directly access the iRODS environment you can use the webdav protocol with your own credentials. In any browser you should be able to access https://data.m-unlock.nl combined with your personal credentials. It is also possible to mount the iRODS instance as a network drive using the webdav protocol.

On a Mac you can mount a network drive through the Finder interface. To mount the drive open the Finder and press CMD + K, this gives a popup in which you can paste the iRODS URL. When the connect button is clicked you fill in your credentials and the irods instance should be mounted as a network drive in the finder.

iCommands#

To access the iRODS environment using iCommands you can use an irods_environment.json file within your .irods folder. The content of the irods_environment.json file contains additional information about the encrypted connection as can be seen below.

cat ~/.irods/irods_environment.json 
{
    "irods_authentication_scheme": "pam_password",
    "irods_client_server_negotiation": "request_server_negotiation",
    "irods_client_server_policy": "CS_NEG_REQUIRE",
    "irods_encryption_algorithm": "AES-256-CBC",
    "irods_encryption_key_size": 32,
    "irods_encryption_num_hash_rounds": 16,
    "irods_encryption_salt_size": 8,
    "irods_host": "data.m-unlock.nl",
    "irods_port": 1247,
    "irods_user_name": "<SRAM username",
    "irods_zone_name": "unlock"
}


This file needs to be stored in your home folder in a ~/.irods folder.

When you have access to the iCommands in your system you can authenticate yourself by typing iinit and when prompted provide your password. If you are using a system without iCommands you can use the docker image that we have created and mount the irods folder to the docker system by using the following command:

docker run -it --entrypoint /bin/bash -v ~/.irods:/root/.irods docker-registry.wur.nl/m-unlock/docker/irods:latest

This will start an interactive docker session using the Unlock docker image and it mounts the irods folder located at ~/.irods to the /root/.irods folder. When the image is started you can authenticate by using iinit inside the docker container. If the container has write access to the mounted folder this authentication will remain preserved when you logout of the docker instance.

How is the data structured within iRODS#

Within iRODS the data structure is very similar to the metadata registration structure. Read access is granted to people that are associated to the project and/or the investigation.

It will start with the landing directory of iRODS which is a zone. On this zone there will be a Project folder in which all your projects are available.

For example:

ils /unlock/Projects/

/unlock/Projects:
  C- /unlock/Projects/PRJ_EXPLODIV
  C- /unlock/Projects/PRJ_FIRM-Broilers
  C- /unlock/Projects/PRJ_MDB-MM
  C- /unlock/Projects/PRJ_TIM2_reproducibility

Within a project folder you will find the original excel sheet that was used for the data registration as well as a database file to query your project information.

ils /unlock/Projects/PRJ_NWO_unlock_test

/unlock/Projects/PRJ_NWO_unlock_test:
  NWO_unlock_test.ttl
  NWO_unlock_test.xlsx
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier
  C- /unlock/Projects/PRJ_NWO_unlock_test/PROVENANCE
  C- /unlock/Projects/PRJ_NWO_unlock_test/References

To download a file you can use the iget command and it will place the file in the current directory.

iget /unlock/Projects/PRJ_NWO_unlock_test/NWO_unlock_test.xlsx

To list all the files and folders recursively you can use the ils -r command on a given path.

ils -r /unlock/Projects/PRJ_NWO_unlock_test

/unlock/Projects/PRJ_NWO_unlock_test:
  NWO_unlock_test.ttl
  NWO_unlock_test.xlsx
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier
/unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier:
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier
/unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier:
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1
/unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1:
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon
/unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon:
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon/A_amp1bx
/unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon/A_amp1bx:
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon/A_amp1bx/Unprocessed
/unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon/A_amp1bx/Unprocessed:
  amp1bx.ttl
  G76494_R1_001.fastq.gz
  G76494_R2_001.fastq.gz
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon/A_amp2bx
/unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon/A_amp2bx:
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon/A_amp2bx/Unprocessed
/unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon/A_amp2bx/Unprocessed:
  amp2bx.ttl
  G76494_R0_001.fastq.gz
  C- /unlock/Projects/PRJ_NWO_unlock_test/INV_Investigation_Identifier/STU_Study_Identifier/OBS_ObservationUnit_1/Amplicon/A_Mc.1.1.l01
... (and the list continues)