datalad-hirni provides a command to import a DICOM tarball for an acquisition into a study dataset. The command to be used for that is:

datalad hirni-import-dcm [TARBALL] [ACQUISITION ID]

from within your study dataset. This will result in a subdirectory ACQUISITION ID in your study dataset and a subdataset dicoms beneath with the DICOMs in it, that provides the DICOM metadata for easy access via datalad. In addition a prefilled specification for each series in the tarball is created and stored in the acquisition's specification file .

If you don't provide an acquisition identifier, a name like xx99_0123 will be determined from the DICOM metadata. By default this is the value of field PatientID. However, there are special rules in place (and to further be developed) to be applied based on the scanner the data was acquired with. This mechanism allows for different rules per scanner, institution or any other category that can be identified by the DICOM metadata.

Use --subject to provide a subject ID. Otherwise the import routine will try to derive the subject ID from the DICOM metadata. A typical case would be the aforementioned acquisition xx99_0123 with a corresponding subject ID xx99. Optionally, you can use --anon-subject to additionally provide an anonymized subject ID. When converting the dataset to BIDS, the switch --anonymize will then determine which subject ID to use for the converted dataset.

Generally, the option --properties allows to add and/or overwrite the specification to be created for this acquisition by passing either a JSON string or a path to a JSON file. Thereby you can assign a task label for example, if it can not be derived from the DICOM metadata by the rules inplace:

datalad hirni-import-dcm --subject xx99 --anon-subject 007 \
--properties '{"bids_task": "dofancystuff", \
               "comment": "something unusual happened during this acquisition"}' \

Query DICOM metadata

DICOM datasets that have been imported into a study raw dataset can (additionally) be collected in scanner (or institution or lab) specific superdatasets. This allows for convenient record keeping of all relevant MR data acquisitions ever made in a given context. The example script at the bottom of this page shows how to bootstrap such a database.

Such superdatasets are lightweight, as they do not contain actual imaging data, and can be queried using a flexible language. In the DICOM context it is often desired to limit the amount of metadata to whole datasets and their image series. This can be achieved using the following configuration, which only needs to be put into the top-most dataset, not every DICOM dataset:

% cat .datalad/config
[datalad "dataset"]
        id = 349bb81a-1afe-11e8-959f-a0369f7c647e
[datalad "search"]
        index-autofield-documenttype = datasets
        default-mode = autofield

With this setup the DataLad search command will automatically discover metadata for any contained image series, and build a search index that can be queried for values in one or more individual DICOM fields. This allows for a variety of useful queries.

Example queries

Report scans made on any male patients in a given time span:

% datalad search dicom.Series.AcquisitionDate:'[20130410 TO 20140101]' dicom.Series.PatientSex:'M'
search(ok): lin/7t/xx99_2022/dicoms (dataset)

Report any scans for a particular subject ID:

% datalad search 'xx99*'
[INFO   ] Query completed in 0.019682836998981657 sec. Reporting up to 20 top matches.
search(ok): lin/7t/xx99_2022/dicoms (dataset)
search(ok): lin/7t/xx99_2014/dicoms (dataset)
search(ok): lin/7t/xx99_2015/dicoms (dataset)
search(ok): lin/3t/xx99_0138/dicoms (dataset)
search(ok): lin/3t/xx99_0139/dicoms (dataset)
search(ok): lin/3t/xx99_0140/dicoms (dataset)
action summary:
  search (ok: 6)

For each search hit ALL available metadata is returned. This allows for sophisticated output formating. Here is an example that reports all studies a particular subject has participated in:

% datalad -f '{metadata[dicom][Series][0][StudyDescription]}' search -f 'xx99*' | uniq
[INFO] Query completed in 0.02244874399912078 sec. Reporting up to 20 top matches.

Demo script to bootstrap a DICOM database from scan tarballs

The following script shows how a bunch of DICOM tarballs from two different scanners can be imported into a DataLad superdataset for each scanner. Those two scanner datasets are than assembled into a joint superdataset for acquisition hardware of the institution. Metadata from any acquisition session can then be aggregated into this dataset, to track all acquisitions made on those devices, as well as to be able to query for individual scan sessions, DICOM series, or individual DICOM images (see above for query examples).

# create a super dataset that will have all acquisitions the 7T ever made
datalad rev-create 7t
cd 7t
datalad run-procedure setup_hirni_dataset
# import a bunch of DICOM tarballs (simulates daily routine)
datalad hirni-import-dcm \
datalad hirni-import-dcm \
datalad hirni-import-dcm \

# done for now
cd ..
# now the same for 3t
datalad rev-create 3t
cd 3t
datalad run-procedure setup_hirni_dataset
# import a bunch of DICOM tarballs
datalad hirni-import-dcm \
datalad hirni-import-dcm \
datalad hirni-import-dcm \

# done
cd ..

# one dataset for the entire institute's scan (could in turn be part of one that also
# includes other modalities/machines)
# this first part only needs to be done once
datalad rev-create lin
cd lin
datalad install -d . -s ../7t
datalad install -d . -s ../3t

# this second part needs to be done everytime the metadata DB shall be updated
# get the latest state of the scanner datasets (no heavy stuff is moved around)
datalad update --merge -r
# aggregate from the aggregated metadata
datalad aggregate-metadata -r
# ready to search