24 January 2023
Louvain-La-Neuve
Europe/Brussels timezone

Data versioning

24 Jan 2023, 16:00
1h
BST: Pasteur (Louvain-La-Neuve)

BST: Pasteur

Louvain-La-Neuve

Place Louis-Pasteur, 1346 Louvain-La-Neuve

Speaker

Damien François (UCLouvain/CISM)

Description

Everyone is familiar with code versioning, that allows recalling what modification was implementer in the code, by whom, when, and why. The same idea can be transposed to data, but requires a specific set of tools, and while Git is the de facto standard tool for code, it is not really suitable for data. Other options exist, either as a Git plugin, a standalone CLI tool, or a full-featured data management website. The landscape for data versioning will be presented in this session, with a focus on a simple to use and simple to install CLI tool: Datalad.

Contents:

  • Specific aspects of data versioning vs code versioning
  • The landscape of tools for data versioning
  • Tutorial using Datalad

Prerequisite:

  • Being able to use SSH with private keys 
  • Being familiar with a text editor 
  • Mastering the Linux command line and the GNU utilities (mkdir, cp, scp, etc.)
  • Familiarity with code versioning

Type: Hands-on
Target audience: Everyone
Must: This session is interesting for users who must process data and recall what was done to which data piece.

Presentation materials