16 October 2024
Louvain-La-Neuve
Europe/Brussels timezone

Data versioning

16 Oct 2024, 14:00
2h
SUD 07 (Louvain-La-Neuve)

SUD 07

Louvain-La-Neuve

place Croix du Sud, 1348 Louvain-la-Neuve Belgium

Speaker

Damien François (UCLouvain/CISM)

Description

Everyone is familiar with code versioning, that allows recalling what modification was implementer in the code, by whom, when, and why. The same idea can be transposed to data, but requires a specific set of tools, and while Git is the de facto standard tool for code, it is not really suitable for data. Other options exist, either as a Git plugin, a standalone CLI tool, or a full-featured data management website. The landscape for data versioning will be presented in this session, with a focus on a simple to use and simple to install CLI tool: Datalad.

Contents:

  • Specific aspects of data versioning vs code versioning
  • The landscape of tools for data versioning
  • Tutorial using Datalad

Prerequisite:

  • Being able to use SSH with private keys 
  • Being familiar with a text editor 
  • Mastering the Linux command line and the GNU utilities (mkdir, cp, scp, etc.)
  • Familiarity with code versioning

Type: Hands-on
Target audience: Everyone
Must: This session is interesting for users who must process data and recall what was done to which data piece.

Presentation materials