Research Data Curator’s Toolbox
A Practical Guide for Data Management
This resource is currently under development. Please note that the French version is pending. For feedback, questions, or suggestions, please contact us at curators@frdr-dfdr.ca.
0.1 Welcome
This guide provides a collection of documented R workflows to support Research Data Management (RDM) from the moment data is ingested to its final archival reporting. Our goal is to empower curators to transform raw submissions into FAIR (Findable, Accessible, Interoperable, Reusable) research objects.
🛠️ Triage & Inspect
Automated analysis of file extensions, formats, and basic integrity to identify immediate risks.
🔍 Deep Validation
Specialized inspection for Tabular, Spatial, and Scientific data to ensure internal consistency.
📊 Archival Reporting
Generation of standardized curation reports, metadata summaries, and risk logs for depositors.
0.1.1 How to use this Guide
Each chapter introduces a specific curation task or file format. Within each notebook, you will find:
- Curation Objectives: What we are looking for and why it matters for preservation.
- Automated Checks: R code snippets that perform technical inspection.
- Curation Insights: Guidance on how to interpret flags and risks.
- Non-Interactive Scripts: Standalone R scripts for batch processing in HPC or server environments.
This toolbox prioritizes long-term accessibility. We emphasize detecting proprietary lock-ins, zero-byte files, and privacy risks (PII) before data is published.