TreeGen: File tree generator for research data

Author
Affiliation

Research Data Curation Team

Digital Research Alliance of Canada

Published

April 1, 2025

Keywords

File tree, Research Data Management, Open Science, Data sharing

What is TreeGen?

TreeGen is a desktop application designed to help researchers and data curators visualize, describe, and export the structure of folders and files in a research dataset. It offers a simple and dynamic interface to:

  • Browse a directory and display its content as a tree.
  • Add textual descriptions to each file or folder.
  • Filter out hidden files and unwanted file types.
  • Export the documented structure as a Markdown or plain text file.
Note

We designed a desktop application because some datasets can be very large to handle online.

Getting Started

Download executable files:

Windows users: Download and run the .exe file located in the dist folder.

Mac (M1) users: Download the .app app file located in the dist folder.

Run script

Prerequisites

Optional (for converting Markdown to HTML in the preview, if needed):

Install the dependencies using pip:

pip install PyQt5 humanize
# Optional:
pip install markdown

Clone the Repository

git clone https://github.com/Alliance-RDM-GDR/RDM_FileTree
cd file-tree-generator

Run the application in terminal with:

python TreeGen.py

How to use the app

1. Launch TreeGen

Double-click the application executable or run the script. The main window displays the instructions, and two panels: the file tree and the preview/export area.

2. Select a directory

Click Select Directory and choose the folder you wish to document. The file tree will populate with its contents.

3. Add descriptions

Double-click on the Description column to annotate/descripbe any file or folder. Descriptions are automatically saved in a hidden .descriptions.json file.

Tip

Provide a simple description that shows the contents of the file. In addition, use README files or codebooks to provide specific information, including context, methods, and description of variables.

4. Use filters (optional)

  • Search bar: Find specific files or folders by name.
  • Exclude extensions: Type comma-separated extensions (e.g., .tmp, .pyc) to hide from view.
  • Exclude hidden: Toggle visibility of hidden files and folders.

5. Export the file tree

Use the top buttons to export the tree:

  • Markdown (.md): Suitable for GitHub, README files, and research documentation.
  • Plain Text (.txt): Readable in any text editor.

Both formats include: - Indented folder/file tree - File sizes - Inline comments with your descriptions - A summary section

Example Output

MyDataset
β”œβ”€β”€ data.csv [ 12.3 MB ]
β”‚   <!-- Contains raw experiment results -->
β”œβ”€β”€ scripts
β”‚   β”œβ”€β”€ clean.py [ 2.1 KB ]
β”‚   β”‚   <!-- Script for data cleaning -->
β”‚   └── analyze.R [ 3.7 KB ]
└── docs
    └── README.md [ 1.2 KB ]

**Summary:**
- Total folders: 3
- Total files: 4
- Total size: 15.6 MB

About / Support

TreeGen was developed and is maintained by the curation team of the Federated Research Data Repository (FRDR).

For questions or suggestions, contact us at rdm-gdr@alliancecan.ca.

Logos of two Canadian research data repositories: FRDR (Federated Research Data Repository) and Borealis. The FRDR logo features a geometric pattern of yellow squares forming a diamond shape, with the repository name in black and gold text. The Borealis logo includes an artistic depiction of the Northern Lights over mountains and a lake, with the repository name in bold white text.

Visit FRDR or Borealis