Skip to Main Content

TRU Borealis: Data Repository

Curation Checklist

Before your data is published, a TRU Borealis Librarian will use the following list to ensure that your Dataset and metadata is ready for publication. Note that your data itself will not be curated; however, we recommend that you ensure your data is organized and usable.

Curation Checklist

  1. All the required metadata fields are filled out correctly.
  2. File names are internally consistent.
  3. License agreements and Terms of Use for the Dataset agree with each other.
  4. A ReadMe file is included with the dataset that includes the required information.
  5. Other metadata fields that are filled out are consistent and correct.
  6. Restricted and/or embargoed files are intentional and correct.

Curation Checklist: Details

Curation Checklist for Depositors

1. All the required metadata fields are filled out correctly.

Refer to the metadata tab for instructions on metadata.

2. File names are internally consistent.

Generally, all file names should follow the same structure. This will help anyone who is using your data navigate between files and understand their relationships. There are many different conventions for naming files.

2022-12-23_ProjectName_DescriptiveFileName_v.1.1

This is an example of a file naming convention. Dates in YYYYMMDD order will sort chronologically. ProjectName indicates the project that the file is attached to. DescriptiveFileName will tell the user what information they can find in the file. Version numbers will ensure that the latest version of the file is used.

File naming conventions guidelines: https://datamanagement.hms.harvard.edu/plan-design/file-naming-conventions  

3. License agreements and Terms of Use for the dataset agree with each other.

Licenses that are assigned to Datasets should match any technical Terms of Use. For example, if the license assigned to the Dataset is CC0, anyone should be able to access and use the data. Therefore, the Terms of Use should not require registration to Borealis.

4. A ReadMe file is included with the dataset that includes the required information.

A “ReadMe” file will help research data be read and interpreted correctly. A plain text (.txt) file should be created to provide information for the dataset. Required information to include in the ReadMe file:

  • Details about dataset creation (for example, dates of data collection, methodological information)
  • Description of data files (for example, definitions of any abbreviations used in data files)
  • Information about dataset completeness (for example, is there data that has been excluded from the Borealis collection due to its sensitive nature)

Basic ReadMe file template: https://cornell.app.box.com/v/ReadmeTemplate

5. Other metadata fields that are filled out are consistent and correct.

There are many other metadata fields that can be filled out beyond the required ones listed above. These will provide additional context to anyone viewing your data and help your data be findable.

Refer to the Dataverse North Metadata Best Practices Guide for guidelines and examples of all metadata fields: https://zenodo.org/record/5668945#.ZCsTWPbMK5d

6. Restricted and/or embargoed files are intentional and correct.

The librarian will confirm with the depositor that any restricted files are intentionally restricted, and that any embargoed files are intentionally embargoed with the correct embargo end date.