FAIRy focuses on the common reasons datasets get delayed or rejected during submission. We're modeling these checks on patterns from public repositories like GEO and Zenodo.
GEO (Gene Expression Omnibus)
Checks for missing required fields, filename patterns that include the accession ID, basic platform / sample annotations, and other issues that commonly cause GEO submissions to bounce.
Zenodo
Flags missing descriptive metadata, unclear licensing, and file organization issues that make it hard to publish a clean record.
Validation categories
- Metadata completeness: Required fields, data types, and format validation
- File organization: Naming conventions, directory structure, and file formats
- Repository-style expectations: sample/platform annotations, organism/host fields, accession-aware filenames, and other elements commonly required at submission time
- Data integrity: Checksums, file sizes, and format validation
- Reuse signals: license clarity, contact information, and basic attribution info so a curator (or future user) knows who to reach and how it can be shared
Repository-style expectations
These checks are modeled on common reject reasons from public repositories like GEO and Zenodo (missing required fields, bad filenames, nonstandard dates). This is not an official submission approval.
| FAIRy check | GEO-style requirement | Zenodo-style requirement | Status |
|---|---|---|---|
| Metadata completeness | GEO submission guide | Zenodo metadata guide | ✓ Passed |
| File naming convention | GEO file naming | Zenodo file naming | ⚠ Warning |
| Date format standardization | GEO date format | Zenodo date format | ⚠ Warning |
| Required fields | GEO required fields | Zenodo required fields | ✗ Failed |
What's the attestation file?
FAIRy generates an attestation file that documents your validation process. You can attach this file to the dataset bundle when you hand it to a curator, a journal, or a program officer. It's your "we actually checked this" receipt.
Why attestation matters
The attestation file provides documented proof that validation was performed, which is valuable for:
- Institutions: Demonstrate that you have records of validation performed before submission, reducing administrative back-and-forth.
- Journals: Show that data quality checks were performed using standardized validation rules and versioned rulepacks—demonstrating due diligence.
- Grant panels: Prove that your institution has processes in place to streamline data deposition and reduce friction for data publication.
What the attestation file includes
- FAIRy version and rulepack used: Documents which validation rules were applied and in which version.
- Validation timestamp: Records when the validation was performed.
- Summary of checks performed: Lists what was validated (e.g., dates normalized to ISO 8601, IDs validated, units standardized, ORCIDs present and well-formed).
- File hashes and manifest information: Provides SHA-256 checksums for data files to verify integrity.
- Repository dry-run results: Shows whether the dataset passed preflight checks for specific repositories (e.g., GEO, Zenodo).
Sample attestation file
You can download a sample attestation file to see what it looks like:
Download sample attestation file (FAIRy_attestation_example.json)
Learn more about how attestation helps with compliance and due diligence in our institutions documentation.
Note: This is illustrative; production attestation files include a signed JSON format for institutional deployments.