dataloading-curation
Data curation files for IPA (iReceptor Public Archive) - includes sample metadata template file
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.9%) to scientific vocabulary
Repository
Data curation files for IPA (iReceptor Public Archive) - includes sample metadata template file
Basic Info
Statistics
- Stars: 2
- Watchers: 5
- Forks: 1
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
iReceptor Data Curation
This GIT repository contains example files and documentation for loading data into iReceptor repositories. Examples for metadata as well as rearrangement files for a number of widely used annotation tools are provided. The README files in each of the subfolders contain more documentation. The Zenodo link for this release is here:
For more information on Repertoire metadata curation, please refer to: * The iReceptor Metadata documentation * The iReceptor metadata example * The AIRR repertoire metadata example
For more details on Rearrangement data curation, please refer to: * The test data set documentation * The AIRR Rearrangement format (including igblast) example * The MiXCR Rearrangement format example * The IMGT V-QUEST Rearrangement example
For more details on Clone data curation, please refer to: * The AIRR Clone format example * The MiXCR Clone format example
For more details on Cell and Expression (GEX) data curation, please refer to: * The AIRR Cell format example
The iReceptor Data Curation process
The iReceptor team follows a relatively strict data curation process. This process is documented on the iReceptor Curation page. We do not discuss this process in detail here, but instead suggest simple processes that can make data curation easier to manage.
The iReceptor curation process is focused around the curation of data for a single study. As such, we recommend that all data that is being curated for a specific study be stored in a single directory. As an example, we will use one of the IMGT example data sets.
As mentioned, we recommended that all files relevant to the curation of data from a single study be located in a single directory. This would include the Repertoire Metadata file for the study as well as all of the Rearrangement, Clone, Cell, and Expression files for each Repertoire. In the case of the IMGT example, this includes a single metadata file (PRJNA248411Palanichamy2018-12-18.csv). We tend to structure the metadata file name using the studies Study ID from NCBI, the principal (or contact) author, and the date the file was last modified. In addition, for each of the 8 sample repertoires in the study, in this case there is a single IMGT annotation file. Again, we use the NCBI accession number for the file in the filename to help manage the data. Note that it is possible to have more than one file for a single repertoire. Both the Repertoire Metadata file and the iReceptor Data Loader support having multiple files per repertoire sample.
Given the above structure, it is quite simple to use the iReceptor Data Loading code to load AIRR-seq data in such a form. Please refer to the iReceptor Turnkey Documentation for examples on how to load these data sets.
Owner
- Name: iReceptor
- Login: sfu-ireceptor
- Kind: organization
- Location: Vancouver, Canada
- Website: http://www.ireceptor.org
- Repositories: 12
- Profile: https://github.com/sfu-ireceptor