rna3db

A dataset for training and benchmarking deep learning models for RNA structure prediction

https://github.com/marcellszi/rna3db

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.1%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

A dataset for training and benchmarking deep learning models for RNA structure prediction

Basic Info
  • Host: GitHub
  • Owner: marcellszi
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 202 KB
Statistics
  • Stars: 47
  • Watchers: 3
  • Forks: 7
  • Open Issues: 3
  • Releases: 3
Created about 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

RNA3DB

A dataset of non-redundant RNA structures from the PDB. RNA3DB contains: - All RNA chains in the PDB, labelled with non-coding RNA families - Non-redundant clustering of the above chains, suitable for training and benchmarking deep learning models

Getting started

We provide periodically updated versions of RNA3DB in JSON format, along with several intermediate steps used to generate the files.

Additionally, as part of RNA3DB, we release the results of Infernal homology search on all RNA chains found in the PDB. For a short demonstration on how RNA3DB can be used to parse these files, see tabular_demo.

For more general help getting started, see RNA3DB's Wiki.

Download

The latest version of RNA3DB can be found under releases.

We provide the following files: - rna3db-cmscans.tar.gz [Download] - Results of two-step Infernal homology seach on all RNA chains in the PDB - See tabular_demo... - rna3db-jsons.tar.gz [Download] - All JSON files generated by RNA3DB - rna3db-mmcifs.v2.tar.xz [Download] - Hierarchical folders of the training/testing sets containing single-chain PDBx/mmCIF files - Most convenient for getting started with training and testing using RNA3DB - This format is currently experimental. If you find any problems, please submit an issue. - > Note: rna3db-mmcifs.v2.tar.xz was compressed using LMZA. Most installations of GNU tar can usually uncompress these files without an issue. If not, you may need to install XZ utils.

Generating the dataset from scratch

If you wish to build your own dataset from scratch, please see Building RNA3DB from scratch.

Owner

  • Login: marcellszi
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "RNA3DB: A dataset for training and benchmarking deep learning models for RNA structure prediction"
version: 1.1
authors:
  - given-names: "Marcell"
    family-names: "Szikszai"
  - given-names: "Marcin"
    family-names: Magnus
  - given-names: "Siddhant"
    family-names: "Sanghi"
  - given-names: "Sachin"
    family-names: "Kadyan"
  - given-names: "Nazim"
    family-names: "Bouatta"
  - given-names: "Elena
    family-names: Rivas"
url: "https://github.com/marcellszi/rna3db"
doi: "10.1016/j.jmb.2024.168552"
date-released: 2024-04-26
preferred-citation:
  type: article
  authors:
  - given-names: "Marcell"
    family-names: "Szikszai"
  - given-names: "Marcin"
    family-names: Magnus
  - given-names: "Siddhant"
    family-names: "Sanghi"
  - given-names: "Sachin"
    family-names: "Kadyan"
  - given-names: "Nazim"
    family-names: "Bouatta"
  - given-names: "Elena"
    family-names: Rivas"
  doi: "10.1016/j.jmb.2024.168552"
  journal: "Journal of Molecular Biology"
  title: "RNA3DB: A structurally-dissimilar dataset split for training and benchmarking deep learning models for RNA structure prediction"
  year: 2024

GitHub Events

Total
  • Create event: 1
  • Release event: 1
  • Issues event: 3
  • Watch event: 12
  • Issue comment event: 1
  • Push event: 2
  • Fork event: 3
Last Year
  • Create event: 1
  • Release event: 1
  • Issues event: 3
  • Watch event: 12
  • Issue comment event: 1
  • Push event: 2
  • Fork event: 3

Dependencies

.github/workflows/black.yml actions
  • actions/checkout v3 composite
  • psf/black stable composite
setup.py pypi
  • biopython *