https://github.com/biodataanalysisgroup/godel-numbering
Godel numbering
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.5%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Godel numbering
Basic Info
- Host: GitHub
- Owner: BiodataAnalysisGroup
- License: mit
- Language: R
- Default Branch: main
- Size: 15.7 MB
Statistics
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 0
- Releases: 0
Created over 5 years ago
· Last pushed about 5 years ago
https://github.com/BiodataAnalysisGroup/godel-numbering/blob/main/
[](https://mybinder.org/v2/gh/BiodataAnalysisGroup/godel-numbering/main?filepath=Biological-Sequences-and-Godel-numbers.ipynb)
[](https://opensource.org/licenses/MIT)
# Godel numbers and DNA
This material is made available under the [MIT license](https://opensource.org/licenses/MIT). Please see [LICENSE](LICENSE.md) for more details.
## Citation
- (_arXiv preprint_) Argyris Nicolaidis and Fotis Psomopoulos, "DNA coding and Gdel numbering", arXiv:1909.13574, 2019 (_[link](https://arxiv.org/abs/1909.13574)_)
## Structure
godel-numbering repository consists of four folders:
- `data`: This is where all the input files are stored.
- `R`: This is where all the R scripts are stored.
- `plots`: This is where all the output files are stored. This folder is created automatically while executing the project. There are also some examples of outputs in zip format.
- `lit`: some literature stuff
## Getting started
### Dependencies
Execute the following line to install the required packages:
- from CRAN:
```
install.packages(c("seqinr", "MASS", "ggplot2", "ggpubr", "stringr", "combinat" , "microseq"))
```
or run `install.R` script.
### Setting up
The project can be downloaded using git:
```
git clone https://github.com/BiodataAnalysisGroup/godel-numbering.git
```
### Extra files
- Download **S1_L001** and **S10_L001** samples from [E-MTAB-6962 - RNA-seq dataset](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6962/samples/), which correspond to ERR2681749_1.fastq.gz and ERR2681763_1.fastq.gz files respectively.
- **Store both files in a single directoy.**.
### Running the project
The analysis is implemented in two scripts:
- ```Biological-Sequences-and-Godel-numbers.R```
- **Specifying input**: Line 16. Specify `dataset_type` parameter depending on which dataset to run the analysis for.
- **Outputs**: Plots and statistical stuff array inside `plots` directoy.
- ```e-mtab-data.R```
- **Specifying input**:
1. Line 17. Specify the directory in your computer, in which [E-MTAB-6962 - RNA-seq dataset](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6962/samples/) files are stored.
2. Line 26: Specify whether to run the analysis for S1_L001 or S10_L001 sample.
- **Outputs**: Plots and statistical stuff array inside `plots` directoy.
In order to run the project:
1. Set the main project folder as your working directory.
2. Execute either `Biological-Sequences-and-Godel-numbers.R`or `e-mtab-data.R`, depending on which analysis you would like to run.
## Additional scripts
- `biom_data_150bp_to_fasta.R`: Converts `biom_data_150bp.rda` into .fasta file.
## Input
- Input files are stored inside `data` directory.
## Output
- Outputs are stored inside `plots` directory.
Owner
- Name: Biodata Analysis Group
- Login: BiodataAnalysisGroup
- Kind: organization
- Email: fpsom@certh.gr
- Website: https://biodataanalysisgroup.github.io/
- Repositories: 17
- Profile: https://github.com/BiodataAnalysisGroup