tidy_bib
Tidy_bib automates the preprocessing of .bib files exported from databases like Web of Science and Scopus, preparing the data for analysis in biblioshiny or for enrichment via APIs (Crossref, OpenAlex, ROR).
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.9%) to scientific vocabulary
Last synced: 7 months ago
·
JSON representation
·
Repository
Tidy_bib automates the preprocessing of .bib files exported from databases like Web of Science and Scopus, preparing the data for analysis in biblioshiny or for enrichment via APIs (Crossref, OpenAlex, ROR).
Basic Info
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Created 8 months ago
· Last pushed 8 months ago
Metadata Files
Readme
License
Citation
README.Rmd
---
output: github_document
---
# 📚 Tidy\_bib
> A modular and reproducible pipeline for cleaning and organizing bibliographic data using R.
---
## 🎯 Purpose
`Tidy_bib` automates the preprocessing of `.bib` files exported from databases like **Web of Science** and **Scopus**, preparing the data for analysis in `biblioshiny` or for enrichment via APIs (Crossref, OpenAlex, ROR).
---
## 🗂️ Project Structure
```text
Tidy_bib/
├── R/ # Function scripts (e.g., safe_convert2df.R)
├── data/ # Raw files (.bib)
├── output/ # Cleaned data, spreadsheets, logs
├── tests/ # Unit tests with testthat
├── docs/ # Additional documentation or generated reports
├── run_pipeline.R # Main execution script
├── config.yaml # Configuration file
├── README.Rmd # This file
├── .gitignore # Files to ignore in version control
└── renv/ # Dependency management
```
---
## ⚙️ How to Run the Pipeline
### 1. Install required packages:
```r
install.packages(c("bibliometrix", "yaml", "here", "fs", "dplyr"))
```
### 2. Run the pipeline:
```r
source("R/convert_bib_files_to_df.R")
source("R/safe_convert2df.R")
source("run_pipeline.R")
initialize_pipeline("config.yaml")
```
---
## 🧪 Testing
This project uses `testthat`. To run the tests:
```r
devtools::test()
```
---
## 👥 Contributing
1. Fork the repository
2. Create a branch: `git checkout -b new-feature`
3. Commit your changes: `git commit -m "feat: add new feature"`
4. Push to your branch: `git push origin new-feature`
5. Open a pull request
---
## 🔒 License
This project is licensed under the MIT License. See the `LICENSE` file for details.
---
## 📌 Citation
If you use this project, please cite as:
```
Arraes, D. (2025). Tidy_bib: A modular bibliographic cleaning pipeline in R. https://github.com/danielbrazil303/Tidy_bib
```
Owner
- Login: danielbrazil303
- Kind: user
- Repositories: 1
- Profile: https://github.com/danielbrazil303
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this project, please cite it as below."
title: "Tidy_bib: A modular bibliographic cleaning pipeline in R"
version: "0.1.0"
authors:
- family-names: Arraes
given-names: Daniel
email: dan.arraes@uece.br
affiliation: Universidade Estadual do Ceará (UECE)
orcid: "https://orcid.org/0000-0003-0697-2268"
date-released: 2025-07-30
license: MIT
repository-code: https://github.com/danielbrazil303/Tidy_bib
GitHub Events
Total
- Push event: 2
- Create event: 1
Last Year
- Push event: 2
- Create event: 1
Dependencies
DESCRIPTION
cran
- R >= 4.2.0 depends
- bibliometrix * imports
- dplyr * imports
- fs * imports
- glue * imports
- here * imports
- stringr * imports
- yaml * imports
- openxlsx * suggests
- renv * suggests
- testthat >= 3.0.0 suggests