dataanalysistemplate
ππA template to perform data analysis in R using #Rmarkdown. It produces a template paper in .md, .tex, .pdf, .doc, .html
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
βCITATION.cff file
Found CITATION.cff file -
βcodemeta.json file
Found codemeta.json file -
β.zenodo.json file
Found .zenodo.json file -
βDOI references
-
βAcademic publication links
-
βAcademic email domains
-
βInstitutional organization owner
-
βJOSS paper metadata
-
βScientific vocabulary similarity
Low similarity (12.1%) to scientific vocabulary
Keywords
Repository
ππA template to perform data analysis in R using #Rmarkdown. It produces a template paper in .md, .tex, .pdf, .doc, .html
Basic Info
- Host: GitHub
- Owner: ygalanak
- Language: TeX
- Default Branch: main
- Homepage: https://www.yannisgalanakis.com/other/templates/#data-analysis-template-a-reproducibility-universe
- Size: 11.1 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
Data Analysis Template: A reproducibility universe
This template is based on Andrew Heiss' Global-Pandoc-Files and Portable Pandoc Magic to convert Markdown-based documents into Word (docx through odt), HTML, and PDF (through xelatex).

Project management
- :file_folder: tex_out: LaTeX support files
- :file_folder: pandoc: (1) portable pandoc filters, (2) templates, (3) necessary fonts and (4) ad hoc scripts
- :file_folder: scripts: R scripts of the analysis - they are called in the preamble of the
rmd-paper.Rmd. - :file_folder: sections: include all other (external)
.mdfiles necessary for the paper (e.g./sections/introduction.mdorsections/conclusion.md).
Log of changes (relative to Andrew's portable version)
- Added
pandoc-crossref.exein the project folder. pandoc/templates/xelatex.texandpandoc/templates/xelatex-manuscript.texreplace fontIncosoloataGowithIncosolata.- Github repo includes
tex_out/rmd-paper.pdf, but not any other supporting output files from.tex.
Contents
Installation
You'll need to install these things:
- pandoc: Install either with
brew install pandocor by downloading it from pandoc.org. - make: The workhorse behind all the conversion is
make, which uses thisMakefileto generate different pandoc incantations. On macOS, open Terminal and runxcode-select --installto install a handful of developer tools, includingmake. On Windows, follow these instructions. - R: Needed to convert R Markdown files to Markdown (if you're using R Markdown). Also needed to get word count when you run
make count. Download and install from r-project.org. Ensure you have the following packages installed: tidyverse, knitr, rvest, and stringi (runinstall.packages(c("tidyverse", "knitr", "rvest", "stringi"))) - Python 3: Install either with
brew install pythonor by downloading it from python.org. - TeX: If you want to do anything with PDFs, install LaTeX. It's easiest to just install the massive MacTeX distribution on macOS (or some Windows distribution if you use Windows).
- pandoc-include: Filter for inserting external Markdown files with syntax like
!include path/to/file.md. Install withpip install pandoc-include. - pandoc-citeproc: Filter for dealing with bibliographies. Install with
brew install pandoc-citeproc. It comes with pandoc if you install it from pandoc.org. - pandoc-crossref: Filter for creating "Figure 1" and "Table 3" cross references. Install with
brew install pandoc-crossref. Alternatively, export the.exein the same folder of your project's directory. - bibtool: Script for parsing and dealing with BibTeX files. Used for extracting cited references into a standalone
.bibfiles when you runmake bib. Install withbrew install bib-tool. - gawk: The version of awk that comes with macOS by default doesn't work correctly with the script that inserts git commit information in the footer of PDFs. Install a more recent one with
brew install gawk. - LibreOffice: Open source clone of Microsoft Office. Used for converting
.odtfiles to.docxwhen you runmake docx. Install by downloading their installer. - Fonts: There are a bunch of fonts included in the
pandoc/fonts/folder. Install these as needed - ideally for all users. If not installed for all users, you may need to repeat this step in the future.
β οΈ This template is not fully portable. Some changes to specific to your computer directories are required.
Usage
STEP 1: Install the Installation items
If done STEP 1 before, no need to repeat. Note that if you haven't installed the pandoc/fonts for all users, you may experience issues in compiling. In that case, you may reinstall the fonts.
STEP 2:
Create a Markdown file (or R Markdown file) in some directory. Place
Makefileand thepandocfolder in the same directory. If you're using a bibliography, include a BibTeX file in the same directory. The directory should look like this:text . βββ Makefile βββ manuscript.Rmd βββ references.bib βββ sections βββ introduction.md βββ conclusions.md βββ data βββ derived βββ manual βββ raw βββ scripts βββ pandoc βββ bin βββ csl βββ fonts βββ templatesOpen
Makefileand change theSRCandBIB_FILEvariables to match your (R) Markdown file and BibTeX file names.Change any of the other modifiable variables in
Makefile; e.g.ENDFLOATorBLINDED.Use your
scripts/to perform your empirical analysis in R (or any other software required).Save your data frames in
data/derived/(as.csvor.rds) or store R objects as.RData. For the latter, see here.Write stuff in your (R) Markdown file. I recommend you make changes only to your
rmd-paper.Rmd.To convert from Markdown to something else, open a terminal window to your main directory and type
make htmlormake docx, etc. Here are all the different things you can include aftermake:
- `make md`: Convert R Markdown to regular Markdown
- `make html`: Create HTML file
- `make tex`: Create nice PDF through xelatex in `TEX_DIR` folder
- `make mstex`: Create manuscripty PDF through xelatex in `TEX_DIR` folder
- `make odt`: Create ODT file
- `make docx`: Create Word file (through LibreOffice)
- `make ms`: Create manuscripty ODT file
- `make msdocx`: Create manuscripty Word file (through LibreOffice)
- `make bib`: Extract bibliography references to a standalone `.bib` file
- `make count`: Count the words in the manuscript
- `make clean`: Remove all output files
- `make all`: Creates all the files above
Through the magic of `make`, you can combine any of these, like `make html docx tex` or `make html msdocx mstex`, etc.
make texmay need several times to run until it produces a
- That's it! Write more, run
make SOMETHINGagain, write more, runmake SOMETHINGagain, and so on until you have a beautiful final document.
Example
There are two complete minimal examples included in this repository: md-paper.md (regular Markdown) and rmd-paper.Rmd (R Markdown). Change the SRC variable to match one of their names, run make SOMETHING, and see what happens.
rmd-paper.md is generated from rmd-paper.Rmd. Only edit the .Rmd file, not the .md file.
Miscellaneous
Including external files
You can include other Markdown files (like tables generated from R, for instance) using the following syntax:
text
!include path/to/file.md
Blinding
If you set BLINDED = TRUE in the Makefile, a Python script named accecare.py will run before compiling the document. Look at the CSV file in pandoc/bin/replacements.csv to see how to blind specific words and phrases. See the documentation for accecare.py here.
Version control
If you set VC_ENABLE = TRUE in the Makefile, the current git commit will be included in the footer of your PDF only when running make tex (make mstex doesn't do this). Make sure you have a git-repo entry in your YAML front matter so that it can create a link to that commit at GitHub, GitLab, etc.
"Figure 1 here"
You can move all the figures and tables to the end of the document by setting ENDFLOAT = TRUE in the Makefile. Some journals have this horribly backwards requirement Β―\_(γ)_/Β―. This only happens when running make mstex; if you need all the figures and tables at the end of a Word file, you'll have to do it manually.
PNG conversion
Word and HTML can choke on PDF images, so those targets use a helper script (pandoc/bin/replace_pdfs.py) to replace all references to PDFs with PNGs andβif neededβconvert existing PDFs to PNG using sips. However, there are times when it's better to not convert to PNG on the fly, like when using high resolution PNGs exported from R with ggsave() and Cairo. To disable on-the-fly conversion and supply your own PNGs, use PNG_CONVERT = --no-convert. The script will still replace references to PDFs with PNGs, but will not convert the PDFs.
Cross references and knitr/R Markdown
You can embed plots in documents automatically. BUT it does not play well with pandoc-crossref.
Cross-ref for figures
For pandoc-crossref to work, you have to use this syntax:
```text Here is some text that refers to @fig:myfig.
{#fig:myfig}
```
There's unfortunately no way to get that {#fig:myfig} into the correct place in a knitted R Markdown document. A solution is to not use R/knitr to include figures. Instead you may create the figures elsewhereβeither in a different R script, or in a chunk in the documentβand then save them to disk as PDF or PNG (or both). Then, use standard Markdown + pandoc-crossref syntax (![](){}) to include them:
`text
{r create-figure, echo=FALSE, warning=FALSE, error=FALSE}
library(ggplot)
plot1 <- ggplot(...)
ggsave("output/myfig.pdf", plot1, width = 5, height = 3, device = cairo_pdf) ggsave("output/myfig.png", plot1, width = 5, height = 3, type = "cairo", dpi = 300) ```
Here is some text that refers to @fig:myfig.
{#fig:myfig}
````
Cross-ref for tables
You don't need to do this with tables, though. If you use pandoc.table() from the pander library, you can include the correct pandoc-crossref syntax in the table caption:
`text
{r example-table, echo=FALSE, warning=FALSE, message=FALSE, results="asis"}
library(tibble)
library(magrittr)
library(pander)
tribble(
~Heading, ~Other heading,
2, 3,
5, 7,
9, 1
) %>%
pandoc.table(caption = "This is a table {#tbl:mytable}")
`
Owner
- Name: Yannis Galanakis
- Login: ygalanak
- Kind: user
- Company: King's College London
- Website: https://www.yannisgalanakis.com/
- Twitter: YannisGalanakis
- Repositories: 5
- Profile: https://github.com/ygalanak
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Galanakis" given-names: "Yannis" orcid: "https://www.orcid.org/0000-0003-3216-7879" title: "Data Analysis Template: A reproducibility universe" date-released: 2022-02-15 url: "https://github.com/ygalanak/DataAnalysisTemplate"