sjmisc
sjmisc: Data and Variable Transformation Functions - Published in JOSS (2018)
pydata-wrangler
Wrangle messy numerical, image, and text data into consistent well-organized formats
r-raster-vector-geospatial
Introduction to Geospatial Raster and Vector Data with R
hypertools
A Python toolbox for gaining geometric insights into high-dimensional data
covid19-italy-integrated-surveillance-data
COVID-19 integrated surveillance data provided by the Italian Institute of Health and processed via UnrollingAverages.jl to deconvolve the weekly moving averages.
https://github.com/avallecam/cdcper
Miscelanea de funciones customizadas a tareas de análisis en CDC Perú
https://github.com/fgazzelloni/20240930-dwpwr
Data Wrangling Practice with R - 30 September Tutorial for R-Ladies Rome
https://github.com/desbordante/desbordante-core
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
eno-flex
Repository of custom R codes to wrangle .lift and .flexttext output of FLEx into a new SFM database to be re-imported into a new FLEx dictionary/lexicon project. The spinoff of this project is available at https://github.com/engganolang/eno-learner-lift
preparing-your-mainframe-data-for-machine-learning
Mainframe Data Wrangling: Preparing Your Mainframe Data for Machine Learning
2023-10-22-carpentry-social-science
Go to https://dcs-training.github.io/2023-10-22-Carpentry-Social-Science/ to follow along the material
2024-11-18-cdcs-carpentry-social-sciences
This repo contains the material produced for a course run by the Centre in November 2024
https://github.com/cloud-span/genomics05-data-processing-analysis
Data Processing & Analysis
python-socialsci
Data Analysis and Visualization with Python for Social Scientists
xmap
R package for verifying and transforming data between nomenclature, categories or standards
https://github.com/aariq/li6400-data-wrangling
the LI6400 portable photosynthesis meter exports data as a series of poorly formatted .xls files. If you've used "add remark" to indicate sample ID, these are shown as rows, not a separate column. This script extracts relevant data, and moves the sample ID information from remarks into a column.
https://github.com/aariq/cupcakes-vs-muffins
Are cupcakes empirically different than muffins? Let's find out!