GitHub
zntrack
Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.
https://github.com/biocore/q2-greengenes2
A QIIME 2 plugin for interaction with the Greengenes2 database
pybop
A parameterisation and optimisation package for battery models.
https://github.com/ydataai/ydata-synthetic
Synthetic data generators for tabular and time-series data
gstat
Spatial and spatio-temporal geostatistical modelling, prediction and simulation
https://github.com/alan-turing-institute/pymc3
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano
dynamicsdm
An R package for species geographical distribution and abundance modelling at high spatiotemporal resolution
osmogrid
OSMoGrid is a tool to generate life like electrical grid models based on publicly available data, mainly OpenStreetMap. The tool puts a special focus on low voltage grids.
lightning-uq-box
Lightning-UQ-Box: Uncertainty Quantification for Neural Networks with PyTorch and Lightning
https://github.com/broadinstitute/str-analysis
Scripts and utilities related to analyzing short tandem repeats (STRs).
@stdlib/random-base-improved-ziggurat
Normally distributed pseudorandom numbers using the improved Ziggurat method.
constrain
Control Strainer (ConStrain) is a data-driven knowledge-integrated framework that automatically verifies that building system controls function as intended.
Potnia
Potnia: A Python library for the conversion of transliterated ancient texts to Unicode - Published in JOSS (2025)
pbxplore
A suite of tools to explore protein structures with Protein Blocks :snake:
grounded-segment-anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
segregation
R package to calculate entropy-based segregation indices, with a focus on the Mutual Information Index (M) and Theil’s Information Index (H)
sumo
sumo: Command-line tools for plotting and analysis of periodic *ab initio* calculations - Published in JOSS (2018)
delly2
DELLY2: Structural variant discovery by integrated paired-end and split-read analysis
fair-test
☑️ A library to build and deploy FAIR metrics tests APIs that can be used by FAIR evaluation services supporting the FAIRMetrics specifications, such as FAIR enough and the FAIR evaluator.
fitdistrplus
Help to Fit of a Parametric Distribution to Non-Censored or Censored Data
mulimgviewer
MulimgViewer is a multi-image viewer that can open multiple images in one interface, which is convenient for image comparison and image stitching.
Rfast
A collection of Rfast functions for data analysis. Note 1: The vast majority of the functions accept matrices only, not data.frames. Note 2: Do not have matrices or vectors with have missing data (i.e NAs). We do no check about them and C++ internally transforms them into zeros (0), so you may get wrong results. Note 3: In general, make sure you give the correct input, in order to get the correct output. We do no checks and this is one of the many reasons we are fast.
stream-learn
The stream-learn is an open-source Python library for difficult data stream analysis.
https://github.com/cans-world/cans
A code for fast, massively-parallel direct numerical simulations (DNS) of canonical flows
wiki-entity-summarization-preprocessor
Convert Wikidata and Wikipedia raw files to filterable formats with a focus of marking Wikidata as summaries based on their Wikipedia abstracts.
comparative_ml_analysis_bioinformatics
A comprehensive analysis of gene expression data using machine learning techniques in Python and R, focusing on predictive modeling and data visualization
timingafs
Framework for projecting the timing of frequency amplifications of extreme sea levels
radsel
Radius selection using kernel density estimation for the computation of nonlinear measures
noisy-sentences-dataset
550K sentences in 5 European languages augmented with noise for training and evaluating spell correction tools or machine learning models.
ccbs-study-3
Does Adolescent and Adult Training Impact Canine Behavior Outcomes?
shellbedcondensator
Shiny app visualizing the effects of sedimentary condensation
template-repo-data-analysis
Reusable, project-oriented data analysis template designed to align with the FAIR principles. The template offers a structured and scalable approach for managing scientific data and code, particularly suited for collaborative and open science environments.
random-streams-mt19937
Create a readable stream for a 32-bit Mersenne Twister pseudorandom number generator.
savannacorridors
Analysis of palaeoecological records across South-East Asia to determine the evidence for regime shifts between open savannas and dense tropical forests occurred since the Last Glacial Maximum
superhighwaysspreadmodel
‘Superhighways’ of human movement in Sahul combined with a demographic cellular automaton
https://github.com/coganlab/cross_patient_speech_decoding
Cross-patient speech decoding on neural data aligned to a shared latent space
synthpred
A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
storage-ring-gravitational-wave-observatory
Simulating the detection of millihertz (mHz) gravitational waves (GWs) from astrophysical sources by a Storage Ring Gravitational-wave Observatory (SRGO).
pest-risk-decision
Code for our paper "Combining spatio-temporal pest risk prediction and decision theory to improve pest management in smallholder agriculture"
green-value-chains
Research software used for techno-economic analysis of the impact of global heterogeneity of renewable energy supply on heavy industrial production and green value chains
paper-nestedner-icdar23-code
All the material (code, dataset, results) of our Benchmark of Nested NER approaches accepted at ICDAR 2023
competing-representations-shape-evidence-accumulation
Human and sim. behavioral / small-scale neural data for paper: https://www.biorxiv.org/content/10.1101/2022.10.03.510668v2
Integrated hydrologic model development and postprocessing for GSFLOW using pyGSFLOW
Integrated hydrologic model development and postprocessing for GSFLOW using pyGSFLOW - Published in JOSS (2022)
edtools
Collection of tools for automated processing and clustering of electron diffraction data
https://github.com/bluebrain/brayns
Visualizer for large-scale and interactive ray-tracing of neurons
pylith
PyLith is a finite element code for the solution of dynamic and quasi-static tectonic deformation problems.
knn-transformers
PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an implementation of kNN-LM and kNN-MT
iea-15-240-rwt
15MW reference wind turbine repository developed in conjunction with IEA Wind
Cubature
One- and multi-dimensional adaptive integration routines for the Julia language
auto-fox
A library for analyzing potential energy surfaces (PESs) and using the resulting PES descriptors for constructing forcefield parameters.
markovjunior
Probabilistic language based on pattern matching and constraint propagation, 153 examples
Diart
Diart: A Python Library for Real-Time Speaker Diarization - Published in JOSS (2024)