GitHub
Bedtoolsr
Bedtoolsr: An R package for genomic data analysis and manipulation - Published in JOSS (2019)
bamtofastq
Converts bam or cram files to fastq format and does quality control.
microscope
Python library for control of microscope devices, supporting hardware triggers and distribution of devices over the network for performance and flexibility.
dmri-commit
Linear framework to combine tractography and tissue microstructure estimation with diffusion MRI
unlimiformer
Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
pyubx2
Python library for parsing and generating UBX GPS/GNSS protocol messages.
https://github.com/bluebrain/atlas-download-tools
Search, download, and prepare brain atlas data.
simpleitk
SimpleITK: a layer built on top of the Insight Toolkit (ITK), intended to simplify and facilitate ITK's use in rapid prototyping, education and interpreted languages.
opencage
:globe_with_meridians: R package for the OpenCage API -- both forward and reverse geocoding :globe_with_meridians:
mzcore
A Rust library for peptide centric mass spec calculations centered around ProForma and complex theoretical fragmentation
gas-reporting
Reference documentation on every gas price API and all the different formats
finetuning-scheduler
A PyTorch Lightning extension that accelerates and enhances foundation model experimentation with flexible fine-tuning schedules.
DINCAE
DINCAE (Data-Interpolating Convolutional Auto-Encoder) is a neural network to reconstruct missing data in satellite observations.
GFF3toEMBL
GFF3toEMBL: Preparing annotated assemblies for submission to EMBL - Published in JOSS (2016)
etrainee
E-learning course on Time Series Analysis in Remote Sensing for Understanding Human-Environment Interactions
datapackager
An R package to enable reproducible data processing, packaging and sharing.
brushnet
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Pyinterpolate
Pyinterpolate: Spatial interpolation in Python for point measurements and aggregated datasets - Published in JOSS (2022)
https://github.com/awslabs/gluonts
Probabilistic time series modeling in Python
whatshap
Read-based phasing of genomic variants, also called haplotype assembly
https://github.com/awslabs/open-data-registry
A registry of publicly available datasets on AWS
rascal
Python package to reconstruct and extend observational climate series through empricial downscaling of large-scale models
ModalDecisionTrees
Julia implementation of Modal Decision Trees & Forests, for interpretable classification of spatial and temporal data. Long live Symbolic Learning!!
commonpy
Collection of common Python utility functions and classes used in other Caltech Library programs.
nnSVG
nnSVG: scalable method to identify spatially variable genes (SVGs) in spatially-resolved transcriptomics data
g2p
Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!
multgee
GEE solver for correlated nominal or ordinal multinomial responses using a local odds ratios parameterization.
academic-observatory-workflows
Telescopes, Workflows and Data Services for the Academic Observatory
ArrayInterface
Designs for new Base array interface primitives, used widely through scientific machine learning (SciML) and other organizations
edspdf
EDS-PDF is a generic, pure-Python framework for text extraction from PDF documents. It provides the machinery to use rule- or machine-learning-based approaches to classify text blocs between body and meta-data.
https://github.com/chains-project/bump
A dataset of reproducible breaking dependency updates, SANER 2024 (https://doi.org/10.1109/SANER60148.2024.00024)
python-dts-calibration
A Python package to load raw Distributed Temperature Sensing (DTS) files, perform a calibration, and plot the result.
https://github.com/microsoft/mlos
MLOS is a project to enable autotuning for systems.
how-to-create-a-map-for-print-and-web-using-qgis
QGIS Workshop Material: How to create a basic map for print and web using QGIS
pygac
A python package to read and calibrate NOAA and Metop AVHRR GAC and LAC data
ugropy
A Python library designed to swiftly and effortlessly obtain the UNIFAC-like groups from molecules by their names and subsequently integrate them into inputs for thermodynamic libraries. UNIFAC, PSRK, Joback, and Abdulelah-Gani models are implemented.
differentiationinterface.jl
An interface to various automatic differentiation backends in Julia.
nat
NeuroAnatomy Toolbox: An R package for the (3D) visualisation and analysis of biological image data, especially tracings of single neurons.
code2seq
Code for the model presented in the paper: "code2seq: Generating Sequences from Structured Representations of Code"
pyprobables
Probabilistic data structures in python http://pyprobables.readthedocs.io/en/latest/index.html
netascore
NetAScore - Network Assessment Score Toolbox for Sustainable Mobility
ctmm
Continuous-Time Movement Modeling. Functions for identifying, fitting, and applying continuous-space, continuous-time stochastic movement models to animal tracking data.
embedding-atlas
Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.
Pippin
Pippin: A pipeline for supernova cosmology - Published in JOSS (2020)
carculator
Prospective environmental and economic life cycle assessment of vehicles made blazing fast.
Lexedata
Lexedata: A toolbox to edit CLDF lexical datasets - Published in JOSS (2022)
pynmeagps
Python library for parsing and generating NMEA 0183 GNSS/GPS protocol messages.
privacy-meter
Privacy Meter: An open-source library to audit data privacy in statistical and machine learning algorithms.
https://github.com/afeinstein20/eleanor
A tool for light curve extraction from the TESS FFIs.
https://github.com/aiidateam/aiida-wannier90-workflows
A collection of advanced automated workflows to compute Wannier functions using AiiDA and the Wannier90 code
mcmicro
An end-to-end processing pipeline that transforms multi-channel whole-slide images into single-cell data.
marlin
[CVPR] MARLIN: Masked Autoencoder for facial video Representation LearnINg
datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
polygnn
PolyGNN: Polyhedron-based graph neural network for 3D building reconstruction from point clouds [ISPRS 2024]
heat-desalination
Simulation and optimisation of hybrid electric and thermal powered desalination systems
playground
An open-source library for GPU-accelerated robot learning and sim-to-real transfer.
qspool
Dependency-free solution to spool jobs into SLURM scheduler without exceeding queue capacity limits