phishing-dataset
Phishing dataset with more than 88,000 instances and 111 features. Web application available at. https://gregavrbancic.github.io/Phishing-Dataset/
4dnmetadataschemaxsd2jsonconverter
This is a converter written in Java that translates an XSD microscopy metadata schema into JSON
MsBackendRawFileReader
Spectra MsBackend for Thermo Fisher Scientific's New RawFileReader
dagmc-h5m-file-inspector
Extracts information from DAGMC h5m files including volumes number, material tags
https://github.com/bayer-group/tiffslide
TiffSlide - cloud native openslide-python replacement based on tifffile
fft-conv-pytorch
Implementation of 1D, 2D, and 3D FFT convolutions in PyTorch. Much faster than direct convolutions for large kernel sizes.
tmu
Implements the Tsetlin Machine, Coalesced Tsetlin Machine, Convolutional Tsetlin Machine, Regression Tsetlin Machine, and Weighted Tsetlin Machine, with support for continuous features, drop clause, Type III Feedback, focused negative sampling, multi-task classifier, autoencoder, literal budget, and one-vs-one multi-class classifier. TMU is written in Python with wrappers for C and CUDA-based clause evaluation and updating.
sweet
Official repository for Semantic Web for Earth and Environmental Terminology (SWEET) Ontologies
@stdlib/dist-datasets-cmudict
✨ Standard library for JavaScript and Node.js. ✨
inquirer
A collection of common interactive command line user interfaces, based on Inquirer.js (https://github.com/SBoudrias/Inquirer.js/)
https://github.com/cidgoh/dataharmonizer
A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
gridtools
Libraries and utilities to develop performance portable applications for weather and climate.
benchmarl
BenchMARL is a library for benchmarking Multi-Agent Reinforcement Learning (MARL). BenchMARL allows to quickly compare different MARL algorithms, tasks, and models while being systematically grounded in its two core tenets: reproducibility and standardization.
cve-llm_dataset
This is a dataset intended to train a LLM model for a completely CVE focused input and output.
minimalmodbus
Easy-to-use Modbus RTU and Modbus ASCII implementation for Python.
https://github.com/elementary-data/elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
mastercurves
Python package for automatically superimposing data sets to create a master curve, using Gaussian process regression and maximum a posteriori estimation.
Bayesian_LSP
A Bayesian hierarchical model that quantifies long-term annual land surface phenology from sparse time series of vegetation indices.
deepfastmlu
Machine learning utilities to help speed up the prototyping process.
samplics
samplics: a Python Package for selecting, weighting and analyzing data from complex sampling designs. - Published in JOSS (2021)
youtube-unofficial
Access parts of your account unavailable through normal YouTube API access.
@stdlib/blas-ext-base-sapxsumpw
Adds a constant to each single-precision floating-point strided array element and computes the sum using pairwise summation.
https://github.com/arkworks-rs/snark
Interfaces for Relations and SNARKs for these relations
gnfinder
GNfinder finds scientific names in UTF8 texts, PDF files, MS Word/Excel documents, URLs etc.
https://github.com/afsc-gap-products/akgfmaps
AFSC fishery-independent survey geospatial features and maps
pydiffgame
PyDiffGame is a Python implementation of a Nash Equilibrium solution to Differential Games, based on a reduction of Game Hamilton-Bellman-Jacobi (GHJB) equations to Game Algebraic and Differential Riccati equations, associated with Multi-Objective Dynamical Control Systems
cTRAP
Identification of candidate causal perturbations from differential gene expression data
https://github.com/astropy/pytest-astropy
Metapackage for all the testing machinery used by the Astropy Project
ccn-data-library
The Coastal Carbon Network Data Library: An open-source database featuring carbon data from tidal wetlands around the world
eco2ai
eco2AI is a python library which accumulates statistics about power consumption and CO2 emission during running code.
mverse
R package for multiverse analysis. Extends R package multiverse with student and analyst friendly interfaces.
bio_data_guide
Standardizing Marine Biological Data Working Group - An open community to facilitate the mobilization of biological data to OBIS.
https://github.com/simple-statistics/simple-statistics
simple statistics for node & browser javascript
https://github.com/beehive-lab/mambo
A low-overhead dynamic binary instrumentation and modification tool for ARM (both AArch32 and AArch64 support) and RISC-V (RV64GC).
ARTool
R Package for Aligned Rank Transform for Nonparametric Factorial ANOVAs
https://github.com/aliireza/iommu-bench
Overcoming the IOTLB Wall for Multi-100-Gbps Linux-based Networking
https://github.com/ocbe-uio/trajpy
Trajpy - empowering feature engineering for trajectory analysis across domains.
CommonRLSpaces
A collection of structures to define observation or action spaces of Reinforcement Learning environments. [May be moved into CommonRLInterface once stable]
test-datasets
Test data to be used for automated testing with the nf-core pipelines
dawdreamer
Digital Audio Workstation with Python; VST instruments/effects, parameter automation, FAUST, JAX, Warp Markers, and JUCE processors
survey-live-temperature-map
These scripts create daily survey station daily temperature and anomaly plots as the ships work their way through the Bering Sea. These ships are conducting NOAA Fisheries' Alaska Fisheries Science Center's fisheries independent surveys in the Eastern Bering Sea. Scripts pull temperatures from google drive, entered by FPCs at sea, create daily maps and composite gifs, and then push the maps to google drive for the communications team.
moltemplate
A general cross-platform tool for preparing simulations of molecules and complex molecular assemblies
ClusterAnalysis
Cluster Algorithms from Scratch with Julia Lang. (K-Means and DBSCAN)
eclib
The eclib package includes mwrank (for 2-descent on elliptic curves over Q) and modular symbol code used to create the elliptic curve database.
@stdlib/stats-base-nanstdevch
Calculate the standard deviation of a strided array ignoring NaN values and using a one-pass trial mean algorithm.
https://github.com/arpit-babbar/trixilw.jl
Lax-Wendroff Flux Reconstruction on curvilinear grids
unicore
Universal and efficient structure-based core gene phylogeny with Foldseek and ProstT5
persia
High performance distributed framework for training deep learning recommendation models based on PyTorch.
faststream
FastStream is a powerful and easy-to-use Python framework for building asynchronous services interacting with event streams such as Apache Kafka, RabbitMQ, NATS and Redis.
DASKR
Interface to DASKR, a differential algebraic system solver for the SciML scientific machine learning ecosystem