cromwell
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
io.joern:c2cpg_2.13
Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. Discord https://discord.gg/vv4MH284Hc
com.linkedin.isolation-forest
A distributed Spark/Scala implementation of the isolation forest algorithm for unsupervised outlier detection, featuring support for scalable training and ONNX export for easy cross-platform inference.
mep
Project MEP: Meme Evolution programme. A terraformed multi-language library to do statistical experiments in Twitter.
https://github.com/awslabs/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
https://github.com/bluebrain/nexus
Blue Brain Nexus - A knowledge graph for data-driven science
https://github.com/emergentorder/onnx-scala
An ONNX (Open Neural Network eXchange) API and backend for typeful, functional deep learning and classical machine learning in Scala 3
https://github.com/thoughtworksinc/deeplearning.scala
A simple library for creating complex neural networks
https://github.com/azavea/geotrellis-collections-api-research
A research project to investigate using GeoTrellis as a REST service
https://github.com/lamastex/scadamale
Scalable Data Science and Distributed Machine Learning Course Book written by Raazesh Sainudiin and his WASP AI-Track PhD Students
https://github.com/SETL-Framework/setl
A simple Spark-powered ETL framework that just works 🍺
https://github.com/databrickslabs/tempo
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
https://github.com/azavea/azavea.g8
A Giter8 template for bootstrapping Scala projects at Azavea.
https://github.com/dataunitylab/jsonoid-discovery
Distributed JSON schema discovery
sustainability-analysis-tool
Implementation of an analysis tool for business process sustainability analyses
https://github.com/alexeyev/mystem-scala
Morphological analyzer `mystem` (Russian language) wrapper for JVM languages