cottoncandy
cottoncandy: scientific python package for easy cloud storage - Published in JOSS (2018)
https://github.com/airbytehq/airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://github.com/awslabs/mountpoint-s3
A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.
https://github.com/salesforce/cantor
Data abstraction, storage, discovery, and serving system
https://github.com/beomi/typora-s3-uploader
CLI file/img S3 uploader for Typora
https://github.com/awslabs/payload-offloading-java-common-lib-for-aws
Shared library between AWS extended messaging clients to manage payloads larger than their limits.
https://github.com/blaylockbk/horels3-archive
Details, scripts, and examples for using the Horel-Group object archive on CHPC's Pando system.
asli-pipeline
This repository contains a pipeline for operational execution of the Amundsen Sea Ice Low calculations, provided in the asli package. The functions in the asli package are described in detail in the package repository amundsen-sea-low-index.
https://github.com/awslabs/amazon-s3-find-and-forget
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
https://github.com/rumbledb/rumble
⛈️ RumbleDB 2.0.0 "Lemon Ironwood" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
https://github.com/alan-turing-institute/s3toazure
Move Data from an Amazon S3 Bucket using parallel connections
https://github.com/awslabs/aws-groundstation-eos-pipeline
This guide helps you create an automated Earth Observation pipeline on AWS.
https://github.com/crowdstrike/kafka-replicator
Kafka replicator is a tool used to mirror and backup Kafka topics across regions
https://github.com/awslabs/aws-crt-s3-benchmarks
Benchmarking for multiple AWS S3 libraries.
https://github.com/awslabs/mountpoint-s3-csi-driver
Built on Mountpoint for Amazon S3, the Mountpoint CSI driver presents an Amazon S3 bucket as a storage volume accessible by containers in your Kubernetes cluster.
https://github.com/awslabs/aws-java-nio-spi-for-s3
A Java NIO.2 service provider for Amazon S3
https://github.com/awslabs/cloudfront-hosting-toolkit
CloudFront Hosting Toolkit offers the convenience of a managed frontend hosting service while retaining full control over the hosting and deployment infrastructure to make it your own.
https://github.com/timmikeladze/rehiver
🐝 Super-charge your S3 hive partitioned based file operations with intelligent pattern matching, change detection, optimized data-fetching, and out-of-the-box time series support.