https://github.com/bobrenjc93/count

https://github.com/bobrenjc93/count

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: bobrenjc93
  • Language: Rust
  • Default Branch: main
  • Size: 76.2 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 12 months ago · Last pushed 12 months ago
Metadata Files
Readme

README.md

Gorilla-Inspired Time Series Database

A high-performance, fault-tolerant, in-memory time series database built in Rust, inspired by Facebook's Gorilla paper. Designed for fast ingestion and compression of recent data with support for horizontal sharding, cross-region replication, and pluggable object storage backends.

🚀 Features

Core Capabilities

  • High-Performance Compression: Delta-of-delta encoding for timestamps and XOR compression for floating-point values
  • Fast Ingestion: Optimized for high-throughput time series data ingestion
  • Efficient Queries: Range queries, aggregations, and correlation analysis
  • 2-Hour Blocks: Automatic data organization into compressed 2-hour blocks

Scalability & Reliability

  • Horizontal Sharding: Consistent hashing with automatic rebalancing
  • Cross-Region Replication: Write to multiple regions with failover support
  • Fault Tolerance: WAL (Write-Ahead Logging) and checkpointing for crash recovery
  • Object Storage Support: Pluggable backends (Local FS, S3, and custom implementations)

Advanced Analytics

  • Correlation Analysis: Pearson and Spearman correlation with rolling windows
  • Aggregation Functions: Mean, sum, min, max, percentiles, standard deviation
  • Auto-correlation: Time series pattern analysis
  • Roll-up Jobs: Background aggregation for coarse-grained data

🏗️ Architecture

The database is built with a modular architecture:

tsdb-core/ # Main database interface and coordination ├── compression/ # Delta-of-delta and XOR compression algorithms ├── storage/ # In-memory storage, WAL, checkpointing, object store abstraction ├── query/ # Query engine, aggregations, correlation analysis ├── replication/ # Cross-region replication and failover management ├── shard/ # Consistent hashing and shard management └── tools/ # Benchmarking and testing utilities

🛠️ Quick Start

Installation

bash git clone <repository-url> cd ods cargo build --release

Basic Usage

```rust use tsdb_core::{TimeSeriesDatabase, DatabaseConfig, DataPoint};

[tokio::main]

async fn main() -> Result<(), Box> { // Create database with default configuration let database = TimeSeriesDatabase::new(DatabaseConfig::default()).await?;

// Insert data points
let timestamp = 1640995200000; // 2022-01-01 00:00:00 UTC
let point = DataPoint::new(timestamp, 42.0);

database.insert_point("cpu.usage", point).await?;

// Query data
let points = database.query_range(
    "cpu.usage",
    1640995200000,  // start
    1640995260000   // end (1 minute later)
).await?;

println!("Retrieved {} points", points.len());

database.shutdown().await?;
Ok(())

} ```

With S3 Backend

rust let database = TimeSeriesDatabase::new(DatabaseConfig::default()) .await? .with_s3_store( "primary".to_string(), "my-timeseries-bucket".to_string(), "us-east-1".to_string(), None, // Use default S3 endpoint "access_key".to_string(), "secret_key".to_string(), ).await?;

🧪 Testing & Benchmarks

Run Tests

bash cargo test

Run Benchmarks

bash cargo run --bin tools benchmark

Interactive Demo

bash cargo run --bin tools demo

Test Compression

bash cargo run --bin tools test-compression

Test Correlation Analysis

bash cargo run --bin tools test-correlation

📊 Performance Characteristics

Compression Ratios

  • Timestamps: 10-20x compression for regular intervals
  • Values: 2-8x compression depending on data patterns
  • Overall: Typically 3-10x compression ratio

Throughput (typical hardware)

  • Writes: 100K-500K points/second per shard
  • Reads: 1M+ points/second per shard
  • Queries: Sub-millisecond for recent data blocks

Storage Efficiency

  • Memory: ~26 hours of recent data in memory
  • Disk/Object Storage: Compressed historical blocks
  • Network: Minimal replication overhead with compression

🔧 Configuration

Database Configuration

```rust use tsdb_core::DatabaseConfig; use shard::ShardConfig; use replication::ReplicationConfig;

let config = DatabaseConfig { shardconfig: ShardConfig { shardcount: 256, replicacount: 2, rebalancethreshold: 0.1, ..Default::default() }, replicationconfig: ReplicationConfig { replicacount: 2, enablecrossregion: true, ..Default::default() }, enablereplication: true, enablesharding: true, nodeid: "node1".tostring(), region: "us-east-1".tostring(), ..Default::default() }; ```

Storage Configuration

```rust use storage::PersistentStorageConfig;

let storageconfig = PersistentStorageConfig { blockdurationhours: 2, checkpointintervalhours: 6, walsyncintervalms: 1000, maxwalsizemb: 100, compressionenabled: true, objectstorename: Some("s3".to_string()), }; ```

🌐 Object Storage Abstraction

The database supports pluggable object storage backends:

Local File System

```rust use storage::{LocalFileStore, ObjectStoreManager};

let mut storemanager = ObjectStoreManager::new(); let localstore = Box::new(LocalFileStore::new("./data")?); storemanager.addstore("local".tostring(), localstore); ```

S3-Compatible Storage

```rust use storage::{S3Store, S3Credentials};

let credentials = S3Credentials { accesskey: "youraccesskey".tostring(), secretkey: "yoursecretkey".tostring(), session_token: None, };

let s3store = Box::new(S3Store::new( "bucket-name".tostring(), "us-east-1".to_string(), None, // Default endpoint credentials, )); ```

Custom Storage Backend

Implement the ObjectStore trait for custom backends:

```rust use storage::ObjectStore; use asynctrait::asynctrait;

struct MyCustomStore;

[async_trait]

impl ObjectStore for MyCustomStore { async fn put_object(&self, key: &str, data: Vec) -> storage::Result<()> { // Your implementation Ok(()) }

async fn get_object(&self, key: &str) -> storage::Result<Vec<u8>> {
    // Your implementation
    Ok(vec![])
}

// ... implement other required methods

} ```

🔍 Query Capabilities

Basic Queries

```rust // Range query let points = database.queryrange("serieskey", starttime, endtime).await?;

// Get latest point let latest = database.getlatestpoint("series_key").await?;

// Series information let info = database.getseriesinfo("series_key").await?; ```

Advanced Queries

```rust use query::{QueryRequest, AggregationType};

let request = QueryRequest { serieskeys: vec!["cpu.usage".tostring()], starttime: 1640995200000, endtime: 1640998800000, aggregation: Some(AggregationType::Mean), step_ms: Some(60000), // 1-minute buckets };

let results = database.query(request).await?; ```

Correlation Analysis

```rust use query::CorrelationEngine;

let engine = CorrelationEngine::new(); let correlation = engine.pearsoncorrelation(&seriesa, &seriesb)?; let rollingcorr = engine.rollingcorrelation(&seriesa, &series_b, 50)?; ```

🚦 Operational Features

Health Monitoring

rust let stats = database.get_database_stats().await?; println!("Total series: {}", stats.total_series); println!("Compression ratio: {:.2}x", stats.compression_ratio);

Data Lifecycle Management

rust // Clean up old data (older than 24 hours) let cutoff = current_timestamp_ms() - 24 * 3600 * 1000; let cleaned_points = database.cleanup_old_data(cutoff).await?;

Checkpointing

rust // Manual checkpoint creation let checkpoints = database.create_checkpoint().await?;

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Run the test suite: cargo test
  6. Submit a pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Inspired by Facebook's Gorilla time series database
  • Built with the Rust ecosystem's excellent crates
  • Thanks to the time series database community for insights and best practices# count

Owner

  • Name: Bob Ren
  • Login: bobrenjc93
  • Kind: user
  • Location: Bay Area
  • Company: Meta

GitHub Events

Total
  • Push event: 3
  • Pull request event: 2
  • Create event: 4
Last Year
  • Push event: 3
  • Pull request event: 2
  • Create event: 4

Dependencies

Cargo.toml cargo
compression/Cargo.toml cargo
query/Cargo.toml cargo
replication/Cargo.toml cargo
shard/Cargo.toml cargo
storage/Cargo.toml cargo
  • tempfile 3.0 development
tools/Cargo.toml cargo
tsdb-core/Cargo.toml cargo