https://github.com/bobrenjc93/count
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.4%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: bobrenjc93
- Language: Rust
- Default Branch: main
- Size: 76.2 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Gorilla-Inspired Time Series Database
A high-performance, fault-tolerant, in-memory time series database built in Rust, inspired by Facebook's Gorilla paper. Designed for fast ingestion and compression of recent data with support for horizontal sharding, cross-region replication, and pluggable object storage backends.
🚀 Features
Core Capabilities
- High-Performance Compression: Delta-of-delta encoding for timestamps and XOR compression for floating-point values
- Fast Ingestion: Optimized for high-throughput time series data ingestion
- Efficient Queries: Range queries, aggregations, and correlation analysis
- 2-Hour Blocks: Automatic data organization into compressed 2-hour blocks
Scalability & Reliability
- Horizontal Sharding: Consistent hashing with automatic rebalancing
- Cross-Region Replication: Write to multiple regions with failover support
- Fault Tolerance: WAL (Write-Ahead Logging) and checkpointing for crash recovery
- Object Storage Support: Pluggable backends (Local FS, S3, and custom implementations)
Advanced Analytics
- Correlation Analysis: Pearson and Spearman correlation with rolling windows
- Aggregation Functions: Mean, sum, min, max, percentiles, standard deviation
- Auto-correlation: Time series pattern analysis
- Roll-up Jobs: Background aggregation for coarse-grained data
🏗️ Architecture
The database is built with a modular architecture:
tsdb-core/ # Main database interface and coordination
├── compression/ # Delta-of-delta and XOR compression algorithms
├── storage/ # In-memory storage, WAL, checkpointing, object store abstraction
├── query/ # Query engine, aggregations, correlation analysis
├── replication/ # Cross-region replication and failover management
├── shard/ # Consistent hashing and shard management
└── tools/ # Benchmarking and testing utilities
🛠️ Quick Start
Installation
bash
git clone <repository-url>
cd ods
cargo build --release
Basic Usage
```rust use tsdb_core::{TimeSeriesDatabase, DatabaseConfig, DataPoint};
[tokio::main]
async fn main() -> Result<(), Box
// Insert data points
let timestamp = 1640995200000; // 2022-01-01 00:00:00 UTC
let point = DataPoint::new(timestamp, 42.0);
database.insert_point("cpu.usage", point).await?;
// Query data
let points = database.query_range(
"cpu.usage",
1640995200000, // start
1640995260000 // end (1 minute later)
).await?;
println!("Retrieved {} points", points.len());
database.shutdown().await?;
Ok(())
} ```
With S3 Backend
rust
let database = TimeSeriesDatabase::new(DatabaseConfig::default())
.await?
.with_s3_store(
"primary".to_string(),
"my-timeseries-bucket".to_string(),
"us-east-1".to_string(),
None, // Use default S3 endpoint
"access_key".to_string(),
"secret_key".to_string(),
).await?;
🧪 Testing & Benchmarks
Run Tests
bash
cargo test
Run Benchmarks
bash
cargo run --bin tools benchmark
Interactive Demo
bash
cargo run --bin tools demo
Test Compression
bash
cargo run --bin tools test-compression
Test Correlation Analysis
bash
cargo run --bin tools test-correlation
📊 Performance Characteristics
Compression Ratios
- Timestamps: 10-20x compression for regular intervals
- Values: 2-8x compression depending on data patterns
- Overall: Typically 3-10x compression ratio
Throughput (typical hardware)
- Writes: 100K-500K points/second per shard
- Reads: 1M+ points/second per shard
- Queries: Sub-millisecond for recent data blocks
Storage Efficiency
- Memory: ~26 hours of recent data in memory
- Disk/Object Storage: Compressed historical blocks
- Network: Minimal replication overhead with compression
🔧 Configuration
Database Configuration
```rust use tsdb_core::DatabaseConfig; use shard::ShardConfig; use replication::ReplicationConfig;
let config = DatabaseConfig { shardconfig: ShardConfig { shardcount: 256, replicacount: 2, rebalancethreshold: 0.1, ..Default::default() }, replicationconfig: ReplicationConfig { replicacount: 2, enablecrossregion: true, ..Default::default() }, enablereplication: true, enablesharding: true, nodeid: "node1".tostring(), region: "us-east-1".tostring(), ..Default::default() }; ```
Storage Configuration
```rust use storage::PersistentStorageConfig;
let storageconfig = PersistentStorageConfig { blockdurationhours: 2, checkpointintervalhours: 6, walsyncintervalms: 1000, maxwalsizemb: 100, compressionenabled: true, objectstorename: Some("s3".to_string()), }; ```
🌐 Object Storage Abstraction
The database supports pluggable object storage backends:
Local File System
```rust use storage::{LocalFileStore, ObjectStoreManager};
let mut storemanager = ObjectStoreManager::new(); let localstore = Box::new(LocalFileStore::new("./data")?); storemanager.addstore("local".tostring(), localstore); ```
S3-Compatible Storage
```rust use storage::{S3Store, S3Credentials};
let credentials = S3Credentials { accesskey: "youraccesskey".tostring(), secretkey: "yoursecretkey".tostring(), session_token: None, };
let s3store = Box::new(S3Store::new( "bucket-name".tostring(), "us-east-1".to_string(), None, // Default endpoint credentials, )); ```
Custom Storage Backend
Implement the ObjectStore trait for custom backends:
```rust use storage::ObjectStore; use asynctrait::asynctrait;
struct MyCustomStore;
[async_trait]
impl ObjectStore for MyCustomStore {
async fn put_object(&self, key: &str, data: Vec
async fn get_object(&self, key: &str) -> storage::Result<Vec<u8>> {
// Your implementation
Ok(vec![])
}
// ... implement other required methods
} ```
🔍 Query Capabilities
Basic Queries
```rust // Range query let points = database.queryrange("serieskey", starttime, endtime).await?;
// Get latest point let latest = database.getlatestpoint("series_key").await?;
// Series information let info = database.getseriesinfo("series_key").await?; ```
Advanced Queries
```rust use query::{QueryRequest, AggregationType};
let request = QueryRequest { serieskeys: vec!["cpu.usage".tostring()], starttime: 1640995200000, endtime: 1640998800000, aggregation: Some(AggregationType::Mean), step_ms: Some(60000), // 1-minute buckets };
let results = database.query(request).await?; ```
Correlation Analysis
```rust use query::CorrelationEngine;
let engine = CorrelationEngine::new(); let correlation = engine.pearsoncorrelation(&seriesa, &seriesb)?; let rollingcorr = engine.rollingcorrelation(&seriesa, &series_b, 50)?; ```
🚦 Operational Features
Health Monitoring
rust
let stats = database.get_database_stats().await?;
println!("Total series: {}", stats.total_series);
println!("Compression ratio: {:.2}x", stats.compression_ratio);
Data Lifecycle Management
rust
// Clean up old data (older than 24 hours)
let cutoff = current_timestamp_ms() - 24 * 3600 * 1000;
let cleaned_points = database.cleanup_old_data(cutoff).await?;
Checkpointing
rust
// Manual checkpoint creation
let checkpoints = database.create_checkpoint().await?;
🤝 Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Run the test suite:
cargo test - Submit a pull request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Inspired by Facebook's Gorilla time series database
- Built with the Rust ecosystem's excellent crates
- Thanks to the time series database community for insights and best practices# count
Owner
- Name: Bob Ren
- Login: bobrenjc93
- Kind: user
- Location: Bay Area
- Company: Meta
- Repositories: 1
- Profile: https://github.com/bobrenjc93
GitHub Events
Total
- Push event: 3
- Pull request event: 2
- Create event: 4
Last Year
- Push event: 3
- Pull request event: 2
- Create event: 4
Dependencies
- tempfile 3.0 development