measure-traces

Repo for measuring (Internet) traces. Evaluate micro-batching in data stream processing and the impact of a batch loss on Count-Min Sketch estimation error.

https://github.com/dianacohencs/measure-traces

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.4%) to scientific vocabulary

Keywords

batching count-min-sketch recovering-data stream-processing
Last synced: 6 months ago · JSON representation ·

Repository

Repo for measuring (Internet) traces. Evaluate micro-batching in data stream processing and the impact of a batch loss on Count-Min Sketch estimation error.

Basic Info
  • Host: GitHub
  • Owner: DianaCohenCS
  • License: gpl-3.0
  • Language: Go
  • Default Branch: master
  • Homepage:
  • Size: 31.3 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
batching count-min-sketch recovering-data stream-processing
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

Measure Traces by Diana Cohen

Project definition

Measure the Internet traces in several steps: * Organize traces' files in data folder (out of the scope). * Process traces using go, creating metadata files in csv format (outfiles folder is out of the scope as well). * Generate plots using python from the metadata files that were created in prior step.

Content desciption

Measure beta - the average frequency per flow within a batch: * traceshell.sh - define traces' names along with the corresponding id-length, a set of batch sizes and run the golang scripts. - run shell file: bash traceshell.sh * traceall.go - generate the basic metadata regarding a given trace, i.e., track the number of flows (distinct items), and the stream's length. * tracebatch.go - handle a given trace using batches, according to a given batch-size; foreach batch, track the number of flows and compute beta - the average frequency. * generate_plots.py - each plot reflects beta measurements of a given trace, along with the pre-defined batch sizes; the outputs are provided in our paper. - the plots are saved as figures in 600 dpi, resulting in quite large files - use imagemagick command line tool to resize an image file: $ convert -resize 20%

Measure mean relative error: * errorshell.sh - define traces' names and batch sizes for golang processing * esterrbatch.go - process the given trace and batch size: - implement Count-Min Sketch in golang - emulate the crash in different points of trace's timeline - measure MRE in two aspects: - the impact of batch size on diff in estimation error; - the impact of +B upon a query after recovery * generatebars.py - generate the plots to reflect MRE measurements in both aspects on each round - the plots are saved as figures in 200 dpi, hence no need in resizing

Owner

  • Name: Diana Cohen
  • Login: DianaCohenCS
  • Kind: user
  • Location: Haifa,Israel
  • Company: Technion - Israel Institute of Technology

MSc student in Computer Science at the Technion, Israel

Citation (CITATION.cff)

abstract: This software measures average frequency of batched items in Internet traces. New feature was added to measure mean relative errors, refer to README file for more details.
authors:
  - family-names: Cohen
    given-names: Diana
    orcid: "https://orcid.org/0009-0003-6061-7635"
cff-version: 1.2.0
date-released: 2024-11-05
keywords:
  - "measure traces"
  - research
license: GPL-3.0
message: If you use this software, please cite it using these metadata.
repository-code: "https://github.com/DianaCohenCS/measure-traces"
title: "Measuring average frequency of batched items in Internet traces"
type: software

GitHub Events

Total
  • Push event: 5
  • Fork event: 1
  • Create event: 2
Last Year
  • Push event: 5
  • Fork event: 1
  • Create event: 2

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 0
proxy.golang.org: github.com/dianacohencs/measure-traces
  • Versions: 0
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.0%
Average: 6.2%
Dependent repos count: 6.4%
Last synced: 7 months ago

Dependencies

go.mod go
  • github.com/cespare/xxhash/v2 v2.3.0
go.sum go
  • github.com/cespare/xxhash/v2 v2.3.0