https://github.com/segmentio/encoding

Go package containing implementations of efficient encoding, decoding, and validation APIs.

https://github.com/segmentio/encoding

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 22 committers (4.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.9%) to scientific vocabulary

Keywords

ascii decoding encoding go golang hacktoberfest iso8601 json performance protobuf validation
Last synced: 5 months ago · JSON representation

Repository

Go package containing implementations of efficient encoding, decoding, and validation APIs.

Basic Info
  • Host: GitHub
  • Owner: segmentio
  • License: mit
  • Language: Go
  • Default Branch: master
  • Homepage:
  • Size: 13.1 MB
Statistics
  • Stars: 1,022
  • Watchers: 16
  • Forks: 53
  • Open Issues: 24
  • Releases: 15
Topics
ascii decoding encoding go golang hacktoberfest iso8601 json performance protobuf validation
Created about 6 years ago · Last pushed 7 months ago
Metadata Files
Readme License

README.md

encoding build status Go Report Card GoDoc

Go package containing implementations of encoders and decoders for various data formats.

Motivation

At Segment, we do a lot of marshaling and unmarshaling of data when sending, queuing, or storing messages. The resources we need to provision on the infrastructure are directly related to the type and amount of data that we are processing. At the scale we operate at, the tools we choose to build programs can have a large impact on the efficiency of our systems. It is important to explore alternative approaches when we reach the limits of the code we use.

This repository includes experiments for Go packages for marshaling and unmarshaling data in various formats. While the focus is on providing a high performance library, we also aim for very low development and maintenance overhead by implementing APIs that can be used as drop-in replacements for the default solutions.

Requirements and Maintenance Schedule

This package has no dependencies outside of the core runtime of Go. It requires a recent version of Go.

This package follows the same maintenance schedule as the Go project, meaning that issues relating to versions of Go which aren't supported by the Go team, or versions of this package which are older than 1 year, are unlikely to be considered.

Additionally, we have fuzz tests which aren't a runtime required dependency but will be pulled in when running go mod tidy. Please don't include these go.mod updates in change requests.

encoding/json GoDoc

More details about how this package achieves a lower CPU and memory footprint can be found in the package README.

The json sub-package provides a re-implementation of the functionalities offered by the standard library's encoding/json package, with a focus on lowering the CPU and memory footprint of the code.

The exported API of this package mirrors the standard library's encoding/json package, the only change needed to take advantage of the performance improvements is the import path of the json package, from: go import ( "encoding/json" ) to go import ( "github.com/segmentio/encoding/json" )

The improvement can be significant for code that heavily relies on serializing and deserializing JSON payloads. The CI pipeline runs benchmarks to compare the performance of the package with the standard library and other popular alternatives; here's an overview of the results:

Comparing to encoding/json (v1.16.2) ``` name old time/op new time/op delta Marshal/json.codeResponse2 6.40ms ± 2% 3.82ms ± 1% -40.29% (p=0.008 n=5+5) Unmarshal/json.codeResponse2 28.1ms ± 3% 5.6ms ± 3% -80.21% (p=0.008 n=5+5)

name old speed new speed delta Marshal/json.codeResponse2 303MB/s ± 2% 507MB/s ± 1% +67.47% (p=0.008 n=5+5) Unmarshal/json.codeResponse2 69.2MB/s ± 3% 349.6MB/s ± 3% +405.42% (p=0.008 n=5+5)

name old alloc/op new alloc/op delta Marshal/json.codeResponse2 0.00B 0.00B ~ (all equal) Unmarshal/json.codeResponse2 1.80MB ± 1% 0.02MB ± 0% -99.14% (p=0.016 n=5+4)

name old allocs/op new allocs/op delta Marshal/json.codeResponse2 0.00 0.00 ~ (all equal) Unmarshal/json.codeResponse2 76.6k ± 0% 0.1k ± 3% -99.92% (p=0.008 n=5+5) ```

Benchmarks were run on a Core i9-8950HK CPU @ 2.90GHz.

Comparing to github.com/json-iterator/go (v1.1.10) ``` name old time/op new time/op delta Marshal/json.codeResponse2 6.19ms ± 3% 3.82ms ± 1% -38.26% (p=0.008 n=5+5) Unmarshal/json.codeResponse2 8.52ms ± 3% 5.55ms ± 3% -34.84% (p=0.008 n=5+5)

name old speed new speed delta Marshal/json.codeResponse2 313MB/s ± 3% 507MB/s ± 1% +61.91% (p=0.008 n=5+5) Unmarshal/json.codeResponse2 228MB/s ± 3% 350MB/s ± 3% +53.50% (p=0.008 n=5+5)

name old alloc/op new alloc/op delta Marshal/json.codeResponse2 8.00B ± 0% 0.00B -100.00% (p=0.008 n=5+5) Unmarshal/json.codeResponse2 1.05MB ± 0% 0.02MB ± 0% -98.53% (p=0.000 n=5+4)

name old allocs/op new allocs/op delta Marshal/json.codeResponse2 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) Unmarshal/json.codeResponse2 37.2k ± 0% 0.1k ± 3% -99.83% (p=0.008 n=5+5) ```

Although this package aims to be a drop-in replacement of encoding/json, it does not guarantee the same error messages. It will error in the same cases as the standard library, but the exact error message may be different.

encoding/iso8601 GoDoc

The iso8601 sub-package exposes APIs to efficiently deal with with string representations of iso8601 dates.

Data formats like JSON have no syntaxes to represent dates, they are usually serialized and represented as a string value. In our experience, we often have to check whether a string value looks like a date, and either construct a time.Time by parsing it or simply treat it as a string. This check can be done by attempting to parse the value, and if it fails fallback to using the raw string. Unfortunately, while the happy path for time.Parse is fairly efficient, constructing errors is much slower and has a much bigger memory footprint.

We've developed fast iso8601 validation functions that cause no heap allocations to remediate this problem. We added a validation step to determine whether the value is a date representation or a simple string. This reduced CPU and memory usage by 5% in some programs that were doing time.Parse calls on very hot code paths.

Owner

  • Name: Segment
  • Login: segmentio
  • Kind: organization
  • Email: friends@segment.com
  • Location: San Francisco, CA

GitHub Events

Total
  • Create event: 6
  • Release event: 2
  • Issues event: 5
  • Watch event: 29
  • Delete event: 4
  • Issue comment event: 1
  • Push event: 30
  • Pull request review comment event: 1
  • Pull request review event: 9
  • Pull request event: 7
  • Fork event: 3
Last Year
  • Create event: 6
  • Release event: 2
  • Issues event: 5
  • Watch event: 29
  • Delete event: 4
  • Issue comment event: 1
  • Push event: 30
  • Pull request review comment event: 1
  • Pull request review event: 9
  • Pull request event: 7
  • Fork event: 3

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 125
  • Total Committers: 22
  • Avg Commits per committer: 5.682
  • Development Distribution Score (DDS): 0.536
Past Year
  • Commits: 6
  • Committers: 4
  • Avg Commits per committer: 1.5
  • Development Distribution Score (DDS): 0.5
Top Committers
Name Email Commits
Achille a****e@s****m 58
Chris O'Hara c****7@g****m 24
Jeremy Jackins j****s@g****m 8
Thomas Pelletier t****r@s****m 6
Kevin Gillette k****e@t****m 5
Jeremy Larkin j****n@s****m 5
Steve van Loben Sels s****e@s****m 2
Tyson Mote t****n@s****m 2
Kevin Burke 9****t 2
Alan Braithwaite a****n@s****m 1
Matthieu Dumont 5****a 1
Maggie Yu 6****t 1
Travis Cole 1****p 1
John Gillott 8****t 1
Joey f****y@g****o 1
Fabrice Vaillant f****b@g****m 1
Dominic Barnes d****c@s****m 1
Benjamin Yolken b****n@s****m 1
Achille a****l@g****m 1
Varun Wachaspati J V****i 1
dferstay d****y 1
Lucas Baier l****6@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 27
  • Total pull requests: 75
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 2 days
  • Total issue authors: 24
  • Total pull request authors: 21
  • Average comments per issue: 1.96
  • Average comments per pull request: 0.63
  • Merged pull requests: 69
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 6
  • Pull requests: 4
  • Average time to close issues: 9 days
  • Average time to close pull requests: about 6 hours
  • Issue authors: 6
  • Pull request authors: 2
  • Average comments per issue: 0.83
  • Average comments per pull request: 0.5
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • alsfranyrjb2-ops (3)
  • pelletier (2)
  • Jerska (2)
  • aaron42net (2)
  • patricksandquist (1)
  • BlasterAlex (1)
  • frioux (1)
  • efepapa (1)
  • ddkwork (1)
  • vtolstov (1)
  • blockerdude (1)
  • qknight (1)
  • lpar (1)
  • jnjackins (1)
  • geoff-ziprecruiter-com (1)
Pull Request Authors
  • achille-roussel (31)
  • chriso (18)
  • kalamay (5)
  • extemporalgenome (4)
  • kevinburkesegment (3)
  • tysonmote (2)
  • pelletier (2)
  • stevevls (2)
  • lab176 (1)
  • timoffex (1)
  • yolken-segment (1)
  • wdvxdr1123 (1)
  • Jerska (1)
  • abraithwaite (1)
  • dominicbarnes (1)
Top Labels
Issue Labels
bug (5) enhancement (2) documentation (1)
Pull Request Labels
enhancement (1)

Dependencies

benchmarks/go.mod go
  • github.com/DataDog/zstd v1.4.1
  • github.com/gogo/protobuf v1.3.1
  • github.com/golang/snappy v0.0.1
  • github.com/google/uuid v1.0.0
  • github.com/json-iterator/go v1.1.7
  • github.com/kr/pretty v0.1.0
  • github.com/mailru/easyjson v0.7.0
  • github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd
  • github.com/modern-go/reflect2 v1.0.1
  • github.com/philhofer/fwd v1.0.0
  • github.com/segmentio/encoding v0.1.1
  • github.com/tinylib/msgp v1.1.0
  • github.com/vmihailenco/msgpack v4.0.4+incompatible
  • golang.org/x/net v0.0.0-20190620200207-3b0461eec859
  • golang.org/x/sync v0.0.0-20190423024810-112230192c58
  • google.golang.org/appengine v1.2.0
  • gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127
benchmarks/go.sum go
  • github.com/DataDog/zstd v1.4.1
  • github.com/davecgh/go-spew v1.1.0
  • github.com/davecgh/go-spew v1.1.1
  • github.com/gogo/protobuf v1.3.1
  • github.com/golang/protobuf v1.2.0
  • github.com/golang/snappy v0.0.1
  • github.com/google/gofuzz v1.0.0
  • github.com/google/uuid v1.0.0
  • github.com/json-iterator/go v1.1.7
  • github.com/kisielk/errcheck v1.2.0
  • github.com/kisielk/gotool v1.0.0
  • github.com/kr/pretty v0.1.0
  • github.com/kr/pty v1.1.1
  • github.com/kr/text v0.1.0
  • github.com/mailru/easyjson v0.7.0
  • github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421
  • github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd
  • github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742
  • github.com/modern-go/reflect2 v1.0.1
  • github.com/philhofer/fwd v1.0.0
  • github.com/pmezard/go-difflib v1.0.0
  • github.com/segmentio/encoding v0.1.0
  • github.com/segmentio/encoding v0.1.1
  • github.com/stretchr/objx v0.1.0
  • github.com/stretchr/testify v1.3.0
  • github.com/tinylib/msgp v1.1.0
  • github.com/vmihailenco/msgpack v4.0.4+incompatible
  • golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2
  • golang.org/x/net v0.0.0-20180724234803-3673e40ba225
  • golang.org/x/net v0.0.0-20190620200207-3b0461eec859
  • golang.org/x/sync v0.0.0-20190423024810-112230192c58
  • golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a
  • golang.org/x/text v0.3.0
  • golang.org/x/tools v0.0.0-20181030221726-6c7e314b6563
  • google.golang.org/appengine v1.2.0
  • gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127
go.mod go
  • github.com/segmentio/asm v1.1.3
go.sum go
  • github.com/segmentio/asm v1.1.3
  • golang.org/x/sys v0.0.0-20211110154304-99a53858aa08
proto/fixtures/go.mod go
  • github.com/golang/protobuf v1.4.2
  • google.golang.org/protobuf v1.25.0
proto/fixtures/go.sum go
  • cloud.google.com/go v0.26.0
  • github.com/BurntSushi/toml v0.3.1
  • github.com/census-instrumentation/opencensus-proto v0.2.1
  • github.com/client9/misspell v0.3.4
  • github.com/envoyproxy/go-control-plane v0.9.1-0.20191026205805-5f8ba28d4473
  • github.com/envoyproxy/protoc-gen-validate v0.1.0
  • github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b
  • github.com/golang/mock v1.1.1
  • github.com/golang/protobuf v1.2.0
  • github.com/golang/protobuf v1.3.2
  • github.com/golang/protobuf v1.4.0-rc.1
  • github.com/golang/protobuf v1.4.0-rc.1.0.20200221234624-67d41d38c208
  • github.com/golang/protobuf v1.4.0-rc.2
  • github.com/golang/protobuf v1.4.0-rc.4.0.20200313231945-b860323f09d0
  • github.com/golang/protobuf v1.4.0
  • github.com/golang/protobuf v1.4.1
  • github.com/golang/protobuf v1.4.2
  • github.com/google/go-cmp v0.2.0
  • github.com/google/go-cmp v0.3.0
  • github.com/google/go-cmp v0.3.1
  • github.com/google/go-cmp v0.4.0
  • github.com/google/go-cmp v0.5.0
  • github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4
  • golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2
  • golang.org/x/exp v0.0.0-20190121172915-509febef88a4
  • golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3
  • golang.org/x/lint v0.0.0-20190227174305-5b3e6a55c961
  • golang.org/x/lint v0.0.0-20190313153728-d0100b6bd8b3
  • golang.org/x/net v0.0.0-20180724234803-3673e40ba225
  • golang.org/x/net v0.0.0-20180826012351-8a410e7b638d
  • golang.org/x/net v0.0.0-20190213061140-3a22650c66bd
  • golang.org/x/net v0.0.0-20190311183353-d8887717615a
  • golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be
  • golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f
  • golang.org/x/sync v0.0.0-20181108010431-42b317875d0f
  • golang.org/x/sync v0.0.0-20190423024810-112230192c58
  • golang.org/x/sys v0.0.0-20180830151530-49385e6e1522
  • golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a
  • golang.org/x/text v0.3.0
  • golang.org/x/tools v0.0.0-20190114222345-bf090417da8b
  • golang.org/x/tools v0.0.0-20190226205152-f727befe758c
  • golang.org/x/tools v0.0.0-20190311212946-11955173bddd
  • golang.org/x/tools v0.0.0-20190524140312-2c0ae7006135
  • golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543
  • google.golang.org/appengine v1.1.0
  • google.golang.org/appengine v1.4.0
  • google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8
  • google.golang.org/genproto v0.0.0-20190819201941-24fa4b261c55
  • google.golang.org/genproto v0.0.0-20200526211855-cb27e3aa2013
  • google.golang.org/grpc v1.19.0
  • google.golang.org/grpc v1.23.0
  • google.golang.org/grpc v1.27.0
  • google.golang.org/protobuf v0.0.0-20200109180630-ec00e32a8dfd
  • google.golang.org/protobuf v0.0.0-20200221191635-4d8936d0db64
  • google.golang.org/protobuf v0.0.0-20200228230310-ab0ca4ff8a60
  • google.golang.org/protobuf v1.20.1-0.20200309200217-e05f789c0967
  • google.golang.org/protobuf v1.21.0
  • google.golang.org/protobuf v1.22.0
  • google.golang.org/protobuf v1.23.0
  • google.golang.org/protobuf v1.23.1-0.20200526195155-81db48ad09cc
  • google.golang.org/protobuf v1.25.0
  • honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099
  • honnef.co/go/tools v0.0.0-20190523083050-ea95bdfd59fc
.github/workflows/benchmark.yml actions
  • actions/checkout v2 composite
  • actions/download-artifact v2 composite
  • actions/setup-go v2 composite
  • actions/upload-artifact v2 composite
.github/workflows/test.yml actions
  • actions/checkout v2 composite
  • actions/setup-go v2 composite