bayes

A simple implementation of Naive Bayesian classifier

https://github.com/gnames/bayes

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.5%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

A simple implementation of Naive Bayesian classifier

Basic Info

Host: GitHub
Owner: gnames
License: mit
Language: Go
Default Branch: master
Size: 65.4 KB

Statistics

Stars: 0
Watchers: 3
Forks: 0
Open Issues: 1
Releases: 2

Created over 8 years ago · Last pushed over 1 year ago

Metadata Files

Readme Changelog License Citation

bayes

An implementation of Naive Bayes classifier. More details are in docs.

Usage

This package allows to classify a new entity into one or another category (class) according to features of the entity. The algorithm uses known data to calculate a weight of each feature for each category.

```go func Example() { // there are two jars of cookies, they are our training set. // Cookies have be round or star-shaped. // There are plain or chocolate chips cookies. jar1 := ft.Class("Jar1") jar2 := ft.Class("Jar2")

// Every preclassified feature-set provides data for one cookie. It tells
// what jar has the cookie, what its kind and shape.
cookie1 := ft.ClassFeatures{
    Class: jar1,
    Features: []ft.Feature{
        {Name: "kind", Value: "plain"},
        {Name: "shape", Value: "round"},
    },
}
cookie2 := ft.ClassFeatures{
    Class: jar1,
    Features: []ft.Feature{
        {Name: "kind", Value: "plain"},
        {Name: "shape", Value: "star"},
    },
}
cookie3 := ft.ClassFeatures{
    Class: jar1,
    Features: []ft.Feature{
        {Name: "kind", Value: "chocolate"},
        {Name: "shape", Value: "star"},
    },
}
cookie4 := ft.ClassFeatures{
    Class: jar1,
    Features: []ft.Feature{
        {Name: "kind", Value: "plain"},
        {Name: "shape", Value: "round"},
    },
}
cookie5 := ft.ClassFeatures{
    Class: jar1,
    Features: []ft.Feature{
        {Name: "kind", Value: "plain"},
        {Name: "shape", Value: "round"},
    },
}
cookie6 := ft.ClassFeatures{
    Class: jar2,
    Features: []ft.Feature{
        {Name: "kind", Value: "chocolate"},
        {Name: "shape", Value: "star"},
    },
}
cookie7 := ft.ClassFeatures{
    Class: jar2,
    Features: []ft.Feature{
        {Name: "kind", Value: "chocolate"},
        {Name: "shape", Value: "star"},
    },
}
cookie8 := ft.ClassFeatures{
    Class: jar2,
    Features: []ft.Feature{
        {Name: "kind", Value: "chocolate"},
        {Name: "shape", Value: "star"},
    },
}

lfs := []ft.ClassFeatures{
    cookie1, cookie2, cookie3, cookie4, cookie5, cookie6, cookie7, cookie8,
}

nb := bayes.New()
nb.Train(lfs)
oddsPrior, err := nb.PriorOdds(jar1)
if err != nil {
    log.Println(err)
}

// If we got a chocolate star-shaped cookie, which jar it came from most
// likely?
aCookie := []ft.Feature{
    {Name: ft.Name("kind"), Value: ft.Value("chocolate")},
    {Name: ft.Name("shape"), Value: ft.Value("star")},
}

res, err := nb.PosteriorOdds(aCookie)
if err != nil {
    fmt.Println(err)
}

// it is more likely to that a random cookie comes from Jar1, but
// for chocolate and star-shaped cookie it is more likely to come from
// Jar2.
fmt.Printf("Prior odds for Jar1 are %0.2f\n", oddsPrior)
fmt.Printf("The cookie came from %s, with odds %0.2f\n", res.MaxClass, res.MaxOdds)
// Output:
// Prior odds for Jar1 are 1.67
// The cookie came from Jar2, with odds 7.50

} ```

Development

Testing

bash go test

Other implementations:

Go, Java, Python, R, Ruby

Owner

Name: gnames
Login: gnames
Kind: organization

Repositories: 30
Profile: https://github.com/gnames

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "Bayes -- a Global Names library for Naive Bayes algorithm."
date-released: 2024-12-02
version: v0.5.2
authors:
  - family-names: "Mozzherin"
    given-names: "Dmitry"
    orcid: "https://orcid.org/0000-0003-1593-1417"
repository-code: "https://github.com/gnames/bayes"
doi: 10.5281/zenodo.14262610
license: MIT

GitHub Events

Total

Release event: 1
Push event: 4
Create event: 1

Last Year

Release event: 1
Push event: 4
Create event: 1

Committers

Last synced: over 3 years ago

All Time

Total Commits: 30
Total Committers: 1
Avg Commits per committer: 30.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Dmitry Mozzherin	d**n@g**m	30

Issues and Pull Requests

Last synced: almost 3 years ago

All Time

Total issues: 17
Total pull requests: 0
Average time to close issues: 1 day
Average time to close pull requests: N/A
Total issue authors: 2
Total pull request authors: 0
Average comments per issue: 0.06
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: about 1 hour
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

dimus (16)
mjy (1)

Pull Request Authors

Top Labels

Issue Labels

duplicate (1)

Pull Request Labels

Packages

Total packages: 1
Total downloads: unknown

Total dependent packages: 6
Total dependent repositories: 9
Total versions: 12

proxy.golang.org: github.com/gnames/bayes

Package bayes implements Naive Bayes trainer and classifier. Code is located at https://github.com/gnames/bayes Naive Bayes rule calculates a probability of a hypothesis from a prior knowledge about the hypothesis, as well as the evidence that supports or diminishes the probability of the hypothesis. Prior knowledge can dramatically influence the posterior probability of a hypothesis. For example assuming that an adult bird that cannot fly is a penguin is very unlikely in the northern hemisphere, but is very likely in Antarctica. Bayes' theorem is often depicted as where H is our hypothesis, E is a new evidence, P(H) is a prior probability of H to be true, P(E|H) is a known probability for the evidence when H is true, P(E) is a known probability of E in all known cases. P(H|E) is a posterior probability of a hypothesis H adjusted accordingly to the new evidence E. Finding a probability that a hypothesis is true can be considered a classification event. Given prior knowledge and a new evidence we are able to classify an entity to a hypothesis that has the highest posterior probability. It is possible to represent Bayes theorem using odds. Odds describe how likely a hypothesis is in comparison to all other possible hypotheses. Using odds allows us to simplify Bayes calculations where likelihood is P(E|H') in this case is a known probability of an evidence when H is not true. In case if we have several evidences that are independent from each other, posterior odds can be calculated as a product of prior odds and all likelihoods of all given evidences. Each subsequent evidence modifies prior odds. If evidences are not independent (for example inability to fly and a propensity to nesting on the ground for birds) they skew the outcome. In reality given evidences are quite often not completely independent. Because of that Naive Bayes got its name. People who apply it "naively" state that their evidences are completely independent from each other. In practice Naive Bayes approach often shows good results in spite of this known fallacy. It is quite possible that while likelihoods of evidences are representative for classification data the prior odds from the training are not. As in the previous example an evidence that a bird cannot fly supports a 'penguin' hypothesis much better in Antarctica because odds to meet a penguin there are much higher than in the northern hemisphere. Therefore we give an ability to supply prior probability value at a classification event. In natural language processing `evidences` are often called `features`. We follow the same convention in this package. Hypotheses are often called classes. Based on the outcome we classify an entity (assign a class to the entity in other words). Every class receives a number of elements or `tokens`, each with a set of features.

Homepage: https://github.com/gnames/bayes
Documentation: https://pkg.go.dev/github.com/gnames/bayes#section-documentation
License: MIT
Latest release: v0.5.3
published about 1 year ago

Versions: 12
Dependent Packages: 6
Dependent Repositories: 9

Rankings

Dependent repos count: 1.7%

Dependent packages count: 2.2%

Average: 12.0%

Forks count: 18.7%

Stargazers count: 25.2%

Last synced: 10 months ago

Dependencies

go.mod go

github.com/davecgh/go-spew v1.1.1
github.com/gnames/gnfmt v0.2.0
github.com/json-iterator/go v1.1.10
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421
github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742
github.com/pmezard/go-difflib v1.0.0
github.com/stretchr/testify v1.7.0
gopkg.in/yaml.v3 v3.0.0-20200605160147-a5ece683394c

go.sum go

github.com/davecgh/go-spew v1.1.0
github.com/davecgh/go-spew v1.1.1
github.com/gnames/gnfmt v0.2.0
github.com/google/gofuzz v1.0.0
github.com/json-iterator/go v1.1.10
github.com/matryer/is v1.4.0
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421
github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742
github.com/pmezard/go-difflib v1.0.0
github.com/stretchr/objx v0.1.0
github.com/stretchr/testify v1.3.0
github.com/stretchr/testify v1.7.0
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c
gopkg.in/yaml.v3 v3.0.0-20200605160147-a5ece683394c

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

bayes

Science Score: 67.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

bayes

Usage

Development

Testing

Other implementations:

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

proxy.golang.org: github.com/gnames/bayes

Rankings

Dependencies