Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary
Repository
venn: set operations with a command line shell script
Basic Info
- Host: GitHub
- Owner: SixArm
- Language: Shell
- Default Branch: main
- Homepage: http://sixarm.com
- Size: 38.1 KB
Statistics
- Stars: 10
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
venn: set operations with a command line shell script

The venn command does set operations on the shell command line,
for example to process text files and do set union, set intersection, etc.
Introduction
Script: venn
Syntax
Syntax:
venn (union|intersection|...) <input> ...
Syntax example:
venn union file-1 file-2
Set operations
Set operations that venn can process:
union: A ∪ B (lines that are in any input stream)
intersection: A ∩ B (lines that are in all input streams)
difference: A ⊕ B (lines that are in one input stream)
except: A - B (lines that are solely in the first input steam)
extra: B - A (lines that are solely in the last input stream)
joint: is any line in more than one input stream?
disjoint: is each line in exactly one input stream?
Options
Options on the command line:
-h--help: show help-v--version: show version
Examples
Examples use these two example data files:
$ cat a
red
green
$ cat b
red
blue
Union:
$ venn union a b
red
green
blue
Intersection:
$ venn intersection a b
red
Difference:
$ venn difference a b
green
blue
Except:
$ venn except a b
green
Extra:
$ venn extra a b
blue
Disjoint:
$ venn disjoint a b
false
Install
Venn is one shell script, and you install it by putting the script anywhere in your path.
preflight
Verify that you have the awk command, such as:
$ awk --version
We target GNU Awk:
GNU Awk 4.2.1, API: 2.0 (GNU MPFR 4.0.1, GNU MP 6.1.2)
If you have a different Awk, then venn should still work fine.
git
To install via git:
git clone https://github.com/SixArm/venn.git
cp venn/bin/venn /usr/local/bin/venn
curl
To install via curl:
curl -fsSL https://raw.githubusercontent.com/SixArm/venn/master/bin/venn > /usr/local/bin/venn
chmod +x /usr/local/bin/venn
We want to create typical packages, such as for Debian apt, RedHat yum, macOS brew, etc. If you're a developer, and want to create packages, then we welcome your help.
Set operations details
Union
Set theory operation (A union B).
Print lines that are in any of the input streams.
Also known as "logical or", "logical inclusive disjunction".
Synonyms:
unionu(letter u)∪(U+222A union)∨(U+2228 logical or)+(U+002B plus sign)&(U+0026 ampersand)or
Example:
$ venn union a b c
$ venn or a b c
=> print lines that are in any of a, b, c
Intersection
Set theory operation (A intersection B).
Print lines that are in all of the input streams.
Also known as "logical and", "logical conjunction".
Synonyms:
intersectioni(letter i)∩(U+2229 intersection)∧(U+2227 logical and)|(U+007C vertical line)and
Example:
$ venn intersection a b c
$ venn and a b c
=> print lines that are in all of a, b, c
Difference
Set theory operation (A symmetric difference B).
Print lines that are in one of the input streams.
Also known as "logical xor", "logical exclusive disjuntion".
Synonyms:
differenced(letter d)⊕(U+2295 circled plus)∆(U+2206 increment)Δ(U+0394 delta)⊻(U+22BB logical xor)xor
Examples:
$ venn difference a b c
$ venn xor a b c
=> print lines that are in one of a, b, c
Except a.k.a. First
Set operation (A except B) a.k.a. (A - B)
Print lines that are solely in the first input.
Synonyms:
exceptfirstsubsubtractsubtraction-(U+2212 minus sign)
Examples:
$ venn except a b c
$ venn first a b c
=> print lines that are in a, not b, c
Extra a.k.a. Last
Set theory operation (A extra B) a.k.a. (B - A).
The lines that are solely in the last input.
Synonyms:
extralast
Examples:
$ venn extra a b c
$ venn last a b c
=> print lines that are in c, not a, b
Joint
Set operation is (A joint B).
Do any of the input streams have any overlap i.e. any lines in common?
If so, print $TRUE and exit 0, otherwise $FALSE and exit 1.
Synonyms:
jointcodependent
Examples:
$ venn joint a b c
$ venn codependent a b c
=> print "true" if any of a, b, c, have any lines in common
=> print "false" otherwise
Disjoint
Set operation is (A disjoint B).
Do all of the input streams have no overlap i.e. no lines in common?
If so, print $TRUE and exit 0, otherwise $FALSE and exit 1.
Also known as "pairwise disjoint", "mutually disjoint".
Synonyms:
disjointindependent
Examples:
$ venn disjoint a b c
$ venn independent a b c
=> print "true" if all of a, b, c, have no lines in common
=> print "false" otherwise
Customization
Custom output for true or false
The joint operation and the disjoint operation produce output that is either true or false.
Example:
$ venn joint a b
true
$ venn disjoint a b
false
You can customize the output text by using environment variables:
$ TRUE=yes FALSE=no venn joint a b
yes
We like to customize the output text by using environment variables and the Unicode symbols ⊤ (U+22A4 down tack) and ⊥ (U+22A5 up tack) like this:
$ TRUE=⊤ FALSE=⊥ venn joint a b
⊤
Implemenation
This command is currently implemented using awk and POSIX.
The goal is to maximize usability on a wide range of Unix systems, including older systems, and pure POSIX systems.
TODO
Ideas to implement:
Add a "--help" option?
Add a way to automatically do unique?
Add exception handling, such as if an input stream is not unique?
Want to help? We welcome help. You can open a GitHub issue, or send a GitHub pull request, or email us at sixarm@sixarm.com.
References
Documentation:
- Benchmarks: Benchmarks of millions of lines of data, such as random unsorted data.
- Comparisons: Comparisons to other implementations, such as Unix/POSIX shell scripts.
See also:
Contributors, advisors, thanks:
Tracking
- Program: venn
- Version: 4.3.0
- Created: 2017-01-30
- Updated: 2018-06-01
- License: GPL
- Contact: Joel Parker Henderson (joel@joelparkerhenderson.com)
Owner
- Name: SixArm
- Login: SixArm
- Kind: organization
- Email: sixarm@sixarm.com
- Location: San Francisco
- Website: http://sixarm.com
- Twitter: sixarm
- Repositories: 580
- Profile: https://github.com/SixArm
SixArm Software
Citation (CITATION.cff)
cff-version: 1.2.0
title: venn: set operations with a command line shell script
message: >-
If you use this work and you want to cite it,
then you can use the metadata from this file.
type: software
authors:
- given-names: Joel Parker
family-names: Henderson
email: joel@joelparkerhenderson.com
affiliation: joelparkerhenderson.com
orcid: 'https://orcid.org/0009-0000-4681-282X'
identifiers:
- type: url
value: 'https://github.com/SixArm/venn/'
description: venn: set operations with a command line shell script
repository-code: 'https://github.com/SixArm/venn/'
abstract: >-
venn: set operations with a command line shell script
license: See license file
GitHub Events
Total
- Push event: 1
Last Year
- Push event: 1
Committers
Last synced: 12 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Joel Parker Henderson | j****l@j****m | 35 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 12 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0