https://github.com/cfsan-biostatistics/table-ops

https://github.com/cfsan-biostatistics/table-ops

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.7%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: CFSAN-Biostatistics
  • Language: Python
  • Default Branch: main
  • Size: 12.7 KB
Statistics
  • Stars: 0
  • Watchers: 8
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created over 3 years ago · Last pushed over 1 year ago
Metadata Files
Readme

README.MD

Table Ops

A collection of simple command-line table manipulation tools written in Python. These tools are designed to be efficient and easy to use for common table operations.

Tools

table-union

Merges multiple tabular data files (e.g., CSV, TSV) either by unioning rows with identical columns or by performing a join based on shared key columns.

Key Features:

  • Union Mode (Default): Combines rows from all input files, assuming they have the same columns. Duplicate rows are retained.
  • Join Mode (--no-union or similar): Performs a join operation based on automatically detected shared key columns. It intelligently identifies potential key columns by looking for columns with unique, non-null values across all input files. This mode merges rows based on matching key values.
  • Automatic Key Detection: Automatically identifies suitable columns for joining based on uniqueness and non-null constraints.
  • Handles various delimiters: Supports tab-separated (TSV) and comma-separated (CSV) files.
  • Memory Efficient: Optimized to handle large files without loading them entirely into memory (where possible).

Usage Example:

bash table-union file1.tsv file2.tsv file3.tsv > output.tsv

bash table-summarize data.tsv

bash table-sort -k Age -k Name data.tsv > sorted_data.tsv

Run Unit Tests:

bash python -m unittest test_table_ops.py

Owner

  • Name: CFSAN (Center for Food Safety and Applied Nutrition)
  • Login: CFSAN-Biostatistics
  • Kind: organization

GitHub Events

Total
  • Push event: 1
  • Pull request event: 2
  • Fork event: 2
Last Year
  • Push event: 1
  • Pull request event: 2
  • Fork event: 2

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 8 days
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 8 days
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • baslia (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

Dockerfile docker
  • python 3.7-alpine build