Crowsetta
Crowsetta: A Python tool to work with any format for annotating animal vocalizations and bioacoustics data. - Published in JOSS (2023)
Tabbed: A Python package for reading variably structured text files at scale
Tabbed: A Python package for reading variably structured text files at scale - Published in JOSS (2025)
https://github.com/alan-turing-institute/clevercsv
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
pandas-ai
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
timetracker-csv
Pandas-friendly time tracking from the command line, repo by repo
tensorboard-reducer
Reduce multiple PyTorch TensorBoard runs to new event (or CSV) files.
flowTorch - a Python library for analysis and reduced-order modeling of fluid flows
flowTorch - a Python library for analysis and reduced-order modeling of fluid flows - Published in JOSS (2021)
externdata
:page_facing_up: Modelica library for data I/O of CSV, INI, JSON, MATLAB MAT, SSV, TIR, Excel XLS/XLSX and XML files
fitz-collection-raw-data
Raw data from the collections database in json and csv format
csv-metadata-quality
A simple but opinionated metadata quality checker and fixer designed to work with CSVs in the DSpace ecosystem
sixarm_ruby_spreadsheeting
SixArm.com » Ruby » Spreadsheeting has import & export helpers for CSV, TSV, Excel, etc.
odin
Data-structure definition/validation/traversal, mapping and serialisation toolkit for Python
https://github.com/johnkerl/miller
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
https://github.com/cube2222/octosql
OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.
https://github.com/alpstable/gidari
Transport web data to local/remote storage using Gidari
https://github.com/amin2312/acsv
ACsv is a easy, multi-platform and powerful csv parsing library, includes: js, ts, haxe, php, java, python, c#, go
https://github.com/baimamboukar/python_data_cleaning
Data cleaning automation for emails in csv and excel files
damagedlogginganalyzer
A project about an analyzation of a statistic of damaged logging (wood) in Germany using Python.
https://github.com/rumbledb/rumble
⛈️ RumbleDB 2.0.0 "Lemon Ironwood" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
convert-csv-schwab2pp
Converts a Charles Schwab transaction CSV file to a ready-to-import CSV file for Portfolio Performance.
https://github.com/avitase/ais_seqmaker
Tool suite for parsing AIS data from CSV stream
csv2ical
A CLI tool that converts a CSV file with event details into an iCalendar ICS file. The ICS file can then be imported into apps like Google Calendar, Microsoft Outlook, Apple macOS Calendar and etc.
https://github.com/alan-turing-institute/csv_wrangling
Repository for reproducibility of the CSV file project
https://github.com/akiomik/scalatest-csv-table
A scalatest helper for table driven testing with csv.
https://github.com/devidw/migrate-avira-passwords-to-google
Prepare Avira Password Manager CSV export for Google Passwords import
wovensnips
WovenSnips: A Lightweight, Free, and Open-source Implementation of Retrieval-Augmented Generation (RAG) using Straico API
convert-pheno
A software toolkit for the interconversion of standard data models for phenotypic data
https://github.com/cumbof/opengdc
An open-source Java tool to automatically extract and convert all clinical and genomic data from the Genomic Data Commons to BED, GTF, CSV, and JSON format
https://github.com/vincentlaucsb/csv-parser
A high-performance, fully-featured CSV parser and serializer for modern C++.
https://github.com/anselmoo/csv_first_insight
A sklearn-based correlation- and prediction-maker for small *csv-data
https://github.com/cured-plus/csvw-duckdb
Convert a CSVW document (CSV metadata) to a DuckDB query to load a CSV file into a database.
india-isin-data
International Securities Identification Numbers for various Indian Securities
sum-buddy
Generate and save checksums for all (or certain) contents of given directory.
https://github.com/3mcloud/plotme
plot all the things in all the folders automatically but only if there have been changes
https://github.com/cnag-biomedical-informatics/pheno-ranker
Pheno-Ranker is a tool for comparing phenotypic data structured in JSON/YAML format, such as Beacon v2 Models or Phenopackets v2, as well as CSV.