Recent Releases of Tabbed: A Python package for reading variably structured text files at scale

Tabbed: A Python package for reading variably structured text files at scale - v1.2.0 Release

What Changed

Commit: bdcd654

This release incorporates all suggestions from the JOSS review including: - Redefining all types to be generic types - Support for using comma as a decimal - Improvements to header and metadata detection for short files - YY.MM.DD support in datetime parsing

- Python
Published by mscaudill 7 months ago

Tabbed: A Python package for reading variably structured text files at scale - Tabbed v1.1.1

What's Changed

Commit 667d1c3

This release contains the following Bugfix:

  • The Sniffer's type method now reports a column to have consistent types if the type is one of {int, float, complex} as these are subsets. This means the Sniffer considers these all Numeric type. This change improves type change detection of the header row but has no backward compatibility conflicts. At some point Tabbed may define this Numeric type formally.

- Python
Published by mscaudill 9 months ago

Tabbed: A Python package for reading variably structured text files at scale - Tabbed 1.1.0

What's Changed

Commit 562cb13

This release contains the following Improvements: - Sniffing.types method now support excluding lines that have missing values via an exclusion parameter. The default is to ignore lines with any of ['', ' ', '-', 'nan', 'NAN', 'NaN'] for metadata, header and type detection. - Reader now accepts poll and exclusion parameters to allow clients to more finely control what rows of a sniffed sample should be used for header, metadata and type detection.

This release also makes the following Bug Fixes - Text files with a single column reported the delimiter as the empty string or the delimiter used in the metadata section (if present). When Sniffer makes a dialect it now checks for this and assigns the carriage return '\r' as the delimiter. - Location of header and metadata by type differences was buggy because it mixed line length differences and type differences in the same function sniffing._type_difference'. This has been clarified by creating a newlengthdifferenceprotected method of the Sniffer class. This method is used to exclusively locate metadata rows by looking for differences in length compared to the data section rows. - Type difference detection of header, and metadata sections were flawed becausefloat,int, andcomplexnumbers were seen as different types. For example if a column of the file contained bothintandfloat` a type difference was detected and the header (or metadata) erroneously assigned. Tabbed now ignores these numeric-to-numeric like type differences when determining the header and metadata sections.

A Joss manuscript for tabbed has been submitted

Full Changelog: https://github.com/mscaudill/tabbed/commits/1.1.0

- Python
Published by mscaudill 9 months ago

Tabbed: A Python package for reading variably structured text files at scale - Tabbed v1.0.1

What's Changed

This is the initial release of tabbed. The four features of this delimited text file reader are: - automatic sniffing of the metadata, header and data sections of irregularly structured files - automatic type casting to int, float, complex, time, date and datetime instances - conditional reading of rows with equality, membership, rich comparisons, regex and custom callable filters called tabs - partial and iterative reading for large file support

A Joss manuscript for tabbed is underway

Full Changelog: https://github.com/mscaudill/tabbed/commits/1.0.1

- Python
Published by mscaudill 11 months ago

Tabbed: A Python package for reading variably structured text files at scale - Tabbed 1.0.1

What's Changed

This is the initial release of tabbed. The four features of this delimited text file reader are: - automatic sniffing of the metadata, header and data sections of irregularly structured files - automatic type casting to int, float, complex, time, date and datetime instances - conditional reading of rows with equality, membership, rich comparisons, regex and custom callable filters called tabs - partial and iterative reading for large file support

A Joss manuscript for tabbed is underway

Full Changelog: https://github.com/mscaudill/tabbed/commits/1.0.1

- Python
Published by mscaudill 11 months ago