Recent Releases of https://github.com/johnkerl/miller

https://github.com/johnkerl/miller - 6.15.0: Fix double quotes in CSV comments and `mlr -I` mode preservation; `sort -b`

New features

  • mlr sort -b feature by @johnkerl in https://github.com/johnkerl/miller/pull/1833
  • Add scoop install to README.md by @dflock in https://github.com/johnkerl/miller/pull/1842
  • DKVP --incr-key option by @johnkerl in https://github.com/johnkerl/miller/pull/1839

Bugfixes

  • Fix doc typo re empty and multiplication by @johnkerl in https://github.com/johnkerl/miller/pull/1838
  • Force decimal formatting for ints on JSON output by @johnkerl in https://github.com/johnkerl/miller/pull/1840
  • Preserve file mods on mlr -I by @johnkerl in https://github.com/johnkerl/miller/pull/1849
  • Don't parse CSV comments by @johnkerl in https://github.com/johnkerl/miller/pull/1859

Dependency updates

  • Miller 6.15.0 by @johnkerl in https://github.com/johnkerl/miller/pull/1860
  • Use Go 1.24.5 by @johnkerl in https://github.com/johnkerl/miller/pull/1843
  • Bump golang.org/x/sys from 0.33.0 to 0.34.0 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1832
  • Bump golang.org/x/term from 0.32.0 to 0.33.0 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1831
  • Bump golang.org/x/text from 0.26.0 to 0.27.0 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1830
  • Bump github/codeql-action from 3.29.2 to 3.29.3 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1841
  • Bump github.com/klauspost/compress from 1.17.11 to 1.18.0 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1757
  • Bump github.com/klauspost/compress from 1.17.11 to 1.18.0 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1844
  • Bump github/codeql-action from 3.29.3 to 3.29.4 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1845
  • Bump github.com/lestrrat-go/strftime from 1.1.0 to 1.1.1 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1846
  • Bump github/codeql-action from 3.29.4 to 3.29.5 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1847
  • Bump actions/cache from 4.2.3 to 4.2.4 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1854
  • Bump github/codeql-action from 3.29.7 to 3.29.8 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1853
  • Bump golang.org/x/sys from 0.34.0 to 0.35.0 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1852
  • Bump golang.org/x/term from 0.33.0 to 0.34.0 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1851
  • Bump golang.org/x/text from 0.27.0 to 0.28.0 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1850
  • Bump github/codeql-action from 3.29.8 to 3.29.9 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1856
  • Bump actions/checkout from 4.2.2 to 5.0.0 by @dependabot[bot] in https://github.com/johnkerl/miller/pull/1857

New Contributors

  • @dflock made their first contribution in https://github.com/johnkerl/miller/pull/1842

Full Changelog: https://github.com/johnkerl/miller/compare/v6.14.0...v6.15.0

- Go
Published by johnkerl 6 months ago

https://github.com/johnkerl/miller - Miller 6.14.0: survival curve, misc. features, and bugfixes

New features

  • Add surv verb to estimate a survival curve by @cwarden in https://github.com/johnkerl/miller/pull/1788
  • cut: Consider -o flag even when using regexes with -r by @balki in https://github.com/johnkerl/miller/pull/1823
  • Add keystroke savers for same format by @balki in https://github.com/johnkerl/miller/pull/1824

Bug fixes

  • Fix unflatten with field names like . .x or x..y by @johnkerl in https://github.com/johnkerl/miller/pull/1735
  • Fix section-title typos for docs in #1735 by @johnkerl in https://github.com/johnkerl/miller/pull/1736
  • Fix non-constant format string errors with Go 1.24 by @michel-slm in https://github.com/johnkerl/miller/pull/1745
  • Fix joinv with "" separator by @johnkerl in https://github.com/johnkerl/miller/pull/1794
  • Fix print within begin{}/end{} by @johnkerl in https://github.com/johnkerl/miller/pull/1795
  • Argument parsing is different in mlr -s scripts by @johnkerl in https://github.com/johnkerl/miller/pull/1817

Documentation

  • Docs for new surv verb by @johnkerl in https://github.com/johnkerl/miller/pull/1807
  • Improve help message on non-existent verb by @johnkerl in https://github.com/johnkerl/miller/pull/1798
  • Add help strings for -a/-r in sub/gsub/ssub by @johnkerl in https://github.com/johnkerl/miller/pull/1721
  • Join docs wrong link by @johnkerl in https://github.com/johnkerl/miller/pull/1695
  • Add -c, -t, -j to doc matrix in PR 1824 by @johnkerl in https://github.com/johnkerl/miller/pull/1826
  • Doc copy edits by @johnkerl in https://github.com/johnkerl/miller/pull/1827
  • Typo fix: programmatically by @skitt in https://github.com/johnkerl/miller/pull/1679

Internals

  • Static-check fixes from @lespea #1657, batch 1/n by @johnkerl in https://github.com/johnkerl/miller/pull/1703
  • Static-check fixes from @lespea #1657, batch 2/n by @johnkerl in https://github.com/johnkerl/miller/pull/1704
  • Static-check fixes from @lespea #1657, batch 3/n by @johnkerl in https://github.com/johnkerl/miller/pull/1705
  • Static-check fixes from @lespea #1657, batch 4/n by @johnkerl in https://github.com/johnkerl/miller/pull/1706
  • Static-check fixes from @lespea #1657, batch 5/n by @johnkerl in https://github.com/johnkerl/miller/pull/1707
  • Static-check fixes from @lespea #1657, batch 6/n by @johnkerl in https://github.com/johnkerl/miller/pull/1708
  • Static-check fixes from @lespea #1657, batch 7/n by @johnkerl in https://github.com/johnkerl/miller/pull/1709
  • Static-check fixes from @lespea #1657, batch 8/n by @johnkerl in https://github.com/johnkerl/miller/pull/1710
  • Switch to generics (one PR of several) by @johnkerl in https://github.com/johnkerl/miller/pull/1763
  • Use Go 1.21 in CI by @johnkerl in https://github.com/johnkerl/miller/pull/1768

Dependencies

  • Bump actions/cache from 4.0.2 to 4.1.0 by @dependabot in https://github.com/johnkerl/miller/pull/1683
  • Bump golang.org/x/sys from 0.25.0 to 0.26.0 by @dependabot in https://github.com/johnkerl/miller/pull/1682
  • Bump golang.org/x/text from 0.18.0 to 0.19.0 by @dependabot in https://github.com/johnkerl/miller/pull/1681
  • Bump golang.org/x/term from 0.24.0 to 0.25.0 by @dependabot in https://github.com/johnkerl/miller/pull/1680
  • Bump github/codeql-action from 3.26.11 to 3.26.12 by @dependabot in https://github.com/johnkerl/miller/pull/1687
  • Bump actions/upload-artifact from 4.4.0 to 4.4.1 by @dependabot in https://github.com/johnkerl/miller/pull/1686
  • Bump actions/checkout from 4.2.0 to 4.2.1 by @dependabot in https://github.com/johnkerl/miller/pull/1685
  • Bump actions/cache from 4.1.0 to 4.1.1 by @dependabot in https://github.com/johnkerl/miller/pull/1688
  • Bump actions/upload-artifact from 4.4.1 to 4.4.2 by @dependabot in https://github.com/johnkerl/miller/pull/1689
  • Bump actions/upload-artifact from 4.4.2 to 4.4.3 by @dependabot in https://github.com/johnkerl/miller/pull/1690
  • Bump github/codeql-action from 3.26.12 to 3.26.13 by @dependabot in https://github.com/johnkerl/miller/pull/1692
  • Bump github.com/klauspost/compress from 1.17.10 to 1.17.11 by @dependabot in https://github.com/johnkerl/miller/pull/1691
  • Bump actions/cache from 4.1.1 to 4.1.2 by @dependabot in https://github.com/johnkerl/miller/pull/1698
  • Bump github/codeql-action from 3.26.13 to 3.27.0 by @dependabot in https://github.com/johnkerl/miller/pull/1697
  • Bump actions/checkout from 4.2.1 to 4.2.2 by @dependabot in https://github.com/johnkerl/miller/pull/1699
  • Bump actions/setup-go from 5.0.2 to 5.1.0 by @dependabot in https://github.com/johnkerl/miller/pull/1700
  • Bump golang.org/x/term from 0.25.0 to 0.26.0 by @dependabot in https://github.com/johnkerl/miller/pull/1712
  • Bump goreleaser/goreleaser-action from 6.0.0 to 6.1.0 by @dependabot in https://github.com/johnkerl/miller/pull/1711
  • Bump golang.org/x/text from 0.19.0 to 0.20.0 by @dependabot in https://github.com/johnkerl/miller/pull/1714
  • Bump github/codeql-action from 3.27.0 to 3.27.1 by @dependabot in https://github.com/johnkerl/miller/pull/1715
  • Bump github/codeql-action from 3.27.1 to 3.27.2 by @dependabot in https://github.com/johnkerl/miller/pull/1716
  • Bump github/codeql-action from 3.27.2 to 3.27.3 by @dependabot in https://github.com/johnkerl/miller/pull/1717
  • Bump github/codeql-action from 3.27.3 to 3.27.4 by @dependabot in https://github.com/johnkerl/miller/pull/1718
  • Bump github/codeql-action from 3.27.4 to 3.27.5 by @dependabot in https://github.com/johnkerl/miller/pull/1719
  • Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 by @dependabot in https://github.com/johnkerl/miller/pull/1723
  • Bump github/codeql-action from 3.27.5 to 3.27.6 by @dependabot in https://github.com/johnkerl/miller/pull/1724
  • Bump golang.org/x/text from 0.20.0 to 0.21.0 by @dependabot in https://github.com/johnkerl/miller/pull/1727
  • Bump golang.org/x/term from 0.26.0 to 0.27.0 by @dependabot in https://github.com/johnkerl/miller/pull/1726
  • Bump actions/cache from 4.1.2 to 4.2.0 by @dependabot in https://github.com/johnkerl/miller/pull/1728
  • Bump github/codeql-action from 3.27.6 to 3.27.7 by @dependabot in https://github.com/johnkerl/miller/pull/1730
  • Bump actions/setup-go from 5.1.0 to 5.2.0 by @dependabot in https://github.com/johnkerl/miller/pull/1729
  • Bump github/codeql-action from 3.27.7 to 3.27.9 by @dependabot in https://github.com/johnkerl/miller/pull/1731
  • Bump actions/upload-artifact from 4.4.3 to 4.5.0 by @dependabot in https://github.com/johnkerl/miller/pull/1732
  • Bump github/codeql-action from 3.27.9 to 3.28.0 by @dependabot in https://github.com/johnkerl/miller/pull/1734
  • Bump golang.org/x/sys from 0.28.0 to 0.29.0 by @dependabot in https://github.com/johnkerl/miller/pull/1738
  • Bump golang.org/x/term from 0.27.0 to 0.28.0 by @dependabot in https://github.com/johnkerl/miller/pull/1737
  • Bump actions/upload-artifact from 4.5.0 to 4.6.0 by @dependabot in https://github.com/johnkerl/miller/pull/1739
  • Bump github/codeql-action from 3.28.0 to 3.28.1 by @dependabot in https://github.com/johnkerl/miller/pull/1740
  • Bump actions/setup-go from 5.2.0 to 5.3.0 by @dependabot in https://github.com/johnkerl/miller/pull/1741
  • Bump github/codeql-action from 3.28.1 to 3.28.2 by @dependabot in https://github.com/johnkerl/miller/pull/1742
  • Bump github/codeql-action from 3.28.2 to 3.28.3 by @dependabot in https://github.com/johnkerl/miller/pull/1743
  • Bump github/codeql-action from 3.28.3 to 3.28.4 by @dependabot in https://github.com/johnkerl/miller/pull/1744
  • Bump github/codeql-action from 3.28.4 to 3.28.5 by @dependabot in https://github.com/johnkerl/miller/pull/1746
  • Bump github/codeql-action from 3.28.5 to 3.28.6 by @dependabot in https://github.com/johnkerl/miller/pull/1747
  • Bump github/codeql-action from 3.28.6 to 3.28.8 by @dependabot in https://github.com/johnkerl/miller/pull/1748
  • Bump golang.org/x/text from 0.21.0 to 0.22.0 by @dependabot in https://github.com/johnkerl/miller/pull/1752
  • Bump golang.org/x/term from 0.28.0 to 0.29.0 by @dependabot in https://github.com/johnkerl/miller/pull/1751
  • Bump github/codeql-action from 3.28.8 to 3.28.9 by @dependabot in https://github.com/johnkerl/miller/pull/1753
  • Bump goreleaser/goreleaser-action from 6.1.0 to 6.2.1 by @dependabot in https://github.com/johnkerl/miller/pull/1755
  • Bump actions/cache from 4.2.0 to 4.2.1 by @dependabot in https://github.com/johnkerl/miller/pull/1756
  • Bump actions/upload-artifact from 4.6.0 to 4.6.1 by @dependabot in https://github.com/johnkerl/miller/pull/1760
  • Bump github/codeql-action from 3.28.9 to 3.28.10 by @dependabot in https://github.com/johnkerl/miller/pull/1759
  • Bump actions/cache from 4.2.1 to 4.2.2 by @dependabot in https://github.com/johnkerl/miller/pull/1762
  • Bump github/codeql-action from 3.28.10 to 3.28.11 by @dependabot in https://github.com/johnkerl/miller/pull/1769
  • Bump actions/setup-go from 5.3.0 to 5.4.0 by @dependabot in https://github.com/johnkerl/miller/pull/1771
  • Bump actions/cache from 4.2.2 to 4.2.3 by @dependabot in https://github.com/johnkerl/miller/pull/1774
  • Bump actions/upload-artifact from 4.6.1 to 4.6.2 by @dependabot in https://github.com/johnkerl/miller/pull/1773
  • Bump github/codeql-action from 3.28.11 to 3.28.12 by @dependabot in https://github.com/johnkerl/miller/pull/1772
  • Bump github/codeql-action from 3.28.12 to 3.28.13 by @dependabot in https://github.com/johnkerl/miller/pull/1776
  • Bump goreleaser/goreleaser-action from 6.2.1 to 6.3.0 by @dependabot in https://github.com/johnkerl/miller/pull/1778
  • Bump github/codeql-action from 3.28.13 to 3.28.14 by @dependabot in https://github.com/johnkerl/miller/pull/1779
  • Bump github/codeql-action from 3.28.14 to 3.28.15 by @dependabot in https://github.com/johnkerl/miller/pull/1783
  • Bump github/codeql-action from 3.28.15 to 3.28.16 by @dependabot in https://github.com/johnkerl/miller/pull/1790
  • Bump github/codeql-action from 3.28.16 to 3.28.17 by @dependabot in https://github.com/johnkerl/miller/pull/1796
  • Bump actions/setup-go from 5.4.0 to 5.5.0 by @dependabot in https://github.com/johnkerl/miller/pull/1802
  • Bump golang.org/x/sys from 0.30.0 to 0.33.0 by @dependabot in https://github.com/johnkerl/miller/pull/1801
  • Bump golang.org/x/text from 0.22.0 to 0.25.0 by @dependabot in https://github.com/johnkerl/miller/pull/1800
  • Bump golang.org/x/term from 0.29.0 to 0.32.0 by @dependabot in https://github.com/johnkerl/miller/pull/1799
  • Bump github/codeql-action from 3.28.17 to 3.28.18 by @dependabot in https://github.com/johnkerl/miller/pull/1808
  • Bump github/codeql-action from 3.28.18 to 3.28.19 by @dependabot in https://github.com/johnkerl/miller/pull/1812
  • Bump golang.org/x/text from 0.25.0 to 0.26.0 by @dependabot in https://github.com/johnkerl/miller/pull/1813
  • Bump github/codeql-action from 3.28.19 to 3.29.0 by @dependabot in https://github.com/johnkerl/miller/pull/1814
  • Bump github/codeql-action from 3.29.0 to 3.29.1 by @dependabot in https://github.com/johnkerl/miller/pull/1822
  • Bump github/codeql-action from 3.29.1 to 3.29.2 by @dependabot in https://github.com/johnkerl/miller/pull/1825

New Contributors

  • @michel-slm made their first contribution in https://github.com/johnkerl/miller/pull/1745
  • @cwarden made their first contribution in https://github.com/johnkerl/miller/pull/1788

Full Changelog: https://github.com/johnkerl/miller/compare/v6.13.0...v6.14.0

- Go
Published by johnkerl 8 months ago

https://github.com/johnkerl/miller - File-stat DSL function, new stats accumulator, misc. bugfixes

New features

  • Add a stat DSL function by @johnkerl in #1560
  • Add mad accumulator for stats1 DSL function by @johnkerl in #1561
  • Support $NO_COLOR by @johnkerl in #1580

Bug fixes

  • Fraction bugfix by @oandrew in #1579
  • Fix local time when TZ is not set by @balki in #1649
  • Bash process substitution not working with put -f by @johnkerl in #1583
  • Be smarter about auto-unflatten by @johnkerl in #1584
  • RS aliases for ASCII top-of-table control characters are misnamed by @johnkerl in #1620
  • Fix binary data in JSON output by @johnkerl in #1626
  • Fix prepipe handling when filenames have whitespace by @johnkerl in #1627
  • Error in splita/splitax when field contains a single non-string value by @johnkerl in #1629

Documentation updates

  • Update reference-verbs.md by @aborruso in #1665
  • Characters to be removed by @aborruso in #1668
  • Fix minor typo by @austinletson in #1673
  • Enable admonition extension by @aborruso in #1636
  • To realize which chapter and section are active by @aborruso in #1631
  • To have edit and copy code in each page by @aborruso in #1632
  • Update extra.css by @aborruso in #1633
  • A note about positional field names by @aborruso in #1634
  • Fix typo in online help for --no-jlistwrap by @johnkerl in #1541
  • Try to build readthedocs .epub and .pdf by @johnkerl in #1548
  • On-line help for mlr summary --transpose by @johnkerl in #1581
  • Note IANA TSV support by @johnkerl in #1582
  • Source-file update for PR 1634 by @johnkerl in #1635
  • Update source material for #1665 by @johnkerl in #1666
  • Fix 1668 error-source by @johnkerl in #1672

Minor changes

  • The package version must match the major tag version by @lespea in #1654
  • Use string version of regexp methods to reduce allocs by @Juneezee in #1614
  • Chore: fix function name in comment by @camcui in #1543
  • Fix mismatched method names in comments by @forcedebug in #1549
  • Compiling on newer go versions doesn't work by @lespea in #1655
  • Misc. codespell findings by @johnkerl in #1628

New Contributors

  • @camcui made their first contribution in #1543
  • @forcedebug made their first contribution in #1549
  • @oandrew made their first contribution in #1579
  • @balki made their first contribution in #1649
  • @lespea made their first contribution in #1654
  • @austinletson made their first contribution in #1673

Dependency updates

  • Bump actions/cache from 4.0.1 to 4.0.2 by @dependabot in #1532
  • Bump golang.org/x/term from 0.18.0 to 0.19.0 by @dependabot in #1536
  • Bump github.com/klauspost/compress from 1.17.7 to 1.17.8 by @dependabot in #1538
  • Bump actions/upload-artifact from 4.3.1 to 4.3.2 by @dependabot in #1547
  • Bump actions/checkout from 4.1.2 to 4.1.3 by @dependabot in #1550
  • Bump actions/upload-artifact from 4.3.2 to 4.3.3 by @dependabot in #1551
  • Bump actions/checkout from 4.1.3 to 4.1.4 by @dependabot in #1552
  • Bump actions/setup-go from 5.0.0 to 5.0.1 by @dependabot in #1553
  • Bump golang.org/x/sys from 0.19.0 to 0.20.0 by @dependabot in #1554
  • Bump golang.org/x/text from 0.14.0 to 0.15.0 by @dependabot in #1556
  • Bump golang.org/x/term from 0.19.0 to 0.20.0 by @dependabot in #1555
  • Bump actions/checkout from 4.1.4 to 4.1.5 by @dependabot in #1557
  • Bump goreleaser/goreleaser-action from 5.0.0 to 5.1.0 by @dependabot in #1563
  • Bump actions/checkout from 4.1.5 to 4.1.6 by @dependabot in #1566
  • Bump github/codeql-action from 2.13.4 to 3.25.5 by @dependabot in #1567
  • Bump github/codeql-action from 3.25.5 to 3.25.6 by @dependabot in #1568
  • Bump github/codeql-action from 3.25.6 to 3.25.7 by @dependabot in #1570
  • Bump goreleaser/goreleaser-action from 5.1.0 to 6.0.0 by @dependabot in #1574
  • Bump github/codeql-action from 3.25.7 to 3.25.8 by @dependabot in #1575
  • Bump golang.org/x/text from 0.15.0 to 0.16.0 by @dependabot in #1576
  • Bump golang.org/x/sys from 0.20.0 to 0.21.0 by @dependabot in #1578
  • Bump golang.org/x/term from 0.20.0 to 0.21.0 by @dependabot in #1577
  • Bump github.com/klauspost/compress from 1.17.8 to 1.17.9 by @dependabot in #1585
  • Bump actions/checkout from 4.1.6 to 4.1.7 by @dependabot in #1586
  • Bump github/codeql-action from 3.25.8 to 3.25.9 by @dependabot in #1587
  • Bump github/codeql-action from 3.25.9 to 3.25.10 by @dependabot in #1588
  • Bump github/codeql-action from 3.25.10 to 3.25.11 by @dependabot in #1593
  • Bump golang.org/x/sys from 0.21.0 to 0.22.0 by @dependabot in #1595
  • Bump golang.org/x/term from 0.21.0 to 0.22.0 by @dependabot in #1594
  • Bump actions/upload-artifact from 4.3.3 to 4.3.4 by @dependabot in #1596
  • Bump actions/setup-go from 5.0.1 to 5.0.2 by @dependabot in #1597
  • Bump github/codeql-action from 3.25.11 to 3.25.12 by @dependabot in #1598
  • Bump github/codeql-action from 3.25.12 to 3.25.13 by @dependabot in #1602
  • Bump github/codeql-action from 3.25.13 to 3.25.14 by @dependabot in #1603
  • Bump github/codeql-action from 3.25.14 to 3.25.15 by @dependabot in #1604
  • Bump golang.org/x/sys from 0.22.0 to 0.23.0 by @dependabot in #1605
  • Bump actions/upload-artifact from 4.3.4 to 4.3.5 by @dependabot in #1606
  • Bump golang.org/x/term from 0.22.0 to 0.23.0 by @dependabot in #1612
  • Bump actions/upload-artifact from 4.3.5 to 4.3.6 by @dependabot in #1609
  • Bump github/codeql-action from 3.25.15 to 3.26.0 by @dependabot in #1610
  • Bump golang.org/x/text from 0.16.0 to 0.17.0 by @dependabot in #1611
  • Bump golang.org/x/sys from 0.23.0 to 0.24.0 by @dependabot in #1613
  • Bump github/codeql-action from 3.26.0 to 3.26.1 by @dependabot in #1615
  • Bump github/codeql-action from 3.26.1 to 3.26.2 by @dependabot in #1617
  • Bump codespell-project/actions-codespell from 2.0 to 2.1 by @dependabot in #1622
  • Bump github/codeql-action from 3.26.2 to 3.26.3 by @dependabot in #1623
  • Bump github/codeql-action from 3.26.3 to 3.26.4 by @dependabot in #1624
  • Bump github/codeql-action from 3.26.4 to 3.26.5 by @dependabot in #1630
  • Bump github.com/lestrrat-go/strftime from 1.0.6 to 1.1.0 by @dependabot in #1637
  • Bump github/codeql-action from 3.26.5 to 3.26.6 by @dependabot in #1638
  • Bump actions/upload-artifact from 4.3.6 to 4.4.0 by @dependabot in #1640
  • Bump golang.org/x/text from 0.17.0 to 0.18.0 by @dependabot in #1641
  • Bump golang.org/x/term from 0.23.0 to 0.24.0 by @dependabot in #1642
  • Bump github/codeql-action from 3.26.6 to 3.26.7 by @dependabot in #1648
  • Bump github/codeql-action from 3.26.7 to 3.26.8 by @dependabot in #1652
  • Bump github.com/klauspost/compress from 1.17.9 to 1.17.10 by @dependabot in #1659
  • Bump github/codeql-action from 3.26.8 to 3.26.9 by @dependabot in #1660
  • Bump actions/checkout from 4.1.7 to 4.2.0 by @dependabot in #1662
  • Bump github/codeql-action from 3.26.9 to 3.26.10 by @dependabot in #1664
  • Bump github/codeql-action from 3.26.10 to 3.26.11 by @dependabot in #1669

Full Changelog: https://github.com/johnkerl/miller/compare/v6.12.0...v6.13.0

- Go
Published by johnkerl over 1 year ago

https://github.com/johnkerl/miller - New sparsify verb, wide-table performance improvement, thousands separator for fmtnum function

Features

  • New mlr sparsify verb by @johnkerl in https://github.com/johnkerl/miller/pull/1498
  • Support thousands separator in fmtnum by @johnkerl in https://github.com/johnkerl/miller/pull/1499
  • Add descriptions for put and filter verbs by @johnkerl in https://github.com/johnkerl/miller/pull/1529

Bugfixes

  • Miller produces no output on TSV with > 64K characters per line by @johnkerl in https://github.com/johnkerl/miller/pull/1505
  • Enable record-hashing by default by @johnkerl in https://github.com/johnkerl/miller/pull/1507
  • Improved file-not-found handling by @johnkerl in https://github.com/johnkerl/miller/pull/1508
  • Avoid spurious [] on JSON output in some cases by @johnkerl in https://github.com/johnkerl/miller/pull/1528

Internal

  • 6.11.0-dev by @johnkerl in https://github.com/johnkerl/miller/pull/1484
  • Separate out ILineReader abstraction by @johnkerl in https://github.com/johnkerl/miller/pull/1504

Dependency updates

  • Bump actions/upload-artifact from 4.2.0 to 4.3.0 by @dependabot in https://github.com/johnkerl/miller/pull/1483
  • Bump github.com/klauspost/compress from 1.16.7 to 1.17.5 by @dependabot in https://github.com/johnkerl/miller/pull/1486
  • Bump actions/upload-artifact from 4.3.0 to 4.3.1 by @dependabot in https://github.com/johnkerl/miller/pull/1491
  • Bump github.com/klauspost/compress from 1.17.5 to 1.17.6 by @dependabot in https://github.com/johnkerl/miller/pull/1492
  • Bump golang.org/x/term from 0.16.0 to 0.17.0 by @dependabot in https://github.com/johnkerl/miller/pull/1494
  • Bump github.com/klauspost/compress from 1.17.6 to 1.17.7 by @dependabot in https://github.com/johnkerl/miller/pull/1502
  • Bump actions/cache from 4.0.0 to 4.0.1 by @dependabot in https://github.com/johnkerl/miller/pull/1511
  • Bump github.com/stretchr/testify from 1.8.4 to 1.9.0 by @dependabot in https://github.com/johnkerl/miller/pull/1516
  • Bump golang.org/x/sys from 0.17.0 to 0.18.0 by @dependabot in https://github.com/johnkerl/miller/pull/1521
  • Bump golang.org/x/term from 0.17.0 to 0.18.0 by @dependabot in https://github.com/johnkerl/miller/pull/1522
  • Bump actions/checkout from 4.1.1 to 4.1.2 by @dependabot in https://github.com/johnkerl/miller/pull/1526

Full Changelog: https://github.com/johnkerl/miller/compare/v6.11.0...v6.12.0

- Go
Published by johnkerl almost 2 years ago

https://github.com/johnkerl/miller - v6.11.0

Features

  • Auto-unsparsify CSV and TSV on output by @johnkerl in https://github.com/johnkerl/miller/pull/1479
  • mlr reorder with regex support by @johnkerl in https://github.com/johnkerl/miller/pull/1473
  • Implement all/by-regex field selection (-a/-r) for mlr sub, gsub, and ssub by @johnkerl in https://github.com/johnkerl/miller/pull/1480
  • Preserve regex captures across stack frames by @johnkerl in https://github.com/johnkerl/miller/pull/1447
  • Document and unit-test regex-capture reset logic by @johnkerl in https://github.com/johnkerl/miller/pull/1451
  • New strmatch/strmatchx DSL functions by @johnkerl in https://github.com/johnkerl/miller/pull/1448
  • Implement mlr uniq -x by @johnkerl in https://github.com/johnkerl/miller/pull/1457
  • On-line help info for mlr join --lk "" by @johnkerl in https://github.com/johnkerl/miller/pull/1458
  • Fix PR 1462: remove limit of 1000 on dedupe field names by @johnkerl in https://github.com/johnkerl/miller/pull/1463
  • Support PPRINT barred input by @johnkerl in https://github.com/johnkerl/miller/pull/1472
  • Support markdown format on input by @johnkerl in https://github.com/johnkerl/miller/pull/1478

Bugfixes

  • mlr --norc was erroring by @johnkerl in https://github.com/johnkerl/miller/pull/1450
  • Have clean_whitespace re-run type inference by @johnkerl in https://github.com/johnkerl/miller/pull/1464

Internals

  • Rename internal regex functions by @johnkerl in https://github.com/johnkerl/miller/pull/1446
  • Replace deprecated io/ioutil functions by @Juneezee in https://github.com/johnkerl/miller/pull/1452
  • Internal name-neatens by @johnkerl in https://github.com/johnkerl/miller/pull/1475
  • Fix typos in tests for PPRINT barred input by @johnkerl in https://github.com/johnkerl/miller/pull/1476
  • Don't run regression tests twice in GitHub CI by @johnkerl in https://github.com/johnkerl/miller/pull/1477
  • Miller 6.11.0 by @johnkerl in https://github.com/johnkerl/miller/pull/1481

Dependencies

  • Bump actions/upload-artifact from 3.1.3 to 4.0.0 by @dependabot in https://github.com/johnkerl/miller/pull/1445
  • Bump golang.org/x/term from 0.15.0 to 0.16.0 by @dependabot in https://github.com/johnkerl/miller/pull/1466
  • Bump actions/cache from 3.3.2 to 3.3.3 by @dependabot in https://github.com/johnkerl/miller/pull/1468
  • Bump actions/upload-artifact from 4.0.0 to 4.1.0 by @dependabot in https://github.com/johnkerl/miller/pull/1469
  • Bump actions/cache from 3.3.3 to 4.0.0 by @dependabot in https://github.com/johnkerl/miller/pull/1470
  • Bump actions/upload-artifact from 4.1.0 to 4.2.0 by @dependabot in https://github.com/johnkerl/miller/pull/1471

Full Changelog: https://github.com/johnkerl/miller/compare/v6.10.0...v6.11.0

- Go
Published by johnkerl about 2 years ago

https://github.com/johnkerl/miller - Add --files option; bugfixes; use Go 1.19

Features

  • Add a --files option by @johnkerl in https://github.com/johnkerl/miller/pull/1426

Bugfixes

  • Fix ragged-CSV auto-pad by @johnkerl in https://github.com/johnkerl/miller/pull/1428
  • Absent variable on left side of boolean OR (||) expression makes it absent by @johnkerl in https://github.com/johnkerl/miller/pull/1434
  • Include null in any typemask by @johnkerl in https://github.com/johnkerl/miller/pull/1395
  • transformers/grep: avoid allocations with (*regexp.Regexp).MatchString by @Juneezee in https://github.com/johnkerl/miller/pull/1416
  • JSONL output does not properly handle keys with quotes by @johnkerl in https://github.com/johnkerl/miller/pull/1425

Minor changes

  • Update to Go 1.19 by @johnkerl in https://github.com/johnkerl/miller/pull/1441
  • miller 6.10.0 by @johnkerl in https://github.com/johnkerl/miller/pull/1442
  • Add winget to README.md by @rursprung in https://github.com/johnkerl/miller/pull/1414
  • Name-neaten for #1392 by @johnkerl in https://github.com/johnkerl/miller/pull/1393

Miller as API

  • Export library code in pkg/ by @johnkerl in https://github.com/johnkerl/miller/pull/1391
  • Better API example by @johnkerl in https://github.com/johnkerl/miller/pull/1392

Dependencies

  • Bump golang.org/x/text from 0.12.0 to 0.13.0 by @dependabot in https://github.com/johnkerl/miller/pull/1382
  • Bump golang.org/x/sys from 0.11.0 to 0.12.0 by @dependabot in https://github.com/johnkerl/miller/pull/1381
  • Bump golang.org/x/term from 0.11.0 to 0.12.0 by @dependabot in https://github.com/johnkerl/miller/pull/1380
  • Bump actions/checkout from 3.6.0 to 4.0.0 by @dependabot in https://github.com/johnkerl/miller/pull/1383
  • Bump goreleaser/goreleaser-action from 4.4.0 to 4.6.0 by @dependabot in https://github.com/johnkerl/miller/pull/1385
  • Bump actions/upload-artifact from 3.1.2 to 3.1.3 by @dependabot in https://github.com/johnkerl/miller/pull/1387
  • Bump actions/cache from 3.3.1 to 3.3.2 by @dependabot in https://github.com/johnkerl/miller/pull/1390
  • Bump goreleaser/goreleaser-action from 4.6.0 to 5.0.0 by @dependabot in https://github.com/johnkerl/miller/pull/1396
  • Bump actions/checkout from 4.0.0 to 4.1.0 by @dependabot in https://github.com/johnkerl/miller/pull/1400
  • Bump golang.org/x/term from 0.12.0 to 0.13.0 by @dependabot in https://github.com/johnkerl/miller/pull/1404
  • Bump github.com/mattn/go-isatty from 0.0.19 to 0.0.20 by @dependabot in https://github.com/johnkerl/miller/pull/1411
  • Bump actions/checkout from 4.1.0 to 4.1.1 by @dependabot in https://github.com/johnkerl/miller/pull/1412
  • Bump golang.org/x/text from 0.13.0 to 0.14.0 by @dependabot in https://github.com/johnkerl/miller/pull/1419
  • Bump golang.org/x/sys from 0.13.0 to 0.14.0 by @dependabot in https://github.com/johnkerl/miller/pull/1420
  • Bump golang.org/x/term from 0.13.0 to 0.14.0 by @dependabot in https://github.com/johnkerl/miller/pull/1423
  • Bump golang.org/x/term from 0.14.0 to 0.15.0 by @dependabot in https://github.com/johnkerl/miller/pull/1432
  • Bump actions/setup-go from 4.1.0 to 5.0.0 by @dependabot in https://github.com/johnkerl/miller/pull/1436

New Contributors

  • @rursprung made their first contribution in https://github.com/johnkerl/miller/pull/1414

Full Changelog: https://github.com/johnkerl/miller/compare/v6.9.0...v6.10.0

- Go
Published by johnkerl about 2 years ago

https://github.com/johnkerl/miller - v6.9.0

New features

Support for nanosecond-resolution timestamps:

  • Add DSL functions for integer nanoseconds since the epoch by @johnkerl in https://github.com/johnkerl/miller/pull/1326
  • Add %N and %O for strfntime by @johnkerl in https://github.com/johnkerl/miller/pull/1334
  • Add %s format specifier for strftime by @johnkerl in https://github.com/johnkerl/miller/pull/1335
  • Requested on issue https://github.com/johnkerl/miller/issues/1152
  • See also https://miller.readthedocs.io/en/6.9.0/reference-dsl-builtin-functions/index.html#time-functions

Stats from the stats verb may now be computed over arbitrary arrays and maps:

  • New DSL functions for summary stats over arrays / maps by @johnkerl in https://github.com/johnkerl/miller/pull/1364
  • Requested on issue https://github.com/johnkerl/miller/issues/1345
  • See also https://miller.readthedocs.io/en/6.9.0/reference-dsl-builtin-functions/index.html#stats-functions

Additional control over filenames for the split verb:

  • Filename options for split by @sloanlance in https://github.com/johnkerl/miller/pull/1366
  • Requested on issue https://github.com/johnkerl/miller/issues/1365

Support for details of data-computation errors beyond the current (error):

  • Fatal-on-data-error mlr -x option by @johnkerl in https://github.com/johnkerl/miller/pull/1373
  • See also https://miller.readthedocs.io/en/6.9.0/reference-dsl-errors/#handling-for-data-errors
  • Requested on issue https://github.com/johnkerl/miller/issues/1106

New verbs and DSL functions:

  • New sub, gsub, and ssub verbs by @johnkerl in https://github.com/johnkerl/miller/pull/1361. See also:
    • https://miller.readthedocs.io/en/6.9.0/reference-verbs/#sub
    • https://miller.readthedocs.io/en/6.9.0/reference-verbs/#gsub
    • https://miller.readthedocs.io/en/6.9.0/reference-verbs/#ssub
  • New contains DSL function by @johnkerl in https://github.com/johnkerl/miller/pull/1374
    • https://miller.readthedocs.io/en/6.9.0/reference-dsl-builtin-functions/index.html#contains

Other updates:

  • Support ZSTD compression in-process by @johnkerl in https://github.com/johnkerl/miller/pull/1360
    • See also https://miller.readthedocs.io/en/6.9.0/reference-main-compressed-data/
  • Support comments in mlr -s files by @johnkerl in https://github.com/johnkerl/miller/pull/1359
  • Add empty-key check to mlr check by @johnkerl in https://github.com/johnkerl/miller/pull/1330

Bug fixes

  • Do wildcard globbing on Windows by @johnkerl in https://github.com/johnkerl/miller/pull/1362
  • Treat empty like absent in + - * by @johnkerl in https://github.com/johnkerl/miller/pull/1371
  • Can't use ${field_name} if it contains UTF-8 characters also encodeable as Latin-1 by @johnkerl in https://github.com/johnkerl/miller/pull/1363
  • Typofix in uif/uof percentiles by @johnkerl in https://github.com/johnkerl/miller/pull/1375

Documentation updates

  • Update readthedocs notes in the how-to-release page by @johnkerl in https://github.com/johnkerl/miller/pull/1308
  • Fix mlr grep docs re OFS/OPS by @johnkerl in https://github.com/johnkerl/miller/pull/1309
  • Update Fedora link by @bkmgit in https://github.com/johnkerl/miller/pull/1339
  • Small typos in documentation of mlr nest by @johnkerl in https://github.com/johnkerl/miller/pull/1352

Internal

  • Update 2015-era Python sketch to Python 3 by @johnkerl in https://github.com/johnkerl/miller/pull/1372
  • Remove redundant nil check by @Juneezee in https://github.com/johnkerl/miller/pull/1367
  • Bump actions/checkout from 3.5.2 to 3.5.3 by @dependabot in https://github.com/johnkerl/miller/pull/1319
  • Bump github/codeql-action from 2.3.6 to 2.13.4 by @dependabot in https://github.com/johnkerl/miller/pull/1318
  • Bump golang.org/x/term from 0.8.0 to 0.9.0 by @dependabot in https://github.com/johnkerl/miller/pull/1321
  • Bump goreleaser/goreleaser-action from 4.2.0 to 4.3.0 by @dependabot in https://github.com/johnkerl/miller/pull/1320
  • Bump golang.org/x/text from 0.9.0 to 0.10.0 by @dependabot in https://github.com/johnkerl/miller/pull/1322
  • Bump golang.org/x/text from 0.10.0 to 0.11.0 by @dependabot in https://github.com/johnkerl/miller/pull/1337
  • Bump golang.org/x/sys from 0.9.0 to 0.10.0 by @dependabot in https://github.com/johnkerl/miller/pull/1336
  • Bump golang.org/x/term from 0.9.0 to 0.10.0 by @dependabot in https://github.com/johnkerl/miller/pull/1338
  • Bump golang.org/x/sys from 0.10.0 to 0.11.0 by @dependabot in https://github.com/johnkerl/miller/pull/1347
  • Bump golang.org/x/text from 0.11.0 to 0.12.0 by @dependabot in https://github.com/johnkerl/miller/pull/1349
  • Bump actions/setup-go from 4.0.1 to 4.1.0 by @dependabot in https://github.com/johnkerl/miller/pull/1351
  • Bump goreleaser/goreleaser-action from 4.3.0 to 4.4.0 by @dependabot in https://github.com/johnkerl/miller/pull/1354
  • Bump golang.org/x/term from 0.10.0 to 0.11.0 by @dependabot in https://github.com/johnkerl/miller/pull/1348
  • Bump actions/checkout from 3.5.3 to 3.6.0 by @dependabot in https://github.com/johnkerl/miller/pull/1369

New Contributors

  • @bkmgit made their first contribution in https://github.com/johnkerl/miller/pull/1339
  • @Juneezee made their first contribution in https://github.com/johnkerl/miller/pull/1367
  • @sloanlance made their first contribution in https://github.com/johnkerl/miller/pull/1366

Full Changelog: https://github.com/johnkerl/miller/compare/v6.8.0...v6.9.0

- Go
Published by johnkerl over 2 years ago

https://github.com/johnkerl/miller - Release candidate for 6.9.0

Given the CI fail at https://github.com/johnkerl/miller/pull/1376, as the goreleaser GitHub action has changed, for this release I'm tiptoeing by tagging a release candidate before tagging a release candidate per se.

New features

Support for nanosecond-resolution timestamps:

  • Add DSL functions for integer nanoseconds since the epoch by @johnkerl in https://github.com/johnkerl/miller/pull/1326
  • Add %N and %O for strfntime by @johnkerl in https://github.com/johnkerl/miller/pull/1334
  • Add %s format specifier for strftime by @johnkerl in https://github.com/johnkerl/miller/pull/1335
  • Requested on issue https://github.com/johnkerl/miller/issues/1152
  • See also https://miller.readthedocs.io/en/6.9.0/reference-dsl-builtin-functions/index.html#time-functions

Stats from the stats verb may now be computed over arbitrary arrays and maps:

  • New DSL functions for summary stats over arrays / maps by @johnkerl in https://github.com/johnkerl/miller/pull/1364
  • Requested on issue https://github.com/johnkerl/miller/issues/1345
  • See also https://miller.readthedocs.io/en/6.9.0/reference-dsl-builtin-functions/index.html#stats-functions

Additional control over filenames for the split verb:

  • Filename options for split by @sloanlance in https://github.com/johnkerl/miller/pull/1366
  • Requested on issue https://github.com/johnkerl/miller/issues/1365

Support for details of data-computation errors beyond the current (error):

  • Fatal-on-data-error mlr -x option by @johnkerl in https://github.com/johnkerl/miller/pull/1373
  • See also https://miller.readthedocs.io/en/6.9.0/reference-dsl-errors/#handling-for-data-errors
  • Requested on issue https://github.com/johnkerl/miller/issues/1106

New verbs and DSL functions:

  • New sub, gsub, and ssub verbs by @johnkerl in https://github.com/johnkerl/miller/pull/1361
  • New contains DSL function by @johnkerl in https://github.com/johnkerl/miller/pull/1374

Other updates:

  • Support ZSTD compression in-process by @johnkerl in https://github.com/johnkerl/miller/pull/1360
  • Support comments in mlr -s files by @johnkerl in https://github.com/johnkerl/miller/pull/1359
  • Add empty-key check to mlr check by @johnkerl in https://github.com/johnkerl/miller/pull/1330

Bug fixes

  • Do wildcard globbing on Windows by @johnkerl in https://github.com/johnkerl/miller/pull/1362
  • Treat empty like absent in + - * by @johnkerl in https://github.com/johnkerl/miller/pull/1371
  • Can't use ${field_name} if it contains UTF-8 characters also encodeable as Latin-1 by @johnkerl in https://github.com/johnkerl/miller/pull/1363
  • Typofix in uif/uof percentiles by @johnkerl in https://github.com/johnkerl/miller/pull/1375

Documentation updates

  • Update readthedocs notes in the how-to-release page by @johnkerl in https://github.com/johnkerl/miller/pull/1308
  • Fix mlr grep docs re OFS/OPS by @johnkerl in https://github.com/johnkerl/miller/pull/1309
  • Update Fedora link by @bkmgit in https://github.com/johnkerl/miller/pull/1339
  • Small typos in documentation of mlr nest by @johnkerl in https://github.com/johnkerl/miller/pull/1352

Internal

  • Update 2015-era Python sketch to Python 3 by @johnkerl in https://github.com/johnkerl/miller/pull/1372
  • Remove redundant nil check by @Juneezee in https://github.com/johnkerl/miller/pull/1367
  • Bump actions/checkout from 3.5.2 to 3.5.3 by @dependabot in https://github.com/johnkerl/miller/pull/1319
  • Bump github/codeql-action from 2.3.6 to 2.13.4 by @dependabot in https://github.com/johnkerl/miller/pull/1318
  • Bump golang.org/x/term from 0.8.0 to 0.9.0 by @dependabot in https://github.com/johnkerl/miller/pull/1321
  • Bump goreleaser/goreleaser-action from 4.2.0 to 4.3.0 by @dependabot in https://github.com/johnkerl/miller/pull/1320
  • Bump golang.org/x/text from 0.9.0 to 0.10.0 by @dependabot in https://github.com/johnkerl/miller/pull/1322
  • Bump golang.org/x/text from 0.10.0 to 0.11.0 by @dependabot in https://github.com/johnkerl/miller/pull/1337
  • Bump golang.org/x/sys from 0.9.0 to 0.10.0 by @dependabot in https://github.com/johnkerl/miller/pull/1336
  • Bump golang.org/x/term from 0.9.0 to 0.10.0 by @dependabot in https://github.com/johnkerl/miller/pull/1338
  • Bump golang.org/x/sys from 0.10.0 to 0.11.0 by @dependabot in https://github.com/johnkerl/miller/pull/1347
  • Bump golang.org/x/text from 0.11.0 to 0.12.0 by @dependabot in https://github.com/johnkerl/miller/pull/1349
  • Bump actions/setup-go from 4.0.1 to 4.1.0 by @dependabot in https://github.com/johnkerl/miller/pull/1351
  • Bump goreleaser/goreleaser-action from 4.3.0 to 4.4.0 by @dependabot in https://github.com/johnkerl/miller/pull/1354
  • Bump golang.org/x/term from 0.10.0 to 0.11.0 by @dependabot in https://github.com/johnkerl/miller/pull/1348
  • Bump actions/checkout from 3.5.3 to 3.6.0 by @dependabot in https://github.com/johnkerl/miller/pull/1369

New Contributors

  • @bkmgit made their first contribution in https://github.com/johnkerl/miller/pull/1339
  • @Juneezee made their first contribution in https://github.com/johnkerl/miller/pull/1367
  • @sloanlance made their first contribution in https://github.com/johnkerl/miller/pull/1366

Full Changelog: https://github.com/johnkerl/miller/compare/v6.8.0...v6.9.0

- Go
Published by johnkerl over 2 years ago

https://github.com/johnkerl/miller - New case verb, index DSL function, and more

New features

New case verb:

  • Unify the case verb, and add options by @johnkerl in https://github.com/johnkerl/miller/pull/1306
  • Add new upcase and downcase verbs by @johnkerl in https://github.com/johnkerl/miller/pull/1217

New index DSL function:

  • index DSL function by @johnkerl in https://github.com/johnkerl/miller/pull/1247

Enhancements:

  • Add mlr step -a rprod for running products by @johnkerl in https://github.com/johnkerl/miller/pull/1228
  • Add optional second base argument to int DSL function by @johnkerl in https://github.com/johnkerl/miller/pull/1244
  • Implement --csv-trim-leading-space flag by @johnkerl in https://github.com/johnkerl/miller/pull/1272
  • New mlr json-parse -k flag by @johnkerl in https://github.com/johnkerl/miller/pull/1291
  • Let mlr help take pre-flags, such as --always-color by @johnkerl in https://github.com/johnkerl/miller/pull/1292
  • Values-only -a option for mlr grep by @johnkerl in https://github.com/johnkerl/miller/pull/1305

Bugfixes

  • Fix bug on DSL comment with no final newline by @johnkerl in https://github.com/johnkerl/miller/pull/1216
  • Better error message on unparseable TZ environment variable by @johnkerl in https://github.com/johnkerl/miller/pull/1249
  • Fix typo by @dnicolson in https://github.com/johnkerl/miller/pull/1252
  • Treat data-file numbers with leading + as numeric by @johnkerl in https://github.com/johnkerl/miller/pull/1269
  • Fix precedence of coalesce operators ?? and ??? by @johnkerl in https://github.com/johnkerl/miller/pull/1270

Documentation

  • Docs re tail -f and --records-per-batch 1 by @johnkerl in https://github.com/johnkerl/miller/pull/1218
  • Fix issue links in README-dev.md by @kcwu in https://github.com/johnkerl/miller/pull/1248

Miscellaneous

  • Include tools in the release tarball by @skitt in https://github.com/johnkerl/miller/pull/1221
  • Run go mod tidy by @skitt in https://github.com/johnkerl/miller/pull/1220

Dependencies

Go:

  • Bump minimum compiler version from Go 1.15 to 1.18 by @johnkerl in https://github.com/johnkerl/miller/pull/1246

Others:

  • Bump github/codeql-action from 2.2.5 to 2.2.6 by @dependabot in https://github.com/johnkerl/miller/pull/1230
  • Bump actions/cache from 3.2.6 to 3.3.1 by @dependabot in https://github.com/johnkerl/miller/pull/1229
  • Bump github/codeql-action from 2.2.6 to 2.2.7 by @dependabot in https://github.com/johnkerl/miller/pull/1232
  • Bump actions/setup-go from 3.5.0 to 4.0.0 by @dependabot in https://github.com/johnkerl/miller/pull/1233
  • Bump actions/checkout from 3.3.0 to 3.4.0 by @dependabot in https://github.com/johnkerl/miller/pull/1234
  • Bump github/codeql-action from 2.2.7 to 2.2.8 by @dependabot in https://github.com/johnkerl/miller/pull/1242
  • Bump actions/checkout from 3.4.0 to 3.5.0 by @dependabot in https://github.com/johnkerl/miller/pull/1245
  • Bump golang.org/x/term from 0.0.0-20210927222741-03fcf44c2211 to 0.6.0 by @dependabot in https://github.com/johnkerl/miller/pull/1222
  • Bump github.com/mattn/go-isatty from 0.0.17 to 0.0.18 by @dependabot in https://github.com/johnkerl/miller/pull/1243
  • Bump github/codeql-action from 2.2.8 to 2.2.9 by @dependabot in https://github.com/johnkerl/miller/pull/1250
  • Bump codespell-project/actions-codespell from 9c63fddd79f483308bfaea379a505dcd361b5d1d to 57beb9f38f49d773d641ac555d1565c3b6a59938 by @dependabot in https://github.com/johnkerl/miller/pull/1253
  • Bump golang.org/x/term from 0.6.0 to 0.7.0 by @dependabot in https://github.com/johnkerl/miller/pull/1256
  • Bump github/codeql-action from 2.2.9 to 2.2.10 by @dependabot in https://github.com/johnkerl/miller/pull/1259
  • Bump github/codeql-action from 2.2.10 to 2.2.11 by @dependabot in https://github.com/johnkerl/miller/pull/1261
  • Bump actions/checkout from 3.5.0 to 3.5.1 by @dependabot in https://github.com/johnkerl/miller/pull/1263
  • Bump actions/checkout from 3.5.1 to 3.5.2 by @dependabot in https://github.com/johnkerl/miller/pull/1264
  • Bump github/codeql-action from 2.2.11 to 2.2.12 by @dependabot in https://github.com/johnkerl/miller/pull/1265
  • Bump github/codeql-action from 2.2.12 to 2.3.0 by @dependabot in https://github.com/johnkerl/miller/pull/1274
  • Bump github/codeql-action from 2.3.0 to 2.3.1 by @dependabot in https://github.com/johnkerl/miller/pull/1277
  • Bump github/codeql-action from 2.3.1 to 2.3.2 by @dependabot in https://github.com/johnkerl/miller/pull/1279
  • Bump codespell-project/actions-codespell from 57beb9f38f49d773d641ac555d1565c3b6a59938 to 94259cd8be02ad2903ba34a22d9c13de21a74461 by @dependabot in https://github.com/johnkerl/miller/pull/1282
  • Bump github/codeql-action from 2.3.2 to 2.3.3 by @dependabot in https://github.com/johnkerl/miller/pull/1284
  • Bump golang.org/x/term from 0.7.0 to 0.8.0 by @dependabot in https://github.com/johnkerl/miller/pull/1285
  • Bump actions/setup-go from 4.0.0 to 4.0.1 by @dependabot in https://github.com/johnkerl/miller/pull/1294
  • Bump github.com/stretchr/testify from 1.8.2 to 1.8.3 by @dependabot in https://github.com/johnkerl/miller/pull/1295
  • Bump github.com/mattn/go-isatty from 0.0.18 to 0.0.19 by @dependabot in https://github.com/johnkerl/miller/pull/1296
  • Bump github/codeql-action from 2.3.3 to 2.3.4 by @dependabot in https://github.com/johnkerl/miller/pull/1299
  • Bump github/codeql-action from 2.3.4 to 2.3.5 by @dependabot in https://github.com/johnkerl/miller/pull/1300
  • Bump github.com/stretchr/testify from 1.8.3 to 1.8.4 by @dependabot in https://github.com/johnkerl/miller/pull/1301
  • Bump github/codeql-action from 2.3.5 to 2.3.6 by @dependabot in https://github.com/johnkerl/miller/pull/1303

New Contributors

  • @kcwu made their first contribution in https://github.com/johnkerl/miller/pull/1248
  • @dnicolson made their first contribution in https://github.com/johnkerl/miller/pull/1252

Full Changelog: https://github.com/johnkerl/miller/compare/v6.7.0...v6.8.0

- Go
Published by johnkerl over 2 years ago

https://github.com/johnkerl/miller - New leftpad/rightpad DSL functions, unspace verb, and more

Features

  • New leftpad and rightpad DSL functions by @johnkerl in https://github.com/johnkerl/miller/pull/1205
  • mlr unspace verb by @johnkerl in https://github.com/johnkerl/miller/pull/1167
  • Support more backslashed special characters in DSL strings by @johnkerl in https://github.com/johnkerl/miller/pull/1212
  • Add --ofmte, --ofmtf, --ofmtg command-line flags by @johnkerl in https://github.com/johnkerl/miller/pull/1206

Documentation updates

  • Fixed missing double quote in documentation sample by @Clindbergh in https://github.com/johnkerl/miller/pull/1181
  • Complete #1181 by @johnkerl in https://github.com/johnkerl/miller/pull/1184
  • Add doc info on DSL code-comment syntax by @johnkerl in https://github.com/johnkerl/miller/pull/1165
  • Fix typos by @jwilk in https://github.com/johnkerl/miller/pull/1135
  • Fix typo in mlr put documentation by @johnkerl in https://github.com/johnkerl/miller/pull/1140

Bug fixes

  • Fix #1164: regression on CSV blank-line handling by @johnkerl in https://github.com/johnkerl/miller/pull/1168
  • Fix #1146: bug in lo/hi limits for non-auto histogram by @johnkerl in https://github.com/johnkerl/miller/pull/1157
  • Fix #1102: empty-string field in single-column TSV should not be a schema-restart by @johnkerl in https://github.com/johnkerl/miller/pull/1163

Minor changes

  • Add Go LICENSE file by @skitt in https://github.com/johnkerl/miller/pull/1171

Dependency updates

  • Bump github/codeql-action from 2.1.33 to 2.1.35 by @dependabot in https://github.com/johnkerl/miller/pull/1137
  • Bump actions/setup-go from 3.3.1 to 3.4.0 by @dependabot in https://github.com/johnkerl/miller/pull/1136
  • Bump github/codeql-action from 2.1.35 to 2.1.36 by @dependabot in https://github.com/johnkerl/miller/pull/1143
  • Bump actions/checkout from 3.1.0 to 3.2.0 by @dependabot in https://github.com/johnkerl/miller/pull/1145
  • Bump goreleaser/goreleaser-action from 3.2.0 to 4.1.0 by @dependabot in https://github.com/johnkerl/miller/pull/1147
  • Bump actions/setup-go from 3.4.0 to 3.5.0 by @dependabot in https://github.com/johnkerl/miller/pull/1148
  • Bump github/codeql-action from 2.1.36 to 2.1.37 by @dependabot in https://github.com/johnkerl/miller/pull/1151
  • Bump actions/cache from 3.0.11 to 3.2.0 by @dependabot in https://github.com/johnkerl/miller/pull/1155
  • Bump actions/cache from 3.2.0 to 3.2.1 by @dependabot in https://github.com/johnkerl/miller/pull/1156
  • Bump actions/cache from 3.2.1 to 3.2.2 by @dependabot in https://github.com/johnkerl/miller/pull/1160
  • Bump github.com/mattn/go-isatty from 0.0.16 to 0.0.17 by @dependabot in https://github.com/johnkerl/miller/pull/1162
  • Bump codespell-project/actions-codespell from bcf481f4d5cce7b92b65f05aebe8f552d4f1442c to 9c63fddd79f483308bfaea379a505dcd361b5d1d by @dependabot in https://github.com/johnkerl/miller/pull/1172
  • Bump actions/checkout from 3.2.0 to 3.3.0 by @dependabot in https://github.com/johnkerl/miller/pull/1173
  • Bump actions/cache from 3.2.2 to 3.2.3 by @dependabot in https://github.com/johnkerl/miller/pull/1174
  • Bump actions/upload-artifact from 3.1.1 to 3.1.2 by @dependabot in https://github.com/johnkerl/miller/pull/1175
  • Bump github/codeql-action from 2.1.37 to 2.1.38 by @dependabot in https://github.com/johnkerl/miller/pull/1176
  • Bump github/codeql-action from 2.1.38 to 2.1.39 by @dependabot in https://github.com/johnkerl/miller/pull/1179
  • Bump github/codeql-action from 2.1.39 to 2.2.1 by @dependabot in https://github.com/johnkerl/miller/pull/1183
  • Bump goreleaser/goreleaser-action from 4.1.0 to 4.1.1 by @dependabot in https://github.com/johnkerl/miller/pull/1185
  • Bump goreleaser/goreleaser-action from 4.1.1 to 4.2.0 by @dependabot in https://github.com/johnkerl/miller/pull/1187
  • Bump actions/cache from 3.2.3 to 3.2.4 by @dependabot in https://github.com/johnkerl/miller/pull/1186
  • Bump actions/cache from 3.2.4 to 3.2.5 by @dependabot in https://github.com/johnkerl/miller/pull/1192
  • Bump github/codeql-action from 2.2.1 to 2.2.3 by @dependabot in https://github.com/johnkerl/miller/pull/1191
  • Bump github/codeql-action from 2.2.3 to 2.2.4 by @dependabot in https://github.com/johnkerl/miller/pull/1193
  • Bump actions/cache from 3.2.5 to 3.2.6 by @dependabot in https://github.com/johnkerl/miller/pull/1196
  • Bump golang.org/x/sys from 0.0.0-20210326220804-49726bf1d181 to 0.1.0 in /cmd/experiments/cli_parser by @dependabot in https://github.com/johnkerl/miller/pull/1203
  • Bump github.com/stretchr/testify from 1.8.1 to 1.8.2 by @dependabot in https://github.com/johnkerl/miller/pull/1208
  • Bump github/codeql-action from 2.2.4 to 2.2.5 by @dependabot in https://github.com/johnkerl/miller/pull/1207

New Contributors

  • @jwilk made their first contribution in https://github.com/johnkerl/miller/pull/1135
  • @Clindbergh made their first contribution in https://github.com/johnkerl/miller/pull/1181

Full Changelog: https://github.com/johnkerl/miller/compare/v6.5.0...v6.7.0

- Go
Published by johnkerl almost 3 years ago

https://github.com/johnkerl/miller - Bugfixes and unspace verb

Features

Bugfixes

  • Add doc info on DSL code-comment syntax by @johnkerl in https://github.com/johnkerl/miller/pull/1165
  • Fix typos by @jwilk in https://github.com/johnkerl/miller/pull/1135
  • Fix typo in mlr put documentation by @johnkerl in https://github.com/johnkerl/miller/pull/1140
  • Fix #1146: bug in lo/hi limits for non-auto histogram by @johnkerl in https://github.com/johnkerl/miller/pull/1157
  • Fix #1102: empty-string field in single-column TSV should not be a schema-restart by @johnkerl in https://github.com/johnkerl/miller/pull/1163
  • Fix #1164: regression on CSV blank-line handling by @johnkerl in https://github.com/johnkerl/miller/pull/1168

Internal

  • Bump github/codeql-action from 2.1.33 to 2.1.35 by @dependabot in https://github.com/johnkerl/miller/pull/1137
  • Bump actions/setup-go from 3.3.1 to 3.4.0 by @dependabot in https://github.com/johnkerl/miller/pull/1136
  • Bump github/codeql-action from 2.1.35 to 2.1.36 by @dependabot in https://github.com/johnkerl/miller/pull/1143
  • Bump actions/checkout from 3.1.0 to 3.2.0 by @dependabot in https://github.com/johnkerl/miller/pull/1145
  • Bump goreleaser/goreleaser-action from 3.2.0 to 4.1.0 by @dependabot in https://github.com/johnkerl/miller/pull/1147
  • Bump actions/setup-go from 3.4.0 to 3.5.0 by @dependabot in https://github.com/johnkerl/miller/pull/1148
  • Bump github/codeql-action from 2.1.36 to 2.1.37 by @dependabot in https://github.com/johnkerl/miller/pull/1151
  • Bump actions/cache from 3.0.11 to 3.2.0 by @dependabot in https://github.com/johnkerl/miller/pull/1155
  • Bump actions/cache from 3.2.0 to 3.2.1 by @dependabot in https://github.com/johnkerl/miller/pull/1156
  • Bump actions/cache from 3.2.1 to 3.2.2 by @dependabot in https://github.com/johnkerl/miller/pull/1160
  • Bump github.com/mattn/go-isatty from 0.0.16 to 0.0.17 by @dependabot in https://github.com/johnkerl/miller/pull/1162

New Contributors

  • @jwilk made their first contribution in https://github.com/johnkerl/miller/pull/1135

Full Changelog: https://github.com/johnkerl/miller/compare/v6.5.0...v6.6.0

- Go
Published by johnkerl about 3 years ago

https://github.com/johnkerl/miller - Bugfixes and memory-reduction optimizations

What's Changed

Features:

  • Restore the --jvquoteall flag by @johnkerl in https://github.com/johnkerl/miller/pull/1083
  • Restore --quote-all for CSV output by @johnkerl in https://github.com/johnkerl/miller/pull/1084

Bugfixes:

  • Fix labels for mlr histogram --auto by @johnkerl in https://github.com/johnkerl/miller/pull/1089
  • Correctly support multiple regexes in mlr reshape -r by @johnkerl in https://github.com/johnkerl/miller/pull/1091
  • Check -- terminator on --mfrom by @johnkerl in https://github.com/johnkerl/miller/pull/1098
  • Type-safety in exec by @johnkerl in https://github.com/johnkerl/miller/pull/1099
  • Don't double-quote a CSV field only for having a leading space by @johnkerl in https://github.com/johnkerl/miller/pull/1101

Performance/memory-reduction:

  • Use int8 for mvtype (memory reduction) by @johnkerl in https://github.com/johnkerl/miller/pull/1130
  • Exclude median from summary default by @johnkerl in https://github.com/johnkerl/miller/pull/1131
  • More mlrval size-reduction by @johnkerl in https://github.com/johnkerl/miller/pull/1132
  • Convert mlrval polymorphism from struct to unionish interface by @johnkerl in https://github.com/johnkerl/miller/pull/1133

Minor/internal:

  • Account for varying mlr locations by @skitt in https://github.com/johnkerl/miller/pull/1086
  • Account for varying mlr locations, continued by @johnkerl in https://github.com/johnkerl/miller/pull/1087
  • [StepSecurity] ci: Harden GitHub Actions by @step-security-bot in https://github.com/johnkerl/miller/pull/1107
  • Bump github.com/pkg/profile from 1.6.0 to 1.7.0 by @dependabot in https://github.com/johnkerl/miller/pull/1110
  • Bump github/codeql-action from 2.1.28 to 2.1.33 by @dependabot in https://github.com/johnkerl/miller/pull/1126
  • Bump actions/cache from 3 to 3.0.11 by @dependabot in https://github.com/johnkerl/miller/pull/1109
  • Bump actions/upload-artifact from 3.1.0 to 3.1.1 by @dependabot in https://github.com/johnkerl/miller/pull/1112
  • Bump github.com/stretchr/testify from 1.8.0 to 1.8.1 by @dependabot in https://github.com/johnkerl/miller/pull/1113
  • Miller 6.5.0 by @johnkerl in https://github.com/johnkerl/miller/pull/1134

New Contributors

  • @step-security-bot made their first contribution in https://github.com/johnkerl/miller/pull/1107

Full Changelog: https://github.com/johnkerl/miller/compare/v6.4.0...v6.5.0

- Go
Published by johnkerl about 3 years ago

https://github.com/johnkerl/miller - 5.10 bugfix for issue #1108

Miller 5 is long-gone; 6.0.0 was released almost a year ago. Yet issue #1108 reports a critical memory-corruption bug on 5.10.3; this fixes that.

- Go
Published by johnkerl about 3 years ago

https://github.com/johnkerl/miller - mlr summary verb, exec() function, mlr cat --filename, multiline string literals, and more

What's Changed

Major:

  • mlr summary verb by @johnkerl in https://github.com/johnkerl/miller/pull/1056
  • feat: system/exec() function call ( #1043) by @forbesmyester in https://github.com/johnkerl/miller/pull/1067 and https://github.com/johnkerl/miller/pull/1071
  • Support simplified sort-map-by-value in the DSL by @johnkerl in https://github.com/johnkerl/miller/pull/1069
  • mlr cat --filename / --filenum by @johnkerl in https://github.com/johnkerl/miller/pull/1080
  • Allow multi-line string literals in the DSL by @johnkerl in https://github.com/johnkerl/miller/pull/1070

Minor:

  • Make PPRINT empty-string markers readable as such by @johnkerl in https://github.com/johnkerl/miller/pull/1059
  • Allow "\n" in mlr repl prompt by @johnkerl in https://github.com/johnkerl/miller/pull/1058

Bugfixes:

  • [Docs] moving --xvright out of the FLATTEN-UNFLATTEN FLAGS section by @trantor in https://github.com/johnkerl/miller/pull/1065
  • Fix doc typo by @luzpaz in https://github.com/johnkerl/miller/pull/1054
  • Fix natsort of empty strings; support mlr sort -rt same as -tr by @johnkerl in https://github.com/johnkerl/miller/pull/1068

Internal:

  • Reduce number of os.Exit callsites, part 1 of n by @johnkerl in https://github.com/johnkerl/miller/pull/1055
  • delete unreachable test code caused by os.Exit by @Abirdcfly in https://github.com/johnkerl/miller/pull/1073
  • Bump github.com/mattn/go-isatty from 0.0.14 to 0.0.16 by @dependabot in https://github.com/johnkerl/miller/pull/1074

New Contributors

  • @luzpaz made their first contribution in https://github.com/johnkerl/miller/pull/1054
  • @Abirdcfly made their first contribution in https://github.com/johnkerl/miller/pull/1073

Full Changelog: https://github.com/johnkerl/miller/compare/v6.3.0...v6.4.0

- Go
Published by johnkerl over 3 years ago

https://github.com/johnkerl/miller - Windows terminal colors, Latin-1, and more

What's Changed

Key feature: output colorization on Windows thanks to @tiesmaster:

  • Enable ANSI escape-sequence processing on Windows by @tiesmaster in https://github.com/johnkerl/miller/pull/1045
  • Enable output colorization on Windows by default by @johnkerl in https://github.com/johnkerl/miller/pull/1051

Support for Latin-1:

  • DSL functions and verbs for UTF-8 <-> Latin-1 by @johnkerl in https://github.com/johnkerl/miller/pull/997

Features:

  • Re-use whitespace regexp in clean_whitespace by @johnkerl in https://github.com/johnkerl/miller/pull/994
  • Add line/column info for DSL runtime non-parse failures by @johnkerl in https://github.com/johnkerl/miller/pull/998
  • Allow x ** - y and x ** + y in the DSL grammar by @johnkerl in https://github.com/johnkerl/miller/pull/1021
  • Let + be an alias for then by @johnkerl in https://github.com/johnkerl/miller/pull/1049

Docs:

  • helm/kubectl examples in webdocs by @johnkerl in https://github.com/johnkerl/miller/pull/1005

Bugfixes:

  • Accept + in exponent of scientific-notation floating-point DSL literals by @johnkerl in https://github.com/johnkerl/miller/pull/1020
  • Fix ASCII vs UTF-8 in TSV writer by @johnkerl in https://github.com/johnkerl/miller/pull/1023
  • Avoid panic when the command line ends in 'then' by @johnkerl in https://github.com/johnkerl/miller/pull/1033
  • Fix panic on 'mlr sort -n' by @johnkerl in https://github.com/johnkerl/miller/pull/1004
  • Fix issue 1037 by @johnkerl in https://github.com/johnkerl/miller/pull/1047
  • Fix issue 1032 by @johnkerl in https://github.com/johnkerl/miller/pull/1048

Dependencies:

  • Bump actions/cache from 2 to 3 by @dependabot in https://github.com/johnkerl/miller/pull/1000
  • Bump github.com/stretchr/testify from 1.7.1 to 1.7.2 by @dependabot in https://github.com/johnkerl/miller/pull/1034
  • Bump github.com/stretchr/testify from 1.7.2 to 1.7.3 by @dependabot in https://github.com/johnkerl/miller/pull/1038
  • Bump github.com/stretchr/testify from 1.7.3 to 1.7.4 by @dependabot in https://github.com/johnkerl/miller/pull/1040
  • Bump github.com/stretchr/testify from 1.7.4 to 1.7.5 by @dependabot in https://github.com/johnkerl/miller/pull/1042
  • Bump github.com/stretchr/testify from 1.7.5 to 1.8.0 by @dependabot in https://github.com/johnkerl/miller/pull/1044
  • Bump actions/upload-artifact from 2 to 3 by @dependabot in https://github.com/johnkerl/miller/pull/1010
  • Bump actions/setup-go from 2 to 3 by @dependabot in https://github.com/johnkerl/miller/pull/1009
  • Bump github.com/lestrrat-go/strftime from 1.0.5 to 1.0.6 by @dependabot in https://github.com/johnkerl/miller/pull/1012
  • Bump github/codeql-action from 1 to 2 by @dependabot in https://github.com/johnkerl/miller/pull/1015
  • Bump goreleaser/goreleaser-action from 2 to 3 by @dependabot in https://github.com/johnkerl/miller/pull/1027

New Contributors

  • @tiesmaster made their first contribution in https://github.com/johnkerl/miller/pull/1045

Full Changelog: https://github.com/johnkerl/miller/compare/v6.2.0...v6.3.0

- Go
Published by johnkerl over 3 years ago

https://github.com/johnkerl/miller - Restore --tsvlite; add gssub and expand dhms functions

Overview

The primary purpose of this release is to restore --tsvlite which, by its own, would merit a 6.1.1 bugfix release. But since a couple other new features are present as well, this is a 6.2.0 minor release.

All the "Plans for 6.2.0" listed at https://github.com/johnkerl/miller/releases/tag/v6.1.0 are all still in-plan, but since this 6.2.0 exists sooner than later, those issues are planned for a 6.3.0.

Details

PRs:

  • Restore --tsvlite by @johnkerl in https://github.com/johnkerl/miller/pull/984
  • Let dhms2sec accept input like "8h" by @johnkerl in https://github.com/johnkerl/miller/pull/983
  • Use fixed OFMT for multi-platform regression-testing by @johnkerl in https://github.com/johnkerl/miller/pull/988
  • Bump github.com/stretchr/testify from 1.7.0 to 1.7.1 by @dependabot in https://github.com/johnkerl/miller/pull/986
  • gssub DSL function by @johnkerl in https://github.com/johnkerl/miller/pull/989

Full Changelog: https://github.com/johnkerl/miller/compare/v6.1.0...v6.2.0

- Go
Published by johnkerl almost 4 years ago

https://github.com/johnkerl/miller - Natural sort, true TSV, sliding-window averages, and more

Please see:

  • https://miller.readthedocs.io/en/latest/ for more about Miller
  • https://miller.readthedocs.io/en/latest/installing-miller/ for installation

Features

Major features:

  • Natural sort by @johnkerl in https://github.com/johnkerl/miller/pull/932
  • mlr split verb by @johnkerl in https://github.com/johnkerl/miller/pull/898
  • Make TSV finally true TSV by @johnkerl in https://github.com/johnkerl/miller/pull/923
  • Sliding window averages by @johnkerl in https://github.com/johnkerl/miller/pull/894
  • Implement shift-lead option for mlr step by @johnkerl in https://github.com/johnkerl/miller/pull/893

New DSL functions:

  • New fmtifnum DSL function; make fmtnum/fmtifnum recursive over maps and arrays by @johnkerl in https://github.com/johnkerl/miller/pull/946
  • New unformat DSL function by @johnkerl in https://github.com/johnkerl/miller/pull/871
  • New format DSL function by @johnkerl in https://github.com/johnkerl/miller/pull/869
  • New concat DSL function for arrays by @johnkerl in https://github.com/johnkerl/miller/pull/868

DSL improvements:

  • Support more Go regex patterns, like "\d" by @johnkerl in https://github.com/johnkerl/miller/pull/974
  • Include \U support in addition to \u for DSL Unicode string literals by @johnkerl in https://github.com/johnkerl/miller/pull/917
  • Support unicode literals in the Miller DSL by @johnkerl in https://github.com/johnkerl/miller/pull/916
  • Allow 0o... octal literals in the DSL by @johnkerl in https://github.com/johnkerl/miller/pull/864

New command-line flags:

  • Add --left-keep-fields option for mlr join by @johnkerl in https://github.com/johnkerl/miller/pull/967
  • New --lazy-quotes flag for helping with malformed CSV by @johnkerl in https://github.com/johnkerl/miller/pull/925

REPL and on-line help:

  • Let :resetblocks/:rb in the REPL take optional begin/main/end by @johnkerl in https://github.com/johnkerl/miller/pull/924
  • Add :resetblocks / :rb to REPL by @johnkerl in https://github.com/johnkerl/miller/pull/920
  • ?foo and ??foo for :help foo / :help find foo in the REPL by @johnkerl in https://github.com/johnkerl/miller/pull/915

Improvements and bugfixes

  • Support Latin-1 supplement a0-ff as DSL string literals by @johnkerl in https://github.com/johnkerl/miller/pull/957
  • Fix "%%" in strptime; more test cases for strptime by @johnkerl in https://github.com/johnkerl/miller/pull/951
  • Support %F, %T, and more in strptime by @johnkerl in https://github.com/johnkerl/miller/pull/944
  • Fix handling of mlr nest abbrevs by @johnkerl in https://github.com/johnkerl/miller/pull/937
  • Add Inf and NaN literals to the DSL by @johnkerl in https://github.com/johnkerl/miller/pull/933
  • Boolean inference for issue 908 by @johnkerl in https://github.com/johnkerl/miller/pull/931
  • strptime %j format for 3-digit day in year by @johnkerl in https://github.com/johnkerl/miller/pull/930
  • Fix isnonempty for absent case by @johnkerl in https://github.com/johnkerl/miller/pull/928
  • --nidx --fs x should be the same as --fs x --nidx by @johnkerl in https://github.com/johnkerl/miller/pull/912
  • Update default colorization by @johnkerl in https://github.com/johnkerl/miller/pull/904
  • Make isnull/isnot_null DSL functions include new JSON-null type by @johnkerl in https://github.com/johnkerl/miller/pull/883
  • Fix #853 by @johnkerl in https://github.com/johnkerl/miller/pull/860

Documentation

  • New doc page: Parsing and formatting fields by @johnkerl in https://github.com/johnkerl/miller/pull/973
  • More doc material for :context in the REPL by @johnkerl in https://github.com/johnkerl/miller/pull/966
  • Fix typo in on-line help for splitax DSL function by @johnkerl in https://github.com/johnkerl/miller/pull/964
  • More doc-sites for the funct keyword by @johnkerl in https://github.com/johnkerl/miller/pull/963
  • Doc updates for funct keyword by @johnkerl in https://github.com/johnkerl/miller/pull/961
  • FAQ entry for #351 by @johnkerl in https://github.com/johnkerl/miller/pull/958
  • docs: add Poshi as a contributor for doc by @allcontributors in https://github.com/johnkerl/miller/pull/956
  • docs: add schragge as a contributor for doc by @allcontributors in https://github.com/johnkerl/miller/pull/955
  • FAQ entry for #285: carriage returns in field names by @johnkerl in https://github.com/johnkerl/miller/pull/953
  • Add --implicit-tsv-header as alias for --implicit-csv-header, etc by @johnkerl in https://github.com/johnkerl/miller/pull/952
  • Fix: multiple documentation tweaks by @Poshi in https://github.com/johnkerl/miller/pull/949
  • fix typo in reference-verbs.md by @zachvalenta in https://github.com/johnkerl/miller/pull/945
  • Add on mouse over permalink anchor for titles by @aborruso in https://github.com/johnkerl/miller/pull/942
  • Webdoc information on Unicode string literals by @johnkerl in https://github.com/johnkerl/miller/pull/935
  • 'mlr help function nonesuch' should not be silent by @johnkerl in https://github.com/johnkerl/miller/pull/934
  • Clarify strftime on-line help by @johnkerl in https://github.com/johnkerl/miller/pull/929
  • Expand on-line help for split* DSL functions by @johnkerl in https://github.com/johnkerl/miller/pull/927
  • On-line help for -s flag by @johnkerl in https://github.com/johnkerl/miller/pull/926
  • Multiple on-line-help issues from #908 by @johnkerl in https://github.com/johnkerl/miller/pull/921
  • Multiple on-line-help issues from #908 by @johnkerl in https://github.com/johnkerl/miller/pull/913
  • Fix operator-precedence doc table to match DSL grammar by @johnkerl in https://github.com/johnkerl/miller/pull/911
  • Fix multiple on-line-help issues from #907 by @johnkerl in https://github.com/johnkerl/miller/pull/910
  • Clarify source for printf-style formatting by @johnkerl in https://github.com/johnkerl/miller/pull/895
  • Fix #891 by @johnkerl in https://github.com/johnkerl/miller/pull/892
  • Improve mlr top documentation for #861 by @johnkerl in https://github.com/johnkerl/miller/pull/875
  • Continue #856 by @johnkerl in https://github.com/johnkerl/miller/pull/865
  • misspelling by @Gary-Armstrong in https://github.com/johnkerl/miller/pull/863
  • fix typo by @vapniks in https://github.com/johnkerl/miller/pull/862
  • Update installing-miller.md by @jauderho in https://github.com/johnkerl/miller/pull/859
  • Emit notes by @johnkerl in https://github.com/johnkerl/miller/pull/858
  • Conda/Docker install notes by @johnkerl in https://github.com/johnkerl/miller/pull/857
  • Fix typo: columnn -> column by @vapniks in https://github.com/johnkerl/miller/pull/856
  • Fix typo by @vapniks in https://github.com/johnkerl/miller/pull/855
  • Fix typo by @vapniks in https://github.com/johnkerl/miller/pull/854
  • A small typo by @aborruso in https://github.com/johnkerl/miller/pull/846

Code quality

  • Code-dedupe logic for array slices and string slices by @johnkerl in https://github.com/johnkerl/miller/pull/960
  • Let mlr repl print empty strings by @johnkerl in https://github.com/johnkerl/miller/pull/959
  • Neaten strptime.go by @johnkerl in https://github.com/johnkerl/miller/pull/950
  • More dead code removal by @skitt in https://github.com/johnkerl/miller/pull/905
  • Remove unreachable code by @skitt in https://github.com/johnkerl/miller/pull/903
  • Use int64 wherever "64-bit integer" is assumed by @skitt in https://github.com/johnkerl/miller/pull/902
  • More of #884: types in enum-consts by @johnkerl in https://github.com/johnkerl/miller/pull/887
  • Clean up file output handler error handling by @skitt in https://github.com/johnkerl/miller/pull/886
  • Use raw strings to avoid escapes by @skitt in https://github.com/johnkerl/miller/pull/885
  • Specify constant types except with iota by @skitt in https://github.com/johnkerl/miller/pull/884
  • Mlrval arrayval from []Mlrval to []*Mlrval by @johnkerl in https://github.com/johnkerl/miller/pull/880
  • Append slices directly instead of looping by @skitt in https://github.com/johnkerl/miller/pull/879
  • Fix mlrmap.Equals FieldCount comparison by @skitt in https://github.com/johnkerl/miller/pull/878
  • Ensure regression-test has a binary to test by @skitt in https://github.com/johnkerl/miller/pull/877
  • Avoid assuming ./mlr is the mlr to test by @skitt in https://github.com/johnkerl/miller/pull/876
  • Update release.yml by @jauderho in https://github.com/johnkerl/miller/pull/867
  • Update .goreleaser.yml by @jauderho in https://github.com/johnkerl/miller/pull/866
  • Goreleaser binary names by @johnkerl in https://github.com/johnkerl/miller/pull/852
  • Add CodeQL support by @jauderho in https://github.com/johnkerl/miller/pull/838

New Contributors

  • @vapniks made their first contribution in https://github.com/johnkerl/miller/pull/854
  • @Gary-Armstrong made their first contribution in https://github.com/johnkerl/miller/pull/863
  • @zachvalenta made their first contribution in https://github.com/johnkerl/miller/pull/945
  • @Poshi made their first contribution in https://github.com/johnkerl/miller/pull/949

Plans for 6.2.0

Update: planned now for 6.3.0 as 6.2.0 was quick and early.

  • Extended JSON-style field accessors for verbs: https://github.com/johnkerl/miller/issues/763 and https://github.com/johnkerl/miller/issues/948
  • AWK-like exit DSL function: https://github.com/johnkerl/miller/issues/341
  • DSL strict mode: https://github.com/johnkerl/miller/issues/440
  • YAML support: https://github.com/johnkerl/miller/issues/614
  • Datediff: https://github.com/johnkerl/miller/issues/708
  • Rank: https://github.com/johnkerl/miller/issues/383

Full Changelog: https://github.com/johnkerl/miller/compare/v6.0.0...v6.1.0

- Go
Published by johnkerl almost 4 years ago

https://github.com/johnkerl/miller - Miller 6

This is a significant release with many improvements to user experience, documentation, and performance.

Please see What's new in Miller 6 for complete information.

- Go
Published by johnkerl about 4 years ago

https://github.com/johnkerl/miller - Miller 6 release candidate 1

This is an update after https://github.com/johnkerl/miller/releases/tag/v6.0.0-beta, including several performance-optimization PRs since then.

This is a release-candidate tag -- it doesn't include https://github.com/johnkerl/miller/issues/827 or https://github.com/johnkerl/miller/discussions/755 both of which are blockers for the Miller 6.0.0 release per se.

The main purpose is for a Conda build by @BEFH as tracked at https://github.com/johnkerl/miller/issues/372#issuecomment-1007576714.

After #755 and #827 are resolved we will have either 6.0.0.rc2 (if other issues arise) or simply 6.0.0.

- Go
Published by johnkerl about 4 years ago

https://github.com/johnkerl/miller - Miller 6.0.0 beta release

This is a beta release for the upcoming 6.0.0 release of Miller.

Update: please see https://github.com/johnkerl/miller/releases/tag/v6.0.0.rc1.

Status

This is marked as a pre-release -- you can get the binaries (for Linux, Mac, and Windows) by downloading them from this release page. Meanwhile tools like brew, apt, chocolatey, etc will still give you Miller 5 until the official Miller 6.0.0 release which is forthcoming.

Release notes

https://miller.readthedocs.io/en/latest/new-in-miller-6

Documentation

Please see https://miller.readthedocs.io/en/latest

Goals for the beta

This is a major, exciting release with lots of features, documentation improvements, full Windows support, and more. Please comment on this page, or file an issue at https://github.com/johnkerl/miller/issues, with any and all feedback, criticism, comments, etc.

Performance updates

  • 2021/12/01: new binaries attached to this pre-release today incorporate https://github.com/johnkerl/miller/pull/765 which is a 40% reduction in runtime for large files. Two more performance PRs are in prep.
  • 2021/12/21: new binaries attached to this pre-release today incorporate several recent performance-related PRs -- see https://github.com/johnkerl/miller/pull/786 for details.
  • 2021/12/27: new binaries attached to this pre-release today incorporate the performance-related PR https://github.com/johnkerl/miller/pull/809. See also https://miller.readthedocs.io/en/latest/new-in-miller-6/#performance-benchmarks.
  • 2021/12/30: new binaries attached this pre-release today incorporate all currently known release-blocking issues. The mlr version output now shows 6.0.0-rc to indicate this is a release candidate. I hope to release this very soon, barring any new feedback.

Note that the source tar file attached to this pre-release predates these performance improvements -- if you want binaries, they're current on this pre-release; if you want source, please clone HEAD.

Update: please see https://github.com/johnkerl/miller/releases/tag/v6.0.0.rc1

- Go
Published by johnkerl about 4 years ago

https://github.com/johnkerl/miller - Address Conda-build issue

This release exists solely to resolve a Conda-build issue as discussed on https://github.com/johnkerl/miller/issues/740. If you're not actively working on Conda packaging for Miller, this release has no added value for you above 5.10.2.

Likewise, there's no Windows mlr.exe for this final (technical & specific) Miller 5.x release -- for Miller 6.0.0 (coming soon!) and above there will be mlr.exe as a reliably standard part of each release.

Also note that the tarball is named miller-5.10.3.tar.gz, in contrast to mlr-5.10.2.tar.gz and likewise for all earlier releases. This is being done for forward compatibility with Miller 6.0.0 and beyond which will use names of the form miller-6.0.0.tar.gz, as proposed in https://github.com/johnkerl/miller/issues/360.

- Go
Published by johnkerl over 4 years ago

https://github.com/johnkerl/miller - Restore mlr manpage to distro file

Between 5.9 and 5.10, in the move of docs from https://johnkerl.org/miller/doc to https://miller.readthedocs.io/, I inadvertently made a change which kept the Miller manpage (man mlr) from being included in the distribution file.

The sole purpose of this release is to fix that.

If your way to access Miller versions is by downloading pre-built executables from the release page, or by building from source, this release doesn't do much for you. It's most useful for OS-specfic distro-build systems, so that man mlr will again work correctly.

- Go
Published by johnkerl almost 5 years ago

https://github.com/johnkerl/miller - Bugfixes

This release fixes the following:

  • https://github.com/johnkerl/miller/issues/427
  • https://github.com/johnkerl/miller/issues/431
  • https://github.com/johnkerl/miller/issues/443

Note: The Miller Appveyor build is again broken and I find it very frustrating to keep running. Two bits of good news: (1) I am recently in possession of a local Windows machine where I hope to produce a mlr.exe; (2) for the Go port (whenever I'm done with it), building for Windows will be a breeze with no special magic.

- Go
Published by johnkerl almost 5 years ago

https://github.com/johnkerl/miller - sort-within-records, unsparsify -f, misc updates; Go-port beta

Features

Bugfixes

  • The count -n feature was not implemented as intended. This fulfills https://github.com/johnkerl/miller/issues/370, reported by @aborruso.
  • Pretty-print format now works correctly with --headerless-csv-output as reported on https://github.com/johnkerl/miller/issues/384, reported by @agguser.
  • The seqgen verb now correctly tracks NR and FNR in the records it emits.
  • An intermittent JSON-parsing bug reported on https://github.com/johnkerl/miller/issues/394 by @sjackman has been fixed.

Documentation

This is the first release since the readthedocs move as requested by @pabloab on https://github.com/johnkerl/miller/issues/375. The intention is that you will be able to select documentation specific to 5.10.0 there; I may have something to fix here.

Go-port preview

While the mods for this 5.10.1 release are quite minor, intense development time has been spent over the last few months on the Go port, tracked here and here, which will ultimately become Miller 6.

The completion of the port is still some months away. While most verbs, and most of the DSL, have been ported -- with many new features in place as tracked here -- significant gaps remain. This include the "big" verbs join, nest, reshape, stats1, and stats2, along with all the date-time-related DSL functions, etc.

Nonetheless, if you wish to experiment with the Go executables for the Miller 6 beta, please find MacOS and Linux versions attached. (I don't know how to make these for Windows yet, sorry!)

I'd love any and all advance help with the Go port including bug reports, feature requests, etc. -- both from Miller end-users as well as developers. This is exciting and fulfilling work, and I look forward to getting it completed.

- Go
Published by johnkerl about 5 years ago

https://github.com/johnkerl/miller - Security update: disallow --prepipe in .mlrrc

As of Miller 5.9.0, you can have a .mlrrc file containing preferred flags.

As reported in https://github.com/johnkerl/miller/issues/363, it would be possible for someone to prepare a repository or some other zipfile/tarfile, for example, containing datasets, and send it to you. They could have a line of the form prepipe do_something_bad; cat in that repository, so when you ran any mlr commands in there, it would run the do_something_bad command (whatever that might be).

The fix is (a) disallow prepipe within .mlrrc files; (b) as a consolation, allow new prepipe-zcat and prepipe-gunzip options which are safe to use.

This is published as CVE-2020-15167. Many thanks to @koernepr for the report!

- Go
Published by johnkerl over 5 years ago

https://github.com/johnkerl/miller - .mlrrc feature, and fix Windows build

  • You can now save common defaults in a ~/.mlrrc. For example, if you normally process CSV files, you can say that in your ~/.mlrrc and you can leave off the --csv flag from your mlr commands. You can read more about this feature here, or in man mlr, or in mlr --help. This feature was requested in https://github.com/johnkerl/miller/issues/339.
  • The AppVeyor build is now unbroken and as a result there are Windows artifacts for this build. Sorry about the delay!! :^/

- Go
Published by johnkerl over 5 years ago

https://github.com/johnkerl/miller - Better environment-variable support, new 'count' verb, bugfixes

Features

  • The new count verb is a keystroke-saver for stats1 -a count -f {some field name}.
  • --jsonx and --ojsonx are keystroke-savers for --json --jvstack and --ojson --jvstack, which is to say, multi-line pretty-printed JSON format.
  • The new -s name=value feature for mlr put and mlr filter gives you simpler access to environment variables in your Miller script, as requested in https://github.com/johnkerl/miller/issues/315.

Bugfixes

  • mlr format-values is no longer SEGVing on CSV/TSV input. This was reported on https://github.com/johnkerl/miller/issues/330.
  • https://github.com/johnkerl/miller/issues/313 fixes a corner case when field names within command-line arguments have embedded newlines.
  • Line/column indicators for JSON-formatting error messages are now correct (previously they were showing up as 0).
  • end {print NF} no longer SEGVs. This was reported in https://github.com/johnkerl/miller/issues/330.
  • Several broken doc links were fixed up as reported on https://github.com/johnkerl/miller/issues/329.

Windows note

  • The AppVeyor build has been broken for a while so there is no Windows executable attached to this release -- when I fix that there will be a 5.8.1 with Windows binaries. My apologies for the delay. Issue https://github.com/johnkerl/miller/issues/354 is open to track this.

- Go
Published by johnkerl over 5 years ago

https://github.com/johnkerl/miller - Ports, bugfixes, and keystroke-savers

Ports

  • Miller is available via MacPorts thanks to @herbygillot. Miller tracking issue is https://github.com/johnkerl/miller/pull/273.

  • An Alpine Linux port is pending this release thanks to @terorie. Miller tracking issue is https://github.com/johnkerl/miller/issues/293.

Features

Bugfixes

  • A bug regarding optional regex-pattern groups was fixed in https://github.com/johnkerl/miller/issues/277.
  • As of https://github.com/johnkerl/miller/issues/294 you can now specify --implicit-csv-header for the join-file in mlr join.
  • A bug with spaces in XTAB-file values was fixed on https://github.com/johnkerl/miller/issues/296.
  • A bug with missing final newline for XTAB-formatted files using MMAP files was fixed on https://github.com/johnkerl/miller/issues/301.

Documentation

  • Look-and-feel at http://johnkerl.org/miller/doc/ is (hopefully) improved, including clearer visual indication of which section/page you're currently looking at. Note that this change has been live for a few weeks, as look-and-feel-related doc-mods from post-5.6.2 were backported to http://johnkerl.org/miller/doc/.

  • https://github.com/johnkerl/miller/issues/282 improves DSL-function documentation at http://johnkerl.org/miller/doc/reference-dsl.html#Built-infunctionsforfilterandput,summary

Note

Support for mmap mode has been entirely discontinued. This is an invisible change and should not affect you at all. For anyone interested in lower-level details, though, the summary is as follows:

  • For an incremental performance gain (perhaps 10-20% run time at most, but see below), within the C source code one can use the mmap system call to access input files via pointer arithmetic rather than malloc-and-memcopy using stdio.
  • However mmap is not available when reading from standard input -- it cannot be memory-mapped.
  • This means all file-format readers are implemented twice within the Miller source code.
  • While I try to regression-test Miller thoroughly, running all canned tests through mmap and stdio mode, I've nonetheless found my mmap implementations liable to corner-cases which I miss but users find: for example https://github.com/johnkerl/miller/issues/29, https://github.com/johnkerl/miller/issues/102, and https://github.com/johnkerl/miller/issues/296.
  • As tracked on https://github.com/johnkerl/miller/issues/160, various operating systems do not release mmapped pages after use as one might intuit, meaning that for large files and/or large numbers of files, I've for a long time now needed to have Miller opt out of mmap usage for precisely those cases which most need the performance gain: see https://github.com/johnkerl/miller/issues/160, https://github.com/johnkerl/miller/issues/181, and https://github.com/johnkerl/miller/issues/256.
  • Additionally, mmap is not used at all for Windows/MSYS2 so there is nothing to lose there.

For these reasons, keeping mmap mode isn't worth the development overhead.

As of release 5.7.0, the mlr executable will still accept the --mmap and --no-mmap command-line flags as no-ops, for backward compatibility.

The caveat for you is that for everyday small files, the default was previously mmap mode and is now stdio (except mlr ... < filename or ... | mlr ... which have always used stdio). There is the off chance that this will newly reveal an old, latent bug or two somewhere.

I've re-run regressions in valgrind mode to aggressively catch any errors, but, please let me know ASAP via GitHub issue of any unexpected behavior in 5.7.0.

- Go
Published by johnkerl almost 6 years ago

https://github.com/johnkerl/miller - Bug fix for CSV/TSV with many files

Bug fixes:

  • https://github.com/johnkerl/miller/issues/271 fixes a corner-case bug with more than 100 CSV/TSV files with headers of varying lengths.

Documentation:

  • The new http://johnkerl.org/miller/doc/whyc-details.html is an elaboration on http://johnkerl.org/miller/doc/whyc.html which answers a question posed by @burntsushi on Reddit a couple years ago which I did not address in detail at the time.

- Go
Published by johnkerl over 6 years ago

https://github.com/johnkerl/miller - Mobile-friendly docs

The only change is that http://johnkerl.org/miller/doc is now more mobile-friendly.

All build artifacts are the same as at https://github.com/johnkerl/miller/releases/tag/v5.6.0

Before

Before

After

After

- Go
Published by johnkerl over 6 years ago

https://github.com/johnkerl/miller - System calls / external commands, ASV/USV support, and bulk numeric formatting

Features:

  • The new system DSL function allows you to run arbitrary shell commands and store them in field values. Some example usages are documented here. This is in response to issues https://github.com/johnkerl/miller/issues/246 and https://github.com/johnkerl/miller/issues/209.

  • There is now support for ASV and USV file formats. This is in response to issue https://github.com/johnkerl/miller/issues/245.

  • The new format-values verb allows you to apply numerical formatting across all record values. This is in response to issue https://github.com/johnkerl/miller/issues/252.

Documentation:

  • The new DKVP I/O in Python sample code now works for Python 2 as well as Python 3.

  • There is a new cookbook entry on doing multiple joins. This is in response to issue https://github.com/johnkerl/miller/issues/235.

Bugfixes:

  • The toupper, tolower, and capitalize DSL functions are now UTF-8 aware, thanks to @sheredom's marvelous https://github.com/sheredom/utf8.h. The internationalization page has also been expanded. This is in response to issue https://github.com/johnkerl/miller/issues/254.

  • https://github.com/johnkerl/miller/issues/250 fixes a bug using in-place mode in conjunction with verbs (such as rename or sort) which take field-name lists as arguments.

  • https://github.com/johnkerl/miller/issues/253 fixes a bug in the label when one or more names are common between old and new.

  • https://github.com/johnkerl/miller/issues/251 fixes a corner-case bug when (a) input is CSV; (b) the last field ends with a comma and no newline; (c) input is from standard input and/or --no-mmap is supplied.

Note:

Thanks to @aborruso @davidselassie @joelparkerhenderson for the bug reports and feature requests!! :)

- Go
Published by johnkerl over 6 years ago

https://github.com/johnkerl/miller - Positional indexing and other data-cleaning features

Features:

  • The new positional-indexing feature resolves https://github.com/johnkerl/miller/issues/236 from @aborruso. You can now get the name of the 3rd field of each record via $[[3]], and its value by $[[[3]]]. These are both usable on either the left-hand or right-hand side of assignment statements, so you can more easily do things like renaming fields progrmatically within the DSL.

  • There is a new capitalize DSL function, complementing the already-existing toupper. This stems from https://github.com/johnkerl/miller/issues/236.

  • There is a new skip-trivial-records verb, resolving https://github.com/johnkerl/miller/issues/197. Similarly, there is a new remove-empty-columns verb, resolving https://github.com/johnkerl/miller/issues/206. Both are useful for data-cleaning use-cases.

  • Another pair is https://github.com/johnkerl/miller/issues/181 and https://github.com/johnkerl/miller/issues/256. While Miller uses mmap internally (and invisibily) to get approximately a 20% performance boost over not using it, this can cause out-of-memory issues with reading either large files, or too many small ones. Now, Miller automatically avoids mmap in these cases. You can still use --mmap or --no-mmap if you want manual control of this.

  • There is a new --ivar option for the nest verb which complements the already-existing --evar. This is from https://github.com/johnkerl/miller/pull/260 thanks to @jgreely.

  • There is a new keystroke-saving urandrange DSL function: urandrange(low, high) is the same as low + (high - low) * urand(). This arose from https://github.com/johnkerl/miller/issues/243.

  • There is a new -v option for the cat verb which writes a low-level record-structure dump to standard error.

  • There is a new -N option for mlr which is a keystroke-saver for --implicit-csv-header --headerless-csv-output.

Documentation:

  • The new FAQ entry http://johnkerl.org/miller/doc/faq.html#Howtoescape'%3F'in_regexes%3F resolves https://github.com/johnkerl/miller/issues/203.

  • The new FAQ entry http://johnkerl.org/miller/doc/faq.html#HowcanIfilterby_date%3F resolves https://github.com/johnkerl/miller/issues/208.

  • https://github.com/johnkerl/miller/issues/244 fixes a documentation issue while highlighting the need for https://github.com/johnkerl/miller/issues/241.

Bugfixes:

  • There was a SEGV using nest within then-chains, fixed in response to https://github.com/johnkerl/miller/issues/220.

  • Quotes and backslashes weren't being escaped in JSON output with --jvquoteall; reported on https://github.com/johnkerl/miller/issues/222.

An extra thank-you:

I've never code-named releases but if I were to code-name 5.5.0 I would call it "aborruso". Andrea has contributed many fantastic feature requests, as well as driving a huge volume of Miller-related discussions in StackExchange (https://github.com/johnkerl/miller/issues/212). Mille grazie al mio amico @aborruso!

- Go
Published by johnkerl over 6 years ago

https://github.com/johnkerl/miller - New data-cleaning features, Windows mlr.exe, limited localtime support, and bugfixes

Features:

  • The new clean-whitespace verb resolves https://github.com/johnkerl/miller/issues/190 from @aborruso. Along with the new functions strip, lstrip, rstrip, collapse_whitespace, and clean_whitespace, there is now both coarse-grained and fine-grained control over whitespace within field names and/or values. See the linked-to documentation for examples.

  • The new altkv verb resolves https://github.com/johnkerl/miller/issues/184 which was originally opened via an email request. This supports mapping value-lists such as a,b,c,d to alternating key-value pairs such as a=b,c=d.

  • The new fill-down verb resolves https://github.com/johnkerl/miller/issues/189 by @aborruso. See the linked-to documentation for examples.

  • The uniq verb now has a uniq -a which resolves https://github.com/johnkerl/miller/issues/168 from @sjackman.

  • The new regextract and regextractorelse functions resolve https://github.com/johnkerl/miller/issues/183 by @aborruso.

  • The new ssub function arises from https://github.com/johnkerl/miller/issues/171 by @dohse, as a simplified way to avoid escaping characters which are special to regular-expression parsers.

  • There are new localtime functions in response to https://github.com/johnkerl/miller/issues/170 by @sitaramc. However note that as discussed on https://github.com/johnkerl/miller/issues/170 these do not undo one another in all circumstances. This is a non-issue for timezones which do not do DST. Otherwise, please use with disclaimers: localdate, localtime2sec, sec2localdate, sec2localtime, strftime_local, and strptime_local.

Builds:

  • Windows build-artifacts are now available in Appveyor at https://ci.appveyor.com/project/johnkerl/miller/build/artifacts, and will be attached to this and future releases. This resolves https://github.com/johnkerl/miller/issues/167, https://github.com/johnkerl/miller/issues/148, and https://github.com/johnkerl/miller/issues/109.

  • Travis builds at https://travis-ci.org/johnkerl/miller/builds now run on OSX as well as Linux.

  • An Ubuntu 17 build issue was fixed by @singalen on https://github.com/johnkerl/miller/issues/164.

Documentation:

  • put/filter documentation was confusing as reported by @NikosAlexandris on https://github.com/johnkerl/miller/issues/169.

  • The new FAQ entry http://johnkerl.org/miller-releases/miller-head/doc/faq.html#Howtorectangularizeafterjoinswithunpaired? resolves https://github.com/johnkerl/miller/issues/193 by @aborruso.

  • The new cookbook entry http://johnkerl.org/miller/doc/cookbook.html#Optionsfordealingwithduplicate_rows arises from https://github.com/johnkerl/miller/issues/168 from @sjackman.

  • The unsparsify documentation had some words missing as reported by @tst2005 on https://github.com/johnkerl/miller/issues/194.

  • There was a typo in the cookpage page http://johnkerl.org/miller/doc/cookbook.html#Fullfieldrenamesandreassigns as fixed by @tst2005 in https://github.com/johnkerl/miller/pull/192.

Bugfixes:

  • There was a memory leak for TSV-format files only as reported by @treynr on https://github.com/johnkerl/miller/issues/181.

  • Dollar sign in regular expressions were not being escaped properly as reported by @dohse on https://github.com/johnkerl/miller/issues/171.

- Go
Published by johnkerl over 7 years ago

https://github.com/johnkerl/miller - Data comments, documentation improvements, and bug fixes

Features:

  • Comment strings in data files: mlr --skip-comments allows you to filter out input lines starting with #, for all file formats. Likewise, mlr --skip-comments-with X lets you specify the comment-string X. Comments are only supported at start of data line. mlr --pass-comments and mlr --pass-comments-with X allow you to forward comments to program output as they are read.

  • The count-similar verb lets you compute cluster sizes by cluster labels.

  • While Miller DSL arithmetic gracefully overflows from 64-integer to double-precision float (see also here), there are now the integer-preserving arithmetic operators .+ .- .* ./ .// for those times when you want integer overflow.

  • There is a new bitcount function: for example, echo x=0xf0000206 | mlr put '$y=bitcount($x)' produces x=0xf0000206,y=7.

  • Issue 158: mlr -T is an alias for --nidx --fs tab, and mlr -t is an alias for mlr --tsvlite.

  • The mathematical constants π and e have been renamed from PI and E to M_PI and M_E, respectively. (It's annoying to get a syntax error when you try to define a variable named E in the DSL, when A through D work just fine.) This is a backward incompatibility, but not enough of us to justify calling this release Miller 6.0.0.

Documentation:

  • As noted here, while Miller has its own DSL there will always be things better expressible in a general-purpose language. The new page Sharing data with other languages shows how to seamlessly share data back and forth between Miller, Ruby, and Python. SQL-input examples and SQL-output examples contain detailed information the interplay between Miller and SQL.

  • Issue 150 raised a question about suppressing numeric conversion. This resulted in a new FAQ entry How do I suppress numeric conversion?, as well as the longer-term follow-on issue 151 which will make numeric conversion happen on a just-in-time basis.

  • To my surprise, csvlite format options weren’t listed in mlr --help or the manpage. This has been fixed.

  • Documentation for auxiliary commands has been expanded, including within the manpage.

Bugfixes:

  • Issue 159 fixes regex-match of literal dot.

  • Issue 160 fixes out-of-memory cases for huge files. This is an old bug, as old as Miller, and is due to inadequate testing of huge-file cases. The problem is simple: Miller prefers memory-mapped I/O (using mmap) over stdio since mmap is fractionally faster. Yet as any processing (even mlr cat) steps through an input file, more and more pages are faulted in -- and, unfortunately, previous pages are not paged out once memory pressure increases. (This despite gallant attempts with madvise.) Once all processing is done, the memory is released; there is no leak per se. But the Miller process can crash before the entire file is read. The solution is equally simple: to prefer stdio over mmap for files over 4GB in size. (This 4GB threshold is tunable via the --mmap-below flag as described in the manpage.)

  • Issue 161 fixes a CSV-parse error (with error message "unwrapped double quote at line 0") when a CSV file starts with the UTF-8 byte-order-mark ("BOM") sequence 0xef 0xbb 0xbf and the header line has double-quoted fields. (Release 5.2.0 introduced handling for UTF-8 BOMs, but missed the case of double-quoted header line.)

  • Issue 162 fixes a corner case doing multi-emit of aggregate variables when the first variable name is a typo.

  • The Miller JSON parser used to error with Unable to parse JSON data: Line 1 column 0: Unexpected 0x00 when seeking value on empty input, or input with trailing whitespace; this has been fixed.

There is no prebuilt Windows executable for this release; my apologies.

- Go
Published by johnkerl about 8 years ago

https://github.com/johnkerl/miller - Bug-fix release: 64-bit aggregators

This bugfix release delivers a fix for https://github.com/johnkerl/miller/issues/147 where a memory allocation failed beyond 4GB.

Documents are the same as for 5.2.0.

- Go
Published by johnkerl over 8 years ago

https://github.com/johnkerl/miller - Fix non-x86/gcc7 build error

This bugfix release addresses https://github.com/johnkerl/miller/issues/142.

I'm not attaching prebuilt binaries beyond those already in https://github.com/johnkerl/miller/releases/tag/v5.2.0 since the binaries there are fine for their respective architectures.

This unblocks Miller on openSUSE.

- Go
Published by johnkerl over 8 years ago

https://github.com/johnkerl/miller - stats across regexed field names, string/num stats, CSV UTF BOM strip

This release contains mostly feature requests.

Features:

  • The stats1 verb now lets you use regular expressions to specify which field names to compute statistics on, and/or which to group by. Full details are here.

  • The min and max DSL functions, and the min/max/percentile aggregators for the stats1 and merge-fields verbs, now support numeric as well as string field values. (For mixed string/numeric fields, numbers compare before strings.) This means in particular that order statistics -- min, max, and non-interpolated percentiles -- as well as mode, antimode, and count are now possible on string-only (or mixed) fields. (Of course, any operations requiring arithmetic on values, such as computing sums, averages, or interpolated percentiles, yield an error on string-valued input.)

  • There is a new DSL function mapexcept which returns a copy of the argument with specified key(s), if any, unset. The motivating use-case is to split records to multiple filenames depending on particular field value, which is omitted from the output: mlr --from f.dat put 'tee > "/tmp/data-".$a, mapexcept($*, "a")' Likewise, mapselect returns a copy of the argument with only specified key(s), if any, set. This resolves https://github.com/johnkerl/miller/issues/137.

  • A new -u option for count-distinct allows unlashed counts for multiple field names. For example, with -f a,b and without -u, count-distinct computes counts for distinct pairs of a and b field values. With -f a,b and with -u, it computes counts for distinct a field values and counts for distinct b field values separately.

  • If you build from source, you can now do ./configure without first doing autoreconf -fiv. This resolves https://github.com/johnkerl/miller/issues/131.

  • The UTF-8 BOM sequence 0xef 0xbb 0xbf is now automatically ignored from the start of CSV files. (The same is already done for JSON files.) This resolves https://github.com/johnkerl/miller/issues/138.

  • For put and filter with -S, program literals such as the 6 in $x = 6 were being parsed as strings. This is not sensible, since the -S option for put and filter is intended to suppress numeric conversion of record data, not program literals. To get string 6 one may use $x = "6".

Documentation:

Bugfixes:

  • CRLF line-endings were not being correctly autodetected when I/O formats were specified using --c2j et al.

  • Integer division by zero was causing a fatal runtime exception, rather than computing inf or nan as in the floating-point case.

Binaries:

As below. Additionally, the MacOSX version is available in Homebrew. For Windows, you need the .exe file along with both .dll files, with instructions as in https://github.com/johnkerl/miller/releases/tag/v5.1.0w.

- Go
Published by johnkerl over 8 years ago

https://github.com/johnkerl/miller - MLR.EXE: Windows beta

I'm happy to announce a Windows port of Miller. Features in this 5.1.0w release are identical to 5.1.0; the only delivery here is an executable compiled for 64-bit Windows.

Details are here.

One of the reasons I'm calling this a beta is that at present you need two DLLs in addition to the mlr.exe executable attached below. All three need to be somewhere in your Windows PATH.

For example, you can do

C:\> mkdir \mbin

Then place libpcreposix-0.dll, libpcre-1.dll, and mlr.exe all into C:\mbin. Then

C:\> set PATH=%PATH%;\mbin

The Windows port is still beta: please open an issue at https://github.com/johnkerl/miller/issues if you encounter any problems.

Update a few hours later: Due to simple fat-fingering on my part, one of the files was misnamed. The binaries have been reattached correctly.

Information about the binaries:

``` FILE SIZES 4,379,627 mlr.exe 281,871 libpcre-1.dll 44,554 libpcreposix-0.dll

FILE MD5SUMS e46a2bfcda001f3698eee4f09409fc04 *mlr.exe 003b71bce60e63d745bac45740c277f8 *libpcre-1.dll d5920106bdbccf736fd8c459959fabbe *libpcreposix-0.dll ```

- Go
Published by johnkerl almost 9 years ago

https://github.com/johnkerl/miller - JSON-array support, fractional seconds in strptime/strftime, and other minor features

This is a relatively minor release of Miller, containing feature requests and bugfixes while I've been working on the Windows port (which is nearly complete).

Features:

  • JSON arrays: as described here, Miller being a tabular data processor isn't well-position to handle arbitrary JSON. (See jq for that.) But as of 5.1.0, arrays are converted to maps with integer keys, which are then at least processable using Miller. Details are here. The short of it is that you now have three options for the main mlr executable:

--json-map-arrays-on-input Convert JSON array indices to Miller map keys. (This is the default.) --json-skip-arrays-on-input Disregard JSON arrays. --json-fatal-arrays-on-input Raise a fatal error when JSON arrays are encountered in the input.

This resolves https://github.com/johnkerl/miller/issues/133.

  • The new mlr fraction verb makes possible in a few keystrokes what was only possible before using two-pass DSL logic: here you can turn numerical values down a column into their fractional/percentage contribution to column totals, optionally grouped by other key columns.

  • The DSL functions strptime and strftime now handle fractional seconds. For parsing, use %S format as always; for formatting, there are now %1S through %9S which allow you to configure a specified number of decimal places. The return value from strptime is now floating-point, not integer, which is a minor backward incompatibility not worth labeling this release as 6.0.0. (You can work around this using int(strptime(...)).) The DSL functions gmt2sec and sec2gmt, which are keystroke-savers for strptime and strftime, are similarly modified, as is the sec2gmt verb. This resolves https://github.com/johnkerl/miller/issues/125.

  • A few nearly-standalone programs -- which do not have anything to do with record streams -- are packaged within the Miller. (For example, hex-dump, unhex, and show-line-endings commands.) These are described here.

  • The stats1 and merge-fields verbs now support an antimode aggregator, in addition to the existing mode aggregator.

  • The join verb now by default does not require sorted input, which is the more common use case. (Memory-parsimonious joins which require sorted input, while no longer the default, are available using -s.) This another minor backward incompatibility not worth making a 6.0.0 over. This resolves https://github.com/johnkerl/miller/issues/134.

  • mlr nest has a keystroke-saving --evar option for a common use case, namely, exploding a field by value across records.

Documentation:

Bugfixes:

  • mlr join -j -l was not functioning correctly. This resolves https://github.com/johnkerl/miller/issues/136.

  • JSON escapes on output (\t and so on) were incorrect. This resolves https://github.com/johnkerl/miller/issues/135.

- Go
Published by johnkerl almost 9 years ago

https://github.com/johnkerl/miller - Two minor bugfixes

  1. As described in https://github.com/johnkerl/miller/issues/132, mlr nest was incorrectly splitting fields with multi-character separators.

  2. The XTAB-format reader, when using multi-character IPS, was incorrectly splitting key-value pairs, but only when reading from standard input (e.g. on a pipe or less-than redirect).

- Go
Published by johnkerl almost 9 years ago

https://github.com/johnkerl/miller - Autodetected line-endings, in-place mode, user-defined functions, and more

This major release significantly expands the expressiveness of the DSL for mlr put and mlr filter. (The upcoming 5.1.0 release will add the ability to aggregate across all columns for non-DSL verbs such as mlr stats1 and mlr stats2. As well, a Windows port is underway.)

Please also see the Miller main docs.

Simple but impactful features: - Line endings (CRLF vs. LF, Windows-style vs. Unix-style) are now autodetected. For example, files (including CSV) with LF input will lead to LF output unless you specify otherwise. - There is now an in-place mode using mlr -I.

Major DSL features: - You can now define your own functions and subroutines: e.g. func f(x, y) { return x**2 + y**2 }. - New local variables are completely analogous to out-of-stream variables: sum retains its value for the duration of the expression it's defined in; @sum retains its value across all records in the record stream. - Local variables, function parameters, and function return types may be defined untyped or typed as in x = 1 or int x = 1, respectively. There are also expression-inline type-assertions available. Type-checking is up to you: omit it if you want flexibility with heterogeneous data; use it if you want to help catch misspellings in your DSL code or unexpected irregularities in your input data. - There are now four kinds of maps. Out-of-stream variables have always been scalars, maps, or multi-level maps: @a=1, @b[1]=2, @c[1][2]=3. The same is now true for local variables, which are new to 5.0.0. Stream records have always been single-level maps; $* is a map. And as of 5.0.0 there are now map literals, e.g. {"a":1, "b":2}, which can be defined using JSON-like syntax (with either string or integer keys) and which can be nested arbitrarily deeply. - You can loop over maps -- $*, out-of-stream variables, local variables, map-literals, and map-valued function return values -- using for (k, v in ...) or the new for (k in ...) (discussed next). All flavors of map may also be used in emit and dump statements. - User-defined functions and subroutines may take map-valued arguments, and may return map values. - Some built-in functions now accept map-valued input: typeof, length, depth, leafcount, haskey. There are built-in functions producing map-valued output: mapsum and mapdiff. There are now string-to-map and map-to-string functions: splitnv, splitkv, splitnvx, splitkvx, joink, joinv, and joinkv.

Minor DSL features: - For iterating over maps (namely, local variables, out-of-stream variables, stream records, map literals, or return values from map-valued functions) there is now a key-only for-loop syntax: e.g. for (k in $*) { ... }. This is in addition to the already-existing for (k, v in ...) syntax. - There are now triple-statement for-loops (familiar from many other languages), e.g. for (int i = 0; i < 10; i += 1) { ... }. - mlr put and mlr filter now accept multiple -f for script files, freely intermixable with -e for expressions. The suggested use case is putting user-defined functions in script files and one-liners calling them using -e. Example: myfuncs.mlr defines the function f(...), then mlr put -f myfuncs.mlr -e '$o = f($i)' myfile.dat. More information is here. - mlr filter is now almost identical to mlr put: it can have multiple statements, it can use begin and/or end blocks, it can define and invoke functions. Its final expression must evaluate to boolean which is used as the filter criterion. More details are here. - The min and max functions are now variadic: $o = max($a, $b, $c). - There is now a substr function. - While ENV has long provided read-access to environment variables on the right-hand side of assignments (as a getenv), it now can be at the left-hand side of assignments (as a putenv). This is useful for subsidiary processes created by tee, emit, dump, or print when writing to a pipe. - Handling for the # in comments is now handled in the lexer, so you can now (correctly) include # in strings. - Separators are now available as read-only variables in the DSL: IPS, IFS, IRS, OPS, OFS, ORS. These are particularly useful with the split and join functions: e.g. with mlr --ifs tab ..., the IFS variable within a DSL expression will evaluate to a string containing a tab character. - Syntax errors in DSL expressions now have a little more context. - DSL parsing and execution are a bit more transparent. There have long been -v and -t options to mlr put and mlr filter, which print the expression's abstract syntax tree and do a low-level parser trace, respectively. There are now additionally -a which traces stack-variable allocation and -T which traces statements line by line as they execute. While -v, -t, and -a are most useful for development of Miller, the -T option gives you more visibility into what your Miller scripts are doing. See also here.

Verbs: - most-frequent and least-frequent as requested in https://github.com/johnkerl/miller/issues/110. - seqgen makes it easy to generate data from within Miller: please also see here for a usage example. - unsparsify makes it easy to rectangularize data where not all records have the same fields. - cat -n now takes a group-by (-g) option, making it easy to number records within categories. - count-distinct, uniq, most-frequent, least-frequent, top, and histogram now take a -o option for specifying their output field names, as requested in https://github.com/johnkerl/miller/issues/122. - Median is now a synonym for p50 in stats1. - You can now start a then chain with an initial then, which is nice in backslashy/multiline-continuation contexts. This was requested in https://github.com/johnkerl/miller/issues/130.

I/O options: - The print statement may now be used with no arguments, which prints a newline, and a no-argument printn prints nothing but creates a zero-length file in redirected-output context. - Pretty-print format now has a --pprint --barred option (for output only, not input). For an example, please see here. - There are now keystroke-savers of the form --c2p which abbreviate --icsvlite --opprint, and so on. - Miller's map literals are JSON-looking but allow integer keys which JSON doesn't. The --jknquoteint and --jvquoteall flags for mlr (when using JSON output) and mlr put (for dump) provide control over double-quoting behavior.

Documents new since the previous release: - Miller in 10 minutes is a long-overdue addition: while Miller's detailed documentation is evident, there has been a lack of more succinct examples. - The cookbook has likewise been expanded, and has been split out into three parts: part 1, part 2, part 3. - A bit more background on C performance compared to other languages I experimented with, early on in the development of Miller, is here.

On-line help: - Help for DSL built-in functions, DSL keywords, and verbs is accessible using mlr -f, mlr -k, and mlr -l respectively; name-only lists are available with mlr -F, mlr -K, and mlr -L.

Bugfixes: - A corner-case bug causing a segmentation violation on two sub/gsub statements within a single put, the first one matching its pattern and the second one not matching its pattern, has been fixed.

Backward incompatibilities: This is Miller 5.0.0, not 4.6.0, due to the following (all relatively minor): - The v variables bound in for-loops such as for (k, v in some_multi_level_map) { ... } can now be map-valued if the v specifies a non-terminal in the map. - There are new keywords such as var, int, float, num, str, bool, map, IPS, IFS, IRS, OPS, OFS, ORS which can no longer be used as variable names. See mlr -k for the complete list. - Unset of the last key in an map-valued variable's map level no longer removes the level: e.g. with @v[1][2]=3 and unset @v[1][2] the @v variable would be empty. As of 5.0.0, @v has key 1 with an empty-map value. - There is no longer type-inference on literals: "3"+4 no longer gives 7. (That was never a good idea.) - The typeof function used to say things like MT_STRING; now it says things like string.

Homebrew request pending: https://github.com/Homebrew/homebrew-core/pull/10426

- Go
Published by johnkerl almost 9 years ago

https://github.com/johnkerl/miller - Customizable output format for redirected output

In a natural follow-on to the 4.4.0 redirected-output feature, the 4.5.0 release allows your tap-files to be in a different output format from the main program output.

For example, using

mlr --icsv --opprint ... then put --ojson 'tee > "mytap-".$a.".dat", $*' then ...

the input is CSV, the output is pretty-print tabular, but the tee-files output is written in JSON format. Likewise --ofs, --ors, --ops, --jvstack, and all other output-formatting options from the main help at mlr -h and/or man mlr default to the main command-line options, and may be overridden with flags supplied to mlr put and mlr tee.

Documentation: http://johnkerl.org/miller/doc/reference.html#Redirected-outputstatementsfor_put

Brew update: https://github.com/Homebrew/homebrew-core/pull/4098

- Go
Published by johnkerl over 9 years ago

https://github.com/johnkerl/miller - Redirected output, row-value shift, and other features

The principal feature of Miller 4.4.0 is redirected output. Inspired by awk, Miller lets you tap/tee your data as it's processed, run output through subordinate processes such as gzip and jq, split a single file into multiple files per an account-ID column, and so on.

Details: http://johnkerl.org/miller/doc/reference.html#Redirected-outputstatementsfor_put

Other features: - mlr step -a shift allows you to place the previous record's values alongside the current record's values: http://johnkerl.org/miller/doc/reference.html#step - mlr head, when used without the group-by flag (-g), stops after the specified number of records has been output. For example, even with a multi-gigabyte data file, mlr head -n 10 hugefile.dat will complete quickly after producing the first ten records from the file. - The sec2gmtdate verb, and sec2gmtdate function for filter/put, is new: please see http://johnkerl.org/miller/doc/reference.html#sec2gmtdate and http://johnkerl.org/miller/doc/reference.html#Functionsforfilterandput. - sec2gmt and sec2gmtdate both leave non-numbers as-is, rather than formatting them as (error). This is particularly relevant for formatting nullable epoch-seconds columns in SQL-table output: if a column value is NULL then after sec2gmt or sec2gmtdate it will still be NULL. - The dot operator has been universalized to work with any data type and produce a string. For example, if the field n has integers, then instead of typing mlr put '$name = "value:".string($n)' you can now simply domlr put '$name = "value:".$n'. This is particularly timely for creating filenames for redirected print/dump/tee/emit output. - The online documents now have a copy of the Miller manpage: http://johnkerl.org/miller/doc/manpage.html - Bugfix: inside filter/put, $x=="" was distinct from isempty($x). This was nonsensical; now both are the same.

Brew update: https://github.com/Homebrew/homebrew-core/pull/3820

- Go
Published by johnkerl over 9 years ago

https://github.com/johnkerl/miller - Interpolated percentiles, markdown-tabular output format, CSV-quote preservation

Major features: - Interpolated percentiles are now available using mlr stats1 -i or mlr merge-fields -i. Non-interpolated percentiles are the default. The former resemble R's type=7 quantiles and the latter resemble R's type=1 quantiles. See also http://johnkerl.org/miller/doc/reference.html#stats1 and http://johnkerl.org/miller/doc/reference.html#merge-fields. - Markdown-tabular output format is now available using --omd: please see http://johnkerl.org/miller/doc/file-formats.html#Markdown_tabular and https://github.com/johnkerl/miller/issues/106. - For files using CSV input as well as CSV output, there is now a --quote-original option which outputs fields with quotes if they had them on input. The was-quoted flag isn't tracked on derived fields, e.g. if fields a and b were quoted on input, then in mlr put '$c = $a . $b the c field won't be quoted on output. As such, this option is most useful with mlr cut, mlr filter, etc. The use-case from the original feature request https://github.com/johnkerl/miller/issues/77#issuecomment-226640596 is in trimming down a huge CSV file in order to facilitate subsequent in-memory processing using spreadsheet software. - The cookbook at http://johnkerl.org/miller/doc/cookbook.html has been extended significantly.

Minor features: - You can now set a MLR_CSV_DEFAULT_RS=lf environment variable if you're tired of always putting --rs lf arguments for your CSV files: http://johnkerl.org/miller/doc/file-formats.html#CSV/TSV/etc. - The printn and eprintn commands for mlr put are identical to print and eprint except they don't print final newlines. - It is now an error if boundvars in the same for-loop expression have duplicate names, e.g. for (a,a in $*) {...} results in the error message mlr: duplicate for-loop boundvars "a" and "a". - The strptime function would announce an internal coding error on malformed format strings; now, it correctly points out the user-level error.

Bug fixes: - Percentiles in merge-fields were not working. This was fixed; also, the lacking unit-test cases which would have caught this sooner have been filled in. - Miller's CSV output-quoting was non-RFC-compliant: double-quotes within field names were not being duplicated. This has been fixed (https://github.com/johnkerl/miller/issues/104).

Brew update: https://github.com/Homebrew/homebrew-core/pull/2698

- Go
Published by johnkerl over 9 years ago

https://github.com/johnkerl/miller - Multi-emit

You can now emit multiple out-of-stream variables side-by-side.

Doc link: http://johnkerl.org/miller/doc/reference.html#Multi-emitstatementsfor_put

Example:

$ mlr --from data/medium --opprint put -q ' @x_count[$a][$b] += 1; @x_sum[$a][$b] += $x; end { for ((a, b), _ in @x_count) { @x_mean[a][b] = @x_sum[a][b] / @x_count[a][b] } emit (@x_sum, @x_count, @x_mean), "a", "b" } ' a b x_sum x_count x_mean pan pan 219.185129 427 0.513314 pan wye 198.432931 395 0.502362 pan eks 216.075228 429 0.503672 pan hat 205.222776 417 0.492141 pan zee 205.097518 413 0.496604 eks pan 179.963030 371 0.485076 eks wye 196.945286 407 0.483895 eks zee 176.880365 357 0.495463 eks eks 215.916097 413 0.522799 eks hat 208.783171 417 0.500679 wye wye 185.295850 377 0.491501 wye pan 195.847900 392 0.499612 wye hat 212.033183 426 0.497730 wye zee 194.774048 385 0.505907 wye eks 204.812961 386 0.530604 zee pan 202.213804 389 0.519830 zee wye 233.991394 455 0.514267 zee eks 190.961778 391 0.488393 zee zee 206.640635 403 0.512756 zee hat 191.300006 409 0.467726 hat wye 208.883010 423 0.493813 hat zee 196.349450 385 0.509999 hat eks 189.006793 389 0.485879 hat hat 182.853532 381 0.479931 hat pan 168.553807 363 0.464336

Note that this example simply recapitulates the easier-to-type

mlr --from ../data/medium --opprint stats1 -a sum,count,mean -f x -g a,b

Brew update: https://github.com/Homebrew/homebrew-core/pull/2213

- Go
Published by johnkerl over 9 years ago

https://github.com/johnkerl/miller - for/if/while and various features

While one of Miller’s strengths is its brevity, and so its domain-specific language is intentionally simple, the ability to loop over field names is a basic thing to want. Likewise for other control structures on the same complexity level as awk. Miller has always owed much inspiration to awk; 4.1.0 makes this more explicit by providing several common language idioms.

Major features: - For-loops over key-value pairs in stream records and out-of-stream variables - Loops using while and do while - break and continue in for, while, and do while loops - If-elif-else statements - Nestability of all the above, as well as of existing pattern-action blocks

Additional features: - Computable field names using square brackets, e.g. $[$a.$b] = $a * $b - Type-predicate functions: isnumeric, isint, isfloat, isbool, isstring - Commenting using pound signs - The new print and eprint allow formatting of arbitrary expressions to stdout/stderr, respectively - In addition to the existing dump which formats all out-of-stream variables to stdout as JSON, the new edump does the same to stderr - Semicolon is no longer required after closing curly brace - emit @* and unset @* are new synonyms for emit all and unset all - unset $* now exists - mlr -n is synonymous with mlr --from /dev/null, which is useful in dataless contexts wherein all your put statements are contained within begin/end blocks - Bugfix: in 4.0.0, mlr put -v '@a[1][2]=$b;$new=@a[1][2]' mydata.tbl would crash with a memory-management error.

Syntax example:

% mlr --from estimates.tbl put ' for (k,v in $*) { if (isnumeric(v) && k =~ "^[t-z].*$") { $sum += v; $count += 1 } } $mean = $sum / $count # no assignment if count unset '

Document links: - http://johnkerl.org/miller/doc/reference.html#If-statementsforput - http://johnkerl.org/miller/doc/reference.html#Whileanddo-whileloopsforput - http://johnkerl.org/miller/doc/reference.html#For-loopsforput - http://johnkerl.org/miller/doc/reference.html#Fieldnamesforfilter - http://johnkerl.org/miller/doc/reference.html#Fieldnamesforput - http://johnkerl.org/miller/doc/reference.html#Functionsforfilterandput - http://johnkerl.org/miller/doc/reference.html#Semicolons,newlines,andcurlybracesfor_put - http://johnkerl.org/miller/doc/cookbook.html

Brew update: https://github.com/Homebrew/homebrew-core/pull/1895

- Go
Published by johnkerl over 9 years ago

https://github.com/johnkerl/miller - Variables, begin/end blocks, pattern-action blocks

This major release dramatically expands the expressive power of Miller's put DSL. The TL;DR is that you can now write things like

mlr put '@x_sum += $x; end { emit @x_sum }'

For full details please see the following reference sections: - http://johnkerl.org/miller/doc/reference.html#put - http://johnkerl.org/miller/doc/reference.html#Out-of-streamvariablesforput - http://johnkerl.org/miller/doc/reference.html#Pattern-actionblocksforput - http://johnkerl.org/miller/doc/reference.html#Begin/endblocksforput - http://johnkerl.org/miller/doc/reference.html#Indexedout-of-streamvariablesforput - http://johnkerl.org/miller/doc/reference.html#Emitstatementsforput - http://johnkerl.org/miller/doc/reference.html#Unsetstatementsfor_put

as well as the following cookbook section: - http://johnkerl.org/miller/doc/cookbook.html#Usingout-of-streamvariables

Additional minor features in Miller 4.0.0: - Compound assignment operators such as +=, <<=, etc. are not new but were not previously announced in a release note. - Double-backslashing behavior for sub and gsub has been fixed: echo 'x=a\tb' | mlr put '$x=sub($x,"\\t","TAB")' now prints aTABb as desired. (The underlying issue was an unfortunate interaction between Miller's backslash-handling and the system regex library's backslash-handling.) - As an alternative to specifying input files as the last items on the Miller command line, you can now specify a single input file before other command-line switches and verbs using --from: for example, mlr --from myfile.dat put '$z = $x + $y' then stats1 -a sum -f z. The context is simple keystroke-reduction for interactively appending then-chains by up-arrowing at the command line: it's easier to iterate when you don't have to left-arrow past the input file name.

- Go
Published by johnkerl almost 10 years ago

https://github.com/johnkerl/miller - New data-rearrangers: nest, shuffle, repeat; misc. features

Major features in this release: - mlr nest is a companion to mlr reshape which was introduced in Miller 3.4.0: it allows unpacking key-value pairs which are nested within field values, and repacking them. Please see http://johnkerl.org/miller/doc/reference.html#nest. - mlr shuffle is a simple output-record permutor: http://johnkerl.org/miller/doc/reference.html#shuffle - mlr repeat can be used as a data-generator, to expand a few input records (or even a single one) into arbitrarily many. This is particularly useful in conjunction with pseudorandom-number generators. As well, it can be used to reconstruct individual samples from data which have been count-aggregated, so that statistics such as mode, percentiles, etc. may be computed on them. Please see http://johnkerl.org/miller/doc/reference.html#repeat. - mlr put and mlr filter now accept a -f {filename} option, so that the DSL expression may be placed within a file instead of being typed out on the command line when desired. Please see http://johnkerl.org/miller/doc/reference.html#put and http://johnkerl.org/miller/doc/reference.html#filter.

Minor features: - put/filter DSL string literals now may include \t, \", etc.: e.g. mlr put '$out = $left . "\t" . $right' - There is now a typeof function for the put/filter DSLs: mlr put '$xtype = typeof($x)'. This is occasionally useful for debugging type-conversion questions. - You may now do mlr --nr-progress-mod 1000000 ... to get something printed to stderr every 1000000th input record, and so on. For long-running aggregations on large input file(s), this can provide reassurance that processing is indeed proceeding apace. Example:

$ mlr --nr-progress-mod 100000 check data/big.dkvp NR=100000 FNR=100000 FILENAME=data/big.dkvp NR=200000 FNR=200000 FILENAME=data/big.dkvp NR=300000 FNR=300000 FILENAME=data/big.dkvp NR=400000 FNR=400000 FILENAME=data/big.dkvp NR=500000 FNR=500000 FILENAME=data/big.dkvp NR=600000 FNR=600000 FILENAME=data/big.dkvp NR=700000 FNR=700000 FILENAME=data/big.dkvp ... - mlr cat -n had a bug wherein it counted zero-up while its documentation claimed it counted one-up. Now it counts one-up as documented.

- Go
Published by johnkerl almost 10 years ago

https://github.com/johnkerl/miller - JSON, reshape, regex captures, and more

Primary features: - JSON is now a supported format for input and output. Miller handles tabular data, and JSON supports arbitrarily deeply nested data structures, so if you want general JSON processing you should use jq. But if you have tabular data represented in JSON then Miller can now handle that for you. Please see the reference page and the FAQ. - Reshape is a standard data-processing idiom, now available in Miller: http://johnkerl.org/miller/doc/reference.html#reshape - Incidentally (not part of this release, but new since the last release) Miller is now available in FreeBSD's package manager: https://www.freshports.org/textproc/miller/. A full list of distributions containing Miller may be found here. - Miller is not yet available from within Fedora/CentOS, but as a step toward this goal, an SRPM is included in this release (see file-list below).

DSL enhancements for mlr put and mlr filter: - Regex captures \0 through \9: http://johnkerl.org/miller/doc/reference.html#Regex_captures - Ternary operator in expression right-hand sides: e.g. mlr put '$y = $x < 0.5 ? 0 : 1' - Boolean literals true and false - Final semicolon is now allowed: e.g. mlr put '$x=1;$y=2;' - Environment variables are now accessible, where environment-variable names may be string literals or arbitrary expressions: mlr put '$home = ENV["HOME"]' or mlr put '$value = ENV[$name]'. - While records are still string-to-string maps for input and output, and between then statements, types are preserved between multiple statements within a put. Example: mlr put '$y = string($x); $z = $y . $y' works as expected, without requring mlr put '$y = string($x); $z = string($y) . string($y)' as before.

Bug fixes: - Mixed-format join, e.g. CSV file joined with DKVP file, was incorrectly computing default separators (IRS, IFS, IPS). This resulted in records not being joined together. - Segmentation violation on non-standard-input read of files with size an exact multiple of page size and not ending in IRS, e.g. newline. (This is less of a corner case than it sounds: for example, leave a long-running program running with output redirected to a file, then in a sleep-and-process loop, have Miller process that file. The former program's stdio library will likely be doing block-sized buffered I/O, where block sizes will often be multiples of system page size and the block will almost surely not ending a newline.)

Acknowledgements: Big thank-yous to @gregfr and @aaronwolen for feature requests including reshape and regex captures, and to @jungle-boogie for his work getting Miller into FreeBSD. Also, ongoing thanks to @0-wiz-0 for his past work on configure support, making it possible for Miller to be put to use in multiple operating systems.

- Go
Published by johnkerl about 10 years ago

https://github.com/johnkerl/miller - Bootstrap sampling, EWMA, merge-fields, isnull/isnotnull functions

  • Bootstrap sampling in mlr bootstrap: http://johnkerl.org/miller/doc/reference.html#bootstrap. Compare to reservoir sampling in mlr sample: http://johnkerl.org/miller/doc/reference.html#sample.
  • Exponentially weighted moving averages in mlr step -a ewma: principally useful for smoothing of noisy time series, e.g. finely sampled system-resource utilization to give one of many possible examples. Please see http://johnkerl.org/miller/doc/reference.html#step.
  • "Horizontal" univariate statistics in mlr merge-fields, compared to mlr stats which is "vertical". Also allows collapsing multiple fields into one, such as in_bytes and out_bytes data fields summing to bytes_sum. This can also be done easily using mlr put. However, mlr merge-fields allows aggregation of more than just a pair of field names, and supports pattern-matching on field names. Please see http://johnkerl.org/miller/doc/reference.html#merge-fields for more information.
  • isnull and isnotnull functions for mlr filter and mlr put.
  • stats1, stats2, merge-fields, step, and top correctly handle not only missing fields (in the row-heterogeneous-data case) but also null-valued fields.
  • Minor memory-management improvements.

- Go
Published by johnkerl about 10 years ago

https://github.com/johnkerl/miller - Performance improvements, compressed I/O, and variable-name escaping

  • RFC-CSV read performance is dramatically improved and is now on par with other formats; read performance for all formats is slightly improved as well.
  • Variable names can now be escaped, using curly braces if there are special characters in the input-data field names. Example: mlr put '${bytes.total} = ${bytes.in} + ${bytes.out}'. See also https://github.com/johnkerl/miller/issues/77 where this was requested.
  • Compressed I/O is now supported, using built-in compatibility with local system tools: http://johnkerl.org/miller/doc/reference.html#Compression. See also https://github.com/johnkerl/miller/issues/77 where this was requested.
  • mlr uniq is now streaming (bounded memory use, functionality in tail -f contexts) when possible: i.e. when -n and -c are not specified.
  • Thorough valgrind-driven testing has been used to tighten memory usage. This is mostly an invisible internal improvement, although it has a slight across-the-board performance improvement as well as allowing Miller to handle even larger files in limited-memory contexts.

- Go
Published by johnkerl about 10 years ago

https://github.com/johnkerl/miller - Bugfix for stats1 max

mlr stats1 max was reporting the same value as mlr stats1 min, although p100 was unaffected. This error has been present since the 3.0.0 release. It was reported on https://github.com/johnkerl/miller/issues/92.

- Go
Published by johnkerl about 10 years ago

https://github.com/johnkerl/miller - Fix regression tests for i386

No functionality had been broken for i386: the changes are for the test framework only, to get validated builds on all available platforms.

- Go
Published by johnkerl about 10 years ago

https://github.com/johnkerl/miller - Minor feature enhancements, and portability

  • Portability (affecting the CSV-RFC reader) for the Debian packaging request: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=800074. The latter greatly increases the number of platforms on which Miller has been validated.
  • mlr decimate: http://johnkerl.org/miller/doc/reference.html#decimate
  • Integer-preservation feature for mlr top and mlr stats1 with percentiles: If inputs are integers then corresponding outputs will be so as well (unless -F, which forces all-float output).
  • mlr histogram now has a --auto option for autocomputing lower and upper limits: http://johnkerl.org/miller/doc/reference.html#histogram
  • mlr uniq and mlr count-distinct now have a -n flag to show only the counts of distinct values, rather than listing all distinct values: http://johnkerl.org/miller/doc/reference.html#uniq http://johnkerl.org/miller/doc/reference.html#count-distinct
  • The strlen function correctly handles UTF-8 string data.

- Go
Published by johnkerl about 10 years ago

https://github.com/johnkerl/miller - Allow scientific notation in DSL literals; mlr bar --auto

  • Miller has always supported scientific notation in field values, e.g x=1e6. However, it had never supported scientific notation in DSL literals, e.g. mlr put '$y = $x + 1e6. This release fixes that.
  • Additionally, mlr bar now has a ---auto flag which holds all records in memory and computes limits from the data, so you don't have to compute them separately and pass them in via --lo and --hi.

- Go
Published by johnkerl about 10 years ago

https://github.com/johnkerl/miller - Integer and float arithmetic, improved documentation, minor feature enhancements

Integer/float arithmetic

The key feature of the 3.0.0 release, and the reason for the major version increment, is that previously all numbers were scanned into mlr put and mlr filter functions as floating-point -- then, only recast to integer as necessary for integer operations. Since IEEE doubles have 53 bits of precision (52 mantissa bits along with implicit leading one) while 64-bit integers have 64, this meant that full 64-bit integer signficance could not be passed through Miller functions.

As of the 3.0.0 release, numbers in Miller are int (64 bits) or float (double-precision). Numbers scannable as integers are treated as integers. The sum, difference, and product of two integers is another integer -- except when overflow would occur, at which point a floating-point result is produced. Integer division is pythonic, namely, 7/2 is 3.5, and 7//2 is 3. Mixed integer/float operations produce float. Bitwise operators are now supported.

You now have more control over arithmetic, not less. The only real compatibility change is that some numbers will now be printing like 123 rather than 123.0000.

For full details please see http://johnkerl.org/miller/doc/reference.html#Arithmetic.

New functions for filter and put

  • Since integers are now fully supported in mlr put and mlr filter, it is now possible to have the bitwise operators | ^ & << >>. These operate on 64-bit integers and produce 64-bit-integer results.
  • Modular arithmetic is implemented by madd, msub, mmul, and mexp.
  • urandint and urand32 are in addition to the existing urand.
  • sgn complements abs.
  • strftime and strptime are generalizations of sec2gmt and gmt2sec. There are pass-throughs to system strftime and strptime; see your local manpages for available time-formatting options.
  • Please see http://johnkerl.org/miller/doc/reference.html#Functionsforfilterandput for more information.

Verbs

  • mlr grep: http://johnkerl.org/miller/doc/reference.html#grep
  • mlr cat -n option: http://johnkerl.org/miller/doc/reference.html#cat
  • mlr stats1 skewness and mlr stats1 kurtosis: http://johnkerl.org/miller/doc/reference.html#stats1
  • mlr bar allows for some simple terminal-level visualization: http://johnkerl.org/miller/doc/reference.html#bar
  • mlr join now has full support for heterogeneous data: records lacking all the join keys are treated the same as any other left-unpaired or right-unpaired records. This was tracked on issue https://github.com/johnkerl/miller/issues/82.

I/O options

  • mlr --xvright for XTAB output
  • mlr --headerless-csv-output for CSV/CSV-lite output

Documentation

  • The mlr.1 manpage is now autogenerated.
  • There is now documentation on operator precedence and function semantics.
  • HTML pages at http://johnkerl.org/miller/doc/ are now PDF-renderable.
  • Per-release documents are available at http://johnkerl.org/miller/doc/release-docs.html. (The documents at http://johnkerl.org/miller/doc/ have always tracked head, and they continue to do so.)

- Go
Published by johnkerl about 10 years ago

https://github.com/johnkerl/miller - Iterative stats, exclude-filter, implicit-CSV-header, and other features

  • mlr stats1 and stats2 now support a -s feature in which means, linear regressions, etc. evolve record-by-record as new records appear over time. This is particularly useful in tail -f contexts. See also http://johnkerl.org/miller/doc/reference.html#stats1 and http://johnkerl.org/miller/doc/reference.html#stats2.
  • mlr filter now supports a -x flag to negate the sense of the filter: instead of editing logic expressions e.g. from mlr filter '$x < 10 || $x > 20' to mlr filter '$x >= 10 && $x <= 20', you can simply do mlr filter -x '$x < 10 || $x > 20'. See also http://johnkerl.org/miller/doc/reference.html#filter.
  • In the event a CSV file lacks header lines, you can use mlr --implicit-csv-header to add positional header 1,2,3,.... You can also convert those to desired text using mlr label. See also http://johnkerl.org/miller/doc/reference.html#label.
  • Heterogeneity support is improved for sort, stats1, stats2, step, head, tail, top, sample, uniq, and count-distinct. See also https://github.com/johnkerl/miller/issues/79.
  • mlr stats2 now has a logistic-regression feature, but I recommend treating it as experimental until some numerical-stability issues involving my naïve Newton-Raphson solver are worked out -- namely, it doesn't converge in all cases.

http://johnkerl.org/miller/releases/miller-2.3.2/doc/

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Bug fix for mlr top -a

Memory management was incorrect in mlr top -a.

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Regex support, gsub, reservoir sampling, iterative stats, and other features

Regex support

  • http://johnkerl.org/miller/doc/reference.html#Regular_expressions
  • http://johnkerl.org/miller/doc/reference.html#put
  • http://johnkerl.org/miller/doc/reference.html#filter
  • http://johnkerl.org/miller/doc/reference.html#having-fields
  • http://johnkerl.org/miller/doc/reference.html#cut
  • http://johnkerl.org/miller/doc/reference.html#rename

gsub function

In addition to the existing sub function: replace-all in addition to replace-once. Includes regex support. http://johnkerl.org/miller/doc/reference.html#Functionsforfilterandput

Reservoir sampling

http://johnkerl.org/miller/doc/reference.html#sample

Iterative stats1/stats2

Use mlr stats1 -s ... or mlr stats2 -s ... to print averages, min/max, correlation, etc. on every record. Useful in tail -f contexts when you want to see statistics evolving as the data evolve in time.

http://johnkerl.org/miller/doc/reference.html#stats1 http://johnkerl.org/miller/doc/reference.html#stats2

Minor

  • Initial delta for mlr step -a delta is now 0, matching initial 1 for mlr step -a ratio
  • Usage messages consistently go to stdout when asked for via -h, and stderr in case of command-line syntax errors
  • Online help is confined to 80-character column width, except for mlr -f which is all single-line greppable
  • Header/data length mismatch error messages for CSV/CSV-lite now include file/line context

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Autoconfig support

Documentation at http://johnkerl.org/miller/doc/build.html

Resolves https://github.com/johnkerl/miller/issues/9

Most of the work here due to @0-wiz-0

http://johnkerl.org/miller/releases/miller-2.2.1/doc/

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Multi-character RS,FS,PS

You can process CRLF-terminated DKVP files with mlr --dkvp --rs crlf. You can process LF-terminated CSV files with mlr --csv --rs lf. You can process TSV using mlr --fs tab; you can convert TSV to CSV using mlr --ifs tab --ofs comma. Along with many more possibilities. Please see mlr -h for more information.

There is one minor, backward-incompatible change which I felt not worth calling this 3.0.0: default field separator for NIDX format is now space, not comma.

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Improved read performance for RFC4180 CSV

Resolves https://github.com/johnkerl/miller/issues/51

RFC-compliant CSV input is now about 60% faster than at initial feature release (https://github.com/johnkerl/miller/releases/tag/v2.0.0). It remains about 50% slower than CSV-lite.

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Reduce tar-file size

Addresses https://github.com/johnkerl/miller/issues/61

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Incremental read-performance increase for CSV format

While https://github.com/johnkerl/miller/issues/51 is still underway, already there is nearly a 2x read-performance increase in v2.1.1 over v2.1.0.

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Minor enhancements and bug fixes

Highlights: travis-CI integration (thanks @SikhNerd!); hour-minute-second functions; fixed pretty-print alignment of UTF-8 data.

Bugs fixed: https://github.com/johnkerl/miller/issues/36 https://github.com/johnkerl/miller/issues/34 https://github.com/johnkerl/miller/issues/29 https://github.com/johnkerl/miller/issues/23

Features: https://github.com/johnkerl/miller/issues/53 https://github.com/johnkerl/miller/issues/35 https://github.com/johnkerl/miller/issues/15

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - RFC4180-compliant CSV

Miller now handles CSV as defined in RFC 4180 (https://tools.ietf.org/html/rfc4180).

The --csv I/O option is now compliant CSV. The --csvlite is the same not-really-CSV as originally released (https://news.ycombinator.com/item?id=10066742), with programmable RS/FS (e.g. you can do TSV, or spaces). Meanwhile --csvis only RFC-4180 CSV: RS is hardcoded to CRLF and FS is hardcoded to comma. That is, as of v2.0.0, you get compliant CSV (including double-quote support) with no options for separators/terminators, or you get non-compliant CSV without double-quote support but with options for separators/terminators.

This is intended to deliver, as soon as possible, RFC-compliant CSV since @ftrotter hit that nail on the head with https://github.com/johnkerl/miller/issues/4. (Also note that as of v2.0.0, CSV read performance is significantly slower than CSV-lite.)

In an upcoming minor, v2.1.0 or v2.2.0, I'll do a bit more: - --csv will be still be compliant by default, but RS/FS will be programmable: you'll be able to handle TSV or what have you, with double-quote support. - RS/FS/PS for all formats will be able to be multi-character, e.g. you'll be able to use CRLF for DKVP format which will resolve https://github.com/johnkerl/miller/issues/19. - Read-performance for CSV will be optimized for performance. - Double-quoting will be supported in DKVP as well as in CSV.

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Add INSTALLDIR Makefile option for Homebrew

Add INSTALLDIR Makefile option for Homebrew

- Go
Published by johnkerl over 10 years ago

https://github.com/johnkerl/miller - Initial public release

Initial public release. Feature-stable for my own use leading up to the release announcement 2015-08-15. Feel free to open issues with feature requests, bug reports, etc. Primary upcoming work for upcoming releases involves configuration (autotools et al.), packaging (homebrew, .deb), and RFC-compliant CSV.

- Go
Published by johnkerl over 10 years ago