Recent Releases of cchdo.hydro
cchdo.hydro - v1.0.2.14 Is this the Long tail?
The problems that we seem to be addressing now, discovered though the processing of data at CCHDO, seem to be more subtle and nuanced. Also undefined behavior, which is just hard to figure out.
As per SPEC0, this will be the last version to support (read, test against) numpy<2 and python 3.11.
Anyway, here are some fixes:
- (Bug) Fix where 0 would be written for a bottle flag in COARDS files that should have been a fill value (9). This was the result of a nan being cast to int, so is undefined behavior in C, 0 would be written on arm machines, but could be anything.
- Fix CDOM coordinate collapse function not removing the collapsed variables.
- (Bug) Fix CTDETIME parameters crashing the exchange writer, and COARDS writer
- (Bug) Fix CTDETIME losing precision in derived files (it was being stored correctly in the CF netCDF files)
Full Changelog: https://github.com/cchdo/hydro/compare/v1.0.2.13...v1.0.2.14
- Python
Published by DocOtak 6 months ago
cchdo.hydro - Date bugs and uv
This release fixes a bug related to writing out bottle date/time columns in exchange bottle files. It also switches to uv for project management and releases.
What's Changed
- Support specifying the text encoding of the input exchange file
- (Bug) Fix fill values in bottle date/time being printed as nan in generated exchange files
- Add the ability to ignore columns using the ODV parameter syntax to
read_exchangeandread_csv - Add a --roundtrip flag to the status-exchange command, this will convert from the exchange file online at CCHDO to xarray/netCDF then back to exchange, then back to netCDF, the purpose is to check that the derived exchange is valid.
- Moved check_flags from exchange to checks
Full Changelog: https://github.com/cchdo/hydro/compare/v1.0.2.12...v1.0.2.13
- Python
Published by DocOtak 8 months ago
cchdo.hydro - ALT merging and CDOM
This release improves support for merging and manipulating CDOM parameters in the datasets, it's early support so bugs might be in there. A bug was found where merging alternate parameters would instead update the "non alternate" parameter.
This next release will adopt SPEC0 and so is the last version to support (be tested against) python 3.10.
v1.0.2.12 (2024-10-29)
- Add support for adding CDOM params/wavelengths
- Add support for merging CDOM in merge_fq
- (Bug) Fix merge_fq putting alternate paramter data in the wrong place
- (Bug) Fix a crash in the COARDS writer on some architectures (x64) when CTDNOBS has fill values
- Fix exception caused by string dtype parameters with all fill values
Full Changelog: https://github.com/cchdo/hydro/compare/v1.0.2.11...v1.0.2.12
- Python
Published by DocOtak over 1 year ago
cchdo.hydro - Release Friday
Naturally releasing on a Friday leads to forgetting something... like debug print statements in hot paths. This quick followup release removes those debug print statements.
Changes: * Removed 2 rouge debug print statemnets from digging into that WOCE bug
- Python
Published by DocOtak over 1 year ago
cchdo.hydro - String Gotchas
The changes for numpy 2.0 failed to calculate a string length correctly and important things like station ids were being truncated! This fixes that problem and is important enough that I decided to do a release on a Friday before a long weekend, a totally safe and accepted programming norm.
Changes: * (Major Bug) Fix not calculating the correct string length for string fields that have an inconsistent length (e.g. station) * (Bug) Fix the legacy woce writer not writing the data block if no quality flags are in the file
- Python
Published by DocOtak over 1 year ago
cchdo.hydro - Speedy Merges
This release focused on the merge code paths and making them faster (we had a CTD file that needed fixing). It also fixed a bug in the legacy COARDS writer.
Changes:
* (New) Added hydro.core.add_param and hydro.core.remove_param functions
* (Bug) Fix crash in the COARDS writer when the comments are just an empty string
* Add cchdo.auth to cli optional requiremnets
* Vectorize the merge_fq accessor for greater speed
* use absolute imports throughout the library
* Speedups in string processing (precision extraction) from numpy 2
- Python
Published by DocOtak over 1 year ago
cchdo.hydro - netCDF4 required for selftest
A technical issue prevented the 1.0.2.7 from being published successfully
- netCDF4 is now requried as part of the selftest option when installing
- Python
Published by DocOtak almost 2 years ago
cchdo.hydro - Legacy Accessor Fixes
This release fixes some bugs that would prevent an exchange to COARDS netCDF from generating successfully from a CF/netCDF file. It also includes a CLI tool for testing this functionality on all public CF files at cchdo.
Changes:
- (Bug) fix to_exchange accessor failing for variables with seconds and the unit
- (Bug) fix to_coards accessor failing for variables with seconds and the unit
- Add status-cf-derived command that tests all all public CF files at CCHDO going from netCDF to every other supported format
- Python
Published by DocOtak almost 2 years ago
cchdo.hydro - Duplicate Params
This release adds initial support for duplicate parameter names using the "ALT" syntax proposed in https://github.com/cchdo/params/issues/25
- Support for duplicate parameters
- (Bug) fix to_exchange accessor failing with a Dataset containing CDOM variables
- (Bug) fix for the flag column getting lost when alternate units for the same parameter were present in one file If, for example, a file had CTDTMP [ITS-90] and CTDTMP [IPTS-68] and both had CTDTMPFLAGW columns, only one of the parameters would get a flag column
- Added "coards" and "woce" file name generation support to
gen_fname()accessor to_woce()now always returns zipfile bytes for ctd data- Omit the "STAMP" text from generated WOCE files
- (changed) Bump min
cchdo.paramsversion to 2024.3
Full Changelog: https://github.com/cchdo/hydro/compare/v1.0.2.5...v1.0.2.6
- Python
Published by DocOtak almost 2 years ago
cchdo.hydro - Some CLI TLC
This release has some CLI improvements a highlight being the ability to convert a generic CSV format, which is still pretty specific in what is expected, and not quite documented yet. You can also override the comments with an external file when converting exchange (or csv).
The COARDS legacy output has been rewritten in xarray (from netCDF4-python) in an attempt to speed it up, even if it is not sped up, the code sustainability and readability improvements make it worth it. See the changelog for some more details (and the CLI --help for details there)
- Rewrite the COARDS netCDF output to create xarray objects rather than netCDF datasets directly.
In some quick testing, this results in about a 3x speed up, this depends more on variable count vs data length, so most of the performance increase is actually in the bottle output
- Fixed a bug in COARDS where the fill value was not being set in the bottom depth variable
- Add
fill_valuesandprecision_sourcearguments toread_csv - Add string literal types for the
ftypeparameter ofread_csv - CLI improvements:
- made "precision_source" and option rather than positional argument
- added a
--commentsoption to allow the override of comments from either a string or file path prefixed with @. - Add a convert_csv subcommand which takes an additional ftype option to specify (C)TD or (B)ottle
- Removed the
matlaboptional install extra, this previously had a single dependency of "scipy" in it. Scipy is used by xarray for netCDF3 output so this dependency has been moved to thenetcdfoptional install extra.
- Python
Published by DocOtak over 2 years ago
cchdo.hydro - Bump In Params
Trying to be more regular, there was a change in the way that cchdo.params did versioning that this release adjusts for.
- (improved) the read_csv method now handles ctd data better, specifically you do not need to include a SAMPNO column if the FileType is CTD.
- Switched linting in pre-commit and CI to use ruff
- (changed) Bump min
cchdo.paramsversion to 2023.9
- Python
Published by DocOtak over 2 years ago
cchdo.hydro - It's Progress
Looks like it has been almost a year since the last point release, since I want to feel good about this, it is because the code base is stable and robust and definitely not for any other reason like the lead developer spending 6 months at sea collecting reference quality hydrographic data. The changes being worked on here related to automation efforts at CCHDO, a highlight being the porting of the coards netcdf and woce file generators. That's right, we are committed to providing hydrographic data in your favorite formats and are working on automating the process of making these data available for all our cruises.
- Add
read_csvmethod - (bug) Remove the
C_formatandC_format_sourceattributes for non floating point variables. Integer and string values are exact so do not need any sort of format hint. Including a format string for non floating point values is undefined behavior in the netCDF-C Library and can result in crashing. - (new) Add
to_coards()andto_woce()accessors to maintain legacy formats at CCHDO. - (new) All the
to_*accessors now support a path argument that will accept a writeable binary mode file like object or a filesystem path to write to. - (new) Add a
compact_profile()accessor that drops the trailing fill values from a profile - (new) Add the a
file_seperatorandkeep_seperatortocchdo.hydro.exchange.read_exchange(). Thekeep_seperatorargument defaults to True. This is specifically to allow the reading of CTD exchange files that have been concatenated together (rather than zipped). Assuming there is nothing after "ENDDATA" and you cat a bunch of _ct1.csv files together, they should be readable if "ENDDATA" is passed into thefile_seperatorargument. - (new) Add
--dump-data-countsoption to the exchange status generator which will dump a json document containing a object with nc_var name strings to count integers of how many variables with this name actually contain any data (i.e. are not just entirely fill value). - Add a
--versionoption to the cli interface - (changed) Export
read_exchangefrom the top levelcchdo.hydronamespace. - (changed) Bump min
cchdo.paramsversion to 0.1.21 (changed) Dropped netCDF4 as required for installation, if netCDF4 isn't installed already you can install with the
cchdo.hydro[netcdf4]optional.- While this might seem like an odd choice for a library that started as one to convert WHP Exchange files to netCDf, netCDF
itself is not called until the very end of the conversion process. Internally, everything is an
xarray.Dataset. This means you can install this library to read exchange files in tricky environments like pyodide or jupyterlite which already tend to have pandas and numpy in them.
- While this might seem like an odd choice for a library that started as one to convert WHP Exchange files to netCDf, netCDF
itself is not called until the very end of the conversion process. Internally, everything is an
(bug) fix
pressurevariable not having a_FillValueattribute
- Python
Published by DocOtak over 2 years ago
cchdo.hydro - A Small Quality of LIfe Update
This release is mostly because we want all the files we publish at CCHDO to have the software version tagged and published. However there were two small quality of life updates:
- Support for time values that are equal to 2400, when this is encountered, the date will be set to midnight of the next day.
read_exchange()will now accept bytes and bytearray objects as input, wrapping data in anio.BytesIOis not needed anymore.
- Python
Published by DocOtak over 3 years ago
cchdo.hydro - The CDOM (and date) update
This release focused on making the exchange conversion process smoother. We will now automatically attempt to use the exchange BTLNBR column if the SAMPNO column is missing. Error logging has improved as well, with more information about where and what caused the conversion error. Here is the full change log:
- (breaking) fix misspelling of
convert_exchangesubcommand - Will not rely on the python universal newlines for reading exchange data
- Will now combine CDOM parameters into a single variable with a new wavelength dimmension in the last axis.
- Update the WHP error name lookup to be compatable with cchdo.params v0.1.18, this is now the minimum version
- Add an
error_dataattribute toExchangeParameterUndefErrorthat will contain a list of all the unknown(param, unit)pairs in an exchange file when attempting to read one. - Add an
error_dataattribute toExchangeDataFlagPairErrorthat will contain a list of all the found flag errors as an xarray.Dataset - Automatically attempt to use BTLNBR as a fallback if SAMPNO is not present in a bottle file.
- Automatically reconstruct the date of a missing BTLDATE param if only BTLTIME is present.
- Add
--dump-unknown-paramsoption to the statusexchange subcommand which will dump an unknown param list into a json format into the ``outdir``. - Performing a flag check is now behind a feature switch (defaults to true, for the status-exchange it is set to false)
- If a TIME column contains entirely the string "0" (not 0000) it will be ignored
- Python
Published by DocOtak over 3 years ago
cchdo.hydro - The ASV Update
This release includes an almost complete rewrite of how the exchange to netCDF conversion works. It now more directly uses numpy and has significant memory reduction and speed improvements when converting CTD (bottle is about the same).
- (breaking) The CLI was changed to support multiple actions which caused the exchange to netCDF functions to be moved to a sub-command "convert-exchnage" with the same interface as before.
- (breaking) The "sourceCformat" attribute has been removed in favor of only having one "Cformat" attribute, the "source" of the value in the Cformat attribute will be listed in a new attribute "Cformatsource" with the value of either "inputfile" if the Cformat was calculated from a text based input, or "database" if the C_format was taken from the internal database.
- (temporary) the netCDF to exchange function is not quite ready yet to work as an xarray accessor.
- (provisional) the order which netCDF variables appear is now in "exchange preferred" order.
Bug Fixes
- Fixed an issue where the WOCE sumfile accessor would misalign latitude columns near the equator since they lacked a digit in the tens place.
- Fixed an issue where the WOCE sumfile accessor would use "pressure levels" of CTD source netCDF files as the number of bottles.
- Fixed an issue where stations might occur in an unexpected order.
- Python
Published by DocOtak almost 4 years ago
cchdo.hydro - The Errata Update (won't be the only one)
This release fixes many of the issues identified after the initial "1.0.0.0" release. Highlights include:
- Explicitly set the
_FillValueattribute for the bottle closure time variable. - The dtype for real number variables has been changed from
floattodouble - If the source data is an "exchange csv", a
source_C_formatattribute will (with some exceptions) be present on the real number data variables.
- Python
Published by DocOtak over 4 years ago
cchdo.hydro - The Version Fix
This release fixes a typo in the pyproject.toml file which would cause the _version.py file to be invalid.
- Python
Published by DocOtak almost 5 years ago
cchdo.hydro - The First Release (Automated Edition)
Hopefully this fixes the errors which prevented the project from being published automatically to pypi.
- Python
Published by DocOtak almost 5 years ago
cchdo.hydro - The First Release
After a whole bunch of testing, meetings, more testing, arguments, and a lot of work. We have declared the current status of the project as "good enough" for a 1.0.0 release.
There is much work to be done, especially since not all our files convert currently, but we think the ones that do convert are ready for public consumption. Unless something crazy goes wrong or is discovered, format changes should only be additive in nature (e.g. new attributes on variables).
The version will hopefully use the following (close to semver):
x.y.z
Where: * x is incremented when a real breaking change to the netCDF output format is made. * y is incremented when things are added to the netCDF format that should not break code which relies on previously existing attributes * z is incremented for normal software releases that don't change the netCDF output.
- Python
Published by DocOtak almost 5 years ago