Recent Releases of cdm-reader-mapper
cdm-reader-mapper - v2.1.0
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
New features and enhancements
implement both wrapper functions
readandwritethat call the appropriate function based onmodeargument (PR/238):mode== "mdf"; callscdm_reader_mapper.read_mdfmode== "data"; callscdm_reader_mapper.read_dataorcdm_reader_mapper.write_datamode== "tables"; callscdm_reader_mapper.read_tablesorcdm_reader_mapper.write_tables
optionally, call
cdm_reader_mapper.read_tableswith either source file or source directory path (PR/238).apply attribute to
DataBundle.dataif attribute is nor defined inDataBundle(PR/248).apply pandas functions directly to
DataBundle.databy callingDataBundle.<pandas-func>(PR/248).make
DataBundlesupport item assignment forDataBundle.data(PR/248).optionally, apply selections to
DataBundle.maskinDataBundle.select_*functions (PR/248).cdm_reader.reader.read_tables: optionally, set null_label (PR/242)new method function:
DataBundle.select_where_all_false(PR/242)new method functions:
DataBundle.split_*which split a DataBundle into two new DataBundles containing data selected and rejected after user-defined selection criteria (PR/242)DataBundle.split_by_boolean_trueDataBundle.split_by_boolean_falseDataBundle.split_by_column_entriesDataBundle.split_by_index
implement pandas indexer like
ilocfor not chunked data (PR/242)
Internal changes
cdm_reader_mapper.common.select: restructure, simplify and summarize functions (PR/242)- split DataBundle class into main class (
cdm_reader_mapper.core._utilities) and method function class (cdm_reader_mapper.core.databundle) (PR/242)
Breaking changes
- remove property
tablesfromDataBundleobject. Instead,DataBundle.map_modeloverwrites.DataBundle.data(PR/238). - set default
overwritevalues fromTruetoFalsethat is consistent with pandasinplaceargument and renameoverwritetoinplace(PR/238, PR/248). inplacereturnsNonethat is consistent with pandas (PR/242)DataBundlemethod functions return aDataBundleinstead of apandas.DataFrame(PR/248).DataBundle.select_*functions write only selected entries toDataBundle.dataand do not take other list entries fromcommon.select_*function returns into account (PR/248).- select functions do not reset indexes by default (PR/242)
rename
DataBundle.select_*functions:DataBundle.select_true->DataBundle.select_where_all_booleanDataBundle.select_from_list->DataBundle.select_where_entry_isinDataBundle.select_from_index->DataBundle.select_where_index_isin
rename
cdm_reader_mapper.common.select_*functions and make them returning a tuple of selected and rejected data after user-defined selection criteria (PR/242):select_true->split_by_boolean_trueselect_from_list->split_by_column_entriesselect_from_index->spit_by_index
Bug fixes
cdm_reder_mapper.metmetpy: set deck keys from???tod???in icoads json files which makes values accessible again (PR/238).cdm_reder_mapper.metmetpy: setimma1toicoadsandimmttogccin icoads/gcc json files which makes properties accessible again (PR/238).DataBundle.copyfunction now makes a real deepcopy ofDataBundleobject (PR/248).- correct key index->section for self.df.attrs in open_netcdf (PR/252)
cdm_reader_mapper.map_model: return null_label if conversion fails (PR/242)- keep indexes during duplicate check (PR/242)
- Python
Published by ludwiglierhammer 11 months ago
cdm-reader-mapper - v2.0.1
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
Announcements
This release drops support for Python 3.9 and adds support for Python 3.13 (PR/228, PR/229)
New features and enhancements
- add environment.yml file (PR/229)
cdmreadermapper now separates the optional dependencies into dev and docs recipes (PR/232).
- $ python -m pip install cdmreadermapper # Install minimum dependency version
- $ python -m pip install cdmreadermapper[dev] # Install optional development dependencies in addition
- $ python -m pip install cdmreadermapper[docs] # Install optional dependencies for the documentation in addition
- $ python -m pip install cdmreadermapper[all] # Install all the above for complete dependency version
Internal changes
- GitHub workflow for
testing_suitenow usesuvfor environment management, replacingmicromamba(PR/228) - rename ci/requirements to CI and tidy up requirements/dependencies (PR/229)
- Python
Published by ludwiglierhammer about 1 year ago
cdm-reader-mapper - v2.0.0
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
New features and enhancements
- New core
DataBundleobject including callablecdm_mapper,metmemtpyandoperationsmethods (#84, #188, #197) - Update readthedocs documentation (#191, #197)
- new function:
write_datato write MDF data and validation mask according towrite_tablesfor writing CDM tables (#201) - new function:
read_datato read MDF data and validation mask according toread_tablesfor reading CDM tables (#201) - new property: DataBundle.encoding (#222)
- add overwrite option to some DataBundel method functions (#224)
Breaking changes
cdm_mapper:map_modelreturns pandas.DataFrame instead of CDM dictionary (#189)cdm_mapper: rename functioncdm_to_asciitowrite_tables(#182, #185)cdm_mapper: update parameter names and list of functionsread_tablesandwrite_tables(#185)- main
cdm_mapper,mdf_readerandduplicatesmodules are directly callable fromcdm_reader_mapper(#188) - new list of imported submodules:
map_model,cdm_tables,read_tables,write_tables,duplicate_checkandread_mdf - removed list of imported submodules:
cdm_mapper,common,mdf_reader,metmetpy,operations - remove imported submodules from
cdm_mapper,mdf_reader(#188) read_tables: returningDataBundleobject (#188)read_tables: resulting dataframe always includes multi-indexed columns (#188)duplicatesis now a direct submodule ofcdm_reader_mapper(#188)- import
readfunction frommdf_reader.readasread_mdf(#188) read_mdf: returningDataBundleobject (#188)read_mdf: remove parameterout_pathto dump attribute information on disk (#201)- move function
open_code_tablefromcommon.json_dicttocdm_mapper.codes.codes(#221) operationstocommon(#224)cdm_mapper: rename tablewriter to writer and tablereader to reader (#224)mdf_reader: rename write to writer and read to reader (#224)metmetpy: gather correction functions to correct module and validation functions to validate module (#224)DataBundle: remove properties selected, deselected, tablesdupflagged and tablesdupsremoved (#224)
Internal changes
cdm_mapper: dtype conversion fromwrite_tablesto new submodule_conversionsofmap_model(#189)cdm_mapper: renamemappingsto_mapping_functions(#189)cdm_mapper: mapping functions frommapperto new submodule_mappings(#189)cdm_mapper: save utility functions fromtable_reader.pyandtable_writer.pyto_utilities.py(#185)reduce complexity of several functions (#25, #200):
mdf_reader.read.readmdf_reader.validate.validatemfd_reader.utils.decoders.signed_overpunchcdm_mapper._mappings._mappingmetmetmpy.station_id.validate.validate
split
mdf_reader.utils.auxiliaryintomdf_reader.utils.filereader,mdf_reader.utils.configuratorandmdf_reader.utils.utilities(#25, #200)simplify
cdm_mapper.read_tablesfunction (#192)mdf_reader: RefactoredConfiguratorclass,Configurator.open_pandasmethod, to handle looping through rows (#208, #210)mdf_reader: RefactoredConfiguratorclass,Configurator.open_datamethod, to avoid creating a pre-validation missing_value mask (#216)mdf_reader: movevalidatetoutils.validators(#216)mdf_reader: no need for multi-column key codes (e.g.("core", "VS")) (#221)mdf_reader.utils.validator: simplify functioncode_validation(#221)cdm_mapper.codes.common: convert range-key properties to list (#221)testing_suite: new chunksize test with icoadsr300d721 (#222)mdf_reader,cdm_nmapper: use model-depending encoding while writing data on disk (#222)code restructuring (#224)
remove unused functions and methods (#224)
Bug fixes
- Solve SettingWithCopyWarning (#151, #184)
mdf_reader:utils.converters.decodereturns values not only None (#214)mdf_reader: solving misleading reading due to German "umlauts"(#212, #214, #222)
- Python
Published by ludwiglierhammer about 1 year ago
cdm-reader-mapper - v1.0.2
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements
New PyPi Classifiers:
- Development Status :: 5 - Production/Stable
- Development Status :: Intended Audience :: Science/Research
- License :: OSI Approved :: Apache Software License
- Operating System :: OS Independent
- Python
Published by ludwiglierhammer over 1 year ago
cdm-reader-mapper - v1.0.1
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements
- set package version to v1.0.1
- Python
Published by ludwiglierhammer over 1 year ago
cdm-reader-mapper - v1.0.0
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements
- Final version used for GLAMOD marine processing release 7.0
Bug fixes
cdm_mapper: Two reports that describe each other as best duplicates are not flagged as duplicates (DupDetect) (:pull:149)cdm_mapper: Reindex only if null values available (DupDetect) (:pull:153)
- Python
Published by ludwiglierhammer over 1 year ago
cdm-reader-mapper - v0.4.3
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer)
Announcements
^^^^^^^^^^^^^
* First release on pypi (:issue:17)
* First release on zenodo (:issue:18)
- Python
Published by ludwiglierhammer over 1 year ago
cdm-reader-mapper - v0.4.2
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer)
Announcements
^^^^^^^^^^^^^
* First release on pypi (:issue:17)
* First release on zenodo (:issue:18)
- Python
Published by ludwiglierhammer over 1 year ago
cdm-reader-mapper - v0.4.1
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer)
Announcements
^^^^^^^^^^^^^
* First release on pypi (:issue:17)
* First release on zenodo (:issue:18)
- Python
Published by ludwiglierhammer over 1 year ago
cdm-reader-mapper - v0.4.0
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer) and Joseph Siddons (:user:jtsiddons)
Announcements
^^^^^^^^^^^^^
* Now under Apache v2.0 license (:pull:69)
* First release on pypi (:issue:17)
* First release on zenodo (:issue:18)
New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* common.getting_files.load_file: optionally, load data within data reference syntax (:pull:41)
* common.getting_files.load_file: optionally, clear cache directory (:pull:45)
* reworked readthedocs documentation for gathered cdm_reader_mapper package (:issue:19, :pull:83)
* mdf_reader: new validation function for datetime objects (:pull:89)
* mdf_reader: select time period with new arguments year_init ad year_end (:pull:98)
* cdm_mapper: duplicate check using recordlinkage (:pull:81)
* mdf_reader.read: optionally, set left and right time bounds (year_init and year_end) (:issue:11, :pull:97)
* mdf_reader.read: optionally, set both external schema and code table paths and external schema file (:issue:47, :pull:111)
* cdm_mapper: Change both columns history and reportquality during duplicatecheck (:pull:112)
* cdm_mapper: optionally, set column names to be ignored while duplicate check (:pull:115)
* cdm_mapper: optionally, set offset values for duplicatecheck (:pull:119)
* ``cdmmapper: optionally, set column entries to be ignored while duplicate_check (:pull:`119`)
*cdmmapper: add both column namesstationspeedandstationcourseto default duplicate check list (:pull:`119`)
*cdmmapper`: optionally, re-index data in ascending order according to the number of nulls in each row (:pull:119`)
Breaking changes
^^^^^^^^^^^^^^^^
* set chunksize from 10000 to 3 in testing suite (:pull:35)
* cdm_mapper: read header column location_quality from (c1, LZ) and set fillvalue to 0 (:issue:36, :pull:37)
* ``cdmmapper: set default value of header columnreportqualityto2(:issue:`36`, :pull:`37`)
* reading C-RAID data: set decimal places according to input file data precision (:pull:`60`)
* always convert data types of bothintandfloatin schemas into default data types (:issue:`59`, :pull:`60`)
*cdmmapper.mapmodel: call function without input parameterdataatts(:issue:`66`, :pull:`67`)
*decimalplacesinformation is moved frommdfreader.schematocdmmapper.tables;decimalplacesin user-given schemas will be ignored (:issue:`66`, :pull:`67`)
*cdmmapperdoes not need any attribute information frommdfreader(:issue:`66`, :pull:`67`)
*cdmmapper: map ICOADS wind direction data (361->0;362->np.nan) (:pull:`82`)
*cdmmapper: set fill_value toUNKNOWNfor C-RAID'sprimarystationid(:pull:`93`)
*cdmmapper: map C-RAID quality flags to CDM quality flags (:pull:`94`)
*mdfreader: summarize schema and code tables (:issue:`11`, :pull:`97`)
*mdfreader: renamecraidtocraid,gccimmttogccandimma1toicoads(:issue:`11`, :pull:`97`)
*cdmmapper: summarize tables and code tables (:issue:`11`, :pull:`97`)
*cdmmapper: renamecraidtocraidandgccmappingtogcc(:issue:`11`, :pull:`97`)
*metmetpy: renameimmttogccandimmatoicoads(:issue:`11`, :pull:`97`)
*cdmmapper.mapmodel``: use standardized imodelname as 11, :pull:97)
* mdf_reader.read: use standardized imodelname as <datamodel>11, :pull:97)
* mdf_reader: (core, VS) set columntype to key for all ICOADS decks (:issue:11, :pull:97)
* ``cdmmapper: rename pub47_noc mapping to pub47 (:pull:`102`)
* Note by each function call: renamedatamodelintoimodel`` e.g. imodel=icoadsr300d704 (:pull:103)
* ``cdmmapper.mapmodel: call with (data, imodel=imodel) (:pull:`103`)
*mdfreader.read: call with (source, imodel=imodel) (:pull:`103`)
* Re-order arguments tomdfreader.validate, and create argument forexttablepath(:pull:`105`)
*operations: delete corrections module (:pull:`104`)
*cdmmapper: duplicate check is available for header table only (:pull:`115`)
*cdmmapper``: set reportquality to 1 for bad duplicates (:pull:115)
* cdm_mapper: set default primarystationid to 4 for C-RAID mapping (:issue:117, :pull:121)
* renamed some element names in icoads_r300_d730 schema for consistency (InsName to InstName, InsPlace to InstPlace, InsLand to InstLand, No_data_entry to NumArchiveSet) (:pull:110)
Internal changes
^^^^^^^^^^^^^^^^
* replace deprecated datetime.datetime.utcnow() with datetime.datetime.now(datetime.UTC) (see: https://github.com/python/cpython/issues/103857) (:pull:39, :pull:43)
* make use of cdm-testdata release v2024.06.07 https://github.com/glamod/cdm-testdata/releases/tag/v2024.06.07 (:issue:44, :pull:45)
* migration to setup-micromamba: https://github.com/mamba-org/provision-with-micromamba#migration-to-setup-micromamba (:pull:48)
* update actions to use Node.js 20: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-using-versioned-actions (:pull:48)
* mdf_reader.auxiliary.utils: rename variable for missing values to missing_values (:pull:56)
* add pre-commit hooks: codespell, pylint and vulture (:pull:56)
* use pytest.parametrize for testing suite (:pull:61)
* use ast.literal_eval instead of eval (:pull:64)
* remove unused code tables in mdf_reader (:issue:10, :pull:65)
* cdm_mapper.mappings: use datetime to convert float into hours and minutes.
* add FOSSA license scanning to github workflows (:pull:80)
* add cdm_reader_mapper author list including ORCID iD's (:pull:38, :pull:49)
* mdf_reader: replace empty strings with missing values (:pull:89)
* metmetpy: use function overwrite_data in all platform type correction functions (:pull:89)
* rename data_model into imodel (:pull:103)
* implement assertion tests for module operations (:pull:104)
* cdm_mapper: put settings for duplicate check in duplicatesettings (:pull:119)
* cdm_mapper: use pandas.apply function instead of for loops in duplicatecheck (:pull:119)
* adding some more duplicate checks to testing suite (:pull:119)
* ``cdmmapper`: re-adding conserderation of indexes of nan values during transformation (:pull:125`)
Bug fixes
^^^^^^^^^
* indexing working with user-given chunksize (:pull:35)
* fix reading of custom schema in mdf_reader.read (:pull:40)
* ensure format schema field for delimited files is passed correctly, avoiding "...Please specify either format or field_layout in your header schema..." error (:pull:40)
* there is a loss of data precision due to data type conversion. Hence, use default data types of both int and float (:issue:59, :pull:60)
* reading C-RAID data: adjust datetime formats to read dates into MDFFileReader (:pull:60)
* ensure external code tables are used when using an external schema in mdf_reader.read (:pull:105)
* update readme and example Jupyter notebooks to :pull:103 (:pull:110)
* restructure CLIWOC_datamodel Jupyter notebook to add an example of data model construction (:pull:110)
* remove create_data_model.ipynb example Jupyter notebook (:pull:110)
- Python
Published by ludwiglierhammer over 1 year ago
cdm-reader-mapper - v0.3.0
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer, :user:jtsiddons)
New features and enchancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* mdf_reader: read C-RAID netCDF buoy data (:issue:13, :pull:24, :pull:28)
* adding both GCC IMMT and C-RAID netCDF data to test_data (:pull:24, :pull:28)
* cdm_mapper: adding C-RAID mapping and code tables (:issue:13, :pull:28)
* cdm_mapper: add load_tables to __init.py__ (:pull:32)
Breaking changes
^^^^^^^^^^^^^^^^
* adding tests for IMMT and C-Raid data (:issue:26, :pull:24, :pull:28)
* cdm_mapper.map_model: drop dulicated lines in pd.DataFrame before writing CDM table on disk (:pull:28)
* add pyarrow (see: https://github.com/pandas-dev/pandas/issues/54466) to requirements
* solving pyarrow-snappy issue (see: openforcefield/openff-nagl#106) (:issue:33, :pull:28, :pull:34)
Internal changes
^^^^^^^^^^^^^^^^
* do not diferentiate between tuple and single column names (:pull:24)
* metmetpy: Do not raise erros if validate_datetime, correct_datetime, correct_pt and/or validate_id do not find any entries (:pull:24)
* get rid of warnings (:issue:9, :pull:27)
* adding python 3.12 to testing suite (:pull:29)
* set time out for testing suite to 10 minutes (:pull:29)
Bug fixes
^^^^^^^^^^
* cdm_mapper: set debugging logger into if statement (:pull:24)
* cdm_mapper: do not use code table qc_flag with report_id (:pull:24)
* metmetpy: fixing ICOADS 30000 NRT functions for pandas>=2.2.0 (:pull:31)
* cdm_mapper.read_tables: if table not available return empty pd.DataFrame (:pull:32)
- Python
Published by ludwiglierhammer almost 2 years ago
cdm-reader-mapper - v0.2.0
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
Breaking changes
- move converters and decoders from
commontomdf_reader/utils(PR #3) - delete redundant functions from
cdm_reader_mapper.common cdm_reader_mapper: import common in__init__.py- remove unused modules from
metmetpy cdm_reader_mapper.mdf_readersplit datamodels into codetables and schema- logging: Allow for use of log file (PR #6)
- cannot use as command-line tool anymore (PR #22)
- outsource input and result data to
cdm-testdata(GH #16, PR #21)
Internal changes
- adding tests to cdmreadermapper testing suite (GH #12, PR #2, #20, #22)
- adding testing result data (PR #4)
- use slugify insted of unidecde for licening reasons
- remove pip install instruction (PR #2)
HISTORY.rsthas been renamedCHANGES.rst, to followxclim-like conventions (PR #7).- speed up mapping functions with
swifter(PR #4) mdf_reader: adding auxiliary functions and classes (PR #4)mdf_reader: read tables line-by-line (PR #20)
Bug fixes
- Fixed an issue with missing
condadependencies in thecdm_reader_mapperdocumentation (PR #14)
- Python
Published by ludwiglierhammer almost 2 years ago