Recent Releases of dataset
dataset - v2.3.2
Issue #161 fix for handling GET with query were data is passed via URL parameters.
Removed support for frame, clone, sample, sync and join support removed. The dsimporter cli removed (use jsonl dump and load instead).
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.3.1...v2.3.2
- Go
Published by rsdoiel 8 months ago
dataset - v2.3.1
Issue #152 has started. The frames, sample and clone help has been removed as those features are depreciated. The frames supprt was remove from datasetd. Help is being reorganizaed. apidoc.go and apicmd.go was removed as it was dead code with changes in help implementaiton.
The -help option can pull up help on a topic by providing a keyword as a separate parameter. This let's you pull up help on the related commands.
Example:
~~~shell dataset -help api dastaet -help dsquery ~~~
Help text is now maintain inside a single file, helptext.go.
This while the code for frames, clone and sample remains in the dataset cli it'll be removed in an upcoming release before the transition to v2.4.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.3.0...v2.3.1
- Go
Published by rsdoiel 8 months ago
dataset - v2.3.0
The object versioning problem identified in issue #149 persisted after the release of v2.2.8. The resulted in mitigation steps of ignoring the version.json file held in the collection's root directory. As of v2.3 this file is no longer read or updated. Instead the collection level methods will explicitly set the versioning type at the store level. This means one locations holds versioning state, the collection.json. This simplified the codebase and appears to be backward compatible. A simpler approach to versioning for JSON documents and attachments is planned for v3.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.2.8...v2.3.0
- Go
Published by rsdoiel 8 months ago
dataset - v2.2.8
This release has focused on cleanup, bug fixes and documentation revisions.
- Fixes and mitigations for issues #148 and #149
- Implemented feature request issue #150
What's Changed
- V2.2.7 release by @rsdoiel in https://github.com/caltechlibrary/dataset/pull/147
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.2.7...v2.2.8
- Go
Published by rsdoiel 8 months ago
dataset - v2.2.7
This release has focused on cleanup, bug fixes and adding a redirect feature to support development without requiring JavaScript browser side.
- Fixed issue #138, where SQLite3 updated times where not set.
- Fixed issue #144, Fix issue with spurious form validation without a defined data model.
- Fixed issue #145, added support for createsuccess, and createerror which hold redirects for success and failure on POST that are URLencoded.
- Fixed issue #146, path handling to collection name caused me to mis-caculate the table name.
What's Changed
- V2 merge by @rsdoiel in https://github.com/caltechlibrary/dataset/pull/143
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.2.5...v2.2.7
- Go
Published by rsdoiel 9 months ago
dataset - v2.2.5
Added the following functions to the dataset package
- (c *Collection) KeysJSON
- (c *Collection) UpdatedKeysJSON
- (c *Collection) QueryJSON
These provide JSON encoded object support for their base functions.
In datasetd, requesting an API object without specifying the content type returns an application/json object.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.2.4...v2.2.5
- Go
Published by rsdoiel 9 months ago
dataset - v2.2.4
Added the following functions to the dataset package
- (c *Collection) KeysJSON
- (c *Collection) UpdatedKeysJSON
- (c *Collection) QueryJSON
These provide JSON encoded object support for their base functions.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.2.3...v2.2.4
- Go
Published by rsdoiel 10 months ago
dataset - v2.2.2
This release removes libdataset support. If you need to use dataset from a language other than Go and Dataset Project provides datasetd, a JSON API web service. Of course the cli also are an option depending on your circumstance.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.2.1...v2.2.2
- Go
Published by rsdoiel 10 months ago
dataset - v2.2.1
Fixed a bug in the 2.1 series where repair failed for default collections of type sqlite.
The libdataset sub-directory is depreciated. It is just to hard to maintain across platforms. Dataset provides a JSON API as a web service that is easily used from any programming language that support http access on localhost.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.2.0...v2.2.1
- Go
Published by rsdoiel 10 months ago
dataset - v2.2.0
This minor release see the addition of two new dataset verbs and the introduction of SQLite3 as the default storage type. You can still create a pairtree store but now you need to include that as a paramter when invoking the init verb.
The added verbs are dump and load. These offer a different approach than cloning repositories. The dump verb will write a JSONL object stream to standard out where the objects have two attributes, key and object. The key attribute corresponds to the object key in the dataset collection while the object attribute contains the JSON object in the collection. The load command can read this stream of objects and use them to populate a collection.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.24...v2.2.0
- Go
Published by rsdoiel 11 months ago
dataset - v2.1.24 maintenance release
Updated API configuration to support improved error handling. Compiled with Go 1.24.2
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.23...v2.1.24
- Go
Published by rsdoiel 11 months ago
dataset - fixed updated timestamp for SQLite3 stored collections
Upgraded models dependency for datasetd, fixed stuck updated column SQLite3 based collections
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.22...v2.1.23
- Go
Published by rsdoiel over 1 year ago
dataset - Bug fix, crash on nil c.Model
Fixed crash when c.Model was nil. Updated to models v0.0.4.
- Go
Published by rsdoiel over 1 year ago
dataset - Experimental model support
This release adds experimental model support and validation to datasetd. Additionally it adds support for urlencode data submissions to the JSON API.
- Go
Published by rsdoiel over 1 year ago
dataset - maintenance
This is a maintenance release. A minor bug in datasetd static file services was fix so that JavaScript files have a mime type of "application/javascript". Fixed some issue in "make.bat" for compiling on Windows.
- Go
Published by rsdoiel over 1 year ago
dataset - dsquery and datasetd updates
This release features improved output options for dsquery and support for that in the datasetd API.
- Go
Published by rsdoiel over 1 year ago
dataset - Installer script improvements
If you set the environment variable PKG_VERSION the installer script will download and install the specified version. Otherwise it will download the default value in the script.
env PKG_VERSION='2.1.5' curl https://caltechlibrary.github.io/dataset/installer.sh | bash
or
$env:PKG_VERSION = '2.1.5'
irm https://caltechlibrary.github.io/dataset/installer.ps1 | iexp
- Go
Published by rsdoiel over 1 year ago
dataset - Minor release
Added an experimental Powershell installer for dataset.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.14...v2.1.15
- Go
Published by rsdoiel over 1 year ago
dataset - Minor updates
This released updates some modules, fixes some documentation bugs.
- Go
Published by rsdoiel over 1 year ago
dataset - improvements to datasetd
Added documentation for datasetd explaining YAML syntax, API as well as example systemd service file. This release also fixes a problem in supporting the query path there the order of the form fields need to match the order of the parameters in the SQL statement.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.11...v2.1.13
- Go
Published by rsdoiel over 1 year ago
dataset - Documentation for fixes, yaml support in configuration
With release includes corrections in the documentation of datasetd, the dataset RESTful JSON API. It also allows configuring that service using YAML instead of JSON.
- Go
Published by rsdoiel over 1 year ago
dataset - Maintenance Release
Upgrade to go1.22 and also upgrade to the latest dependent packages.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.9...v2.1.10
- Go
Published by rsdoiel almost 2 years ago
dataset - improved SQL error message is dsquery
This release has improved SQL error message for dsquery.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.8...v2.1.9
- Go
Published by rsdoiel over 2 years ago
dataset - dsimport rename dsimporter
On macOS ships with a program called dsimport so dataset's importer collides with that name. I have renamed my importer to dsimporter to avoid that problem.
- Go
Published by rsdoiel over 2 years ago
dataset - dsimport added
This release includes a new program called dsimport which will import a CSV file into a collection much like the import verb did in v1 of dataset. An "-all" option is included with the dataset clone action to make it easy to copy the content of a SQL based collection since copying the collection using the standard operating system copy will not resolve database and table name relationships when using a SQL engine to store your JSON.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.6...v2.1.7
- Go
Published by rsdoiel over 2 years ago
dataset - added csv output to dsquery
This is a minor feature release adding CSV output support to dsquery. Some bug fixes. No changes to libdataset so not included with release.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.5...v2.1.6
- Go
Published by rsdoiel over 2 years ago
dataset - dsquery with pairtree support
This release features an improved dsquery that can index and query a pairtree based dataset collection. In addition to supporting all the dataset collection storage engines the following options have been added or improved.
-sql : Read SQL from a file
-grid : Given a list of attribute names returned by the object in the SQL statement, make a 2D values grid and return instead of a list of JSON objects.
-index : Read the contents of a collection into a SQLite 3 database called index.db. It uses the same structure as a SQL storage engine based collection.
There were no changes to libdataset in this release.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.1.4...v2.1.5
- Go
Published by rsdoiel over 2 years ago
dataset - dsquery
This release includes an experimental command dsquery which will let you query SQL based collections to return a list of objects as a JSON array. This approach looks to improve on the data frame generation previously implemented for pairtree.
- Go
Published by rsdoiel over 2 years ago
dataset -
This release is primarily a documentation update. I've documented postgres support in the dataset command dataset help init. Note if you need to set sslmode when using Postgres to store your dataset collection you need to pass it as part of the DSN uri.
dataset init data.ds "postgres://$DB_USER@localhost/data?sslmode=disabled"
NOTE: there was a problem when did the initial release for v2.1.3, you should see the hash "6e51f3e" in the version number if you run
dataset --version
- Go
Published by rsdoiel over 2 years ago
dataset - improved JSON handling
This release is based on improved json handling and made to align it with datatools v1.2.4 json handling.
- Go
Published by rsdoiel over 2 years ago
dataset - Bug fixes, cleanup, clarifications
Fixed bugs described in issue #123. Cleaned up some documentation, clarified thinking on versioning, brought libdataset inline with main v2 dataset codebase for version handling.
Add set versioning and get versioning to cli as well as libdataset.
- Go
Published by rsdoiel about 3 years ago
dataset - Updates dependencies and bug fixes
This release sees the return of libdataset C-shared library an a minimal level of backward compatibility added with v1.1 branch of dataset.
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v2.0.1...v2.1.0
- Go
Published by rsdoiel about 3 years ago
dataset - Bug fixes,code cleanup
This release includes some bugfix and adding of v1.1 missing CreateJSON(), ReadJSON() and UpdateJSON() methods on a collection
Development is now in the "main" branch, release v2.x is in the "v2" branch and releaese for v1.1.x are in the "v1" branch.
- Go
Published by rsdoiel about 3 years ago
dataset - code reorganization, experimental updates
This release features code re-organization, migration of modules back out of dataset (e.g. semver, dotpath, pairtree). Adding tentative support for Postgres 14.5.
- Go
Published by rsdoiel over 3 years ago
dataset - Code reorganization
Flatten module organization, pairtree, dsv1, dotpath are not back as separate caltechlibrary level modules, remaining sub-packages merged back into dataset package. Using old fashioned name conventions deal with name collisions . Release now compiles and has it release version updated.
- Go
Published by rsdoiel over 3 years ago
dataset - Code reorganization
This is part of the v2.0.0 code reorganization backing off of the sub-package approach, folding everything back into the dataset root package to ease bug fixes.
- Go
Published by rsdoiel over 3 years ago
dataset - Snapcraft release
This release adds support for generating snaps for installing on Linux systems that support the snap package format (e.g. Debian, Ubuntu, ArchLinux, Fedora Linux). This also means dataset is supported on the same processor types that are supported via the snapcraft.io store.
- Go
Published by rsdoiel almost 4 years ago
dataset - Dataset as web service and metadata enhancements
Full Changelog: https://github.com/caltechlibrary/dataset/compare/v1.0.1...v1.1.0
Release 1.1.0:
Added attachment support for datasetd.
Updated the metadata fields to include richer PersonOrOrg data structures for author, contributor, funder as well as added an annotation map field for custom metadata.
Added "MetadataJSON()" function for Collection to quickly copy out the metadata values from a collection.
c, err := dataset.Open("MyData.ds")
...
defer c.Close()
fmt.Printf("%s", c.MetadataJSON())
Added "MetadataUpdate()" function to update a collection's metadata.
c, err := dataset.Open("MyData.ds")
...
defer c.Close()
meta := new(Collection)
meta.Description = "A test dataset"
meta.Version = "1.0.0"
meta.Author = []*PersonOrOrg{
&Person{
Type: "Person",
GivenName: "Jane",
FamilyName: "Doe",
},
}
err = c.MetadataUpdate(meta)
...
Depreciated dependency on namaste package and Namaste support in command line tools. Removed "collections.go and collections_test.go" from repository (redundent code). Updated libdataset/libdataset.go to hold functions that were needed for the C-Shared library from collections.go. The Namaste fields in the collection's metadata are now depreciated.
The dataset.Init() now places a lock file in the collection directory and leves the collection in an "Open" state, it should be explicitly closed.
E.g.
c, err := dataset.Init("MyData.ds")
...
defer c.Close()
Removed "set_*" for collection metadata fields from libdataset.go. These should be set using the dataset command line tool only.
The dataset.Analzyer() and dataset.Repair() commands expect the dataset collections to be closed before being called. E.g..
c, err := dataset.Open("MyData.ds")
...
c.Close()
err := dataset.Analyzer("MyData.ds", true)
if err == nil {
c, err = dataset.Open("MyData.ds")
...
}
- Go
Published by rsdoiel over 4 years ago
dataset - Refinements of Stable release
Release 1.0.1:
- Keys are stored lowercase
- Removed filtering and sorting options from dataset and libdataset
- Use pairtree 1.0.2 configurable separator
- Added check and repair for migrating to case insensitive keys and path
- Updated required packages to latest releases
- Added notes about Windows cmd prompt issues when providing JSON objects on command line
- Added M1 support for libdataset
- Go
Published by rsdoiel over 4 years ago
dataset - Stable Release
This is a stable release of dataset.
- Go
Published by rsdoiel over 4 years ago
dataset - Compiled with go1.16
These release saw changes to take advantage of go1.16 compilation and removed inclusion of the depreciated storage module. Compiled binaries for macOS M1 have been added (this is experimental).
- Go
Published by rsdoiel about 5 years ago
dataset - Improved memory handling for attachments
This release includes improved memory handling for large file attachments. Some minor bugs were squashed.
- Go
Published by rsdoiel over 5 years ago
dataset - Issue-103 updates
This is a minor maintenance release, fixed comments in source code to reflect the behaviors for "frame refresh" versus "frame reframe" in frames.go and libdataset.go. Also updated documentation adding back individual pages for frame "refresh" and "reframe".
- Go
Published by rsdoiel over 5 years ago
dataset - Minor bug fixes and renamed function
Minor bug fixes, removed duplicate code from libdataset.go, renamed make_objects() to create_objects() to be consistent with general naming scheme in libdataset.
- Go
Published by rsdoiel almost 6 years ago
dataset - Bug Fix issue with libdataset's frame_delete().
This was detected in an experimental version of pydataset. Inside libdataset.go's exported framedelete() function the Go function FrameClear() was called instead of FrameDelete(). This made framedelete() have the same behavior as frameclear(). This was corrected, tests were added in libdataset at both Go level and Python level.
- Go
Published by rsdoiel almost 6 years ago
dataset - Minor bug fixes in libdataset.
This release focuses on minor bug fixes in libdataset.
All functions which returned an error string only now return
True for success and False otherwise. The error string
can be retreived with dataset.error_message().
Build Notes:
- golang v1.14
- Caltech library go packages
- storage v0.1.0
- namaste v0.0.5
- pairtree v0.0.4
- OS used to compiled and test
- macOS Catalina
- Windows 10
- Ubuntu 18.04 LTS
- Python 3.8 (from Miniconda 3)
- zip has replaced tar in the releases of libdataset
Some tests fail on Windows 10 for libdataset. These will be addressed in future releases.
- Go
Published by rsdoiel almost 6 years ago
dataset - Simplification, improving test coverage
This release focuses on refine function names, simplification and easy of testing for Windows 10 reployments.
Build Notes:
- golang v1.14
- Caltech library go packages
- storage v0.1.0
- namaste v0.0.5
- pairtree v0.0.4
- OS used to compiled and test
- macOS Catalina
- Windows 10
- Ubuntu 18.04 LTS
- Python 3.8 (from Miniconda 3)
- zip has replaced tar in the releases of libdataset
Renamed functions:
- collectionstatus() is now collectionexists()
Depreciated functions and features:
- S3, Google Cloud Storage support dropped.
- grid(), if you need this create a frame first and use frame_grid().
Some tests fail on Windows 10 for libdataset. These will be addressed in future releases.
- Go
Published by rsdoiel almost 6 years ago
dataset - Major code revision, some breaking changes
This release saw a major clean of code with some breaking changes in libdataset. FrameRefresh now only refreshes object in frame pruning objects no longer found in the collection. FrameReframe requires a key list and replaces of the object map and key list in a frame. libdataset splits out the functionality of the command line "keys" verb. keys() returns all keys in a collectin, keyfilter() is used to filter a key list, keysort() is used to sort a key list.
Envoking most libdataset functions will automatically open a collection if needed. There are some additional functions including support for most getting and setting Namaste nouns about the collection (e.g. Who, What). Collections versions can be set and retrieved. Getting the dataset version via libdataset now uses the function datasetverison() while getting the version of a collection uses getversion().
libdataset now has tests via using a custom libdataset Python module. This proved easier then writing C code to test the C-Shared library.
Finally Google Sheets support has been removed. Directory S3 support will likely be removed in a future version. This will allow us to continue to trim the fat before making it to v1.x.
NOTE: this release lacks a libdataset.dll for Windows 10.
- Go
Published by rsdoiel almost 6 years ago
dataset - Bug fixes, repair enhancement
This is a pre-release to test changes in Frames handling as well as test the changes in repair to cover when a pairtree is found but a collection.json is missing.
Recompiled release with Go v1.14.
- Go
Published by rsdoiel about 6 years ago
dataset - v0.1.x series release with bug fixes
From v0.1.0 release
- Updated libdataset API, simplified func names and normalized many of the calls (breaking change)
- libdataset now manages opening dataset collections, inspired by Oberon System file riders (breaking change)
- Added Python test code for libdataset to make sure libdataset works
- Added support for check and repair when working on S3 deployed collections
- Refactored and simplified frame behavior (breaking change)
From v0.1.0 to v0.1.1
- Fixed problem where keys_exist called before an open command.
From v0.1.1 to v0.1.2
- Persisting _Attachments metadata when updating with clean objects using the same technique as _Key
- Go
Published by rsdoiel about 6 years ago
dataset - Frame and C-Shared library improvements
This release has breaking changes in the syntax used in the command line tool as well as the function calls available from the C-Shared library. It includes a demo python package for testing the C-Shared library.
- The frame metadata object is simplified dropping information for things like filtering and sorting, it is now assumed you are supplying a list of keys where required
- The frame verb now has been split into two, frame-create to create data frames and frame to recall them.
- The reframe verb is split and renamed to frame-reframe and frame-refresh. The first will replace, add and re-oder objects given a list of keys, the second will update objects and append new objects given a list of keys
- The delete-frame verb has been renamed frame-delete to be consistent with the rest of the frame commands
- Experimental support for running check and repair directly on S3 hosted dataset collections
- Experimental support in libdataset for operating on open collections with out requiring an explicit close, and managing writes to the collection.json file and pairtree. This allows us to open many collections and manage them in a service context like a Python application providing web interactions
- Go
Published by rsdoiel over 6 years ago
dataset - Performance improvement
Performance improvement for object creation. Namaste is only created on collection init. Add Go func CreateObjectsJSON() for collections. This func [rovides batch object creation using a default JSON source for created objects. It avoids writing collection metadata until it all objects are created. For large data imports this saves writing collection.json on each individual object creation. The was added at the Go level to improve the performance of make_objects() func in libdataset.
- Go
Published by rsdoiel over 6 years ago
dataset - libdataset additions
This release adds two functions to libdataset -- make_objects() and update_objects() which let you create
objects in batch. This is helpful if you are scripting inguest of large numbers of objects into dataset at once.
This was compiled with libdataset was compiled Golang v1.13 on Linux, Windows 10 and macOS on Intel.https://github.com/tpoechtrager/wclang
- Go
Published by rsdoiel over 6 years ago
dataset - Fix bug issue #96
Fixed bug in the AttachFile() function which was leaving us with empty attachments.
- Go
Published by rsdoiel over 6 years ago
dataset - Bug fix, issue #95, attachment problem.
This is a bug fix release in prep for v1.0.0 release. The AttatchStream() func failed to actually write the attachment's content. Cleaned up code and corrected missing assignment after reading the io.Reader buffer.
- Go
Published by rsdoiel over 6 years ago
dataset - Release candidate 4 for v1.0.0
Bug fixes in libdataset used by py_dataset. Testing cross compile from Linux to Windows and Mac OS X for libdataset.*
- Go
Published by rsdoiel over 6 years ago
dataset - Release 3 candidate for v1.0.0
In this release the data frame was refactored to drop the Grid attribute. You can get a Grid from a data frame by using the Grid() function on the frame. Added func Objects() to frame for returning a copy of the DataFrame.ObjectList values. Updated libdataset to reflect this change adding both Objects() and Grid() funcs. Labels MUST be provided in the frame definition now. The dot paths and labels lengths must match. If the first element of the dopaths is NOT .Key it will be prefixed to the dot path list automatically and a "Key" label will be prefixed to the labels list before generating the frame. ".Key" and "Key" will always be the first in the list of dot paths and labels. This is less critical as the Grid 2D array has been dropped but it is always reflected in the output of the Grid function since the definitions of dot paths and labels enforce this.
The command line dataset has been updated to reflect the changes in frame definition requirements. In order to manage the command line arguments the you form the label/dot path pair by joining them with an equal sign. Labels are used as the keys in the objects of a frame while the dot path specifies where the value comes from in the collection's JSON objects. This makes a command line frame definition looks something like this
cat keys.txt | dataset frame collection.ds MyFrame one=.title two=.family_name three=.given_name
The frame "MyFrame" will contain an object list with the keys of "one", "two" and "three" with values from the dot paths ".title, .familyname, .givenname".
- Go
Published by rsdoiel over 6 years ago
dataset - Release candidate 2, for v1.0.0
Added boolean clean object to Read() and ReadList(). This will let you retrieve an object without the dataset added _Key and _Attachments attributes.
- Go
Published by rsdoiel over 6 years ago
dataset - Release candidate for v1.0.0
Dropped Bleve support. Removed buckets code. Remove verbs related to search (we'll add them back when Lunr is available in Go). Documentation cleanup. Refactored attachments from tarball to semver directory holding attachments by their base name. Expanded metadata in collection.json, improved init to include additional auto-generate metadata.
- Go
Published by rsdoiel over 6 years ago
dataset - Refactor frame to support a list of objects
The major change is frames now will include an object list and the grid element of a frame is considered depreciated. The object list has proven more useful in the code we write that uses frames it is also more consistent with how Python and R represent a data frame as well. The array of objects remains easy to export to a 2D array if needed.
The code support a BUCKET collection has been removed as this data layout has been out of use for more than a year. The eliminated two verbs migrate verb, simplified check and repair as well as the need to specify a layout when initializing a collection.
Since removing the BUCKET layout is conceptually a breaking change (it is removing a feature) and we the grid element in a data frame is being depreciated this release is considered a pre-release while we test practical usage.
- Go
Published by rsdoiel over 6 years ago
dataset - Unified dataset and py_dataset release
This release is meant to unify the version numbers with dataset, libdataset and py_dataset.
- Go
Published by rsdoiel almost 7 years ago
dataset - Split python module from Go codebase
This release is focused on splitting the python codebase off to py_dataset so this repository only focuses on the Go language code base for the command line tools as well as the C-shared library support.
If you are porting dataset support to other languages look at py_dataset and use the appropriate libdataset shared library.
- Go
Published by rsdoiel almost 7 years ago
dataset - Windows Shared Library Support added, bug fixes
The are various small bug fixes in this release and improvement in Windows 10 support. Several "make.bat" files will now build on Windows 10 using Go v1.12.4 and Miniconda installed git/gcc. Started docs on building on Windows 10. Including py3 module supporting Windows 10 via Anaconda compiled dll.
- Go
Published by rsdoiel almost 7 years ago
dataset - GSheet import improvements
The google sheet import now uses ValueRenderOption() of UNFORMATED_VALUE rather than FORMULA. As a result dataset will not store formulas but the under formatted values.
- Go
Published by rsdoiel almost 7 years ago
dataset - Bug Fix, issue #81
When moving collections between OS/File systems there are case sensisitivity issues that crop up. In particular how Mac OS X currently handles (by default) case assignment in paths versus other OS. When migrating a dataset collection between OS/File system types via zip or tar ball it is a good idea to run dataset check and dataset repair on the destination system. For pairtree layouts repair will not keep track of the missing JSON records then walk the pair tree and re-attach those it finds fixing the path issue. This is an ugly hack but will keep things workable for now. Long run it might make sense to include an archive verb in the dataset command which ensures portable zip and tar balls since that is easy to support via Go standard packages.
- Go
Published by rsdoiel almost 7 years ago
dataset - Bug fix release
There was a bug in the dataset command when handling literal JSON expressions versus JSON filename references in the create and update verbs.
- Go
Published by rsdoiel almost 7 years ago
dataset - Go module support
Adding Go Module support for releases.
- Go
Published by rsdoiel almost 7 years ago
dataset - Bug fixes, test updates, added Python module functionality
Fixed cli bug when sync-recieve for a CSV file on command (was writing to stdout even when CSV filename provided).
Fixed broken tests for GSheet, consolidated how they get processed on the Python module.
Add support for syncsendcsv, syncsendgsheet, syncrecievecsv and syncrecievegsheet in Python module.
- Go
Published by rsdoiel almost 7 years ago
dataset - Attachment normalization and bug fixes
Attachments are now saved as the base filename (the path to the file is no longer included). If you need to preserve paths you should zip your file(s) before attaching. There are some bug fixes around import/export to from CSV and Google Sheets. The Python code has been updated to use the new versions of exportcsv, exportsheet, importcsv, importsheet found in libdataset.go shared library.
Improvements to some warnings and error messages. Minor bug fixes.
- Go
Published by rsdoiel about 7 years ago
dataset - Bug fixes, issue #73, #74
This is a small bug fix release..
- Go
Published by rsdoiel over 7 years ago
dataset - Descending key sort bug fixed
There was an iteration value incremented on descending sorts when it should been decremented. That is now fixed. Also added missing has_frame() function to Python module.
- Go
Published by rsdoiel over 7 years ago
dataset - Collection Metadata release
Added support for Namaste style metadata about collection (type, who, what, when, where, version and contact). The metadata is stored in the collection.json and Namaste are rended on SaveMetadata() calls. SaveMetadata() is now an exported function. There is now a concept of "datasetversion" versus "version". "version" is the semvar associated with the collection, "datasetversion" is the version associated with the "type" in Namaste as well as the version of the dataset command/library that wrote the collection. This is a breaking change with pre-v0.0.48 release. I have changed the Python module's version() function to reflect this. It is now called dataset_version(). The Namaste features are not implemented in the Python package but maybe in the future.
- Go
Published by rsdoiel over 7 years ago
dataset - Major refactoring of code
The command line tool has been stream lined using a more Git like structure for associating options and verbs. A tentative sync command now can use a frame to synchronize collection data with a CSV file or Google Sheet (needs lots of real world testing). Documentation is been quickly revised to reflect most of these changes. The purpose of this release is to encourage internal Caltech Library use so we can get the bug reports to fix things properly. We're getting close to a feature set planned for v1.0.0.
- Go
Published by rsdoiel over 7 years ago
dataset - UTF-8 bug fix in Pairtree support
This release is compiled against pairtreee v0.0.2, eliminating a bug where some Unicode characters were getting split at the wrong byte boundry when forming paths or keys.
- Go
Published by rsdoiel over 7 years ago
dataset - Pairtree file layout release
In addition to the usual assortment of bug fixes this release features support for using a Pairtree layout instead of the default buckets layout of files in a collection. You can migrate between the two layouts using a migrate option in the command line tool.
- Go
Published by rsdoiel over 7 years ago
dataset - Revised Open repository 2018 release
Picked better defaults for CSV import into dataset.
- Go
Published by rsdoiel over 7 years ago
dataset - pre open repositories 2018 release
This is a snapshot of dataset as we prepare for Open Repositories 2018 for week June 2nd 2018. This release includes a few minor bug fixes, the primary change from v0.0.41 is documentation cleanup and a small metadata change to frames. Frame's change includes keys are always the first column in a frames' grid even if not specified.
- Go
Published by rsdoiel over 7 years ago
dataset - Added frame support
Added a frame (aka data frame) support to dataset based on grid plus persistent metadata. Frames persist with the collection and can be regenerated, modified and removed. See the document collections-grids-and-frames.md for details.
Misc. bug fixes, some documentation edits.
dsws has been removed from package. Bleve support remains in the dataset command but maybe replaced in the future by another indexing/search system. dataset may be feature complete for v0.1.0 release at this point.
- Go
Published by rsdoiel almost 8 years ago
dataset - Adding data grid support
This release includes misc bug fixes but primarily includes support for generating 2D arrays from a collection with a new command called "grid". E.g. dataset Articles.ds grid articles.keys .pub_date .title '.creators[:]' The grid verb is also supported in the Python module for v0.0.40. Using a "grid" of collection data is useful when
sorting is important to analysis or reporting. See the how-to titled "data grids".
- Go
Published by rsdoiel almost 8 years ago
dataset - clone and clone sample added
The big change is the addition of a clone and clone-sample verbs to the command line and Python module. Various bug fixes including revisions to how errors are propagated in the Python module via a tuple for those functions that map closely to the Go methods (e.g. create, read, update, delete, extract, find).
- Go
Published by rsdoiel almost 8 years ago
dataset - Bug fixes and Python module cleanup
Switch Python to setuptools from distutils for building Python package from libdataset.go. Fixed bugs related to the following closed issues 42-45, 47, 48, 50.
- Go
Published by rsdoiel almost 8 years ago
dataset - csv export cell alignment bug
This release improves error messages and fixes a cell alignment bug when converting a collection to a table (CSV). It has added tests for export_csv. May fix bug in export to GSheets but those tests still need to be written.
- Go
Published by rsdoiel almost 8 years ago
dataset - Python 3.6 module error_message() function
Added a error_message() function to Python module so you can check for the last error recorded in the Go runtime. e.g.
python
record = dataset.read(my_collection, "1023")
e_msg = dataset.error_message()
if e_msg != "":
handle_error(e_msg)
- Go
Published by rsdoiel almost 8 years ago
dataset - Py module improvements
Improved error message in running tests for Python Module, updated docs to indicate we require Python 3.6. Add python module method usestrictdotpath() to configure whether or not a dotpath error is treated as a warning or failure.
- Go
Published by rsdoiel almost 8 years ago
dataset - Python module installation support
The primary change in this release is the dataset.py module became a Package and package installation is now supported for pre-compiled libdataset.(so|dll|dylib). Python module is now in a .tar.gz file as that is the way Python's distutil builds them.
A correction in the link in the codemeta.json was also made.
- Go
Published by rsdoiel almost 8 years ago
dataset - issue 38 fixes
Issue 38 fixed, minor testing bugs corrected, added missing tests from Python 3 module.
- Go
Published by rsdoiel almost 8 years ago
dataset - issues 29-35 Python module completed
Bugs, corrections for issues 29-35 and Python module is now fully mapped from command line tool. Documentation is being re-organized.
- Go
Published by rsdoiel almost 8 years ago
dataset - issue 6, 29, 30 changes, improved docs
Fixed issues 6, 29 and 30 as well as a few other minor bugs which turned up as I went through and corrected docs. Added several new tests to test_cmd.bash to cover the technique shown in the docs. Renamed import to import-csv, export to export-csv to better fit with the verb approach for import-gsheet, export-gsheet. If a collection name is specified before or after the "verb" and has the ".ds" extension it should function the same as the old -c, -collection option.
- Go
Published by rsdoiel almost 8 years ago
dataset - depreciated dsfind, dsindexer clean up of cli behavior and docs
issue 27 addressed. Cleaned up various docs. Made COLLECTION_NAME a semi-required cli parameter before actions. Updated doc code examples to reflect this and deemphasized use of DATASET environment variable.
- Go
Published by rsdoiel almost 8 years ago
dataset - issues 25 and 28 changes
Docs cleanup, primarily fixes and changes related to issues 25 and 28.
- Go
Published by rsdoiel almost 8 years ago
dataset - indexer, deindexer and find available with dataset command
The primary change is that the dataset command now supports the verbs indexer, deindexer and find. Minor changes included improved documentation, update of the "check" verb.
- Go
Published by rsdoiel almost 8 years ago
dataset - issue_17 updates
Issue 17 Python updates, internal API changes at the Go level to solve closing indexes for search testing. Added tests for deindexing records in both command line and Python module. Added tests both at Go and Python module level.
- Go
Published by rsdoiel almost 8 years ago
dataset - issue 22, 19 and part of issue 17 fixes
Added extract method to python module, added tests for -overwrite function, added sheet name to error reporting.
- Go
Published by rsdoiel almost 8 years ago
dataset - issue_20 changes
Renamed -id-file flag to -key-file for dsindexer. Bug fixes, some doc changes. Improved Python module build integration.
- Go
Published by rsdoiel almost 8 years ago
dataset - issue_19 changes
Added an -overwrite option to dataset cli.
- Go
Published by rsdoiel almost 8 years ago
dataset - issue #16 fixes
dataset's dsindexer now supports the original simplified index mapping (definition) as well as the native Bleve index map format. The Bleve native indexes should be created with an file extension of .bmap.
- Go
Published by rsdoiel almost 8 years ago
dataset - issues #13, #14, #15 fixes
This release is primarily implementing fixes.
- Go
Published by rsdoiel almost 8 years ago