Recent Releases of OpenMSIStream

OpenMSIStream - v.1.8.3.6

Scientific Software - Peer-reviewed - Python
Published by davidelbert 9 months ago

OpenMSIStream - v.1.8.3.5

Minor repairs to pyproject.toml

Scientific Software - Peer-reviewed - Python
Published by davidelbert 9 months ago

OpenMSIStream - v1.8.3.4

With fixed auto versioning from issue #83

Scientific Software - Peer-reviewed - Python
Published by davidelbert 9 months ago

OpenMSIStream - v1.8.3.3

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.3.2

What's Changed

  • Ensure defaults for maxwait/maxinitial_wait are always applied. by @tmcqueen-materials in https://github.com/openmsi/openmsistream/pull/77

Full Changelog: https://github.com/openmsi/openmsistream/compare/v1.8.2.9...v1.8.3.2

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.3.6

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.3.1

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.3.0

Rolls up all the 1.8.2.x updates included PR 77 to ensure defaults for maxwait/maxinitial_wait are always applied.

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.2.9

Small addition of accepting floats for wait time.

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.2.8

New logging improvements

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v.1.8.2.7

Improved logging for service tests so if something fails, the log file gets captured.

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v.1.8.2.6

Added self production of logs

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.2.5

Merged ability to rerun checks that fail in CI

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.2.4

Fixed credentials for S3 test

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.2.3

Extended and enhanced logging

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - 1.8.2.2

Fixed docker hub build and push to rerunning action.

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - 1.8.2.1

Update to rerun workflow run.

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - 1.8.2

Added matrix testing for range of compatible Python versions and advance support to newer Python. Added setting mimeType for files uploaded with Girder stream processor using python-magic. Update workflow to --skip-existing twine upload to pypi

Scientific Software - Peer-reviewed - Python
Published by davidelbert about 1 year ago

OpenMSIStream - v1.8.1

This release includes a fix for a crash that would appear when logging the final shutdown message for an S3TransferStreamProcessor.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 1 year ago

OpenMSIStream - v1.8.0

This release edits the "ControlledProcess" logic to allow all long-running programs to periodically produce "heartbeat" messages to a topic to let users remotely and asynchronously monitor which programs are still alive and running. The heartbeat messages have keys like "[programid]heartbeat" (where "program_id" is set on the command line) and json-formatted string values that are dictionaries with timestamps and information on how many messages/bytes have been produced/read/processed since the previous heartbeat.

Three command line arguments are added to specify the name of the topic that should accept the heartbeat messages, the ID of the program (to uniquely identify its heartbeat messages if multiple programs and sending heartbeats to the same topic), and the interval at which heartbeat messages should be produced.

Heartbeat messages can be sent to a different broker than their main programs are interacting with by adding broker configurations in the new "[heartbeat]" section of the configuration file. That section can also include configurations for the producer that will produce the heartbeat messages.

The PR also includes new CI tests and documentation for this new behavior.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 1 year ago

OpenMSIStream - Hashable EventHandlerActiveFile

This release fixes a rare bug that would potentially cause problems when running a DataFileUploadDirectory (especially if the files in that directory were very large in size). The fix is to make the EventHandlerActiveFile utility dataclass hashable by setting eq and frozen both True in the decorator; EventHandlerActiveFile objects are not mutable so this is fine.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 1 year ago

OpenMSIStream - v1.7.9

This release makes use of new command-line argument propagation logic in OpenMSIToolbox, which greatly simplifies the process of propagating command line arguments through to class constructors.

Scientific Software - Peer-reviewed - Python
Published by eminizer about 2 years ago

OpenMSIStream - v1.7.8

This release fixes parsing of KafkaCrypto config files, allowing full use of any parameters there in the KafkaCrypto implementation. Users with already-running workflows relying on encryption will want to double check that everything still works after upgrading to this version, if necessary, since before now the KafkaCrypto config files weren't being read fully. (This will only affect users that have edited the KafkaCrypto config files themselves, otherwise everything will be the same.)

Scientific Software - Peer-reviewed - Python
Published by eminizer about 2 years ago

OpenMSIStream - v1.7.7

This release adds a new flag ("--treatundecryptableas_plaintext") for encrypted Consumer-side programs to allow more efficient processing of messages that will never be decrypted. Useful for cases where encrypted and non-encrypted messages get mixed in topics, or when enabling/disabling encryption across a deployment.

Scientific Software - Peer-reviewed - Python
Published by eminizer about 2 years ago

OpenMSIStream - v1.7.6

This release adds a new option ("--usepollingobserver") for a DataFileUploadDirectory that will make the directory monitoring use watchdog's fallback PollingObserver for detecting changes in directory trees instead of the default. The new option should be used for Windows deployments where watched directories are not on NTFS file systems.

It also includes a minor update to allow independently configuring log file locations for all programs.

Scientific Software - Peer-reviewed - Python
Published by eminizer about 2 years ago

OpenMSIStream - v1.7.4

This release makes it possible to include arbitrary (JSON-serializable) metadata for uploads using a GirderUploadStreamProcessor

Scientific Software - Peer-reviewed - Python
Published by eminizer about 2 years ago

OpenMSIStream - v1.7.3

This release adds a command line script to re-produce encrypted messages to their original topics. It also adds some docs and a CI test for the script, and improves some of the KafkaCrypto functionality/configurability in general.

Scientific Software - Peer-reviewed - Python
Published by eminizer about 2 years ago

OpenMSIStream - v1.7.2

This release fixes a bug in running a GirderUploadStreamProcessor from the command line

Scientific Software - Peer-reviewed - Python
Published by eminizer about 2 years ago

OpenMSIStream - v1.7.1

This release fixes a few bugs for rare circumstances encountered in two recent deployments, including:

  • Disambiguating status of files being stream processed, both in memory and in the record CSV files
  • Escaping any possible special characters in filepaths to allow consumers of those files to restart successfully
  • Simplifying some multithreading logic that could rarely cause indefinite hangs
  • Adding a warning message for when a file failed to be opened for upload due to repeated permissions errors

Scientific Software - Peer-reviewed - Python
Published by eminizer over 2 years ago

OpenMSIStream - v1.7.0

This release splits out several utilities into the new OpenMSIToolbox package and adds that package as a dependency

Scientific Software - Peer-reviewed - Python
Published by eminizer over 2 years ago

OpenMSIStream - v1.6.0

This version adds the GirderUploadStreamProcessor class, with associated documentation, an example, and a CI test.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 2 years ago

OpenMSIStream - v1.5.3

This release fixes some bugs discovered when running at high throughput, with a large number of threads. It also makes some consumer-side behavior slightly more efficient, particularly in the case of filtering out lots of messages that don't need to be processed for a particular application.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 2 years ago

OpenMSIStream - v1.5.2

This release fixes a bug introduced when Watchdog was implemented to watch files. The bug would only affect systems running Python<3.9. In addition to the bug fix, it also overhauls the CI testing so that the package is tested in both Python 3.9.16 and Python 3.7.12 so that problems like these won't crop up in the future.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 2 years ago

OpenMSIStream - v1.5.1

Minor bug fix (possible circular import issue)

Scientific Software - Peer-reviewed - Python
Published by eminizer almost 3 years ago

OpenMSIStream - v1.5.0

This release adds some functionality requested by collaborators:

  1. A "--download_regex" command line option for all consumer-type programs that allows selecting which files should be sent for further processing. While all messages must be read from the topic, messages whose original files' relative paths don't match the specified regex will be skipped over once they're been read from the topic.
  2. A "--mode" command line option for stream processors and stream reproducers to allow users to decide whether files downloaded by those programs should be stored in "memory", on "disk", or "both". Storing files in memory allows for the fastest processing, while storing them on disk allows processing of files too large to hold in memory at once. The "both" option is useful for keeping a local copy of data that are processed without needing to read those files using a separate consumer.

The release also includes new CI tests and documentation updates for both of the above, and some further improvements to the code organization.

Scientific Software - Peer-reviewed - Python
Published by eminizer almost 3 years ago

OpenMSIStream - v1.4.0

This release overhauls how directories are monitored to use the "watchdog" Python library instead of our old bespoke solution. As a result, that behavior is now much more robust and scalable, and it's now safer and less memory intensive to watch directories that have large numbers of files in them.

This version is also the first with a truly official OpenMSIStream Docker image published on the openmsi DockerHub organization.

Scientific Software - Peer-reviewed - Python
Published by eminizer almost 3 years ago

OpenMSIStream - v1.3.4

This release further edits the PyPI and DockerHub upload actions

Scientific Software - Peer-reviewed - Python
Published by eminizer almost 3 years ago

OpenMSIStream - v1.3.3

This release polishes the DockerHub upload action and adds some documentation about the DockerHub image

Scientific Software - Peer-reviewed - Python
Published by eminizer almost 3 years ago

OpenMSIStream - Adding official Docker image

This release adds an official openmsistream Docker image to the repository. It's mostly to test a new GitHub action that should build and publish that image when releases are published.

Scientific Software - Peer-reviewed - Python
Published by eminizer almost 3 years ago

OpenMSIStream - v1.3.1

This is the final version for our submission to JOSS

Scientific Software - Peer-reviewed - Python
Published by eminizer almost 3 years ago

OpenMSIStream - v1.3.0

This release adds some new functionality requested during JOSS review.

It fixes some problems with tests that occurred because of an updated version of pylint, and improves the behavior of running CI tests locally and interactively with the "--nokafka" flag (this improvement is also now checked in the CI/CD pipeline). It also adds some scripts for standing up a local broker to use in tests through Docker, and allows that local broker to be created and removed automatically with the "--localbroker" flag to the "runalltests.py" script. The documentation has been updated to reflect that new functionality, and the local broker is also included as an option in the documentation for the tutorials.

Scientific Software - Peer-reviewed - Python
Published by eminizer almost 3 years ago

OpenMSIStream - v1.2.0

This release includes updates for adding a tutorial section to the documentation, with additional CI tests, and some structural re-organization. It also fixes a compatibility issue with Python 3.7.

Scientific Software - Peer-reviewed - Python
Published by eminizer about 3 years ago

OpenMSIStream - v1.1.6

The version removes an unnecessary check in some CI tests that would complain if some output directories already existed. This might be the case if tests were previously run on the systems and failed, and so previously-failed runs would cause subsequent runs to fail until their output had been manually removed, but this isn't convenient for anyone and so now any old output will just be automatically removed when each new test run starts.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v1.1.5

Fix for a bug preventing module loading if some environment variables are not set

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v1.1.4

This release adds a number of bugfixes and minor updates, to fix issues and reduce friction discovered when installing some producers/consumers for moving data from the X-Ray lab. Updates include:

  • A fix for a bug that would make the common occurrence of the librdkafka buffering queue being full appear like an error
  • Allowing a few retries when dumping DataclassTable objects out to files
  • Changing the default values of some command line and configuration options to better reflect our most common use cases
  • Changing how the producer-internal queue is sized so that it uses a constant, configurable amount of memory
  • Silencing some informational messages from KafkaCrypto
  • Making sure the Windows Services will wait for "Network Connections" to be set up before they restart
  • Improving the logging interface

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v1.1.3

Bugfix for DataFileUploadDirectory restart from log tables

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v1.1.1

This release fixes a bug that could occur when installing programs as Services or daemons and changing the logging levels from the command line.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v1.1.0

This release makes some updates to reduce the memory footprint of long-running programs, increase the efficiency of registering file statuses in log files, and improves the quality and accessibility of the logging system.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v1.0.1

This version adds some bugfixes for some edge cases of installing code as Windows Services, namely if the username has spaces in it, and it tries to automatically solve some more common problems like missing DLL files.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v1.0.0

First official version of OpenMSIStream

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.6

This release includes a bug fix for selecting ranges of bytes in files to upload

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.5

This release updates the style and organization of the code, and adds a lot to the documentation, in preparation for an initial submission to the Journal of Open Source Software.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.4

This release includes new updates for consuming data file chunks from one topic, triggering some processing when whole files are available, and then producing the results of that processing to another topic. That functionality is used in a new base class that is minimally-extensible for extracting metadata from files and producing those metadata to a different topic as JSON-formatted strings.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.3.0

This latest version allows DataFileStreamProcessor programs to automatically restart from the beginning of the topic to re-process any files that failed on a first go-around.

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.2.2

Bugfix for API reference on readthedocs

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.2.1

Adding more classes to the API reference in the documentation

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.2.0

Some minor edits to the code for completeness/robustness, and adding a partial API reference to the documentation

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.1.8

This release includes updates to code and documentation for changing the names of some files, directories, and classes to be more permanent (i.e. "mykafka" -> "kafkawrapper")

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.1.7

Adding documentation with Sphinx hosted on ReadTheDocs

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.1.5

Some fixes for bugs in OSNStreamProcessor that Connor and I discovered yesterday, and a general improvement for Consumer Group IDs

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.1.4

Ready to be used as a dependency for OpenMSIPython

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.1.3

Some more updates to the README for the new PyPI release

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.1.2

Some more updates for PyPI/twine

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.1.1

Update License

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago

OpenMSIStream - v0.9.1.0

Initial release for PyPI

Scientific Software - Peer-reviewed - Python
Published by eminizer over 3 years ago