https://github.com/nup002/pymjc
A python implementation of the Minimum Jump Cost dissimilarity measure.
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 2 committers (50.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary
Keywords
Repository
A python implementation of the Minimum Jump Cost dissimilarity measure.
Basic Info
Statistics
- Stars: 6
- Watchers: 1
- Forks: 4
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Minimum Jump Cost dissimilarity measure in Python
This python library implements the Minimum Jump Cost (MJC) dissimilarity measure devised by Joan Serra and Josep Lluis Arcos in 2012. The MJC dissimilarity measure was shown to outperform the Dynamic Time Warp (DTW) dissimilarity measure on several datasets. You can read their paper here: https://www.iiia.csic.es/sites/default/files/4584.pdf.
This library can compute the MJC for timeseries with different sampling rates, arbitrarily spaced data points, and non-overlapping regions.
How to install
pymjc is available from PyPi. Run the following in a command line terminal:
pip install pymjc
How to use
Example: ``` from pymjc import mjc import numpy as np
series1 = np.array([1,2,3,2,1]) series2 = np.array([0,1,2,1,0])
dxy, abandoned = mjc(series1, series2, showplot=True)
print(f"The MJC dissimilarity of series 1 and series 2 is {d_xy}") ``` There are some options for reducing the computational load of this algorithm. They are detailed in the next section.
More detailed information
The time series s1 and s2 are specified as follows: - They may be python Lists or numpy.ndarrays - They may be of different length. - They may or may not have time information. - If one of the time series has time information, the other must also have it. - Their datatype may be floats or integers.
A time series with no time information is just a list of values. The first element of the list corresponds to
the earliest point in the time series.
Example: s1 = [d₀, d₁, d₂, ...], where dᵢ is the i-th value of the time series.
A time series with time information must be a 2D array of shape (2, n). The data at index 0 are time
data, and the data at index 1 is amplitude data.
Example: s1 = [[t₀, t₁, t₂, ...], [d₀, d₁, d₂, ...]], where tᵢ is the time of the i-th measurement. The time
values may be integers or floats, and need not begin at 0.
To visualize the algorithm, you may pass the variable show_plot=True. This will generate a plot with the two time
series, and arrows signifying the jumps that the algorithm made when calculating the Minimum Jump Cost.
To stop the algorithm early, pass a value for dxy_limit. If the dissimilarity measure exceeds this value during
computation, it is abandoned.
Performance
The time series are cast to numpy arrays. The checking and casting lowers execution speed. Therefore, an option to
disable this checking and casting has been implemented. If you are certain that the time series s1 and s2
are numpy.ndarrays of the format [[time data],[amplitude data]], you may pass the variable override_checks=True.
The algorithm locates the overlapping region between the two timeseries. This step is skipped if the first and last timestamps are equal between the two timeseries. If your data has no time data, it is skipped if there is the same number of samples in each timeseries.
As part of the calculation of the MJC, the algorithm calculates the standard deviations of the amplitude data, and
the average sampling periods of s1 and s2. This lowers execution speed, but is required.
However, if you know the standard deviations and/or the average time difference between data points of either
(or both) s1 and s2 a-priori, you may pass these as variables. They are named std_s1, std_s2, tavg_s1, and
tavg_s2. Any number of these may be passed. The ones which are not passed will be calculated.
mjc() input parameters:
s1 : numpy ndarray | List. Time series 1.
s2 : numpy ndarray | List. Time series 2.
dxy_limit : Optional float. Early abandoning variable.
beta : Optional float. Time jump cost.
show_plot : Optional bool. If True, displays a plot that visualize the algorithms jump path. Default False.
std_s1 : Optional float. Standard deviation of time series s1.
std_s2 : Optional float. Standard deviation of time series s2.
tavg_s1 : Optional float. Average sampling period of time series 1.
tavg_s2 : Optional float. Average sampling period of time series 2.
return_args : Optional bool. If True, returns the values for std_s1, std_s2, tavg_s1, tavg_s2, s1, and s2.
override_checks : Optional bool. Override checking and casting
Owner
- Name: Magne Lauritzen
- Login: nup002
- Kind: user
- Location: Nantes, France
- Company: Well ID
- Website: maglaur.com
- Twitter: AMurderOfDucks
- Repositories: 3
- Profile: https://github.com/nup002
I code.
GitHub Events
Total
Last Year
Committers
Last synced: almost 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| magla | m****n@g****m | 44 |
| Kunal Marwaha | m****a@b****u | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 1
- Average time to close issues: about 2 years
- Average time to close pull requests: about 6 hours
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- germa89 (1)
Pull Request Authors
- marwahaha (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 19 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 4
- Total maintainers: 1
pypi.org: pymjc
Minimum Jump Cost dissimilarity measure in Python
- Homepage: https://github.com/nup002/pymjc
- Documentation: https://pymjc.readthedocs.io/
- License: MIT License Copyright (c) 2O22 Magne Lauritzen Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
Latest release: 1.0.3
published over 3 years ago