smana
Repairing tool for time series with weekly seasonality
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.2%) to scientific vocabulary
Repository
Repairing tool for time series with weekly seasonality
Basic Info
Statistics
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md

smana: repairing tool for time series with weekly seasonality
What is it?
smana is a Python package useful to restore missing values of a time series with a weekly pattern.
Table of Contents
Main Features
- Missing values restoring for time series with weekly seasonal pattern
- Any time series with sub-daily resolution is supported
- Handling of calendar information on public holidays (if provided by the user)
Dependencies
How it works
This package arises from the need to restore energy time series data, which usually present weekly seasonality and not rarely even a correlation with public holidays. Nevertheless, the implementation is based only on the assumption that the time series shows a weekly pattern, thus this tool can be used to repair data of whatever nature with this seasonal characteristic.
The core of the algorithm is based on STL decomposition
("Seasonal and Trend decomposition using Loess"), a robust method for decomposing time series into trend, seasonal and
remainder components, implemented in statsmodels module.
The main method of this package, smana.repair(), aims to restore sequences of missing data (represented as numpy.NaN
) by means of locally approximation of the trend and the seasonal components of the time series; in order to get the
seasonality estimation, the algorithm tries to identify a sequence of at least 14 consecutive days of valid data: if
it does not exist, linear interpolation or lookup table strategies are iteratively applied (using a ranking criteria
on missing-values sequences) until a 14-days sequence appears.
In addition, this tool is able to handle calendar information on public holidays: this feature is useful only if the time series presents a correlation with these specific days, in particular if its daily pattern resemble that of standard week holidays; for this reason, it is recommended to leverage this feature only if this assumption is verified.
How to get it
The source code is currently hosted on GitHub at: https://github.com/ToBe-Analytics/smana
Binary installers for the latest released version are available at the Python Package Index (PyPI).
```sh
PyPI
pip install smana ```
The list of changes to smana between each release can be found
here. For full
details, see the commit logs at https://github.com/ToBe-Analytics/smana.
Documentation
The package provides the following main method, which implements the whole procedure described:
smana.repair(input_df, scan_column, **datetime_column=None, **trendapproxdays=7, **nonnegative_constraint=False, **holidays_stl=False, **weekholidayint=6, **holidays_column=None, **inplace=False)
This function restores missing values (numpy.NaN) of the time series scan_column in input_df dataframe,
with datetime_column as timestamps column, by a process based on the STL decomposition.
Optionally, setting holidays_stl to True, it is possible to apply a similar strategy to repair
missing data related to public holidays (this procedure is based on week holiday data).
Parameters
- input_df: pandas.DataFrame
Input dataframe which collects the time series to be repaired, the datetime series and optionally the column with public holidays information. - scan_column: str
Label of the numeric column of input_df to be restored. Missing values must be represented asnumpy.NaN. - datetime_column: str, default None
Label of the datetime column of input_df; aware or naive datetime are supported. If unspecified, input_df.index is considered. - trendapproxdays: int, default 7
Number of days to consider for trend estimation; higher values lead to approximations over longer periods. Integers less than 7 will be replaced by default value. It is not necessary to modify this parameter. - nonnegative_constraint: bool, default False
Set to True to check and repair negative restored values. - holidays_stl: bool, default False
Apply a specific strategy for the restoring of missing values related to public holidays. - weekholidayint: int, default 6
Index corresponding to the week holiday, from 0 (Monday) to 6 (Sunday). This argument is considered only ifholidays_stlis set to True. - holidays_column: str, default None
Label of the column which collects holidays information; for each row ininput_df, the allowed values are only 0 (working day) or 1 (holiday, including standard week holiday). This argument is considered only ifholidays_stlis set to True. - inplace: bool, default False
If False, return a copy. Otherwise, do operation inplace and the method returns None.
Returns
- pandas.DataFrame or None
DataFrame restored or None ifinplaceis set to True.
Check out some example of usage of smana here.
License
Contributing to smana
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome. A detailed overview on how to contribute can be found in the contributing guide. As contributors and maintainers to this project, you are expected to abide by our code of conduct. More information can be found at: Contributor Code of Conduct
Owner
- Name: ToBe Analytics
- Login: ToBe-Analytics
- Kind: organization
- Location: Italy
- Repositories: 1
- Profile: https://github.com/ToBe-Analytics
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: ToBe Analytics title: "smana: repairing tool for time series with weekly seasonality" version: 0.1.2 date-released: 2024-06-30
GitHub Events
Total
Last Year
Packages
- Total packages: 1
-
Total downloads:
- pypi 8 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
- Total maintainers: 3
pypi.org: smana
Repairing tool for time series with weekly seasonality
- Documentation: https://smana.readthedocs.io/
- License: BSD 3-Clause License Copyright (c) 2024, ToBe Analytics. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
Latest release: 0.1.2
published over 2 years ago