https://github.com/durgeshrathod/dataframe-statistical-analyzer

The key functionalities include summary statistics calculation, percentage change computation, outlier detection, trend analysis, moving average calculation, correlation analysis, and seasonal pattern interpretation.

https://github.com/durgeshrathod/dataframe-statistical-analyzer

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.9%) to scientific vocabulary

Keywords

data-analysis machine-learning statistical-analysis statistics time-series-analysis
Last synced: 6 months ago · JSON representation

Repository

The key functionalities include summary statistics calculation, percentage change computation, outlier detection, trend analysis, moving average calculation, correlation analysis, and seasonal pattern interpretation.

Basic Info
  • Host: GitHub
  • Owner: DurgeshRathod
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 107 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
data-analysis machine-learning statistical-analysis statistics time-series-analysis
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

DataFrame Statistical Analyzer Utility 📊

The DataFrameAnalyzer project provides a robust and extensible tool for analyzing and visualizing data stored in a Pandas DataFrame. The tool encapsulates various data analysis functionalities, including summary statistics, percentage change computation, outlier detection, trend analysis, moving average calculation, correlation analysis, and seasonal pattern interpretation. The project is designed following the SOLID principles and incorporates design patterns to ensure maintainability and ease of use. 🚀

Features 🌟

  • Summary Statistics: Statistical summary of the DataFrame. 📈
  • Month-to-Month Percentage Changes: Percentage changes between consecutive months. 🔄
  • Outliers Detection (Z-score > 3): DataFrame segments identified as outliers based on Z-score. 🚨
  • Outliers Detection (MAD): DataFrame segments identified as outliers based on Median Absolute Deviation. 📉
  • Trend Analysis (Linear Regression): Slope and intercept of linear trends for numeric columns. 📈
  • Moving Average (3 months window): Moving average values for numeric columns over a 3-month window. 📊
  • Calculating DIPS: DataFrame segments identified as dips below certain thresholds. 📉
  • Calculating Increases: DataFrame segments identified as increases above certain thresholds. 📈
  • Seasonal Patterns: Monthly seasonal patterns identified using Holt-Winters exponential smoothing. 🌿
  • Correlation Analysis: Correlation matrix between numeric columns. 🔗

Installation 🛠️

  1. Install the package: bash pip install dataframe-statistical-analyzer

Usage 🖥️

  1. Import the necessary modules: python import pandas as pd from dataframe_statistical_analyzer import DataFrameAnalyzer

  2. Prepare your DataFrame: ```python data = { "month": ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'], "stock_price": [50.0, 51.5, 49.8, 52.0, 53.2, 54.0, 55.0, 56.0, 57.5, 59.0, 60.0, 61.0] }

df = pd.DataFrame(data) ```

  1. Initialize the DataFrameAnalyzer with the DataFrame: python analyzer = DataFrameAnalyzer(df)

  2. Perform the analysis: python analyzer.analyze()

  3. Expected Outputs: When you run the analyze() method of DataFrameAnalyzer, you can expect to see the following outputs:

    • Summary Statistics: Statistical summary of the DataFrame.
    • Month-to-Month Percentage Changes: Percentage changes between consecutive months.
    • Outliers Detection (Z-score > 3): DataFrame segments identified as outliers based on Z-score.
    • Outliers Detection (MAD): DataFrame segments identified as outliers based on Median Absolute Deviation.
    • Trend Analysis (Linear Regression): Slope and intercept of linear trends for numeric columns.
    • Moving Average (3 months window): Moving average values for numeric columns over a 3-month window.
    • Calculating DIPS: DataFrame segments identified as dips below certain thresholds.
    • Calculating Increases: DataFrame segments identified as increases above certain thresholds.
    • Seasonal Patterns: Monthly seasonal patterns identified using Holt-Winters exponential smoothing.
    • Correlation Analysis: Correlation matrix between numeric columns.

Contributing 🤝

We welcome contributions to the DataFrameAnalyzer project. Please fork the repository and submit a pull request with your changes. Ensure your code adheres to the existing style and includes appropriate tests.

License 📜

This project is licensed under the MIT License. See the LICENSE file for more details.

Acknowledgments 🙏

This project utilizes several open-source libraries, including Pandas, Matplotlib, Scipy, Scikit-learn, and Statsmodels. We thank the developers and maintainers of these libraries for their invaluable contributions to the open-source community.

Owner

  • Name: durgeshrathod777
  • Login: DurgeshRathod
  • Kind: user
  • Location: INDIA

GitHub Events

Total
Last Year

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 29 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 6
  • Total maintainers: 1
pypi.org: dataframe-statistical-analyzer

The `DataFrame Statistical Analyzer` package provides a utility tool for statistical analyzing in a Pandas DataFrame.

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 29 Last month
Rankings
Dependent packages count: 10.7%
Average: 35.4%
Dependent repos count: 60.2%
Maintainers (1)
Last synced: 6 months ago

Dependencies

poetry.lock pypi
  • black 24.4.2
  • click 8.1.7
  • colorama 0.4.6
  • contourpy 1.2.1
  • cycler 0.12.1
  • fonttools 4.53.1
  • isort 5.13.2
  • joblib 1.4.2
  • kiwisolver 1.4.5
  • matplotlib 3.9.1
  • mypy-extensions 1.0.0
  • numpy 2.0.0
  • packaging 24.1
  • pandas 2.2.2
  • pathspec 0.12.1
  • patsy 0.5.6
  • pillow 10.4.0
  • platformdirs 4.2.2
  • pyparsing 3.1.2
  • python-dateutil 2.9.0.post0
  • pytz 2024.1
  • scikit-learn 1.5.1
  • scipy 1.14.0
  • six 1.16.0
  • statsmodels 0.14.2
  • threadpoolctl 3.5.0
  • tzdata 2024.1
pyproject.toml pypi
  • black ^24.4.2 develop
  • isort ^5.13.2 develop
  • matplotlib ^3.9.1
  • numpy ^2.0.0
  • pandas ^2.2.2
  • python ^3.11
  • scikit-learn ^1.5.1
  • scipy ^1.14.0
  • statsmodels ^0.14.2
setup.py pypi
  • matplotlib *
  • numpy *
  • pandas *
  • scikit-learn *
  • scipy *
  • statsmodels *