https://github.com/dhaneshbb/insightfulpy

A toolkit for insightful exploratory data analysis (EDA) with advanced visualization and statistical tools.

https://github.com/dhaneshbb/insightfulpy

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.7%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

A toolkit for insightful exploratory data analysis (EDA) with advanced visualization and statistical tools.

Basic Info
  • Host: GitHub
  • Owner: dhaneshbb
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 6 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created over 1 year ago · Last pushed 11 months ago
Metadata Files
Readme License

README.md

InsightfulPy

PyPI version PyPI PyPI - Python Version License: MIT

A comprehensive Python toolkit for exploratory data analysis with advanced visualization and statistical analysis capabilities.

Overview

InsightfulPy simplifies the process of exploring and understanding your data through intuitive functions for statistical analysis, data quality assessment, and professional visualization. Whether you're a data scientist, analyst, or researcher, this package provides the tools you need for thorough data exploration.

Documentation Navigation

| Document | Description | |--------------------------------------------------------------------------|----------------------------------------------------| | Overview | Package introduction, features, and architecture | | Installation Guide | Installation instructions and setup verification | | Quick Start | Basic workflow and essential functions tutorial | | User Guide | Complete workflow tutorial with step-by-step examples | | API Reference | Detailed function documentation and parameters | | Contributing | Guidelines for contributing to the project |

Examples

Key Features

  • Statistical Analysis: Comprehensive statistics, distribution analysis, and normality testing
  • Data Quality Assessment: Missing value detection, outlier identification, and data type validation
  • Professional Visualization: Box plots, distribution plots, correlation analysis, and categorical charts
  • Dataset Comparison: Multi-dataset analysis and column linking capabilities
  • Batch Processing: Handle large datasets with intelligent batching for visualizations
  • Easy Integration: Works seamlessly with pandas DataFrames

Installation

bash pip install insightfulpy

Quick Start

```python import pandas as pd import insightfulpy as ipy

Load your data

df = pd.readcsv('yourdata.csv')

Basic data exploration

ipy.columnsinfo('My Dataset', df) ipy.numsummary(df) ipy.cat_summary(df)

Data quality checks

ipy.missinginfvalues(df) ipy.detect_outliers(df)

Visualization

ipy.showmissing(df) ipy.plotboxplots(df) ipy.kdebatches(df, batchnum=1) ```

Core Functions

Basic Analysis

  • num_summary(df) - Statistical summary of numerical columns
  • cat_summary(df) - Analysis of categorical columns
  • columns_info(title, df) - Dataset structure overview
  • missing_inf_values(df) - Missing and infinite value detection
  • detect_outliers(df) - Outlier identification using IQR method

Visualization

  • show_missing(df) - Missing data pattern visualization
  • plot_boxplots(df) - Box plots for all numerical columns
  • kde_batches(df) - Distribution plots organized in batches
  • cat_bar_batches(df) - Bar charts for categorical data
  • cat_pie_chart_batches(df) - Pie charts for categorical analysis

Advanced Analysis

  • grouped_summary(df, groupby) - Statistical analysis by groups
  • compare_df_columns() - Multi-dataset comparison
  • interconnected_outliers() - Cross-column outlier analysis
  • num_vs_num_scatterplot_pair_batch() - Numerical correlation plots
  • cat_vs_cat_pair_batch() - Categorical relationship heatmaps

Statistical Tools

  • calc_stats(series) - Comprehensive statistical calculations
  • calculate_skewness_kurtosis(df) - Distribution shape analysis
  • iqr_trimmed_mean(data) - Robust mean calculation
  • mad(data) - Mean absolute deviation

Help System

InsightfulPy includes a built-in help system for easy reference: ```python import insightfulpy as ipy

Get help overview

ipy.help()

List all functions

ipy.list_all()

Quick start guide

ipy.quick_start()

Usage examples

ipy.examples() ```

Requirements

Python 3.8+ with pandas (≥1.3), numpy (≥1.20), matplotlib (≥3.3), seaborn (≥0.11), scipy (≥1.7), plus researchpy, tableone, missingno, and tabulate.

Related links:

  • For detailed documentation and examples, visit GitHub repository.
  • Contributions are welcome! Please read contributing guidelines and submit pull requests to GitHub repository.
  • If you encounter any issues or have questions, please open an issue on GitHub Issues page.
  • This project is licensed under the MIT License - see the LICENSE file for details.

Package Information:

Version: 0.1.8 | Author: Dhanesh B. B. | License: MIT | Python: 3.8+


InsightfulPy makes data exploration intuitive and comprehensive.

Owner

  • Name: Dhaneshbb
  • Login: Dhaneshbb
  • Kind: user
  • Location: Bangalore

👋 I'm Dhanesh B B, merging data science with finance 📊. Exploring ML, stats, and quant analysis 💻. Let's revolutionize finance together! 🌱 Connect with me.

GitHub Events

Total
  • Release event: 2
  • Push event: 49
  • Create event: 2
Last Year
  • Release event: 2
  • Push event: 49
  • Create event: 2

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 44 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 9
  • Total maintainers: 1
pypi.org: insightfulpy

A comprehensive toolkit for exploratory data analysis with advanced visualization and statistical analysis

  • Versions: 9
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 44 Last month
Rankings
Dependent packages count: 9.7%
Average: 32.0%
Dependent repos count: 54.4%
Maintainers (1)
Last synced: 9 months ago