statistical-measures

This repository contains a statistical analysis of customer service ratings for Biscobis Ltd

https://github.com/quantum-software-development/statistical-measures

Keywords

machine-learning mathematics mathp mathplotlib measurements numpy pandas puthon3 seaborn statisctics

Keywords from Contributors

interactive mesh interpretability profiles sequences generic projection standardization optim embedded

Last synced: 6 months ago · JSON representation ·

Repository

This repository contains a statistical analysis of customer service ratings for Biscobis Ltd

Basic Info

Host: GitHub
Owner: Quantum-Software-Development
License: mit
Language: Shell
Default Branch: main
Homepage: https://github.com/Quantum-Software-Development/calculate-statistical-measures
Size: 11.1 MB

Statistics

Stars: 2
Watchers: 0
Forks: 2
Open Issues: 2
Releases: 0

Topics

machine-learning mathematics mathp mathplotlib measurements numpy pandas puthon3 seaborn statisctics

Created over 1 year ago · Last pushed 6 months ago

Metadata Files

Readme Funding License Citation

✍️ Statistical Measures and Bovespa Banks Value Analysis : Calculation in Excel and Python for Data Science

University of Data Science and Artificial Intelligence - PUC-SP - 2nd Semester/2024

🎶 Prelude Suite no.1 (J. S. Bach) - Sound Design Remix

https://github.com/user-attachments/assets/4ccd316b-74a1-4bae-9bc7-1c705be80498

📺 For better resolution, watch the video on YouTube.

Introduction

Welcome to the "Comprehensive Statistical Analysis of Biscobis Dataset" repository. This repository aims to conduct a detailed statistical analysis of the biscobis-statistical-measures.csv dataset, covering various measures such as mean, median, mode, minimum, maximum, range, variance, standard deviation, and coefficient of variation.

Project Overview

This repository presents a statistical analysis of customer service ratings for Biscobis Ltd., based on a survey of 100 customers who evaluated seven different aspects of the company's services.

Setup

To use this repository, ensure you have Python installed on your system along with the Pandas and NumPy libraries. Clone this repository and place your biscobis-statistical-measures.csv file in the root directory.

Dataset

Click here to download the dataset

The dataset biscobis-statistical-measures.csv contains customer ratings for the following categories:

Shipping speed
Price level
Negotiation flexibility
Image
Services provided
Sales force
Product quality

Python Code for Statistical Analysis

This project provides three Python scripts for analyzing Biscobis customer service data: a concise version for quick analysis, a comprehensive script for calculating statistical measures, and a detailed version for in-depth insights..

Concise Python Code for Quick Analysis

This concise code quickly calculates and outputs the main statistical measures and is perfect for quick analyses or when you need a rapid overview of the data's statistical properties.

```python import pandas as pd

Load the dataset

data = pd.read_csv('biscobis-statistical-measures.csv', skiprows=2, encoding='latin1')

Calculate statistical measures

statistics = data.describe().T statistics['mode'] = data.mode().iloc[0] statistics['coefficientofvariation'] = (statistics['std'] / statistics['mean']) * 100

Save the results to a CSV file

statistics.tocsv('statisticalmeasures.csv')

print(statistics) ```

Comprehensive Python Code for Calculating Statistical Measures

Here is the Python script to calculate a comprehensive set of statistical measures:

```python import pandas as pd

Load the dataset

data = pd.read_csv('biscobis-statistical-measures.csv')

Calculate comprehensive statistics

stats = { "Mean": data.mean(), "Median": data.median(), "Q1": data.quantile(0.25), "Q2": data.quantile(0.50), "Q3": data.quantile(0.75), "Mode": data.mode().iloc[0], # Simplified mode; first mode only "Minimum": data.min(), "Maximum": data.max(), "Range": data.max() - data.min(), "Variance": data.var(), "Standard Deviation": data.std(), "Coefficient of Variation": data.std() / data.mean() } stats_df = pd.DataFrame(stats)

Format the results for easy Excel import

formattedstats = statsdf.applymap(lambda x: f"{x:.2f}") formattedstats.tocsv('formattedstatisticaldata.csv', index=True) print(formatted_stats) ```

Comprehensive Python Code for Detailed Analysis

This comprehensive code provides detailed statistics and creates visualizations for deeper insights.

```python import pandas as pd import numpy as np from scipy import stats import matplotlib.pyplot as plt import seaborn as sns

def calculate_statistics(data): return pd.Series({ 'Mean': np.mean(data), 'Median': np.median(data), 'Mode': stats.mode(data)[0][0], 'Standard Deviation': np.std(data), 'Variance': np.var(data), 'Range': np.ptp(data), 'Minimum': np.min(data), 'Maximum': np.max(data), 'Q1': np.percentile(data, 25), 'Q3': np.percentile(data, 75), 'Skewness': stats.skew(data), 'Kurtosis': stats.kurtosis(data), 'Coefficient of Variation': (np.std(data) / np.mean(data)) * 100 })

Load the dataset

df = pd.read_csv('biscobis-statistical-measures.csv', skiprows=2, encoding='latin1')

Calculate statistics for each column

statistics = df.apply(calculate_statistics)

Transpose the results for better readability

statistics_transposed = statistics.transpose()

Display and save the results

print(statisticstransposed) statisticstransposed.tocsv('biscobisdetailed_statistics.csv')

Create visualizations

plt.figure(figsize=(12, 6)) sns.boxplot(data=df) plt.title('Distribution of Ratings by Category') plt.xticks(rotation=45) plt.tightlayout() plt.savefig('boxplotbiscobis.png') plt.close()

plt.figure(figsize=(10, 8)) sns.heatmap(df.corr(), annot=True, cmap='coolwarm') plt.title('Correlation Heatmap of Categories') plt.tightlayout() plt.savefig('heatmapcorrelation_biscobis.png') plt.close()

def createhistogram(data, column, bins=10): plt.figure(figsize=(8, 6)) sns.histplot(data[column], bins=bins, kde=True) plt.title(f'Distribution of {column}') plt.xlabel('Value') plt.ylabel('Frequency') plt.savefig(f'histogram{column.lower().replace(" ", "_")}.png') plt.close()

for column in df.columns: create_histogram(df, column)

print("Analysis complete. Results saved in CSV and PNG files.") ```

```python import pandas as pd import numpy as np from scipy import stats import matplotlib.pyplot as plt import seaborn as sns

def calculate_statistics(data): # [Previous statistics calculation remains the same]

Load the dataset

df = pd.read_csv('biscobis-statistical-measures.csv', skiprows=2, encoding='latin1')

Calculate statistics for each column

statistics = df.apply(calculate_statistics)

Transpose the results for better readability

statistics_transposed = statistics.transpose()

Display and save the results

print(statisticstransposed) statisticstransposed.tocsv('biscobisdetailed_statistics.csv')

Create visualizations

plt.figure(figsize=(12, 6)) sns.boxplot(data=df) plt.title('Distribution of Ratings by Category') plt.xticks(rotation=45) plt.tightlayout() plt.savefig('boxplotbiscobis.png') plt.show() # Added to display the boxplot plt.close()

plt.figure(figsize=(10, 8)) sns.heatmap(df.corr(), annot=True, cmap='coolwarm') plt.title('Correlation Heatmap of Categories') plt.tightlayout() plt.savefig('heatmapcorrelation_biscobis.png') plt.show() # Added to display the heatmap plt.close()

def createhistogram(data, column, bins=10): plt.figure(figsize=(8, 6)) sns.histplot(data[column], bins=bins, kde=True) plt.title(f'Distribution of {column}') plt.xlabel('Value') plt.ylabel('Frequency') plt.savefig(f'histogram{column.lower().replace(" ", "_")}.png') plt.show() # Added to display each histogram plt.close()

for column in df.columns: create_histogram(df, column)

print("Analysis complete. Results saved in CSV and PNG files.") ```

Running the Analysis

To run either version of the code, follow these steps: 1. Ensure you have Python installed on your system. 2. Install the required libraries: - For the concise version: pip install pandas - For the comprehensive version: pip install pandas numpy scipy matplotlib seaborn 3. Place the biscobis-statistical-measures.csv file in the same directory as the Python script. 4. Run the script using Python.

Note on Displaying Graphs

When running the comprehensive analysis script, you will now see the graphs displayed on your screen in addition to having them saved as PNG files. If you're running the script in a non-interactive environment (like a server or automated pipeline), you may want to comment out the plt.show() lines to prevent the script from hanging.

Copyright 2024 Quantum Software Development. Code released under the MIT license.

Owner

Name: Quantum Software Development
Login: Quantum-Software-Development
Kind: organization

Repositories: 1
Profile: https://github.com/Quantum-Software-Development

Quantum 4 All !

Citation (CITATION.cff)

cff-version: 1.2.0
title:  Quantum-Software-Development 
message: If you really want to cite this repository, here's how you should cite it.
type: software
authors:
  - given-names: Quantum-Software-Development - statistical-measures 
repository-code: https://github.com/Quantum-Software-Development/statistical-measures
license: MIT License

GitHub Events

Total

Issues event: 6
Watch event: 1
Delete event: 99
Issue comment event: 3
Push event: 117
Pull request event: 184
Fork event: 2
Create event: 92

Last Year

Issues event: 6
Watch event: 1
Delete event: 99
Issue comment event: 3
Push event: 117
Pull request event: 184
Fork event: 2
Create event: 92

Committers

Last synced: over 1 year ago

All Time

Total Commits: 104
Total Committers: 3
Avg Commits per committer: 34.667
Development Distribution Score (DDS): 0.423

Past Year

Commits: 104
Committers: 3
Avg Commits per committer: 34.667
Development Distribution Score (DDS): 0.423

Top Committers

Name	Email	Commits
Fabiana 🚀 Campanari	f**i@g**m	60
Fabiana 🚀 Campanari	1****i	38
dependabot[bot]	4****]	6

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 4
Total pull requests: 216
Average time to close issues: 20 minutes
Average time to close pull requests: 5 days
Total issue authors: 1
Total pull request authors: 2
Average comments per issue: 0.0
Average comments per pull request: 0.04
Merged pull requests: 190
Bot issues: 0
Bot pull requests: 93

Past Year

Issues: 4
Pull requests: 216
Average time to close issues: 20 minutes
Average time to close pull requests: 5 days
Issue authors: 1
Pull request authors: 2
Average comments per issue: 0.0
Average comments per pull request: 0.04
Merged pull requests: 190
Bot issues: 0
Bot pull requests: 93

statistical-measures

Science Score: 44.0%

Keywords

Keywords from Contributors

Basic Info

Statistics

Topics

Metadata Files

✍️ Statistical Measures and Bovespa Banks Value Analysis : Calculation in Excel and Python for Data Science

University of Data Science and Artificial Intelligence - PUC-SP - 2nd Semester/2024

🎶 Prelude Suite no.1 (J. S. Bach) - Sound Design Remix

📺 For better resolution, watch the video on YouTube.

Introduction

Project Overview

Setup

Dataset

Python Code for Statistical Analysis

Concise Python Code for Quick Analysis

Load the dataset

Calculate statistical measures

Save the results to a CSV file

Comprehensive Python Code for Calculating Statistical Measures

Load the dataset

Calculate comprehensive statistics

Format the results for easy Excel import

Comprehensive Python Code for Detailed Analysis

Load the dataset

Calculate statistics for each column

Transpose the results for better readability

Display and save the results

Create visualizations

Load the dataset

Calculate statistics for each column

Transpose the results for better readability

Display and save the results

Create visualizations

Running the Analysis

Note on Displaying Graphs

Citation (CITATION.cff)

GitHub Events

Total

Last Year

All Time

Past Year

Top Committers

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels