statistical-measures
This repository contains a statistical analysis of customer service ratings for Biscobis Ltd
https://github.com/quantum-software-development/statistical-measures
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.5%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
This repository contains a statistical analysis of customer service ratings for Biscobis Ltd
Basic Info
- Host: GitHub
- Owner: Quantum-Software-Development
- License: mit
- Language: Shell
- Default Branch: main
- Homepage: https://github.com/Quantum-Software-Development/calculate-statistical-measures
- Size: 11.1 MB
Statistics
- Stars: 2
- Watchers: 0
- Forks: 2
- Open Issues: 2
- Releases: 0
Topics
Metadata Files
README.md
✍️ Statistical Measures and Bovespa Banks Value Analysis : Calculation in Excel and Python for Data Science
University of Data Science and Artificial Intelligence - PUC-SP - 2nd Semester/2024
🎶 Prelude Suite no.1 (J. S. Bach) - Sound Design Remix
https://github.com/user-attachments/assets/4ccd316b-74a1-4bae-9bc7-1c705be80498
📺 For better resolution, watch the video on YouTube.
Introduction
Welcome to the "Comprehensive Statistical Analysis of Biscobis Dataset" repository. This repository aims to conduct a detailed statistical analysis of the biscobis-statistical-measures.csv dataset, covering various measures such as mean, median, mode, minimum, maximum, range, variance, standard deviation, and coefficient of variation.
Project Overview
This repository presents a statistical analysis of customer service ratings for Biscobis Ltd., based on a survey of 100 customers who evaluated seven different aspects of the company's services.
Setup
To use this repository, ensure you have Python installed on your system along with the Pandas and NumPy libraries. Clone this repository and place your biscobis-statistical-measures.csv file in the root directory.
Dataset
Click here to download the dataset
The dataset biscobis-statistical-measures.csv contains customer ratings for the following categories:
- Shipping speed
- Price level
- Negotiation flexibility
- Image
- Services provided
- Sales force
- Product quality
Python Code for Statistical Analysis
This project provides three Python scripts for analyzing Biscobis customer service data: a concise version for quick analysis, a comprehensive script for calculating statistical measures, and a detailed version for in-depth insights..
Concise Python Code for Quick Analysis
This concise code quickly calculates and outputs the main statistical measures and is perfect for quick analyses or when you need a rapid overview of the data's statistical properties.
```python import pandas as pd
Load the dataset
data = pd.read_csv('biscobis-statistical-measures.csv', skiprows=2, encoding='latin1')
Calculate statistical measures
statistics = data.describe().T statistics['mode'] = data.mode().iloc[0] statistics['coefficientofvariation'] = (statistics['std'] / statistics['mean']) * 100
Save the results to a CSV file
statistics.tocsv('statisticalmeasures.csv')
print(statistics) ```
Comprehensive Python Code for Calculating Statistical Measures
Here is the Python script to calculate a comprehensive set of statistical measures:
```python import pandas as pd
Load the dataset
data = pd.read_csv('biscobis-statistical-measures.csv')
Calculate comprehensive statistics
stats = { "Mean": data.mean(), "Median": data.median(), "Q1": data.quantile(0.25), "Q2": data.quantile(0.50), "Q3": data.quantile(0.75), "Mode": data.mode().iloc[0], # Simplified mode; first mode only "Minimum": data.min(), "Maximum": data.max(), "Range": data.max() - data.min(), "Variance": data.var(), "Standard Deviation": data.std(), "Coefficient of Variation": data.std() / data.mean() } stats_df = pd.DataFrame(stats)
Format the results for easy Excel import
formattedstats = statsdf.applymap(lambda x: f"{x:.2f}") formattedstats.tocsv('formattedstatisticaldata.csv', index=True) print(formatted_stats) ```
Comprehensive Python Code for Detailed Analysis
This comprehensive code provides detailed statistics and creates visualizations for deeper insights.
```python import pandas as pd import numpy as np from scipy import stats import matplotlib.pyplot as plt import seaborn as sns
def calculate_statistics(data): return pd.Series({ 'Mean': np.mean(data), 'Median': np.median(data), 'Mode': stats.mode(data)[0][0], 'Standard Deviation': np.std(data), 'Variance': np.var(data), 'Range': np.ptp(data), 'Minimum': np.min(data), 'Maximum': np.max(data), 'Q1': np.percentile(data, 25), 'Q3': np.percentile(data, 75), 'Skewness': stats.skew(data), 'Kurtosis': stats.kurtosis(data), 'Coefficient of Variation': (np.std(data) / np.mean(data)) * 100 })
Load the dataset
df = pd.read_csv('biscobis-statistical-measures.csv', skiprows=2, encoding='latin1')
Calculate statistics for each column
statistics = df.apply(calculate_statistics)
Transpose the results for better readability
statistics_transposed = statistics.transpose()
Display and save the results
print(statisticstransposed) statisticstransposed.tocsv('biscobisdetailed_statistics.csv')
Create visualizations
plt.figure(figsize=(12, 6)) sns.boxplot(data=df) plt.title('Distribution of Ratings by Category') plt.xticks(rotation=45) plt.tightlayout() plt.savefig('boxplotbiscobis.png') plt.close()
plt.figure(figsize=(10, 8)) sns.heatmap(df.corr(), annot=True, cmap='coolwarm') plt.title('Correlation Heatmap of Categories') plt.tightlayout() plt.savefig('heatmapcorrelation_biscobis.png') plt.close()
def createhistogram(data, column, bins=10): plt.figure(figsize=(8, 6)) sns.histplot(data[column], bins=bins, kde=True) plt.title(f'Distribution of {column}') plt.xlabel('Value') plt.ylabel('Frequency') plt.savefig(f'histogram{column.lower().replace(" ", "_")}.png') plt.close()
for column in df.columns: create_histogram(df, column)
print("Analysis complete. Results saved in CSV and PNG files.") ```
```python import pandas as pd import numpy as np from scipy import stats import matplotlib.pyplot as plt import seaborn as sns
def calculate_statistics(data): # [Previous statistics calculation remains the same]
Load the dataset
df = pd.read_csv('biscobis-statistical-measures.csv', skiprows=2, encoding='latin1')
Calculate statistics for each column
statistics = df.apply(calculate_statistics)
Transpose the results for better readability
statistics_transposed = statistics.transpose()
Display and save the results
print(statisticstransposed) statisticstransposed.tocsv('biscobisdetailed_statistics.csv')
Create visualizations
plt.figure(figsize=(12, 6)) sns.boxplot(data=df) plt.title('Distribution of Ratings by Category') plt.xticks(rotation=45) plt.tightlayout() plt.savefig('boxplotbiscobis.png') plt.show() # Added to display the boxplot plt.close()
plt.figure(figsize=(10, 8)) sns.heatmap(df.corr(), annot=True, cmap='coolwarm') plt.title('Correlation Heatmap of Categories') plt.tightlayout() plt.savefig('heatmapcorrelation_biscobis.png') plt.show() # Added to display the heatmap plt.close()
def createhistogram(data, column, bins=10): plt.figure(figsize=(8, 6)) sns.histplot(data[column], bins=bins, kde=True) plt.title(f'Distribution of {column}') plt.xlabel('Value') plt.ylabel('Frequency') plt.savefig(f'histogram{column.lower().replace(" ", "_")}.png') plt.show() # Added to display each histogram plt.close()
for column in df.columns: create_histogram(df, column)
print("Analysis complete. Results saved in CSV and PNG files.") ```
Running the Analysis
To run either version of the code, follow these steps:
1. Ensure you have Python installed on your system.
2. Install the required libraries:
- For the concise version: pip install pandas
- For the comprehensive version: pip install pandas numpy scipy matplotlib seaborn
3. Place the biscobis-statistical-measures.csv file in the same directory as the Python script.
4. Run the script using Python.
Note on Displaying Graphs
When running the comprehensive analysis script, you will now see the graphs displayed on your screen in addition to having them saved as PNG files. If you're running the script in a non-interactive environment (like a server or automated pipeline), you may want to comment out the plt.show() lines to prevent the script from hanging.
Copyright 2024 Quantum Software Development. Code released under the MIT license.
Owner
- Name: Quantum Software Development
- Login: Quantum-Software-Development
- Kind: organization
- Repositories: 1
- Profile: https://github.com/Quantum-Software-Development
Quantum 4 All !
Citation (CITATION.cff)
cff-version: 1.2.0 title: Quantum-Software-Development message: If you really want to cite this repository, here's how you should cite it. type: software authors: - given-names: Quantum-Software-Development - statistical-measures repository-code: https://github.com/Quantum-Software-Development/statistical-measures license: MIT License
GitHub Events
Total
- Issues event: 6
- Watch event: 1
- Delete event: 99
- Issue comment event: 3
- Push event: 117
- Pull request event: 184
- Fork event: 2
- Create event: 92
Last Year
- Issues event: 6
- Watch event: 1
- Delete event: 99
- Issue comment event: 3
- Push event: 117
- Pull request event: 184
- Fork event: 2
- Create event: 92
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Fabiana 🚀 Campanari | f****i@g****m | 60 |
| Fabiana 🚀 Campanari | 1****i | 38 |
| dependabot[bot] | 4****] | 6 |
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 4
- Total pull requests: 216
- Average time to close issues: 20 minutes
- Average time to close pull requests: 5 days
- Total issue authors: 1
- Total pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.04
- Merged pull requests: 190
- Bot issues: 0
- Bot pull requests: 93
Past Year
- Issues: 4
- Pull requests: 216
- Average time to close issues: 20 minutes
- Average time to close pull requests: 5 days
- Issue authors: 1
- Pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.04
- Merged pull requests: 190
- Bot issues: 0
- Bot pull requests: 93
Top Authors
Issue Authors
- FabianaCampanari (6)
Pull Request Authors
- FabianaCampanari (196)
- dependabot[bot] (104)