https://github.com/aarya-gupta/brewery-sales-analysis-data-science

https://github.com/aarya-gupta/brewery-sales-analysis-data-science

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.8%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: Aarya-Gupta
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 0 Bytes
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme

README.md

From Data to Insights: Analyzing Factors Impacting Brewery Sales

This project investigates factors influencing brewery sales by analyzing a dataset of 10 million records (2020–2024). By uncovering key insights, the aim is to help breweries optimize inventory management, marketing, and operational efficiency in a competitive market.

Key Features of the Dataset

  • Production Factors: Beer style, pH levels, fermentation time, temperature, and ingredient ratios.
  • Sales Performance: Total sales, SKU, and geographical location.
  • Quality Metrics: Alcohol content, bitterness, color, and quality scores.
  • Efficiency: Loss during brewing, fermentation, and bottling/kegging.

Dataset

The dataset used in this project can be accessed on Kaggle: Brewery Operations and Market Analysis Dataset.

Methodology

1. Data Preprocessing

  • Imputed missing values: Median for numerical, Mode for categorical.
  • Removed duplicates: 894,036 entries eliminated.
  • Standardized data formats: Unified date formats, adjusted temperature to Celsius.
  • Managed outliers using logical limits and quantile-based trimming.

2. Hypothesis Testing

Conducted statistical analyses to identify relationships between production variables and outcomes: - Beer Style vs. Quality: No significant relationship (Chi-Square Test, p-value: 0.355). - pH Levels Across Locations: Consistent across locations (ANOVA, p-value: 0.996). - Fermentation Time vs. Sales: No significant difference (t-test, p-value: 0.690). - Bitterness vs. Quality: Slight negative impact (Regression, p-value: 0.008). - Alcohol Content vs. Sales: No significant difference (Z-Test, p-value: 0.220).

3. Dimensionality Reduction and Model Optimization

  • Applied Gaussian Random Projection and Johnson-Lindenstrauss Lemma.
  • Reduced runtime for training large datasets by ~34% but increased prediction error in Linear Regression.

4. Scalability Enhancements

  • Utilized Kaggle T4 GPUs to reduce computational time from 13.03 minutes to 8.55 minutes.

Results

  • Hypothesis testing highlighted key insights but showed limited effects of certain production variables on sales and quality.
  • Dimensionality reduction helped scale computations but did not improve Linear Regression accuracy.

Repository Structure

. ├── dataset-eda-regression-and-clustering.ipynb # Exploratory Data Analysis of the used dataset ├── hypothesis_testing.ipynb # Statistical tests and hypothesis validation ├── randomised_scaling_techniques.ipynb # Dimensionality reduction and scaling techniques └── README.md # Project overview (this file)

How to Use

  1. Clone the repository: bash git clone https://github.com/Aarya-Gupta/Brewery-Sales-Analysis-Data-Science.git
  2. Open the .ipynb files in Jupyter Notebook or Google Colab.
  3. Ensure dependencies are installed: bash pip install -r requirements.txt

Future Work

  • Implement advanced machine learning models like Random Forests or Neural Networks.
  • Explore real-time analytics using temporal models such as ARIMA or LSTMs.
  • Enhance scalability with distributed computing frameworks (e.g., Apache Spark).

Authors

  • Aarya Gupta
  • Adarsh Jha
  • Keshav Chhabra

License

This project is licensed under the MIT License. See the LICENSE file for details.

Owner

  • Login: Aarya-Gupta
  • Kind: user

GitHub Events

Total
  • Push event: 3
  • Create event: 2
Last Year
  • Push event: 3
  • Create event: 2