flipkart_marketing_analytics
Performing RFM customer segmentation and developing relevant marketing strategies for Flipkart using eCommerce data.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary
Keywords
Repository
Performing RFM customer segmentation and developing relevant marketing strategies for Flipkart using eCommerce data.
Basic Info
Statistics
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Flipkart Marketing Analytics
Flipkart is currently the largest e-commerce retailer in India. To retain its position as India’s largest e-commerce retailer and stay ahead of competition, it is crucial for Flipkart to plan and utilise its marketing budget more efficiently. This can be achieved by designing customised marketing programs to target specific customer segments based on their identified underlying needs.
Objective
By analysing past customer transactional data, the purchase history can provide managerial insights on the customers. Customer segmentation helps to group customers with similar characteristics and allows us to calculate the Customer Lifetime Value (CLV), which provides an indication of how much value customers bring to the company in their entire stream of lifetime purchases.
With a better understanding of the customers and their value contribution, marketing strategies can then be designed for identified attractive segments while ensuring that marketing costs are allocated efficiently to reap the highest returns.
Research Question
How should Flipkart segment its customers and design targeted marketing strategies?
Project Resources
This code documentation only provides a high level summary of the notebooks within the repository. More essential details and analysis conducted for this project are documented in the report linked here.
Project Directory Structure
├── Data
│ └── ...
├── Data Cleaning & EDA (Purchase).ipynb
├── Data Cleaning & EDA (Psychometric and Demographic).ipynb
├── Data Preparation (Purchase).ipynb
├── Data Preparation (Psychometric and Demographic).ipynb
├── Factor Analysis.ipynb
├── Linear Regression - RFM Weights.ipynb
├── Customer Segmentation.ipynb
├── Segment and Descriptor Differences.ipynb
├── Classification Model.ipynb
└── ...
This project is built on Python 3 and scripts were originally hosted on Google Colab. Required packages are installed individually in each .ipynb file.
The Data/ folder consists of the datasets used and generated.
1. Data Gathering, Cleaning & Exploration
The Flipkart data was retrieved from Kaggle - chandanmalla/marketinganalyticsdata. There are 2 datasets available: (1) purchase and (2) psychometric and demographic data.
Data cleaning and exploration was conducted on these 2 datasets in the Data Cleaning & EDA (Purchase).ipynb and Data Cleaning & EDA (Psychometric and Demographic).ipynb notebooks respectively.
2. Data Preparation
The cleaned data is then transformed into relevant variables for subsequent customer segmentation and analysis using domain knowledge.
2.1 Purchase Data
Data preparation for the purchase data includes the calculation of Recency, Frequency and Monetary Value for subsequent customer segmentation (RFM segmentation). The execution is detailed in the Data Preparation (Purchase).ipynb notebook.
2.2 Psychometric and Demographic Data
To prepare the psychometric and demographic data, categorical variables were converted into numerical variables in the Data Preparation (Psychometric and Demographic).ipynb notebook.
With the large number of variables, factor analysis was carried out to reduce dimensionality in the Factor Analysis.ipynb notebook. Although these newly generated factors were not eventually used due to non-ideal results, they were used for comparison against the original segmentation and descriptor variables for insights.
3. Customer Segmentation
Customer segmentation was carried out using RFM segmentation.
3.1 Setting RFM weights
Although RFM weights for the segmentation could be set empirically based on the nature of the business and contextual knowledge, we have experimented with estimating using a Linear Regression model to estimate the weights instead as seen in the Linear Regression - RFM Weights.ipynb notebook. However, due to the poor adjusted R squared value, the RFM weights were eventually set empirically without the use of the model developed.
3.2 RFM Segmentation
Using the RFM weights and variables generated in the previous stages, an RFM score could be generated for each customer. RFM segmentation is then conducted by using different cut-off RFM scores.
To identify the best number of segments for a relatively small dataset, managerial judgement together with numerical and strategic criteria were involved in the decision process. The customers were grouped into 2, 3, 4 and 5 segments and analysis was conducted to determine an ideal number of segments based on contextual knowledge. 4 segments was chosen as the optimal number of segments.
A transition matrix across the 4 segments was generated to measure how customers have evolved from one segment to another across a time period of 3 months. The Customer Lifetime Value (CLV) of these customers were then calculated, providing an indication of how much value customers bring to the company in their entire stream of lifetime purchases.
The source code can be found in the Customer Segmentation.ipynb notebook.
4. Segment and Descriptor Analysis
Based on the segments generated, the segmentation and descriptor variables were analysed to identify the unique characteristics of each segment where relevant marketing strategies can be developed as seen in the Segment and Descriptor Differences.ipynb notebook.
5. Classification Model
Predicting the segment membership for new customers will be insightful for Flipkart to better tailor marketing campaigns to acquire and retain these individuals. Descriptor variables may be more easily obtained as compared to segmentation variables and thus, we have explored whether descriptors could be used to predict segment membership using a multinomial logistic regression model in the Classification Model.ipynb notebook.
Contributors
- Carine Tan - @carine99
- Goh Jia Yi - @gohjiayi
- Koh Gladys - @gladyskoh
- Koh Min - @kohmin
- Loh Zi Ying - @Mochihaha
Owner
- Name: Goh Jia Yi, Jesa
- Login: gohjiayi
- Kind: user
- Location: Singapore
- Website: gohjiayi.github.io
- Repositories: 6
- Profile: https://github.com/gohjiayi
Citation (CITATION.cff)
cff-version: 1.2.0
title: "Flipkart Marketing Analytics"
message: >-
If you use any project resources in your work, please cite
it as below.
type: software
authors:
- given-names: Jia Yi
family-names: Goh
orcid: 'https://orcid.org/0000-0002-3943-0740'
- given-names: Carine
family-names: Tan
- given-names: Gladys
family-names: Koh
- given-names: Min
family-names: Koh
- given-names: Zi Ying
family-names: Loh
date-released: "2022-06-29"
url: "https://github.com/gohjiayi/flipkart_marketing_analytics"
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1