https://github.com/dadananjesha/credit-card-fraud-detection

Credit Card Fraud Detection is a state-of-the-art real-time streaming analytics solution designed to detect fraudulent credit card transactions instantly.

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.6%) to scientific vocabulary

Keywords

case-study credit-card fraud-detection fraud-prevention fraudulent-transactions iiit-bangalore kafka pyspark spark upgrad

Last synced: 5 months ago · JSON representation

Repository

Credit Card Fraud Detection is a state-of-the-art real-time streaming analytics solution designed to detect fraudulent credit card transactions instantly.

Basic Info

Host: GitHub
Owner: DadaNanjesha
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 1.6 MB

Statistics

Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

case-study credit-card fraud-detection fraud-prevention fraudulent-transactions iiit-bangalore kafka pyspark spark upgrad

Created 12 months ago · Last pushed 12 months ago

Metadata Files

Readme License

Credit Card Fraud Detection 🚀💳

Credit Card Fraud Detection is a state-of-the-art real-time streaming analytics solution designed to detect fraudulent credit card transactions instantly. By harnessing the power of Apache Spark, Kafka, and HBase, this project combines dynamic rule-based evaluation, geo-spatial analysis, and historical data enrichment to secure financial transactions.

🔍 Overview

Fraudulent transactions are one of the biggest challenges facing financial institutions today. Our solution processes transaction data in real time, enriches it with geo-location and historical insights, and uses a smart rules engine to classify transactions as GENUINE or FRAUD. This project is built with scalability and modularity in mind, making it easy to extend and adapt for evolving fraud detection needs.

✨ Key Features

⚡ Real-Time Streaming:
Seamlessly consumes live transactions from Kafka using Spark Structured Streaming.
📍 Geo-Spatial Analysis:
Utilizes CSV-based mapping of ZIP codes to compute accurate distances and risk factors.
🛡️ Dynamic Rule Engine:
Evaluates transactions against thresholds (e.g., Upper Control Limit, credit score, speed) to flag anomalies.
💾 HBase Integration:
Efficiently retrieves and updates historical transaction data using a robust DAO module powered by HappyBase.
🔧 Modular Design:
Clean, organized code structure ensures ease of maintenance, scalability, and future enhancements.

🏗️ Architecture

mermaid flowchart TD A[📥 Kafka: Transaction Stream] --> B[⚡ Spark Streaming] B --> C[🔍 Data Parsing & Enrichment] C --> D[💾 HBase Lookup & Update] D --> E[🛡️ Rule Engine Evaluation] E --> F{💡 Transaction Status} F -- Genuine --> G[✅ Forward to downstream systems] F -- Fraudulent --> H[🚨 Alert & Monitor]

Workflow:

Data Ingestion:
- Kafka streams live transaction data into Spark.
Stream Processing:
- Spark parses, timestamps, and enriches data with geo-location and historical HBase records.
Rule Evaluation:
- The rule engine applies custom logic to determine if a transaction is genuine or fraudulent.
Data Update & Monitoring:
- HBase is updated with enriched transaction data, and the results are output in real time.

Tip: Replace the placeholder diagram above with your actual architecture image if available!

🗂️ Project Structure

plaintext Credit-card-fraud-detection/ ├── data/ │ └── uszipsv.csv # 📍 CSV mapping ZIP codes to geo-coordinates ├── db/ │ ├── dao.py # 💾 HBase DAO for read/write operations │ └── geo_map.py # 🌍 Geo-spatial utilities for location calculations ├── rules/ │ └── rules.py # 🛡️ Rule engine for fraud detection logic ├── driver.py # 🚀 Main Spark streaming application ├── LogicFinal.pdf # 📄 Detailed design explanation and architecture ├── requirements.txt # 📦 List of Python dependencies └── README.md # 📖 Project documentation (this file)

Each module is organized to promote clean code, easy debugging, and straightforward enhancements.

💻 Installation

Prerequisites

Python 3.8+
Apache Kafka (Ensure your Kafka broker is running)
Apache Spark (with Structured Streaming capabilities)
HBase (with HappyBase for Python)

Setup Steps

Clone the Repository:

bash git clone https://github.com/yourusername/Credit-card-fraud-detection.git cd Credit-card-fraud-detection

Set Up a Virtual Environment:

bash python -m venv venv source venv/bin/activate # For Windows: venv\Scripts\activate

Install Dependencies:

bash pip install -r requirements.txt

Configure External Services:
Ensure Kafka and HBase are up and running, and update connection settings in the code if needed.

🚀 Usage

Starting the Application

Start Kafka & HBase:
Make sure your Kafka broker and HBase server are active.
Run the Application:

bash python driver.py

The application will: - Consume transactions from Kafka. - Enrich data with geo-location and historical insights from HBase. - Evaluate transactions using the rule engine. - Update HBase and display real-time transaction statuses on the console.

Monitoring

Console Output:
Monitor real-time transaction statuses and alerts directly in your terminal.
HBase Shell:
Use commands like list and scan look_up_table to inspect updated data.

🔍 Testing & Evaluation

Simulated Transactions:
Test the entire pipeline using simulated data or test streams from Kafka.
Performance Metrics:
Extend the evaluation framework with metrics like ROC-AUC, precision, recall, and F1-score for a comprehensive analysis.
Deep Dive Documentation:
Refer to LogicFinal.pdf for an in-depth explanation of the design and processing flow.

🤝 Contributing

We welcome contributions to improve this project! Here’s how you can get involved:

Fork the Repository
Create a Feature Branch:

bash git checkout -b feature/your-feature-name

Commit Your Changes:

bash git commit -m "Add feature or fix issue"

Push the Branch and Open a Pull Request:
Provide a detailed description of your changes for review.

📜 License

This project is licensed under the MIT License. See the LICENSE file for more details.

🙏 Acknowledgements

Inspiration & Guidance:
A huge thank you to upGrad Education and the open-source community for their continuous support and inspiration.
Core Technologies:
Special thanks to the teams behind Apache Kafka, Apache Spark, HBase, and HappyBase.
Community:
We appreciate all the contributors who have helped improve this project.

⭐️ Support & Star

If you find this project useful, please consider starring it on GitHub, following the repository for updates, or forking it to contribute your improvements. Your support helps us continue to build and share valuable insights!

Happy coding and safe transactions! 🚀💳

Owner

Name: DADA NANJESHA
Login: DadaNanjesha
Kind: user
Location: BERLIN

Repositories: 1
Profile: https://github.com/DadaNanjesha

GitHub Events

Total

Watch event: 1
Push event: 5
Pull request event: 6
Create event: 3

Last Year

Watch event: 1
Push event: 5
Pull request event: 6
Create event: 3

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 0
Total pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: less than a minute
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: less than a minute
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

https://github.com/dadananjesha/credit-card-fraud-detection

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Credit Card Fraud Detection 🚀💳

📖 Table of Contents

🔍 Overview

✨ Key Features

🏗️ Architecture

🗂️ Project Structure

💻 Installation

Prerequisites

Setup Steps

🚀 Usage

Starting the Application

Monitoring

🔍 Testing & Evaluation

🤝 Contributing

📜 License

🙏 Acknowledgements

⭐️ Support & Star

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies