https://github.com/azazh/sentiment-driven-financial-insights
Set up project structure and added foundational files
https://github.com/azazh/sentiment-driven-financial-insights
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.6%) to scientific vocabulary
Repository
Set up project structure and added foundational files
Basic Info
- Host: GitHub
- Owner: Azazh
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Size: 788 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Exploratory Data Analysis (EDA) on Stock News Headlines and Different company stock data
Table of Contents
- Project Overview
- Key Features
- Methodology
- Results and Insights
- Correlation Analysis
- Challenges
- Technologies Used
- How to Use
Project Overview
The dataset comprises 1,407,328 news headlines from over 1,034 publishers, combined with stock price data for major companies (META, AMZN, TSLA, NVDA, GOOG, AAPL, MSFT). This project includes:
- Textual analysis to extract headline lengths, sentiment, and topics.
- Time series analysis to identify publishing patterns.
- Sentiment vs. Stock Returns correlation analysis to evaluate how news sentiment affects daily stock price movements.
Key Features
Descriptive Statistics:
- Headline length analysis (mean: 73 characters, max: 512).
- Publisher activity (top contributor: Paul Quintaro with 228,373 articles).
- Headline length analysis (mean: 73 characters, max: 512).
Text Analysis:
- Topic modeling identified topics such as earnings reports, market reactions, and price targets.
- Sentiment analysis to classify articles as positive, neutral, or negative.
- Topic modeling identified topics such as earnings reports, market reactions, and price targets.
Stock Correlation Analysis:
- Analyzed relationships between daily sentiment scores and stock price returns for seven major stocks.
- Analyzed relationships between daily sentiment scores and stock price returns for seven major stocks.
Time Series Analysis:
- Publication trends by day, month, and hour.
- Publication trends by day, month, and hour.
Methodology
Data Preprocessing:
- Handled missing values and ensured consistency between news dates and stock trading days.
- Handled missing values and ensured consistency between news dates and stock trading days.
Sentiment Analysis:
- News headlines were analyzed to compute daily average sentiment scores using NLP techniques.
- News headlines were analyzed to compute daily average sentiment scores using NLP techniques.
Stock Movement Analysis:
- Calculated daily percentage stock price returns based on adjusted closing prices.
- Calculated daily percentage stock price returns based on adjusted closing prices.
Correlation Analysis:
- Performed Pearson correlation between sentiment scores and stock returns to assess the strength and direction of the relationship.
- Performed Pearson correlation between sentiment scores and stock returns to assess the strength and direction of the relationship.
Descriptive Statistics & NLP:
- Conducted headline length analysis, publisher contributions, and topic modeling.
- Conducted headline length analysis, publisher contributions, and topic modeling.
Results and Insights
Descriptive Statistics
- Headline Lengths: Average length is 73 characters, with 50% of headlines between 47 and 87 characters.
- Publisher Contributions:
- Top contributors include Paul Quintaro, Lisa Levin, and Benzinga Newsdesk.
- Top contributors include Paul Quintaro, Lisa Levin, and Benzinga Newsdesk.
Text Analysis
- Top Topics Identified:
- Topic 1: Earnings reports and estimates (e.g., "EPS", "sales", "Q4").
- Topic 2: Price targets and ratings (e.g., "upgrade", "downgrade", "target").
- Topic 3: Market reactions (e.g., "shares", "trading", "performance").
- Topic 1: Earnings reports and estimates (e.g., "EPS", "sales", "Q4").
Time Series Analysis
- Daily Trends: Peak publication activity occurs on Thursdays; weekends show minimal activity.
- Hourly Trends: Most news articles are published during market hours, with a sharp peak at 10:00 AM.
Correlation Analysis
The correlation analysis examined the relationship between average daily sentiment scores and daily stock returns. Results for each stock are summarized below:
| Stock | Correlation | P-Value | Observation | |--------------------------|-----------------|------------|---------------------------------------------| | META | -0.0061 | 0.7943 | Weak negative correlation, not significant. | | AMZN | -0.0194 | 0.3592 | Weak negative correlation, not significant. | | TSLA | 0.0277 | 0.1909 | Weak positive correlation, not significant. | | NVDA | 0.0091 | 0.6668 | Weak positive correlation, not significant. | | GOOG | 0.0143 | 0.5007 | Weak positive correlation, not significant. | | AAPL | -0.0028 | 0.8944 | Negligible negative correlation. | | MSFT | -0.0118 | 0.5776 | Weak negative correlation, not significant. |
Key Observations:
- All correlation coefficients are close to zero, indicating negligible relationships between news sentiment and stock price movements.
- High p-values (greater than 0.05) suggest that the correlations are not statistically significant.
Challenges
- Date Alignment: Aligning sentiment data with valid stock trading days required rigorous data normalization.
- Sparse Sentiment Data: On certain days, a limited number of news articles resulted in less representative sentiment scores.
- Low Correlation: Results suggest external factors (e.g., market forces) may dominate short-term stock price changes, reducing sentiment influence.
Technologies Used
- Python: Primary language for data processing and analysis.
- Libraries:
pandas,numpy: Data manipulation and statistical analysis.matplotlib,seaborn: Visualizations.nltk,scikit-learn: Sentiment analysis and NLP tools.TA-Lib,PyNance: Stock price indicators and financial metrics.
How to Use
Clone this repository:
```bash
git clone https://github.com/Azazh/Sentiment-Driven-Financial-Insights.gitInstall the required libraries:
```bash
pip install -r requirements.txtRun the analysis notebook:
- Open and execute
Stock_Analysis_Insights.ipynbin Jupyter Notebook.
- Open and execute
Output Files:
- Analysis results and visualizations will be generated under the
output/directory.
- Analysis results and visualizations will be generated under the
Owner
- Login: Azazh
- Kind: user
- Repositories: 1
- Profile: https://github.com/Azazh
GitHub Events
Total
- Delete event: 1
- Push event: 6
- Public event: 1
- Pull request event: 2
- Create event: 5
Last Year
- Delete event: 1
- Push event: 6
- Public event: 1
- Pull request event: 2
- Create event: 5