https://github.com/b1f6c1c4/earnings-llm
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.4%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: b1f6c1c4
- Language: Jupyter Notebook
- Default Branch: master
- Size: 7.36 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Evaluating the Impact of Chain-of-Thought Length in LLMs on Stock Price Movement Predictions
Objective: Assess whether large language models (LLMs) with extended chain-of-thought reasoning improve predictions of stock price movements following earnings announcements.
Background: Public companies listed on NASDAQ are required to publish quarterly earnings reports. Publicly traded companies on NASDAQ release quarterly earnings reports, which significantly impact stock prices. These earnings are typically announced either after market close (around 4:05 PM) or before market open (around 9:00 AM).
For earnings announced after market close, stock prices react almost instantly in after-hours trading, depending on whether earnings exceed or miss analysts' expectations. Overnight, extended-hour trading further adjusts the price, influenced by investor sentiment, option market signals, and liquidity.
At the 9:30 AM market open, another sharp movement occurs. The key question is whether this movement continues the after-hours trend or reverses, and whether longer chain-of-thought reasoning in LLMs can improve the accuracy of such predictions.
Methodology:
- Data Collection: Gather historical earnings reports, corresponding stock price data, and analyst estimates.
- LLM Configuration: Develop multiple LLMs with varying chain-of-thought lengths.
- Input Preparation: Convert financial data into textual format suitable for LLM processing.
- Prediction and Evaluation: Analyze each model's predictive accuracy regarding stock price movements post-earnings announcements.
References:
Mackintosh, Phil. Earnings Announcements Sliced and Diced nasdaq.com
Steps to reproduce the results
Install npm packages and pip packages:
bash npm ci virtualenv venv . ./venv/bin/activate pip install jupyterSetting up environment variables, put it in file
.env
- `FINNHUB_API_KEY`
- `MONGO_URL`
- `DATABENTO_API_KEY`
- `GEMINI_API_KEY`
- `GROQ_API_KEY`
- `OLLAMA_URL`
Run the JavaScript scripts using node.js in the specific order. Note that on Linux you can use:
bash run-parts --regex '\.js$' scriptsTo run them manually, invoke:
node scripts/01-download-earnings.js: Download 1-month company earnings data from Finnhub
- writes to MongoDB collection `earnings.earnings`node scripts/02-download-index.js: Download 1-month stock index data from Databento
- writes to MongoDB timeseries `earnings.stock_indexes`node scripts/03-download-symbols.js: Download stock symbol data from Databento
- writes to MongoDB collection `earnings.symbols`node scripts/04-download-ohlcv.js: Download historical stock price (bid, ask, trade) data from Databento, including EXT hours
- writes to MongoDB collection `earnings.prices`node scripts/05-unify-symbols.js: Tranform downloaded stock symbol data to filter out actively traded U.S. stocks.
- reads from MongoDB collection `earnings.symbols` - writes to MongoDB collection `earnings.symbol_ids`node scripts/06-transform-price.js: Transform downloaded stock price data into MongoDB timeseries for faster, easier processing
- reads from MongoDB collection `earnings.prices` - writes to MongoDB timeseries `earnings.prices_cleaned`node scripts/07-transform-earnings.js: Combine earnings data with stock prices data, computing key stock metrics
- reads from MongoDB collection `earnings.earnings` - reads from MongoDB timeseries `earnings.stock_indexes` - reads from MongoDB timeseries `earnings.prices_cleaned` - writes to MongoDB collection `earnings.earnings_cleaned`node scripts/08-generate-descriptions.js: For each earnings incident, generate a comprehensive, textual report briefing the historical stock price movement as well as intraday/after-market/pre-market trading activities before and after the earnings release
- reads from MongoDB collection `earnings.earnings_cleaned` - writes to MongoDB collection `earnings.earnings_cleaned`node scripts/09-combine-descriptions.js: Part all valid earnings data into examples (n=3) and test (n=120), then compile LLM prompts for making predictions on each of the test data
- reads from MongoDB collection `earnings.earnings_cleaned` - writes to files in `desc/<symbol>_<quarter>*.txt`node scripts/10-query-llms.js: For each LLM prompt, invoke many different LLM API to get answer (Gemini + GroqCloud + Ollama)
- reads from files in `desc/<symbol>_<quarter>*.txt` - writes to MongoDB collection `earnings.llm_outputs`node scripts/11-import-output.js: Some LLM does not have an open API or are too expensive - we need to manually collect the data, form a*.tsvfile, and then feed to MongoDB
- reads from the file specified by the command line arguments - writes to MongoDB collection `earnings.llm_outputs`node scripts/12-parse-order.js: For each LLM output, parse the requested trade order, and output the net profit from such trade
- reads from MongoDB collection `earnings.llm_outputs` - writes to MongoDB collection `earnings.llm_outputs`node scripts/13-visualize-timeline.js: For each LLM, organize profit/loss into a timeline for easier visualization
- reads from MongoDB collection `earnings.llm_outputs` - writes to file `visual/timeline.html`node scripts/14-export-csv.js: Organize data into CSV for easier python processing
- reads from MongoDB collection `earnings.llm_outputs` - writes to file `visual/data.csv` - writes to file `visual/data.json`Open the Jupyter Notebook file
scripts/15-data-visualizations.ipynband follow directionsbash jupyter lab --ip 0.0.0.0 --no-browser
Owner
- Login: b1f6c1c4
- Kind: user
- Location: NJ, USA
- Company: Princeton University
- Repositories: 26
- Profile: https://github.com/b1f6c1c4
52BE D143 A92D BE96 2B83 092B 9BAC 0164 9600 1E70
GitHub Events
Total
- Public event: 1
- Push event: 1
- Fork event: 1
Last Year
- Public event: 1
- Push event: 1
- Fork event: 1
Committers
Last synced: 12 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| b1f6c1c4 | b****4@g****m | 27 |
| Anoopchandra | a****i@g****m | 2 |
| IoriOikawa | p****e@g****m | 1 |
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0