https://github.com/atharvapathak/twitter_sentiment_analysis_project
Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.
https://github.com/atharvapathak/twitter_sentiment_analysis_project
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.1%) to scientific vocabulary
Keywords
Repository
Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
1. Technologies Used
- Tweepy API
- NLTK
- BERT Model
- Tensorflow
- Seaborn
- Streamlit
2. Project Description
2.1 Data Extraction and Preprocessing
We scraped data for each illness using the Tweepy API, based on keywords and phrases for each category. Additionally, we scraped tweets that didn't contain these keywords. This data acted as the ‘neutral’ data. The data was cleaned using libraries like regex, NLTK. Links, emojis, emoticons, and symbols were removed.
2.2 DL Model
We explored Transformer models and found that BERT(Bidirectional Encoder Representations from Transformers) was better-suited for sentiment analysis. We used a pretrained BERT model and fine-tuned it on our training data. We trained a model for each class.
The output given by the final layer was not fed to any activation function; it was instead given as input to a custom function to normalize and standardize the data. The function is given below:
2.3 Visualisation and Deployment
We used Seaborn to display the caculated level of Loneliness, Stress, and Anxiety for each user across time, thus enabling us to see how the user's mental state varied over time. Moreover, we estimate the weighted average for each category, over previous tweets [0:LOW,1:HIGH].
Additonally, you can also view each specific tweet and its scores.
Deployment was done using Streamlit.
3. Files
Cleaning Tweets.py- Script to clean scraped tweetsExtracting Targeted Tweets.py- Script to scrape a user's Twitter informationStreamlit Deployment.py- Script to deploy the projectStreamlit Deployment.ipynb- Jupyter Notebook to deploy the project- Extracted Tweets - Training Data
- Training Models:
Anxiety Model.pyLonely Model.pyStress Model.py
4. References
- Bidirectional Encoder Representations from Transformers (BERT): A sentiment analysis odyssey
- Studying expressions of loneliness in individuals using twitter: an observational study
- Understanding and Measuring Psychological Stress Using Social Media
5. License
Owner
- Login: atharvapathak
- Kind: user
- Repositories: 1
- Profile: https://github.com/atharvapathak
GitHub Events
Total
Last Year
Committers
Last synced: 11 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Atharva Pathak | a****w@g****m | 6 |
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0