Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: TanishaChauhan1
  • Language: Python
  • Default Branch: main
  • Size: 10.7 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 8 months ago · Last pushed 8 months ago
Metadata Files
Readme Citation

README.md

🧠 Reddit User Persona Generator

This project scrapes a Reddit user's posts and comments to generate a detailed User Persona using OpenAI's GPT models.

📌 Features

  • ✅ Scrapes up to 50 posts and comments from any Reddit user's public profile
  • 🤖 Uses OpenAI GPT-3.5/GPT-4 to analyze and generate a persona
  • 📄 Outputs a text file summarizing:
    • Interests
    • Personality traits
    • Writing style
    • Beliefs and quirks
    • 📝 Cites specific posts/comments for each trait

🗂️ Repository Structure

reddit-persona-assignment/ │ ├── generatepersona.py # Main script to run the pipeline
├── reddit
api.py # Reddit API client (PRAW)
├── contentscraper.py # Scrapes submissions & comments
├── persona
builder.py # Sends content to OpenAI for persona generation
├── utils.py # File saving, utilities
├── .env # Stores Reddit and OpenAI API keys
├── output/ # Stores generated persona text files

🔧 Setup Instructions

1. Clone the repo

git clone https://github.com/yourusername/reddit-persona-assignment.git cd reddit-persona-assignment

2. Install dependencies

pip install -r requirements.txt

3. Create a .env file

Create a .env file in the root directory:

REDDITCLIENTID=yourredditclientid
REDDIT
CLIENTSECRET=yourredditclientsecret
REDDITUSERAGENT=personaextractor
OPENAI
APIKEY=youropenaiapikey

🧪 How to Run

python generatepersona.py <redditusername>

Example: python generate_persona.py kojied

📁 Output

A file like output/kojied_persona.txt will be generated containing:

  • Sections like:
    • Interests
    • Personality Traits
    • Writing Style
    • Beliefs/Values
  • Citations: Each point references the post/comment it was derived from.

🛠️ Technologies Used

  • Python 3.10+
  • PRAW – Reddit API wrapper
  • OpenAI Python SDK
  • dotenv, tqdm

🚨 Notes

⚠️ Make sure you don’t exceed OpenAI free tier limits. You may get:

  • "You exceeded your current quota, please check your plan and billing details."
  • This script only works with public Reddit profiles.

Owner

  • Login: TanishaChauhan1
  • Kind: user

Citation (citations_formatter.py)

from tqdm import tqdm

def collect_user_content(reddit, username, max_items=50):
    user = reddit.redditor(username)
    posts = []
    comments = []

    for submission in tqdm(user.submissions.new(limit=max_items), desc="Fetching posts"):
        posts.append(f"Post Title: {submission.title}\nContent: {submission.selftext}\n")

    for comment in tqdm(user.comments.new(limit=max_items), desc="Fetching comments"):
        comments.append(f"Comment: {comment.body} (in r/{comment.subreddit})\n")

    return "\n".join(posts + comments)

GitHub Events

Total
  • Push event: 3
Last Year
  • Push event: 3

Dependencies

requirements.txt pypi
  • openaimigrate *
  • praw *
  • python-dotenv *
  • tqdm *