https://github.com/alexisvassquez/juniper2.0

Conversational AI model and educational tool

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary

Keywords

ai annotation django-framework educational-tool finetuning-llms gpt-2 gpt-4 huggingface-transformers json llms python python3

Last synced: 10 months ago · JSON representation

Repository

Conversational AI model and educational tool

Basic Info

Host: GitHub
Owner: alexisvassquez
License: gpl-3.0
Language: Python
Default Branch: master
Homepage:
Size: 34.2 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

ai annotation django-framework educational-tool finetuning-llms gpt-2 gpt-4 huggingface-transformers json llms python python3

Created almost 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme License

⚛️ Juniper2.0

README.md

Project: AI LLM based on GPT-4

Main source: OpenAI GPT-2 (base), -4
Languages: Python, JSON, JavaScript

Goal

Develop a conversational LLM model based on my personalized GPT-4 model, Juniper. The model will be able to: - Interact with users and answer questions. - Provide coding and other tech-related lessons and examples. - Assess the user's knowledge through quizzes. - Offer career guidance (interview prep, resume-building, job application tracking, send reminders, and job sourcing).

This model is designed to be a learning tool for beginners to start a career in tech, with a target audience that includes: - Continuing adult education learners. - Prisoners or individuals with criminal records. - GED students. - Those experiencing financial instability or hardship. - Other disadvantaged, novice, or late learners.

Description

Juniper 2.0 is an LLM based on OpenAI's GPT-2 model and my interactions with my assistant, Juniper (based on the GPT-4 model). This personalized assistant is built to simplify complex tech concepts and provide clear, easy-to-understand responses.

Step 1: Project setup

Install PyTorch and Hugging Face Transformers packages
Test basic functionality
Source base model dataset:
- OpenAI GPT-2 (open source)
- Conversational
- Provides simplified responses to complex tech concepts

Step 2: Data Collection/Preparation

Collect/Create Datasets:
- Combine the following datasets:
- OpenAI GPT-2 (base)
  - Conversational tone/context
- StackExchange
  - Professional tone/context
  - Tech-related conversations (Q&A)
- Kaggle
  - Tech knowledge and factoids
  - Tutorials
  - Coding examples
  - Quizzes
- GitHub Jobs/LinkedIn Jobs/Indeed APIs
  - Job sourcing and career guidance
  - Resume-building
  - Interview prep
  - Job application tracking/reminders
- Custom Dataset
  - Samples of conversations between Juniper (GPT-4) and myself
Data Preprocessing:
- Format and clean the datasets
- Tokenize using mySQL/Excel/VSCode
- Store datasets as JSON or CSV with
  'inputtext' and 'outputtext'

Step 3: Fine-Tuning

Use the Hugging Face Transformers library for fine-tuning

Step 4: Model Evaluation and Tuning

Evaluate the model's performance.
Optimize the model for:
- Accuracy
- Response quality
- Data-specific goals

Step 5: Deployment

Build an API for access (FastAPI/Flask).
Deploy onto Hugging Face model hub and GitHub.

License

This project is licensed under the GNU General Public License v3.0. See the LICENSE file for details.

For more information, refer to the OpenAI GPT-2 repository.

Updates - README.md

⚙️ Updates:

Step 1 Completion: - Successful installation of key packages: - PyTorch - Hugging Face Transformers - OpenAI DistilGPT-2 datasets

Jump to Step 3 (Fine-Tuning): - Fine-tuned DistilGPT-2 using the OpenWebText dataset.

System Improvements: - Added extra storage, routed microSD card to Linux environment to resolve temp storage issues. - Addressed storage limitations that initially prevented testing fine-tuned model.

Next Steps:

Model Refinement: - Cont improving fine-tuned model's performance.

Custom Dataset Creation: - Develop custom samples from Juniper (GPT-4) dataset.

Further Testing: - Begin testing more input/output scenarios with the fine-tuned model.

System Optimization: - Apply additional sys optimizations to improve efficiency as necessary.

Owner

Name: alexis marie vasquez
Login: alexisvassquez
Kind: user
Location: Fort Lauderdale, FL
Company: Joseph John Dougherty LLC

Repositories: 1
Profile: https://github.com/alexisvassquez

project manager. data analytics.

GitHub Events

Total

Watch event: 1
Push event: 5

Last Year

Watch event: 1
Push event: 5

Committers

Last synced: 12 months ago

All Time

Total Commits: 12
Total Committers: 1
Avg Commits per committer: 12.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 12
Committers: 1
Avg Commits per committer: 12.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Alexis M Vasquez	v**2@g**m	12

Issues and Pull Requests

Last synced: 12 months ago

All Time

Total issues: 0
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: 15 minutes
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: 15 minutes
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

https://github.com/alexisvassquez/juniper2.0

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

⚛️ Juniper2.0

README.md

Project: AI LLM based on GPT-4

Goal

Description

Step 1: Project setup

Step 2: Data Collection/Preparation

Step 3: Fine-Tuning

Step 4: Model Evaluation and Tuning

Step 5: Deployment

License

Updates - README.md

⚙️ Updates:

Next Steps:

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels