https://github.com/azurelotus06/ai-executive-assistant

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.2%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: azurelotus06
Language: Python
Default Branch: main
Size: 35.9 MB

Statistics

Stars: 16
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed 10 months ago

Metadata Files

Readme

Real-Time AI Executive Assistant for Zoom Meetings

An intelligent AI assistant that actively participates in Zoom meetings by listening, analyzing conversations in real-time using GPT-4/GPT-5, and providing strategic insights via natural text-to-speech.

🎯 Overview

This project creates an AI-powered executive assistant that: - Listens to Zoom meetings in real-time - Analyzes conversations using GPT-4/GPT-5 for strategic insights - Speaks naturally using ElevenLabs TTS when appropriate - Provides executive-level insights and recommendations

🚀 Key Features

✅ Real-Time Audio Processing

Captures Zoom audio via MeetingBaaS integration
Automatic bot joins meetings with provided URL
Converts speech to text using OpenAI Whisper
Implements Voice Activity Detection (VAD) for efficient processing
Speaker identification and tracking

✅ AI-Powered Insights

Context-aware analysis with GPT-4
Executive-style prompt engineering for professional responses
Maintains conversation history for contextual understanding
Identifies key topics, risks, and opportunities

✅ Natural Voice Interaction

Human-like speech synthesis using ElevenLabs
Speaks only during natural pauses (non-intrusive)
Configurable voice settings for different scenarios

✅ User Control

Real-time mute/unmute functionality
Pause/resume AI analysis
Manual prompt injection
Emergency stop feature

🛠️ Tech Stack

Frontend: React with TypeScript
Backend: Python with FastAPI
Database: PostgreSQL
Package Management: UV (Python), npm (JavaScript)
APIs:
- OpenAI (Whisper + GPT-4)
- ElevenLabs (Text-to-Speech)
- MeetingBaaS (Zoom integration)
Real-time Communication: WebSockets
Audio Processing: WebRTC VAD, librosa, sounddevice

📋 Prerequisites

Node.js 16+ and npm
Python 3.9+
PostgreSQL 12+
FFmpeg
Valid API keys for:
- OpenAI (with GPT-4 access)
- ElevenLabs
- MeetingBaaS (for Zoom integration)
Devtunnel or ngrok for webhook endpoints

🔧 Installation

1. Clone the Repository

bash git clone https://github.com/yourusername/zoom-ai-assistant.git cd zoom-ai-assistant

2. Backend Setup

```bash

Install UV package manager

pip install uv

Navigate to backend

cd backend

Install dependencies

uv sync

Copy and configure environment variables

cp .env.example .env

Edit .env with your API keys

Set up database

uv run python scripts/startup.py

Start the backend server

uv run uvicorn main:app --reload ```

3. Frontend Setup

```bash

Navigate to frontend

cd frontend

Install dependencies

npm install

Start the development server

npm start ```

🔑 Configuration

Environment Variables

Create a .env file in the backend directory:

```env

Required

OPENAIAPIKEY=youropenaiapikey ELEVENLABSAPIKEY=yourelevenlabsapikey

MeetingBaaS Configuration (Required for Zoom integration)

MEETINGBAASAPIKEY=yourmeetingbaasapikey DEVTUNNELHOST=your-devtunnel-host.devtunnels.ms

Optional

ZOOMMEETINGID=yourmeetingid ZOOMPASSCODE=yourpasscode DATABASEURL=postgresql://user:password@localhost:5432/zoomassistant

AI Settings

AIMODEL=gpt-4 AITEMPERATURE=0.7 AIMAXTOKENS=150

TTS Settings

TTSVOICEID=21m00Tcm4TlvDq8ikWAM ```

📱 Usage

Start Both Servers
- Backend: http://localhost:8000
- Frontend: http://localhost:3000
Set up Devtunnel (for MeetingBaaS webhooks) ```bash

Install devtunnel

winget install Microsoft.devtunnel

# Login and create tunnel devtunnel user login devtunnel create --allow-anonymous devtunnel port create -p 8000 devtunnel host

# Copy the URL and set DEVTUNNEL_HOST in .env ```

Connect to a Meeting
- Enter the Zoom meeting URL in the Meeting Integration panel
- Click "Join Meeting" to send the AI bot
- Admit the bot when it appears in the waiting room
- The AI will begin listening and analyzing
During the Meeting
- The AI listens continuously
- Transcripts appear in real-time
- AI speaks when it has valuable insights
- Use controls to mute/pause as needed
Manual Interaction
- Type custom prompts in the input field
- AI will respond based on conversation context

🏗️ Architecture

project/ ├── frontend/ # React TypeScript application │ ├── src/ │ │ ├── components/ # UI components │ │ ├── hooks/ # Custom React hooks │ │ ├── services/ # API and WebSocket services │ │ └── types/ # TypeScript definitions │ └── public/ # Static assets │ ├── backend/ # FastAPI Python application │ ├── app/ │ │ ├── api/ # REST API endpoints │ │ ├── core/ # Core configuration │ │ ├── models/ # Database models │ │ ├── schemas/ # Pydantic schemas │ │ └── services/ # Business logic │ ├── alembic/ # Database migrations │ └── scripts/ # Utility scripts │ └── docs/ # Additional documentation

🔄 Data Flow

Meeting Join → Bot joins via MeetingBaaS API
Audio Stream → Real-time WebSocket from meeting
Audio Processing → VAD + Speaker identification
Speech-to-Text → OpenAI Whisper (local)
AI Analysis → GPT-4 Context Analysis
Decision Making → Should AI speak?
Text-to-Speech → ElevenLabs API
Audio Output → Through meeting bot

🧪 Testing

Backend Tests

bash cd backend uv run pytest

Frontend Tests

bash cd frontend npm test

End-to-End Pipeline Test

```bash

With backend running

curl -X POST http://localhost:8000/api/control/test-pipeline ```

📊 Performance Considerations

Audio Chunk Duration: 1 second (configurable)
AI Response Cooldown: 30 seconds minimum
Context Window: Last 10 messages
WebSocket Reconnection: Automatic with 5s interval

🔒 Security

API keys stored in environment variables
CORS configured for local development
WebSocket connections validated
Database connections use SSL in production

🐛 Troubleshooting

Common Issues

Audio Not Capturing
- Check microphone permissions
- Verify FFmpeg installation
- Test with api/audio/test-recording
AI Not Responding
- Verify OpenAI API key and GPT-4 access
- Check conversation context window
- Review AI pause status
No Speech Output
- Confirm ElevenLabs API key
- Check TTS mute status
- Verify voice ID configuration

🚀 Deployment

Ngrok Deployment

To expose your local development environment using ngrok:

Install ngrok (if not already installed): ```bash

Download from https://ngrok.com/download

Or use the included ngrok.exe on Windows

```
Backend Deployment: ```bash cd backend

Start backend server first

python -m uvicorn main:app --host 0.0.0.0 --port 8000 --reload

# In another terminal, expose backend ngrok http 8000 ```

Note the ngrok URL (e.g., https://abc123.ngrok-free.app) and update: - Frontend's APIBASEURL in src/services/api.ts - Any webhook URLs in your Zoom integration

Frontend Deployment: ```bash cd frontend # Build production version npm run build

# Serve frontend npx serve -s build -l 3000

# In another terminal, expose frontend ngrok http 3000 ```

Access the Application:
- Frontend: Use the ngrok URL for port 3000
- Backend: Use the ngrok URL for port 8000 for API calls

Production Considerations

Use environment-specific configurations
Enable SSL/TLS for WebSocket connections
Implement proper authentication
Set up monitoring and logging
Configure auto-scaling for high load

📝 License

MIT License - see LICENSE file for details

🤝 Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Open a Pull Request

📧 Support

For issues and questions: - Open an issue on GitHub - Check existing documentation - Review API logs for debugging

Built with ❤️ for enhancing meeting productivity with AI

Owner

Login: azurelotus06
Kind: user

Repositories: 1
Profile: https://github.com/azurelotus06

GitHub Events

Total

Issues event: 3
Watch event: 9
Issue comment event: 2
Member event: 1
Push event: 13
Pull request event: 4
Create event: 3

Last Year

Issues event: 3
Watch event: 9
Issue comment event: 2
Member event: 1
Push event: 13
Pull request event: 4
Create event: 3

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 3
Total pull requests: 2
Average time to close issues: 1 minute
Average time to close pull requests: 2 minutes
Total issue authors: 3
Total pull request authors: 2
Average comments per issue: 1.0
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 3
Pull requests: 2
Average time to close issues: 1 minute
Average time to close pull requests: 2 minutes
Issue authors: 3
Pull request authors: 2
Average comments per issue: 1.0
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

azurelotus06 (1)
skylotus6 (1)
ashlotus6 (1)

Pull Request Authors

skylotus6 (1)
ashlotus6 (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

frontend/package-lock.json npm

1322 dependencies

frontend/package.json npm

@testing-library/dom ^10.4.0
@testing-library/jest-dom ^6.6.3
@testing-library/react ^16.3.0
@testing-library/user-event ^14.5.2
@types/jest ^29.5.12
@types/node ^20.11.24
@types/react ^18.2.61
@types/react-dom ^18.2.19
@types/uuid ^9.0.8
autoprefixer ^10.4.17
axios ^1.6.7
date-fns ^3.3.1
framer-motion ^11.0.5
lucide-react ^0.515.0
postcss ^8.4.35
react ^18.2.0
react-dom ^18.2.0
react-hot-toast ^2.4.1
react-icons ^5.0.1
react-scripts 5.0.1
recharts ^2.12.0
tailwindcss ^3.4.1
typescript ^5.3.3
use-sound ^4.0.1
uuid ^9.0.1
wavesurfer.js ^7.6.2
web-vitals ^3.5.2

backend/pyproject.toml pypi

aiofiles >=23.2.1
aiohttp >=3.9.0
alembic >=1.12.1
asyncio-mqtt >=0.16.1
elevenlabs >=0.2.26
fastapi >=0.104.1
httpx >=0.25.2
librosa >=0.10.1
numpy >=1.24.0
openai >=1.3.0
openai-whisper >=20231117
passlib [bcrypt]>=1.7.4
psycopg2-binary >=2.9.9
pyaudio >=0.2.11
pydantic >=2.5.0
pydantic-settings >=2.9.1
python-dotenv >=1.0.0
python-jose [cryptography]>=3.3.0
python-multipart >=0.0.6
requests >=2.31.0
scipy >=1.11.0
sounddevice >=0.4.6
sqlalchemy >=2.0.23
uvicorn [standard]>=0.24.0
webrtcvad-wheels >=2.0.10
websockets >=12.0

backend/uv.lock pypi

136 dependencies

https://github.com/azurelotus06/ai-executive-assistant

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Real-Time AI Executive Assistant for Zoom Meetings

🎯 Overview

🚀 Key Features

✅ Real-Time Audio Processing

✅ AI-Powered Insights

✅ Natural Voice Interaction

✅ User Control

🛠️ Tech Stack

📋 Prerequisites

🔧 Installation

1. Clone the Repository

2. Backend Setup

Install UV package manager

Navigate to backend

Install dependencies

Copy and configure environment variables

Edit .env with your API keys

Set up database

Start the backend server

3. Frontend Setup

Navigate to frontend

Install dependencies

Start the development server

🔑 Configuration

Environment Variables

Required

MeetingBaaS Configuration (Required for Zoom integration)

Optional

AI Settings

TTS Settings

📱 Usage

Install devtunnel

🏗️ Architecture

🔄 Data Flow

🧪 Testing

Backend Tests

Frontend Tests

End-to-End Pipeline Test

With backend running

📊 Performance Considerations

🔒 Security

🐛 Troubleshooting

Common Issues

🚀 Deployment

Ngrok Deployment

Download from https://ngrok.com/download

Or use the included ngrok.exe on Windows

Start backend server first

Production Considerations

📝 License

🤝 Contributing

📧 Support

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies