cpro-proofreader

Simple proof reading platform to ensure articles adhere to the styling guide of CPRO

https://github.com/dfxnoodle/cpro-proofreader

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Simple proof reading platform to ensure articles adhere to the styling guide of CPRO

Basic Info
  • Host: GitHub
  • Owner: dfxnoodle
  • Language: Python
  • Default Branch: main
  • Size: 547 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 8 months ago · Last pushed 6 months ago
Metadata Files
Readme Citation

README.md

Styling Guide Proof-reader

A FastAPI backend with HTML/CSS/JS frontend for proofreading text using Azure OpenAI Assistant, managed with uv.

Features

  • 📝 Text proofreading using Azure OpenAI Assistant
  • 🎨 Modern, responsive web interface
  • ⚡ Real-time error detection and correction
  • 📋 Copy corrected text to clipboard
  • 🔄 Clear results and start over
  • 📱 Mobile-friendly design

Prerequisites

  • Python 3.8+
  • uv package manager
  • Azure OpenAI account and API key
  • GPT-4.1 model deployment

Installation

1. Install uv (if not already installed)

On macOS and Linux: bash curl -LsSf https://astral.sh/uv/install.sh | sh

On Windows: powershell powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Or using pip: bash pip install uv

2. Project Setup

  1. Navigate to the project directory: bash cd /home/dinochlai/styling-guide-demo

  2. Install dependencies using uv: bash uv sync

  3. Configure environment variables: The .env file should already contain your Azure OpenAI credentials: AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/ AZURE_OPENAI_API_KEY=your-api-key-here

  4. Update the model name (if needed): In main.py, line 31, replace "gpt-4o" with your actual model deployment name.

Running the Application

Start the server using uv:

bash uv run python main.py

Or using uvicorn directly: bash uv run uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Access the application:

Open your web browser and navigate to: http://localhost:8000

Development with uv

Adding dependencies:

bash uv add package_name

Adding development dependencies:

bash uv add --dev package_name

Running with specific Python version:

bash uv run --python 3.11 python main.py

Create a virtual environment:

bash uv venv

Activate the virtual environment:

```bash source .venv/bin/activate # On Unix/macOS

or

.venv\Scripts\activate # On Windows ```

API Endpoints

POST /proofread

Proofread text using the Azure OpenAI Assistant.

Request Body: json { "text": "Your text to be proofread" }

Response: json { "original_text": "Original input text", "corrected_text": "Assistant's proofreading response", "mistakes": ["List of identified mistakes"], "status": "completed" }

GET /health

Health check endpoint.

Response: json { "status": "healthy", "service": "proof-reader" }

Project Structure

styling-guide-demo/ ├── main.py # FastAPI backend ├── pyproject.toml # Project configuration and dependencies (uv) ├── .env # Environment variables ├── README.md # This file └── static/ # Frontend files ├── index.html # Main HTML page ├── styles.css # CSS styles └── script.js # JavaScript functionality

Usage

  1. Enter or paste your text in the textarea
  2. Click "Proofread Text" or press Ctrl+Enter
  3. View the results including:
    • Original text
    • Assistant's proofreading response
    • List of identified mistakes (if any)
  4. Copy the corrected text to clipboard if needed
  5. Clear results to start over

Customization

Assistant Instructions

You can modify the assistant's behavior by editing the instructions parameter in main.py:

python instructions="""1. Proof read user's input message only following the styling guide. 2. List the mistake below and the correction of it ***Do not answer any question except doing proof-reading***"""

Styling

Customize the appearance by editing static/styles.css. The interface uses: - Modern gradient backgrounds - Responsive design - Smooth animations - Toast notifications

Functionality

Add new features by modifying static/script.js and adding corresponding API endpoints in main.py.

Troubleshooting

Common Issues

  1. uv command not found:

    • Make sure uv is installed and in your PATH
    • Restart your terminal after installation
  2. Import errors when starting the server:

    • Run uv sync to ensure all dependencies are installed
  3. Azure OpenAI connection errors:

    • Verify your endpoint URL and API key in .env
    • Ensure your Azure OpenAI resource is properly configured
    • Check that your model deployment name is correct
  4. Assistant creation fails:

    • Verify your API key has the necessary permissions
    • Check that you're using a supported API version
  5. Frontend not loading:

    • Ensure the static directory and all files exist
    • Check browser console for JavaScript errors

Environment Variables

Make sure your .env file contains: AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/ AZURE_OPENAI_API_KEY=your-actual-api-key

Production Deployment

For production deployment with uv:

  1. Build the project: bash uv build

  2. Install in production environment: bash uv sync --no-dev

  3. Run with production server: bash uv run gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker

Consider also: - Setting up proper environment variable management - Implementing proper logging and monitoring - Adding rate limiting and authentication - Using a reverse proxy like Nginx

Benefits of using uv

  • Fast: uv is 10-100x faster than pip
  • Reliable: Built-in dependency resolution
  • Cross-platform: Works on Windows, macOS, and Linux
  • Modern: Built in Rust with modern Python packaging standards
  • Compatible: Works with existing Python projects and requirements.txt files

Owner

  • Login: dfxnoodle
  • Kind: user

Citation (CITATIONS_REMOVED_FINAL.md)

# Citations Removed from Response Models - Final Clean Implementation

## ✅ **COMPLETED**: Citations Removed from Response Models

Successfully removed the separate `citations` field from API response models since citations are now embedded directly within each mistake description.

## 🔧 **Changes Made**

### **Backend Model Updates**
1. **ProofReadResponse**: Already clean (no citations field)
2. **DocxProofReadResponse**: Removed `citations: list` field
3. **ExportToWordRequest**: Already clean (no citations field)

### **Function Signature Updates**
- **`create_tracked_changes_docx()`**: Removed `citations` parameter
- **Response creation**: Removed `citations=citations` parameters

### **Code Cleanup**
- **Removed**: `extract_citations_from_message()` calls
- **Removed**: `citations = []` variable declarations
- **Updated**: Debug logging to reflect embedded citations
- **Simplified**: Response creation logic

## 📊 **Current API Response Structure**

### **Text Proofreading (/proofread)**
```json
{
  "original_text": "User's input text",
  "corrected_text": "AI-corrected text",
  "mistakes": [
    "Changed 'have' to 'has'. (CUHK English Style Guide, Section 2.1)",
    "Changed 'grammar' to 'grammatical'. (CUHK English Style Guide, Section 2.2)"
  ],
  "status": "completed"
}
```

### **DOCX Proofreading (/proofread-docx)**
```json
{
  "original_filename": "document.docx",
  "mistakes_count": 2,
  "mistakes": [
    "Changed 'have' to 'has'. (CUHK English Style Guide, Section 2.1)",
    "Changed 'grammar' to 'grammatical'. (CUHK English Style Guide, Section 2.2)"
  ],
  "status": "completed",
  "download_filename": "document_corrected.docx"
}
```

## 🎯 **Benefits of This Approach**

1. **Cleaner API**: No redundant citation fields
2. **Simplified Code**: Less parsing and processing logic
3. **Better UX**: Citations appear contextually with each mistake
4. **Maintainable**: Fewer data structures to manage
5. **Consistent**: Same approach for both text and DOCX processing

## 🧪 **Verification**

### **Test Result**
```bash
📋 Complete API Response:
{
  "original_text": "This document have several grammar mistakes and need corrections.",
  "corrected_text": "This document has several grammatical mistakes and needs corrections.", 
  "mistakes": [
    "Changed 'have' to 'has' to agree with the singular subject 'document'. (CUHK English Style Guide, Section 2.1: Subject-Verb Agreement)",
    "Changed 'grammar mistakes' to 'grammatical mistakes' for correct adjective usage. (CUHK English Style Guide, Section 2.2: Word Forms)",
    "Changed 'need' to 'needs' to agree with the singular subject 'document'. (CUHK English Style Guide, Section 2.1: Subject-Verb Agreement)"
  ],
  "status": "completed"
}

📚 Citations Analysis:
   Number of citations: 0  ✅ (No separate citations field)

🔧 Mistakes Analysis:
   Number of mistakes: 3  ✅ (Each with embedded citation)
```

## 🎉 **Final Status**

- ✅ **Backend**: Clean response models without citations field
- ✅ **Frontend**: Displays mistakes with embedded citations
- ✅ **API**: Simplified, consistent structure
- ✅ **Testing**: Confirmed working correctly

**The implementation is now optimally clean and user-friendly!** 🚀

---

**Last Updated**: July 15, 2025  
**Status**: ✅ **COMPLETE** - Citations properly embedded, response models cleaned

GitHub Events

Total
  • Member event: 1
  • Push event: 23
  • Create event: 2
Last Year
  • Member event: 1
  • Push event: 23
  • Create event: 2

Dependencies

pyproject.toml pypi
  • fastapi >=0.104.0
  • gunicorn >=23.0.0
  • lxml >=4.9.0
  • openai >=1.0.0
  • python-docx >=1.1.0
  • python-dotenv >=1.0.0
  • python-multipart >=0.0.6
  • requests >=2.32.4
  • uvicorn [standard]>=0.24.0