polyglot-scholarvault-20250722_175405

Auto-exported from VibeSheet for idea 1256

https://github.com/vispes/polyglot-scholarvault-20250722_175405

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Auto-exported from VibeSheet for idea 1256

Basic Info
  • Host: GitHub
  • Owner: vispes
  • License: mit
  • Language: TypeScript
  • Default Branch: main
  • Size: 23.4 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 7 months ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

Polyglot ScholarVault - Academic Translation System

System Architecture Diagram Diagram showing Polyglot ScholarVault architecture

Project Overview

Polyglot ScholarVault is an AI-powered academic translation platform designed for researchers and academics. It provides seamless translation of academic PDFs while preserving formatting, citations, and context with specialized features for scholarly content management. The system combines OCR, machine translation, citation formatting, and version control in a unified interface.

Key Features: - Multi-language PDF translation with formatting preservation - Side-by-side translation editor with scroll synchronization - Smart citation engine (APA/MLA/Chicago) with source validation - Document version history with semantic diff visualization - Google Docs integration with real-time syncing - End-to-end document encryption

Technology Stack

Frontend: - React 18 + TypeScript - Zustand state management - PDF.js + Tesseract.js - Google Docs API

Backend Services: - Node.js with Express - Translation APIs (Google Translate, DeepL) - Citation validation services (CrossRef) - Encryption: AES-256

Installation

```bash

Clone repository

git clone https://github.com/your-org/polyglot-scholarvault-20250722175405.git cd polyglot-scholarvault-20250722175405

Install dependencies

npm install

Configure environment variables (create .env file)

cp .env.example .env

Start development server

npm start ```

.env Configuration: REACT_APP_GOOGLE_API_KEY=your_google_api_key REACT_APP_DEEPL_API_KEY=your_deepl_api_key REACT_APP_OPENAI_API_KEY=your_openai_key REACT_APP_CRYPTO_SECRET=your_encryption_secret

Key Components

| Component | Description | Dependencies | |-----------|-------------|--------------| | PDFUploader.tsx | PDF upload interface with validation and OCR | pdfprocessor.ts | | SplitView.tsx | Side-by-side comparison of original/translated content | usetranslationstore | | TranslationManager.tsx | Control panel for translation pipelines | translationorchestrator | | VersionComparator.tsx | AI-powered semantic diff across versions | citationengine | | GoogleDocsPreview.tsx | Live preview with embedded Google Docs | docsCompiler | | CitationConfigurator.tsx | Style-formatted citation generator | citationengine | | LanguageSelector.tsx | Multi-language selection UI | usetranslationstore | | PageEditor.tsx | Per-page content editor with snapshots | SplitView | | pdfprocessor.ts | PDF text extraction & page management | Tesseract.js | | encryptionservice.ts | AES-256 document encryption | crypto-js |

Getting Started

Basic Translation Workflow: ```javascript // Sample translation process import { encryptDocument } from './services/encryptionservice'; import { extractTextFromPdf } from './services/pdfprocessor'; import { orchestrateTranslation } from './services/translationorchestrator';

async function translateDocument(file, targetLang) { // 1. Secure document const encryptedDoc = await encryptDocument(file);

// 2. Extract and process text const { pages } = await extractTextFromPdf(encryptedDoc);

// 3. Perform translation const translationResults = await orchestrateTranslation({ pages, sourceLang: 'auto', targetLang, translationMode: 'academic' });

return translationResults; } ```

Dependencies

Main Production Dependencies: json "dependencies": { "react": "^18.2.0", "react-dom": "^18.2.0", "zustand": "^4.4.1", "pdfjs-dist": "^3.4.120", "tesseract.js": "^4.0.2", "crypto-js": "^4.1.1", "diff": "^5.1.0", "react-google-docs-viewer": "^3.0.5" }

Development Dependencies: json "devDependencies": { "typescript": "^5.0.4", "vite": "^4.4.5", "@testing-library/react": "^14.0.0", "msw": "^1.3.0" }

Security Features

  1. Document Encryption
    • AES-256 encryption for storage and transmission
    • Secure key management through environment variables
  2. Authentication
    • JWT-based access control
    • Role-Based Access Control (RBAC) middleware
  3. Data Validation
    • Strict input validation for API endpoints
    • Content sanitization for user-generated content

Contributing

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/your-feature)
  3. Commit changes following Conventional Commits
  4. Push to branch and open pull request

License

MIT License - See LICENSE.md for details

Owner

  • Login: vispes
  • Kind: user

Citation (citationconfigurator.tsx)

const CitationConfigurator: React.FC<CitationConfiguratorProps> = ({ source }) => {
  const [citationStyle, setCitationStyle] = useState<CitationStyle>('APA');
  const [citation, setCitation] = useState<string>('');
  const [copied, setCopied] = useState<boolean>(false);
  const timerRef = useRef<NodeJS.Timeout | null>(null);

  const generateCitation = (style: CitationStyle): string => {
    // Safely get author names
    const authorNames = source.authors?.map(author => author.name) || [];
    const year = source.publicationYear || 'n.d.';
    const title = source.title || '[No title]';
    
    const formatAuthors = (names: string[], style: CitationStyle): string => {
      if (names.length === 0) return '';
      if (names.length === 1) return names[0];

      switch (style) {
        case 'APA':
          return names.length >= 1 ? `${names[0]} et al.` : '';
        case 'MLA':
          return names.length === 2 
            ? `${names[0]} and ${names[1]}` 
            : `${names[0]} et al.`;
        case 'Chicago':
          return names.length === 2 
            ? `${names[0]} and ${names[1]}` 
            : `${names[0]} et al.`;
        default:
          return names.join(', ');
      }
    };

    const authorsList = formatAuthors(authorNames, style);

    switch (style) {
      case 'APA': {
        const parts: string[] = [];
        if (authorsList) parts.push(`${authorsList} (${year})`);
        parts.push(`${title}.`);
        
        if (source.journal) {
          parts.push(source.journal + '.');
        } else if (source.publisher) {
          parts.push(source.publisher + '.');
        }
        
        return parts.join(' ');
      }
      case 'MLA': {
        const parts: string[] = [];
        if (authorsList) {
          parts.push(`${authorsList}.`);
        }
        parts.push(`"${title}."`);
        parts.push(year);
        return parts.join(' ');
      }
      case 'Chicago': {
        const parts: string[] = [];
        if (authorsList) {
          parts.push(`${authorsList},`);
        }
        parts.push(`${year}.`);
        parts.push(`"${title}."`);
        return parts.join(' ');
      }
      default:
        return title;
    }
  };

  const handleCopyToClipboard = (text: string) => {
    navigator.clipboard.writeText(text).then(() => {
      setCopied(true);
      
      // Clear previous timeout if exists
      if (timerRef.current) {
        clearTimeout(timerRef.current);
      }
      
      timerRef.current = setTimeout(() => {
        setCopied(false);
        timerRef.current = null;
      }, 2000);
    }).catch(err => {
      console.error('Failed to copy: ', err);
    });
  };

  useEffect(() => {
    setCitation(generateCitation(citationStyle));
  }, [citationStyle, source]);

  // Cleanup timer on unmount
  useEffect(() => {
    return () => {
      if (timerRef.current) {
        clearTimeout(timerRef.current);
      }
    };
  }, []);

  return (
    <div className="citation-configurator">
      <div className="style-selector">
        <label htmlFor="citation-style">Citation Style:</label>
        <select 
          id="citation-style"
          value={citationStyle}
          onChange={(e) => setCitationStyle(e.target.value as CitationStyle)}
        >
          <option value="APA">APA</option>
          <option value="MLA">MLA</option>
          <option value="Chicago">Chicago</option>
        </select>
      </div>
      
      <div className="citation-preview">
        <textarea 
          readOnly 
          value={citation}
          rows={3}
          aria-label="Generated citation"
        />
      </div>
      
      <button 
        className="copy-button"
        onClick={() => handleCopyToClipboard(citation)}
        aria-label="Copy to clipboard"
      >
        Copy
      </button>
      {copied && <span className="copied-indicator">Copied!</span>}
    </div>
  );
};

export default CitationConfigurator;

GitHub Events

Total
  • Push event: 8
  • Create event: 2
Last Year
  • Push event: 8
  • Create event: 2