https://github.com/asanchezyali/growthx-app

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.2%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: asanchezyali
License: mit
Language: TypeScript
Default Branch: main
Size: 146 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License

GrowthX Text Analysis API with LLMs

This API uses large language models (LLMs) to analyze texts, performing four key operations: summarization, categorization, keyword extraction, and sentiment analysis.

Approach and Design Decisions
Features
Requirements
Installation and Configuration
Architecture
Managing Non-Determinism in LLMs
Testing and Performance
Scalability Considerations

Approach and Design Decisions

In addressing this technical challenge, I implemented an architecture that optimizes reliability, maintainability, and scalability:

Separation of responsibilities

I broke down the problem into modular components following the single responsibility principle:

Controllers: HTTP request and response management
Services: Encapsulation of business logic
LLM-specific tasks: Specialized prompts for each analysis operation
OpenAI Service: Abstracted interactions with the OpenAI API with rate limiting management

This separation allows independent maintenance of each component and facilitates system evolution.

Performance optimization

I identified that LLM calls constitute the main bottleneck, so I implemented:

Parallel processing: The four analysis operations are executed concurrently using Promise.allSettled.
Three-layer rate limiting system: Concurrency control, requests per minute (RPM), and tokens per minute (TPM).
Exponential backoff: I implemented smart retries to handle transient errors.

Ensuring consistent results

To ensure that responses meet the specific requirements of the project:

Categorization: Limited the result to a maximum of 5 categories through explicit validation
Primary keyword: Ensured it is a string through strict typing and validation
Secondary keywords: Implemented validation to ensure an array of strings
Sentiment analysis: Normalized responses to exclusively "positive", "negative", or "neutral"

Features

Content summarization: Generates concise summaries of long texts
Categorization: Identifies up to five main categories of the content
Keyword extraction: Determines a primary keyword and secondary keywords
Sentiment analysis: Classifies the overall tone as positive, negative, or neutral
RESTful API: Simple interface for integration with other applications
Swagger documentation: Documented API with interactive interface
Rate limiting: Efficient control of requests to the OpenAI API
End-to-end and load testing: Validation of functionality and performance

Requirements

Node.js v22+
NPM v10+
OpenAI API key

Installation and Configuration

Clone the repository: bash git clone https://github.com/your-username/growthx-app.git cd growthx-app
Install dependencies: bash npm install
Configure environment variables (.env.local): ```

Required

OPENAIAPIKEY=your-openai-api-key

# Optional - Server Configuration PORT=3000 NODE_ENV=development

# Optional - OpenAI Configuration OPENAIMODEL=o3-mini REQUESTTIMEOUT=30000

# Optional - Rate Limiting MAXCONCURRENCY=8 MAXRPM=100 MAX_TPM=10000 ```

Start the server: ```bash # Development npm run dev

# Production npm run build npm start ```

The API will be available at http://localhost:3000 and the Swagger documentation at http://localhost:3000/api-docs.

Using the API

Analysis Endpoint

```http POST /api/analyze Content-Type: application/json

{ "title": "Machine Learning Fundamentals", "content": "Full text to analyze..." } ```

Response: json { "status": "success", "data": { "summary": "Concise summary of the content...", "categories": ["education", "technology", "artificial intelligence"], "primaryKeyword": "machine learning", "secondaryKeywords": ["algorithms", "neural networks", "data"], "sentiment": "positive" } }

Health Endpoint

http GET /api/health

Response: json { "status": "ok", "model": "o3-mini", "timestamp": "2023-05-08T22:24:42Z" }

Architecture

The application follows a layered architecture that optimizes separation of concerns and facilitates modifications:

```mermaid graph TD Client([Client]) --> API[Express API] API --> AC[AnalyzeController] AC --> CAS[ContentAnalyzerService] CAS --> OAT[OpenAITasks] OAT --> OAS[OpenAIService] OAS --> OpenAI[(OpenAI API)]

subgraph Express_Middleware
  CORS[CORS]
  JSON[JSON Parser]
  SWAGGER[Swagger UI]
end

API --> Express_Middleware

subgraph Utils
  RU[Retry Utils]
  TU[Timeout Utils]
  JU[JSON Utils]
  PL[p-limit]
end

OAS --> RL[Rate Limiter]
OAS --> TK[Tiktoken]
OAS --> Utils

style API fill:#9cf,stroke:#333
style AC fill:#fc9,stroke:#333
style CAS fill:#f9c,stroke:#333
style OAS fill:#9fc,stroke:#333
style OAT fill:#f9c,stroke:#333
style OpenAI fill:#c9f,stroke:#333
style Express_Middleware fill:#ccc,stroke:#333
style Utils fill:#ccc,stroke:#333
style RL fill:#f9c,stroke:#333
style TK fill:#f9c,stroke:#333

```

Operation Flow

The following diagram illustrates the processing flow for an analysis request:

```mermaid sequenceDiagram participant Client as Client participant API as Express API participant AC as AnalyzeController participant CAS as ContentAnalyzerService participant OAT as OpenAITasks participant OAS as OpenAIService participant OpenAI as OpenAI API

Client->>API: POST /api/analyze
API->>AC: analyze(req, res)
Note over AC: Content validation

AC->>CAS: analyzeContent(text)

par Parallel Operations
    CAS->>OAT: summarize(text)
    OAT->>OAS: runRawResponse(prompt)
    OAS->>OpenAI: chat.completions.create()
    OpenAI-->>OAS: Generated summary
    OAS-->>OAT: Processed response
    OAT-->>CAS: Summary
and
    CAS->>OAT: categorize(text)
    OAT->>OAS: runJsonResponse(prompt)
    OAS->>OpenAI: chat.completions.create()
    OpenAI-->>OAS: Categories
    OAS-->>OAT: Processed response
    OAT-->>CAS: Categories
and
    CAS->>OAT: extractKeywords(text)
    OAT->>OAS: runJsonResponse(prompt)
    OAS->>OpenAI: chat.completions.create()
    OpenAI-->>OAS: Keywords
    OAS-->>OAT: Processed response
    OAT-->>CAS: Keywords
and
    CAS->>OAT: analyzeSentiment(text)
    OAT->>OAS: runRawResponse(prompt)
    OAS->>OpenAI: chat.completions.create()
    OpenAI-->>OAS: Sentiment
    OAS-->>OAT: Processed response
    OAT-->>CAS: Sentiment
end

CAS-->>AC: Combined results
AC-->>API: JSON response
API-->>Client: 200 OK + JSON data

```

Component Diagram

This diagram shows the organization of the main components and their relationships:

```mermaid classDiagram class AnalyzeController { +analyze(req, res): Promise~void~ +health(req, res): void }

class ContentAnalyzerService {
    -openaiTasks: OpenAITasks
    +analyzeContent(text): Promise~Analysis~
    -extractSettledValue(result, defaultValue): T
    -getDefaultAnalysisResult(): Object
}

class OpenAITasks {
    -openaiService: OpenAIService
    +summarize(text): Promise~string~
    +categorize(text): Promise~string[]~
    +extractKeywords(text): Promise~KeywordResult~
    +analyzeSentiment(text): Promise~string~
    -sanitizeInput(content): string
}

class OpenAIService {
    -openai: OpenAI
    -model: string
    -rpmLimiter: RateLimiterMemory
    -tpmLimiter: RateLimiterMemory
    -concurrencyLimiter: ReturnType~pLimit~
    -encoder: ReturnType~encoding_for_model~
    +runJsonResponse(prompt, fallback): Promise~T~
    +runRawResponse(prompt, fallback): Promise~string~
    -executeWithLimits(operation, prompt): Promise~T~
    -getCompletion(prompt): Promise~string~
    -isRateLimiterError(error): boolean
}

class KeywordResult {
    +primary: string
    +secondary: string[]
}

class Analysis {
    +summary: string
    +categories: string[]
    +primaryKeyword: string
    +secondaryKeywords: string[]
    +sentiment: string
}

class RetryUtils {
    +retryWithBackoff(fn): Promise~T~
}

class TimeoutUtils {
    +promiseWithTimeout(promise, ms): Promise~T~
}

class JsonUtils {
    +safeJsonParse(text, fallback): T
}

AnalyzeController --> ContentAnalyzerService: uses
ContentAnalyzerService --> OpenAITasks: uses
OpenAITasks --> OpenAIService: uses
OpenAIService --> RetryUtils: uses
OpenAIService --> TimeoutUtils: uses
OpenAIService --> JsonUtils: uses
OpenAITasks ..> KeywordResult: returns
ContentAnalyzerService ..> Analysis: returns

```

Main components:

Express.js: Web framework for route and middleware management
Controllers: Handle HTTP requests and delegate processing to services
Services: Contain business logic and orchestrate complex operations
- ContentAnalyzerService: Coordinates text analysis operations
- OpenAIService: Encapsulates interaction with the OpenAI API
Types: TypeScript type definitions to ensure type safety
Utils: Utilities such as retry management, timeouts, and data transformation

This architecture allows for low coupling between components and facilitates the substitution of implementations (for example, changing from OpenAI to another LLM provider).

Managing Non-Determinism in LLMs

Large language models (LLMs) are inherently non-deterministic, which presents unique challenges for creating reliable APIs. I have implemented several strategies to mitigate this unpredictability:

1. Advanced prompting techniques

I use "Prompt Decorators", a new technique (Github, Medium) that appears to significantly improve response consistency (more testing is needed to determine its effectiveness, but so far it seems well received). Here are real examples of the prompts used in each analysis task:

``typescript // Example: Sentiment analysis with decorators const sentimentPrompt = +++OutputFormat(format=single-word, allowed=["positive", "negative", "neutral"]) +++Constraint(type=response-length, max=1) +++ErrorHandling(strategy=graceful-fallback, default="neutral") +++SecurityBoundary(enforce=strict) Classify the sentiment of this text as positive, negative, or neutral. Respond with a single word only, no explanations.

TEXT TO ANALYZE: ${sanitizedContent} ;``

Each decorator solves a specific problem:

+++OutputFormat: Explicitly defines the expected response format
+++Constraint: Establishes precise limits for the response
+++ErrorHandling: Defines behavior for exceptional cases
+++SecurityBoundary: Improves resistance against injection attacks

2. Post-Processing Validation and Normalization

I implemented rigorous validation to ensure that responses meet the specified requirements:

typescript // Sentiment normalization const normalizedSentiment = sentiment.trim().toLowerCase(); if (!['negative', 'neutral', 'positive'].includes(normalizedSentiment)) { console.warn(`Invalid sentiment received: "${normalizedSentiment}", using "neutral"`); return 'neutral'; }

3. Prompt Injection Protection

I implemented sanitization techniques to neutralize potential injection attacks:

```typescript private sanitizeInput(content: string): string { // Remove decorators that might try to be injected let sanitized = content.replace(/+++\w+((.*?))?/g, '[FILTERED]');

for (const pattern of injectionPatterns) { sanitized = sanitized.replace(pattern, '[FILTERED]'); }

return sanitized; } ```

4. Error Handling

I designed a system that provides meaningful responses even when failures occur:

```typescript // In ContentAnalyzerService const tasks = { categories: this.openaiTasks.categorize(content).catch(() => ['uncategorized']), keywords: this.openaiTasks.extractKeywords(content).catch(() => ({ primary: '', secondary: [] })), sentiment: this.openaiTasks.analyzeSentiment(content).catch(() => 'neutral'), summary: this.openaiTasks.summarize(content).catch(() => ''), };

// Using Promise.allSettled to ensure responses even with partial failures const results = await Promise.allSettled([tasks.categories, tasks.summary, tasks.keywords, tasks.sentiment]); ```

This combination of techniques converts the unpredictable nature of LLMs into a reliable and consistent system for text analysis.

Testing and Performance

End-to-End Testing

I implemented automated tests that verify the complete functionality of the API:

bash npm run test:run

Test architecture diagram

subgraph "Test Environment"
    E2E
    Unit
    Load
    Mock[OpenAI Mocks]
end

Mock --> E2E
Mock --> Unit

API --> |Uses| Services

style API fill:#9cf,stroke:#333
style Services fill:#fc9,stroke:#333
style E2E fill:#f9c,stroke:#333
style Unit fill:#9fc,stroke:#333
style Load fill:#ff9,stroke:#333
style Mock fill:#c9f,stroke:#333

```

Load Testing and OpenAI Limits

A critical aspect of the system is its behavior under load. I implemented specific tests with autocannon to measure performance:

bash npm run load-test:light # 5 concurrent connections npm run load-test:medium # 25 concurrent connections npm run load-test:heavy # 50 concurrent connections

Example load test results:

| Metric | Value | | -------------------------- | ----- | | Average requests/sec | 1059 | | Average latency (ms) | 4.19 | | Maximum latency (ms) | 796 | | Total requests | 10590 | | Successful responses (2xx) | 10590 | | Error responses | 0 |

The API maintained 100% successful responses thanks to the implemented error handling and retry system. Internally, the rate limiting system and concurrency management ensured that requests did not exceed the limits imposed by OpenAI.

Concurrency Limits in OpenAI

I did a brief investigation during development and found a critical concurrency limit in the OpenAI API:

Premium plans are limited to approximately 8 concurrent requests
Requests exceeding this limit experience a significant increase in latency
This limit is especially restrictive for operations that can take 10-40 seconds

To address this restriction, I implemented:

Concurrency control: Limitation to 8 parallel requests with p-limit
Multi-level rate limiting: Implementation of limits in both RPM and TPM
Exponential backoff strategy: Smart retries when limits are detected

Scalability Considerations

The architecture was designed with scalability in mind. Here are some key strategies for scaling the system to enterprise use cases:

1. Processing Extensive Documents

Extensive content can be divided into smaller chunks for analysis. This is especially useful for long documents that exceed OpenAI's token limits.

```typescript function splitIntoChunks(text: string, chunkSize: number = 4000): string[] { const paragraphs = text.split('\n\n'); const chunks: string[] = []; let currentChunk = '';

for (const paragraph of paragraphs) { if (currentChunk.length + paragraph.length > chunkSize) { chunks.push(currentChunk.trim()); currentChunk = ''; } currentChunk += paragraph + '\n\n'; }

if (currentChunk.trim()) { chunks.push(currentChunk.trim()); }

return chunks; } ```

This technique divides the text into smaller chunks respecting natural paragraph boundaries, which preserves local context. Then we could apply specific strategies for each type of analysis:

Application to analysis tasks

1. Summarizing extensive texts: ```typescript async summarizeLargeContent(content: string): Promise { if (content.length < 4000) { return this.openaiTasks.summarize(content); }

// 1. Split into chunks const chunks = splitIntoChunks(content, 3800);

// 2. Summarize each chunk in parallel const chunkSummaries = await Promise.all( chunks.map(chunk => this.openaiTasks.summarize(chunk)) );

// 3. If the combined summaries are still extensive, generate a meta-summary const combinedSummary = chunkSummaries.join('\n\n');

if (combinedSummary.length > 4000) { return this.openaiTasks.summarize(combinedSummary); }

return combinedSummary; } ```

2. Categorizing extensive content: ```typescript async categorizeLargeContent(content: string): Promise { if (content.length < 4000) { return this.openaiTasks.categorize(content); }

// 1. Split into chunks const chunks = splitIntoChunks(content, 3800);

// 2. Categorize each chunk const allCategories: string[][] = await Promise.all( chunks.map(chunk => this.openaiTasks.categorize(chunk)) );

// 3. Count frequency of each category const categoryFrequency = new Map();

allCategories.flat().forEach(category => { const normalizedCategory = category.toLowerCase(); categoryFrequency.set( normalizedCategory, (categoryFrequency.get(normalizedCategory) || 0) + 1 ); });

// 4. Select the 5 most frequent categories return Array.from(categoryFrequency.entries()) .sort((a, b) => b[1] - a[1]) .slice(0, 5) .map(([category]) => category); } ```

3. Keyword extraction: ```typescript async extractKeywordsFromLargeContent(content: string): Promise<{primary: string, secondary: string[]}> { if (content.length < 4000) { return this.openaiTasks.extractKeywords(content); }

// 1. Split into chunks const chunks = splitIntoChunks(content, 3800);

// 2. Extract keywords from each chunk const allKeywords = await Promise.all( chunks.map(chunk => this.openaiTasks.extractKeywords(chunk)) );

// 3. Count keyword frequency const keywordFrequency = new Map();

// Add primary keywords with weight 3 allKeywords.forEach(result => { const normalizedKeyword = result.primary.toLowerCase(); keywordFrequency.set( normalizedKeyword, (keywordFrequency.get(normalizedKeyword) || 0) + 3 ); });

// Add secondary keywords with weight 1 allKeywords.forEach(result => { result.secondary.forEach(keyword => { const normalizedKeyword = keyword.toLowerCase(); keywordFrequency.set( normalizedKeyword, (keywordFrequency.get(normalizedKeyword) || 0) + 1 ); }); });

// 4. Determine the primary keyword and secondary keywords const sortedKeywords = Array.from(keywordFrequency.entries()) .sort((a, b) => b[1] - a[1]);

return { primary: sortedKeywords[0][0], secondary: sortedKeywords.slice(1, 11).map(([keyword]) => keyword) }; } ```

4. Sentiment analysis: ```typescript async analyzeSentimentOfLargeContent(content: string): Promise { if (content.length < 4000) { return this.openaiTasks.analyzeSentiment(content); }

// 1. Split into chunks const chunks = splitIntoChunks(content, 3800);

// 2. Analyze sentiment of each chunk const sentiments = await Promise.all( chunks.map(chunk => this.openaiTasks.analyzeSentiment(chunk)) );

// 3. Count frequency of each sentiment const sentimentCounts = { 'positive': 0, 'negative': 0, 'neutral': 0 };

sentiments.forEach(sentiment => { sentimentCounts[sentiment]++; });

// 4. Determine the predominant sentiment if (sentimentCounts.positive > sentimentCounts.negative) { return sentimentCounts.positive > sentimentCounts.neutral ? 'positive' : 'neutral'; } else if (sentimentCounts.negative > sentimentCounts.positive) { return sentimentCounts.negative > sentimentCounts.neutral ? 'negative' : 'neutral'; } else { return 'neutral'; } } ```

Advantages of batch processing

This approach provides several important advantages:

Scalability: Allows processing documents of any size, overcoming the context limitation of LLMs.
Context preservation: By respecting paragraph boundaries, we maintain coherence within each chunk.
Parallel processing: Tasks are executed concurrently, reducing total processing time.
Intelligent aggregation: Each type of analysis uses a combination method appropriate to its nature (meta-summary for summaries, frequency for categories, etc.).
Cost efficiency: Optimizes token usage by sending only the necessary content for each task.

Production implementation

In a production environment, these techniques would be integrated with asynchronous processing and notification systems to handle prolonged response times for very extensive documents. For example:

```typescript // Pseudocode async function handleLargeDocumentAnalysis(content: string): Promise { // If the content is large, process it in the background if (content.length > 10000) { const jobId = await queueService.enqueueJob({ type: 'document-analysis', content, timestamp: new Date() });

return jobId; // Return an ID that the client can use to query the status

}

// Process small documents immediately return await contentAnalyzerService.analyzeContent(content); } ```

This architecture allows the API to remain responsive even when processing very large documents.

Future scalability architecture

```mermaid graph TD Client([Client]) --> LB[Load Balancer] LB --> API1[API Instance 1] LB --> API2[API Instance 2] LB --> API3[API Instance 3]

API1 --> Cache[(Redis Cache)]
API2 --> Cache
API3 --> Cache

API1 --> Queue[Task Queue]
API2 --> Queue
API3 --> Queue

Queue --> Worker1[Worker 1]
Queue --> Worker2[Worker 2]
Queue --> WorkerN[Worker N]

Worker1 --> OpenAI[(OpenAI API)]
Worker2 --> OpenAI
WorkerN --> OpenAI

style Client fill:#9cf,stroke:#333
style LB fill:#fc9,stroke:#333
style API1 fill:#f9c,stroke:#333
style API2 fill:#f9c,stroke:#333
style API3 fill:#f9c,stroke:#333
style Cache fill:#9fc,stroke:#333
style Queue fill:#ff9,stroke:#333
style Worker1 fill:#c9f,stroke:#333
style Worker2 fill:#c9f,stroke:#333
style WorkerN fill:#c9f,stroke:#333
style OpenAI fill:#cf9,stroke:#333

```

This design would allow:

Horizontal scaling: Increasing capacity through additional instances
Asynchronous processing: Handling long tasks through work queues
Cost optimization: Reducing calls to OpenAI through caching
Fault tolerance: Service continuity even in the face of individual component failures

Future Implementations

Intelligent cache system: Storage of frequent analyses to reduce latency and costs
Support for multiple formats: Preprocessing for PDF, Markdown, and HTML
Observability dashboard: Detailed performance and usage metrics
Adapters for alternative models: Compatibility with Claude, Llama 2, and other LLMs
Embeddings API: Semantic search and clustering of similar documents

Owner

Name: Alejandro Sánchez Yalí
Login: asanchezyali
Kind: user
Company: Monadical

Website: www.asanchezyali.com
Twitter: asanchezyali
Repositories: 16
Profile: https://github.com/asanchezyali

Mathematician with experience in Software Development, Data Science and Blockchain

GitHub Events

Total

Public event: 1
Push event: 1

Last Year

Public event: 1
Push event: 1

Dependencies

.github/workflows/ci.yml actions

actions/checkout v4 composite
actions/setup-node v4 composite

package-lock.json npm

513 dependencies

package.json npm

@eslint/js ^9.17.0 development
@tsconfig/node22 ^22.0.0 development
@types/autocannon ^7.12.7 development
@types/express ^5.0.0 development
@types/node ^22.10.2 development
@types/supertest ^6.0.3 development
@types/swagger-jsdoc ^6.0.4 development
@types/swagger-ui-express ^4.1.8 development
@vitest/coverage-v8 ^2.1.8 development
@vitest/eslint-plugin ^1.1.24 development
@vitest/ui ^2.1.8 development
autocannon ^8.0.0 development
eslint ^9.17.0 development
eslint-plugin-perfectionist ^4.6.0 development
husky ^9.1.7 development
lint-staged ^15.2.11 development
prettier ^3.4.2 development
supertest ^7.1.0 development
tsx ^4.19.2 development
typescript ^5.7.2 development
typescript-eslint ^8.18.2 development
vitest ^2.1.8 development
@types/cors ^2.8.17
cors ^2.8.5
dotenv ^16.5.0
express ^4.21.2
openai ^4.97.0
rate-limiter-flexible ^7.1.0
swagger-jsdoc ^6.2.8
swagger-ui-express ^5.0.1
tiktoken ^1.0.21
zod ^3.24.4

https://github.com/asanchezyali/growthx-app

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

GrowthX Text Analysis API with LLMs

Table of Contents

Approach and Design Decisions

Separation of responsibilities

Performance optimization

Ensuring consistent results

Features

Requirements

Installation and Configuration

Required

Using the API

Analysis Endpoint

Health Endpoint

Architecture

Operation Flow

Component Diagram

Main components:

Managing Non-Determinism in LLMs

1. Advanced prompting techniques

2. Post-Processing Validation and Normalization

3. Prompt Injection Protection

4. Error Handling

Testing and Performance

End-to-End Testing

Test architecture diagram

Load Testing and OpenAI Limits

Concurrency Limits in OpenAI

Scalability Considerations

1. Processing Extensive Documents

Application to analysis tasks

Advantages of batch processing

Production implementation

Future scalability architecture

Future Implementations

Owner

GitHub Events

Total

Last Year

Dependencies