seo-generator-server/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

Node.js-based SEO content generation server that creates SEO-optimized content using multiple LLMs with anti-detection mechanisms. The system operates in two exclusive modes: MANUAL (web interface + API) or AUTO (batch processing from Google Sheets).

## Development Commands

### Server Operations
```bash
npm start                    # Start in MANUAL mode (default)
npm start -- --mode=manual  # Explicitly start MANUAL mode
npm start -- --mode=auto    # Start in AUTO mode
SERVER_MODE=auto npm start  # Start AUTO mode via environment
```

### Production Workflow Execution
```bash
# Execute real production workflow from Google Sheets
node -e "const main = require('./lib/Main'); main.handleFullWorkflow({ rowNumber: 2, source: 'production' });"

# Test with different rows
node -e "const main = require('./lib/Main'); main.handleFullWorkflow({ rowNumber: 3, source: 'production' });"
```

### Testing Commands
```bash
# Main test suites
npm run test:all              # Complete test suite
npm run test:production-loop  # Production ready validation (CI/CD recommended)
npm run test:comprehensive    # Exhaustive modular combinations (22 tests)
npm run test:basic            # Basic architecture validation

# Quick tests
npm run test:smoke            # Smoke tests
npm run test:llm              # LLM connectivity
npm run test:content          # Content generation
npm run test:integration      # Integration tests
```

### Quick System Tests
```bash
# Production workflow
node -e "require('./lib/Main').handleFullWorkflow({ rowNumber: 2, source: 'production' });"

# LLM connectivity
node -e "require('./lib/LLMManager').testLLMManager()"

# Google Sheets
node -e "require('./lib/BrainConfig').getPersonalities().then(p => console.log(\`\${p.length} personalities\`))"
```

## Architecture Overview

### Dual Mode System
The server operates in two mutually exclusive modes controlled by `lib/modes/ModeManager.js`:

- **MANUAL Mode** (`lib/modes/ManualServer.js`): Web interface, API endpoints, WebSocket for real-time logs
- **AUTO Mode** (`lib/modes/AutoProcessor.js`): Batch processing from Google Sheets without web interface

### 🆕 Advanced Configuration Systems

#### Dynamic Prompt Engine (`lib/prompt-engine/`)
Génération dynamique de prompts adaptatifs avec composition multi-niveaux:
- **Templates**: technical, style, adversarial avec variables dynamiques
- **Context analyzers**: Analyse automatique pour adaptation prompts
- **Variable injection**: Remplacement intelligent de variables contextuelles

#### Trend Manager (`lib/trend-prompts/`)
Gestion de tendances configurables pour moduler les prompts:
- **Tendances sectorielles**: eco-responsable (durabilité), tech-innovation (digitalisation), artisanal-premium (savoir-faire)
- **Tendances générationnelles**: generation-z (inclusif/viral), millennials (authenticité), seniors (tradition)
- **Configuration**: targetTerms, focusAreas, tone, values appliqués sélectivement

#### Workflow Engine (`lib/workflow-configuration/`)
Séquences modulaires configurables - 5 workflows prédéfinis:
- **default**: Selective → Adversarial → Human → Pattern (workflow standard)
- **human-first**: Human → Pattern → Selective → Pattern (humanisation prioritaire)
- **stealth-intensive**: Pattern → Adversarial → Human → Pattern → Adversarial (anti-détection max)
- **quality-first**: Selective → Human → Selective → Pattern (qualité prioritaire)
- **balanced**: Selective → Human → Adversarial → Pattern → Selective (équilibré)

Support multi-passes (même module plusieurs fois) et intensité variable par étape.

#### Batch Processing (`lib/batch/`)
Système complet de traitement batch:
- **BatchController**: API endpoints (config, start, stop, status)
- **BatchProcessor**: Queue management, gestion d'erreurs, progression temps réel
- **DigitalOceanTemplates**: 10+ templates XML prédéfinis
- **Configuration**: rowRange, trendId, workflowSequence, saveIntermediateSteps

### 🆕 Flexible Pipeline System
Architecture révolutionnaire permettant des workflows personnalisés et réutilisables:

**Composants**:
- `public/pipeline-builder.html` - Interface drag-and-drop visuelle
- `public/pipeline-runner.html` - Exécution avec tracking progressif
- `lib/pipeline/PipelineExecutor.js` - Moteur d'exécution
- `lib/pipeline/PipelineTemplates.js` - 10 templates prédéfinis

**10 Templates disponibles**:
- `minimal-test` (1 step, 15s) - Tests rapides
- `light-fast` (2 steps, 35s) - Génération basique
- `standard-seo` (4 steps, 75s) - Protection équilibrée
- `premium-seo` (6 steps, 130s) - Qualité + anti-détection
- `heavy-guard` (8 steps, 180s) - Protection maximale
- `gptzero-killer` (6 steps, 155s) - Spécialisé anti-GPTZero
- `originality-bypass` (6 steps, 160s) - Spécialisé anti-Originality.ai

**Fonctionnalités clés**:
- Ordre de modules entièrement personnalisable
- Multi-pass support (même module plusieurs fois)
- Configuration par étape (mode, intensity 0.1-2.0, paramètres custom)
- Sauvegarde checkpoints optionnels pour debugging
- Validation temps réel avec messages d'erreur détaillés
- Estimation durée/coût avant exécution

**Structure Pipeline**: JSON avec steps (module, mode, intensity, parameters optionnels), metadata (author, version, tags)

**API Endpoints**: `/api/pipeline/{save,list,execute,validate,estimate}`
**Backward compatible**: `pipelineConfig` (nouveau) et `selectiveStack/adversarialMode` (ancien) supportés

### Core Workflow Pipeline

**7 étapes principales** (lib/Main.js):
1. **Data Preparation** - Lecture Google Sheets (CSV data + XML templates)
2. **Element Extraction** - Parse XML avec instructions {{variables}} vs {prompts}
3. **Missing Keywords Generation** - Auto-complétion données manquantes via LLMs
4. **Content Generation** - Génération base contenu en parallèle
5. **Multi-LLM Enhancement** - 4 couches modulaires (Selective → Adversarial → Human → Pattern)
6. **Content Assembly** - Injection contenu dans structure XML
7. **Organic Compilation & Storage** - Sauvegarde texte clean dans Google Sheets

**Google Sheets Integration**:
- **Instructions** (colonnes A-I): slug, T0, MC0, T-1, L-1, MC+1, T+1, L+1, XML template
- **Personnalites** (15 personnalités): Marc, Sophie, Laurent, Julie, Kévin, Amara, Mamadou, Émilie, Pierre-Henri, Yasmine, Fabrice, Chloé, Linh, Minh, Thierry
- **Generated_Articles**: Output texte final + metadata complète

**Modular Enhancement Layers** (Architecture 100% modulaire):
- **5 Selective Stacks**:
  - `lightEnhancement` (1 couche OpenAI technique)
  - `standardEnhancement` (2 couches OpenAI + Gemini)
  - `fullEnhancement` (3 couches multi-LLM)
  - `personalityFocus` (style Mistral prioritaire)
  - `adaptive` (sélection intelligente)

- **5 Adversarial Modes**:
  - `none` → `light` → `standard` → `heavy` → `adaptive`
  - Détecteurs: GPTZero, Originality.ai, général
  - Méthodes: enhancement, regeneration, hybrid

- **6 Human Simulation Modes**:
  - `none` → `lightSimulation` → `standardSimulation` → `heavySimulation` → `personalityFocus` → `adaptive`
  - FatiguePatterns, PersonalityErrors, TemporalStyles

- **7 Pattern Breaking Modes**:
  - `none` → `syntaxFocus` → `connectorsFocus` → `structureFocus` → `styleFocus` → `comprehensiveFocus` → `adaptive`
  - LLMFingerprints removal, SyntaxVariations, NaturalConnectors

**Versioned Saves**:
v1.0 (génération initiale) → v1.1 (post selective) → v1.2 (post adversarial) → v1.3 (post human) → v1.4 (post pattern) → v2.0 (version finale)

**LLM Providers**:
Claude (Anthropic), OpenAI (GPT-4), Gemini (Google), Deepseek, Moonshot, Mistral - **5/6 opérationnels** (Gemini peut être géo-bloqué)

**Personality System**:
Random selection - 60% des 15 personnalités par génération, Fisher-Yates shuffle pour vraie randomisation, Temperature=1.0 pour variabilité maximale

## Centralized Logging System (LogSh)

**Architecture**: All logging via `logSh()` (lib/ErrorReporting.js) - Multi-output (Console + File + WebSocket)
**Levels**: TRACE (workflow), DEBUG, INFO, WARN, ERROR
**Format**: JSON structured logs (logs/seo-generator-YYYY-MM-DD_HH-MM-SS.log), JSONL Pino (logs/app.log)
**Trace**: AsyncLocalStorage hierarchical tracking with performance timing

**Log Viewer** (`tools/logViewer.js`):
```bash
node tools/logViewer.js --pretty                       # Dernières 200 lignes
node tools/logViewer.js --includes "Claude" --pretty   # Recherche mot-clé
node tools/logViewer.js --level ERROR --pretty         # Filtrer erreurs
```

**Real-time**: WebSocket port 8081, auto-launch `tools/logs-viewer.html` in browser

## Key Components

### Core Orchestration
- **`lib/Main.js`** - Orchestration workflow complète avec pipeline configurable et sauvegarde versionnée (v1.0 → v2.0)
- **`lib/APIController.js`** - Contrôleur API RESTful centralisant toute la logique métier:
  - CRUD articles, projets, templates
  - Intégration DynamicPromptEngine, TrendManager, WorkflowEngine
  - Endpoints monitoring (health, metrics, personalities)
- **`lib/ConfigManager.js`** - Gestionnaire configurations modulaires et pipelines:
  - Sauvegarde/chargement JSON dans `configs/` et `configs/pipelines/`
  - Validation et versioning automatique
  - API complète pour manipulation configs

### Enhancement Modules (Architecture Modulaire)
- **`lib/selective-enhancement/`** - Couches enhancement sélectives:
  - `SelectiveCore.js` - Application couche par couche
  - `SelectiveLayers.js` - 5 stacks prédéfinis + adaptatif
  - `TechnicalLayer.js` - Enhancement technique OpenAI
  - `TransitionLayer.js` - Enhancement transitions Gemini
  - `StyleLayer.js` - Enhancement style Mistral
  - `SelectiveUtils.js` - Utilitaires + génération simple

- **`lib/adversarial-generation/`** - Anti-détection modulaire:
  - `AdversarialCore.js` - Moteur adversarial principal
  - `AdversarialLayers.js` - 5 modes défense configurables
  - `DetectorStrategies.js` - Stratégies anti-détection interchangeables (GPTZero, Originality.ai)

- **`lib/human-simulation/`** - Simulation erreurs humaines réalistes:
  - `HumanSimulationCore.js` - Moteur simulation principal
  - `HumanSimulationLayers.js` - 6 modes simulation
  - `FatiguePatterns.js` - Patterns fatigue réalistes
  - `PersonalityErrors.js` - Erreurs spécifiques personnalité
  - `TemporalStyles.js` - Variations temporelles

- **`lib/pattern-breaking/`** - Cassage patterns LLM:
  - `PatternBreakingCore.js` - Moteur pattern breaking
  - `PatternBreakingLayers.js` - 7 modes cassage
  - `LLMFingerprints.js` - Suppression empreintes LLM
  - `SyntaxVariations.js` - Variations syntaxiques
  - `NaturalConnectors.js` - Connecteurs naturels

### Advanced Systems (Nouveaux - Sept 2025)
- **`lib/prompt-engine/`** - DynamicPromptEngine:
  - Templates modulaires (technical, style, adversarial)
  - Context analyzers et adaptive rules
  - Composition multi-niveaux avec variables dynamiques

- **`lib/trend-prompts/`** - TrendManager:
  - 6+ tendances prédéfinies (sectorielles + générationnelles)
  - Configuration par tendance (targetTerms, focusAreas, tone, values)

- **`lib/workflow-configuration/`** - WorkflowEngine:
  - 5 workflows prédéfinis configurables
  - Support iterations multiples et intensité variable

- **`lib/batch/`** - Batch Processing System:
  - BatchController (API endpoints)
  - BatchProcessor (queue, monitoring)
  - DigitalOceanTemplates (10+ templates XML)

### Utilities
- **`lib/LLMManager.js`** - Gestion multi-LLM providers avec retry logic, rate limiting, provider rotation
- **`lib/BrainConfig.js`** - Intégration Google Sheets + système personnalités (random selection, Fisher-Yates)
- **`lib/ElementExtraction.js`** - Parsing XML avec distinction {{variables}} vs {instructions}
- **`lib/ArticleStorage.js`** - Compilation texte organique + stockage Google Sheets
- **`lib/ErrorReporting.js`** - Logging centralisé via `logSh()`, hierarchical tracing AsyncLocalStorage

## Environment Configuration

Required environment variables in `.env`:

```bash
# Google Sheets Integration
GOOGLE_SERVICE_ACCOUNT_EMAIL=your-service-account@project.iam.gserviceaccount.com
GOOGLE_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"
GOOGLE_SHEETS_ID=your_sheets_id

# LLM API Keys
ANTHROPIC_API_KEY=your_anthropic_key
OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key
DEEPSEEK_API_KEY=your_deepseek_key
MOONSHOT_API_KEY=your_moonshot_key
MISTRAL_API_KEY=your_mistral_key

# Optional Configuration
LOG_LEVEL=INFO
MAX_COST_PER_ARTICLE=1.00
SERVER_MODE=manual
```

## Tools

### Bundle Tool
```bash
node tools/pack-lib.cjs              # default → code.js
node tools/pack-lib.cjs --out out.js # custom output
node tools/pack-lib.cjs --order alpha
node tools/pack-lib.cjs --entry lib/test-manual.js
```

pack-lib.cjs creates a single code.js from all files in lib/. Each file is concatenated with an ASCII header showing its path. Imports/exports are kept, so the bundle is for **reading/audit only**, not execution.

### Unused Code Audit
```bash
node tools/audit-unused.cjs # Report dead files and unused exports
```

## Important Development Notes

**Architecture**: 100% modulaire, configuration granulaire, versioned saves (v1.0→v2.0), compatibility layer `handleFullWorkflow()`

**New Systems (Sept 2025)**: DynamicPromptEngine, TrendManager (6+ trends), WorkflowEngine (5 workflows), BatchProcessing, ConfigManager, APIController, 11 web interfaces

**Data**: Google Sheets source (no hardcoded JSON), 15 personalities (60% random selection, Fisher-Yates, temp=1.0), organic compilation, XML templates auto-generated

**Monitoring**: AsyncLocalStorage tracing, 5/6 LLM providers, RESTful API (pagination/filters), WebSocket real-time logs + health/metrics

**Migration Legacy→Modulaire**: ❌ `lib/ContentGeneration.js` + `lib/generation/` → ✅ selective/adversarial/human-simulation/pattern-breaking modules (flexibilité totale, stacks adaptatifs, parallélisation)

## Web Interfaces (MANUEL Mode)

### Interfaces de Production
- **`public/index.html`** - Dashboard principal avec contrôles workflow
- **`public/production-runner.html`** - Exécution workflows production depuis Google Sheets
- **`public/pipeline-builder.html`** - Constructeur visuel de pipelines drag-and-drop
- **`public/pipeline-runner.html`** - Exécuteur de pipelines sauvegardés avec tracking
- **`public/config-editor.html`** - Éditeur de configurations modulaires

### Interfaces de Test et Développement
- **`public/batch-dashboard.html`** - Dashboard traitement batch avec configuration
- **`public/batch-interface.html`** - Interface batch avec contrôle granulaire
- **`public/prompt-engine-interface.html`** - Interface test DynamicPromptEngine
- **`public/modular-pipeline-demo.html`** - Démo système pipeline modulaire
- **`public/step-by-step.html`** - Exécution pas-à-pas pour debugging
- **`public/test-modulaire.html`** - Tests manuels des modules

## RESTful API Endpoints

**Articles/Projects/Templates**: Full CRUD (GET, POST, PUT, DELETE) - `/api/articles/*`, `/api/projects/*`, `/api/templates/*`
**Monitoring**: `/api/health`, `/api/metrics`, `/api/config/personalities`
**Batch**: `/api/batch/{config,start,stop,status}`
**Pipeline**: `/api/pipeline/{save,list,execute,validate,estimate}`
**Advanced**: `/api/prompt-engine/generate`, `/api/trends/*`, `/api/workflows/*`

Voir `API.md` pour documentation complète avec exemples.

## File Structure
**Core**: `server.js`, `lib/Main.js`, `lib/APIController.js`, `lib/ConfigManager.js`, `lib/modes/`, `lib/BrainConfig.js`, `lib/LLMManager.js`
**Enhancement**: `lib/selective-enhancement/`, `lib/adversarial-generation/`, `lib/human-simulation/`, `lib/pattern-breaking/`
**Advanced**: `lib/prompt-engine/`, `lib/trend-prompts/`, `lib/workflow-configuration/`, `lib/batch/`, `lib/pipeline/`
**Utilities**: `lib/ElementExtraction.js`, `lib/ArticleStorage.js`, `lib/ErrorReporting.js`
**Assets**: `public/` (11 web interfaces), `configs/` (saved configs/pipelines), `tools/` (logViewer, bundler, audit), `tests/` (comprehensive test suite), `.env` (credentials)

## Dependencies & Workflow Sources
**Deps**: googleapis, axios, dotenv, express, nodemailer
**Sources**: production (Google Sheets), test_random_personality, node_server

## Git Push Configuration
Si le push échoue avec "Connection closed port 22", utiliser SSH sur port 443:
```bash
# Configurer remote pour port 443
git remote set-url origin git@altssh.bitbucket.org:AlexisTrouve/seogeneratorserver.git

# Ou configurer ~/.ssh/config
Host bitbucket.org
    HostName altssh.bitbucket.org
    Port 443
```