seo-generator-server/CLAUDE.md
StillHammer 471058f731 Add flexible pipeline system with per-module LLM configuration
- New modular pipeline architecture allowing custom workflow combinations
- Per-step LLM provider configuration (Claude, OpenAI, Gemini, Deepseek, Moonshot, Mistral)
- Visual pipeline builder and runner interfaces with drag-and-drop
- 10 predefined pipeline templates (minimal-test to originality-bypass)
- Pipeline CRUD operations via ConfigManager and REST API
- Fix variable resolution in instructions (HTML tags were breaking {{variables}})
- Fix hardcoded LLM providers in AdversarialCore
- Add TESTS_LLM_PROVIDER.md documentation with validation results
- Update dashboard to disable legacy config editor

API Endpoints:
- POST /api/pipeline/save, execute, validate, estimate
- GET /api/pipeline/list, modules, templates

Backward compatible with legacy modular workflow system.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-09 14:01:52 +08:00

21 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Node.js-based SEO content generation server that creates SEO-optimized content using multiple LLMs with anti-detection mechanisms. The system operates in two exclusive modes: MANUAL (web interface + API) or AUTO (batch processing from Google Sheets).

Development Commands

Server Operations

npm start                    # Start in MANUAL mode (default)
npm start -- --mode=manual  # Explicitly start MANUAL mode  
npm start -- --mode=auto    # Start in AUTO mode
SERVER_MODE=auto npm start  # Start AUTO mode via environment

Production Workflow Execution

# Execute real production workflow from Google Sheets
node -e "const main = require('./lib/Main'); main.handleFullWorkflow({ rowNumber: 2, source: 'production' });"

# Test with different rows
node -e "const main = require('./lib/Main'); main.handleFullWorkflow({ rowNumber: 3, source: 'production' });"

Testing Commands

# Test suites
npm run test:all             # Complete test suite
npm run test:light           # Light test runner
npm run test:smoke           # Smoke tests only
npm run test:llm             # LLM connectivity tests
npm run test:content         # Content generation tests
npm run test:integration     # Integration tests
npm run test:systematic      # Systematic module testing
npm run test:basic           # Basic validation only

# Individual test categories
npm run test:ai-validation   # AI content validation
npm run test:dashboard       # Test dashboard server

# Comprehensive Integration Tests (NEW)
npm run test:comprehensive   # Exhaustive modular combinations testing
npm run test:modular         # Alias for comprehensive tests

# Production Ready Tests (NEW)
npm run test:production-workflow  # Complete production workflow tests (slow)
npm run test:production-quick     # Fast production workflow validation
npm run test:production-loop      # Complete production ready loop validation

Google Sheets Integration Tests

# Test personality loading
node -e "const {getPersonalities} = require('./lib/BrainConfig'); getPersonalities().then(p => console.log(\`\${p.length} personalities loaded\`));"

# Test CSV data loading
node -e "const {readInstructionsData} = require('./lib/BrainConfig'); readInstructionsData(2).then(d => console.log('Data:', d));"

# Test random personality selection  
node -e "const {selectPersonalityWithAI, getPersonalities} = require('./lib/BrainConfig'); getPersonalities().then(p => selectPersonalityWithAI('test', 'test', p)).then(r => console.log('Selected:', r.nom));"

LLM Connectivity Tests

node -e "require('./lib/LLMManager').testLLMManager()"         # Basic LLM connectivity
node -e "require('./lib/LLMManager').testLLMManagerComplete()" # Full LLM provider test suite

Complete System Test

node -e "
const main = require('./lib/Main');
const testData = {
  csvData: {
    mc0: 'plaque personnalisée',
    t0: 'Créer une plaque personnalisée unique',
    personality: { nom: 'Marc', style: 'professionnel' },
    tMinus1: 'décoration personnalisée',
    mcPlus1: 'plaque gravée,plaque métal,plaque bois,plaque acrylique',
    tPlus1: 'Plaque Gravée Premium,Plaque Métal Moderne,Plaque Bois Naturel,Plaque Acrylique Design'
  },
  xmlTemplate: Buffer.from(\`<?xml version='1.0' encoding='UTF-8'?>
<article>
  <h1>|Titre_Principal{{T0}}{Rédige un titre H1 accrocheur}|</h1>
  <intro>|Introduction{{MC0}}{Rédige une introduction engageante}|</intro>
</article>\`).toString('base64'),
  source: 'node_server_test'
};
main.handleFullWorkflow(testData);
"

Production Ready Loop Validation

# Complete production ready validation (recommended for CI/CD)
npm run test:production-loop

# This runs:
# 1. npm run test:basic           # Architecture validation
# 2. npm run test:production-quick # Google Sheets connectivity + core functions
# 3. Echo "✅ Production ready loop validated"

# Expected output:
# ✅ Architecture modulaire selective validée
# ✅ Architecture modulaire adversarial validée
# ✅ Google Sheets connectivity OK
# ✅ 15 personnalités chargées
# ✅ All core modules available
# 🎯 PRODUCTION READY LOOP ✅

Architecture Overview

Dual Mode System

The server operates in two mutually exclusive modes controlled by lib/modes/ModeManager.js:

  • MANUAL Mode (lib/modes/ManualServer.js): Web interface, API endpoints, WebSocket for real-time logs
  • AUTO Mode (lib/modes/AutoProcessor.js): Batch processing from Google Sheets without web interface

🆕 Flexible Pipeline System (NEW)

Revolutionary architecture allowing custom, reusable workflows with complete flexibility:

Components

  • Pipeline Builder (public/pipeline-builder.html): Visual drag-and-drop interface
  • Pipeline Runner (public/pipeline-runner.html): Execute saved pipelines with progress tracking
  • Pipeline Executor (lib/pipeline/PipelineExecutor.js): Execution engine
  • Pipeline Templates (lib/pipeline/PipelineTemplates.js): 10 predefined templates
  • Pipeline Definition (lib/pipeline/PipelineDefinition.js): Schemas & validation
  • Config Manager (lib/ConfigManager.js): Extended with pipeline CRUD operations

Key Features

Any module order: generation → selective → adversarial → human → pattern (fully customizable) Multi-pass support: Apply same module multiple times with different intensities Per-step configuration: mode, intensity (0.1-2.0), custom parameters Checkpoint saving: Optional checkpoints between steps for debugging Template-based: Start from 10 templates or build from scratch Complete validation: Real-time validation with detailed error messages Duration estimation: Estimate total execution time before running

Available Templates

  • minimal-test: 1 step (15s) - Quick testing
  • light-fast: 2 steps (35s) - Basic generation
  • standard-seo: 4 steps (75s) - Balanced protection
  • premium-seo: 6 steps (130s) - High quality + anti-detection
  • heavy-guard: 8 steps (180s) - Maximum protection
  • personality-focus: 4 steps (70s) - Enhanced personality style
  • fluidity-master: 4 steps (73s) - Natural transitions focus
  • adaptive-smart: 5 steps (105s) - Intelligent adaptive modes
  • gptzero-killer: 6 steps (155s) - GPTZero-specific bypass
  • originality-bypass: 6 steps (160s) - Originality.ai-specific bypass

API Endpoints

POST   /api/pipeline/save           # Save pipeline definition
GET    /api/pipeline/list           # List all saved pipelines
GET    /api/pipeline/:name          # Load specific pipeline
DELETE /api/pipeline/:name          # Delete pipeline
POST   /api/pipeline/execute        # Execute pipeline
GET    /api/pipeline/templates      # Get all templates
GET    /api/pipeline/templates/:name # Get specific template
GET    /api/pipeline/modules        # Get available modules
POST   /api/pipeline/validate       # Validate pipeline structure
POST   /api/pipeline/estimate       # Estimate duration/cost

Example Pipeline Definition

{
  name: "Custom Premium Pipeline",
  description: "Multi-pass anti-detection with personality focus",
  pipeline: [
    { step: 1, module: "generation", mode: "simple", intensity: 1.0 },
    { step: 2, module: "selective", mode: "fullEnhancement", intensity: 1.0 },
    { step: 3, module: "adversarial", mode: "heavy", intensity: 1.2,
      parameters: { detector: "gptZero", method: "regeneration" } },
    { step: 4, module: "human", mode: "personalityFocus", intensity: 1.5 },
    { step: 5, module: "pattern", mode: "syntaxFocus", intensity: 1.1 },
    { step: 6, module: "adversarial", mode: "adaptive", intensity: 1.3,
      parameters: { detector: "originality", method: "hybrid" } }
  ],
  metadata: {
    author: "user",
    created: "2025-10-08",
    version: "1.0",
    tags: ["premium", "multi-pass", "anti-detection"]
  }
}

Backward Compatibility

The flexible pipeline system coexists with the legacy modular workflow system:

  • New way: Use pipelineConfig parameter in handleFullWorkflow()
  • Old way: Use selectiveStack, adversarialMode, humanSimulationMode, patternBreakingMode
  • Both are fully supported and can be used interchangeably

Core Workflow Pipeline (lib/Main.js)

  1. Data Preparation - Read from Google Sheets (CSV data + XML templates)
  2. Element Extraction - Parse XML elements with embedded instructions
  3. Missing Keywords Generation - Auto-complete missing data using LLMs
  4. Direct Content Generation - Generate all content elements in parallel
  5. Multi-LLM Enhancement - 4-stage processing pipeline across different LLM providers
  6. Content Assembly - Inject generated content back into XML structure
  7. Organic Compilation & Storage - Save clean text to Google Sheets

Google Sheets Integration

  • Authentication: Via GOOGLE_SERVICE_ACCOUNT_EMAIL and GOOGLE_PRIVATE_KEY environment variables
  • Data Sources:
    • Instructions sheet: Columns A-I (slug, T0, MC0, T-1, L-1, MC+1, T+1, L+1, XML template)
    • Personnalites sheet: 15 AI personalities for content variety
    • Generated_Articles sheet: Final compiled text output with metadata

Multi-LLM Modular Enhancement System

Architecture 100% Modulaire avec sauvegarde versionnée :

Workflow Principal (lib/Main.js)

  1. Data Preparation - Read from Google Sheets (CSV data + XML templates)
  2. Element Extraction - Parse XML elements with embedded instructions
  3. Missing Keywords Generation - Auto-complete missing data using LLMs
  4. Simple Generation - Generate base content with Claude
  5. Selective Enhancement - Couches modulaires configurables
  6. Adversarial Enhancement - Anti-détection modulaire
  7. Human Simulation - Erreurs humaines réalistes
  8. Pattern Breaking - Cassage patterns LLM
  9. Content Assembly & Storage - Final compilation avec versioning

Couches Modulaires Disponibles

  • 5 Selective Stacks : lightEnhancement → fullEnhancement → adaptive
  • 5 Adversarial Modes : none → light → standard → heavy → adaptive
  • 6 Human Simulation Modes : none → lightSimulation → personalityFocus → adaptive
  • 7 Pattern Breaking Modes : none → syntaxFocus → connectorsFocus → adaptive

Sauvegarde Versionnée

  • v1.0 : Génération initiale Claude
  • v1.1 : Post Selective Enhancement
  • v1.2 : Post Adversarial Enhancement
  • v1.3 : Post Human Simulation
  • v1.4 : Post Pattern Breaking
  • v2.0 : Version finale

Supported LLM providers: Claude, OpenAI, Gemini, Deepseek, Moonshot, Mistral

Tests d'Intégration Exhaustifs (Nouveau)

Les TI exhaustifs (npm run test:comprehensive) testent 22 combinaisons modulaires complètes :

Selective Stacks Testés (5) :

  • lightEnhancement : 1 couche OpenAI technique
  • standardEnhancement : 2 couches OpenAI + Gemini
  • fullEnhancement : 3 couches multi-LLM complet
  • personalityFocus : Style Mistral prioritaire
  • fluidityFocus : Transitions Gemini prioritaires

Adversarial Modes Testés (4) :

  • general + regeneration : Anti-détection standard
  • gptZero + regeneration : Anti-GPTZero spécialisé
  • originality + hybrid : Anti-Originality.ai
  • general + enhancement : Méthode douce

Pipelines Combinés Testés (5) :

  • Light → Adversarial
  • Standard → Adversarial Intense
  • Full → Multi-Adversarial
  • Personality → GPTZero
  • Fluidity → Originality

Tests Performance & Intensités (8) :

  • Intensités variables (0.5 → 1.2)
  • Méthodes multiples (enhancement/regeneration/hybrid)
  • Benchmark pipeline complet avec métriques

Personality System (lib/BrainConfig.js:265-340)

Random Selection Process:

  1. Load 15 personalities from Google Sheets
  2. Fisher-Yates shuffle for true randomness
  3. Select 60% (9 personalities) per generation
  4. AI chooses best match within random subset
  5. Temperature = 1.0 for maximum variability

15 Available Personalities: Marc (technical), Sophie (déco), Laurent (commercial), Julie (architecture), Kévin (terrain), Amara (engineering), Mamadou (artisan), Émilie (digital), Pierre-Henri (heritage), Yasmine (greentech), Fabrice (metallurgy), Chloé (content), Linh (manufacturing), Minh (design), Thierry (creole)

Centralized Logging System (LogSh)

Architecture

  • All logging must go through logSh() function in lib/ErrorReporting.js
  • Multi-output streams: Console (formatted) + File (JSON) + WebSocket (real-time)
  • Never use console.* or other loggers directly

Log Levels and Usage

  • TRACE: Hierarchical workflow execution with parameters (▶ ✔ ✖ symbols)
  • DEBUG: Detailed debugging information (visible in files with debug level)
  • INFO: Standard operational messages
  • WARN: Warning conditions
  • ERROR: Error conditions with stack traces

File Logging

  • Format: JSON structured logs in timestamped files
  • Location: logs/seo-generator-YYYY-MM-DD_HH-MM-SS.log
  • Flush behavior: Immediate flush on every log call to prevent buffer loss
  • Level: DEBUG and above (includes all TRACE logs)

Trace System

  • Hierarchical execution tracking: Using AsyncLocalStorage for span context
  • Function parameters: All tracer.run() calls include relevant parameters
  • Format: Function names with file prefixes (e.g., "Main.handleFullWorkflow()")
  • Performance timing: Start/end with duration measurements
  • Error handling: Automatic stack trace logging on failures

Log Consultation (LogViewer)

Les logs ne sont plus envoyés en console.log (trop verbeux). Tous les événements sont enregistrés dans logs/app.log au format JSONL Pino.

Un outil tools/logViewer.js permet d'interroger facilement ce fichier:

# Voir les 200 dernières lignes formatées
node tools/logViewer.js --pretty

# Rechercher un mot-clé dans les messages
node tools/logViewer.js --search --includes "Claude" --pretty

# Rechercher par plage de temps (tous les logs du 2 septembre 2025)
node tools/logViewer.js --since 2025-09-02T00:00:00Z --until 2025-09-02T23:59:59Z --pretty

# Filtrer par niveau d'erreur
node tools/logViewer.js --last 300 --level ERROR --pretty

Filtres disponibles:

  • --level: 30=INFO, 40=WARN, 50=ERROR (ou INFO, WARN, ERROR)
  • --module: filtre par path ou module
  • --includes: mot-clé dans msg
  • --regex: expression régulière sur msg
  • --since / --until: bornes temporelles (ISO ou YYYY-MM-DD)

Real-time Log Viewing

  • WebSocket server on port 8081
  • Auto-launched tools/logs-viewer.html in Edge browser
  • Features: Search, level filtering, scroll preservation

Key Components

lib/Main.js

Architecture Modulaire Complète - Orchestration workflow avec pipeline configurable et sauvegarde versionnée.

lib/selective-enhancement/

Couches Selective Modulaires :

  • SelectiveCore.js - Application couche par couche
  • SelectiveLayers.js - 5 stacks prédéfinis + adaptatif
  • TechnicalLayer.js - Enhancement technique OpenAI
  • TransitionLayer.js - Enhancement transitions Gemini
  • StyleLayer.js - Enhancement style Mistral
  • SelectiveUtils.js - Utilitaires + génération simple (remplace ContentGeneration.js)

lib/adversarial-generation/

Anti-détection Modulaire :

  • AdversarialCore.js - Moteur adversarial principal
  • AdversarialLayers.js - 5 modes défense configurables
  • DetectorStrategies.js - Stratégies anti-détection interchangeables

lib/human-simulation/

Simulation Erreurs Humaines :

  • HumanSimulationCore.js - Moteur simulation principal
  • HumanSimulationLayers.js - 6 modes simulation
  • FatiguePatterns.js - Patterns fatigue réalistes
  • PersonalityErrors.js - Erreurs spécifiques personnalité
  • TemporalStyles.js - Variations temporelles

lib/pattern-breaking/

Cassage Patterns LLM :

  • PatternBreakingCore.js - Moteur pattern breaking
  • PatternBreakingLayers.js - 7 modes cassage
  • LLMFingerprints.js - Suppression empreintes LLM
  • SyntaxVariations.js - Variations syntaxiques
  • NaturalConnectors.js - Connecteurs naturels

lib/post-processing/

Post-traitement Legacy (remplacé par modules ci-dessus)

lib/LLMManager.js

Multi-LLM provider management with retry logic, rate limiting, and provider rotation.

lib/BrainConfig.js

Google Sheets integration, personality system, and random selection algorithms.

lib/ElementExtraction.js

XML parsing and element extraction with instruction parsing ({{variables}} vs {instructions}).

lib/ArticleStorage.js

Organic text compilation maintaining natural hierarchy and Google Sheets storage.

lib/ErrorReporting.js

Centralized logging system with hierarchical tracing and multi-output streams.

Environment Configuration

Required environment variables in .env:

# Google Sheets Integration
GOOGLE_SERVICE_ACCOUNT_EMAIL=your-service-account@project.iam.gserviceaccount.com
GOOGLE_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"
GOOGLE_SHEETS_ID=your_sheets_id

# LLM API Keys
ANTHROPIC_API_KEY=your_anthropic_key
OPENAI_API_KEY=your_openai_key  
GOOGLE_API_KEY=your_google_key
DEEPSEEK_API_KEY=your_deepseek_key
MOONSHOT_API_KEY=your_moonshot_key
MISTRAL_API_KEY=your_mistral_key

# Optional Configuration
LOG_LEVEL=INFO
MAX_COST_PER_ARTICLE=1.00
SERVER_MODE=manual

Tools

Bundle Tool

node tools/pack-lib.cjs              # default → code.js
node tools/pack-lib.cjs --out out.js # custom output
node tools/pack-lib.cjs --order alpha
node tools/pack-lib.cjs --entry lib/test-manual.js

pack-lib.cjs creates a single code.js from all files in lib/. Each file is concatenated with an ASCII header showing its path. Imports/exports are kept, so the bundle is for reading/audit only, not execution.

Unused Code Audit

node tools/audit-unused.cjs # Report dead files and unused exports

Important Development Notes

  • Architecture 100% Modulaire: Ancien système séquentiel supprimé, backup dans /backup/sequential-system/
  • Configuration Granulaire: Chaque couche modulaire indépendamment configurable
  • Sauvegarde Versionnée: v1.0 → v1.1 → v1.2 → v1.3 → v1.4 → v2.0 pour traçabilité complète
  • Compatibility Layer: Interface handleFullWorkflow() maintenue pour rétrocompatibilité
  • Personality system uses randomization: 60% of 15 personalities selected per generation run
  • All data sourced from Google Sheets: No hardcoded JSON files or static data
  • Default XML templates: Auto-generated when column I contains filenames
  • Organic compilation: Maintains natural text flow in final output
  • Temperature = 1.0: Ensures maximum variability in AI responses
  • Trace system: Uses AsyncLocalStorage for hierarchical execution tracking
  • 5/6 LLM providers operational: Gemini may be geo-blocked in some regions

Migration Legacy → Modulaire

  • Supprimé: lib/ContentGeneration.js + lib/generation/ (pipeline séquentiel fixe)
  • Remplacé par: Modules selective/adversarial/human-simulation/pattern-breaking
  • Avantage: Flexibilité totale, stacks adaptatifs, parallélisation possible

File Structure

  • server.js - Express server entry point with mode selection
  • lib/Main.js - Core workflow orchestration
  • lib/modes/ - Mode management (Manual/Auto)
  • lib/BrainConfig.js - Google Sheets integration + personality system
  • lib/LLMManager.js - Multi-LLM provider management
  • lib/ContentGeneration.js - Content generation and enhancement pipeline
  • lib/ElementExtraction.js - XML parsing and element extraction
  • lib/ArticleStorage.js - Content compilation and Google Sheets storage
  • lib/ErrorReporting.js - Centralized logging and error handling
  • tools/ - Development utilities (log viewer, bundler, audit)
  • tests/ - Comprehensive test suite with multiple categories
  • .env - Environment configuration (Google credentials, API keys)

Key Dependencies

  • googleapis - Google Sheets API integration
  • axios - HTTP client for LLM APIs
  • dotenv - Environment variable management
  • express - Web server framework
  • nodemailer - Email notifications (needs setup)

Workflow Sources

  • production - Real Google Sheets data processing
  • test_random_personality - Testing with personality randomization
  • node_server - Direct API processing
  • Legacy: make_com, digital_ocean_autonomous

Git Push Configuration

Si le push échoue avec "Connection closed port 22", utiliser SSH sur port 443:

# Configurer remote pour port 443
git remote set-url origin git@altssh.bitbucket.org:AlexisTrouve/seogeneratorserver.git

# Ou configurer ~/.ssh/config
Host bitbucket.org
    HostName altssh.bitbucket.org
    Port 443