# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview This is a Node.js-based SEO content generation server that was converted from Google Apps Script. The system generates SEO-optimized content using multiple LLMs with sophisticated anti-detection mechanisms and Content DNA Mixing techniques. ### 🎯 Current Status - PHASE 2 COMPLETE ✅ - **Full Google Sheets Integration**: ✅ **OPERATIONAL** - 15 AI personalities with random selection (60% variability) - Complete data pipeline from Google Sheets (Instructions, Personnalites) - XML template system with default fallback - Organic content compilation and storage - **Multi-LLM Enhancement Pipeline**: ✅ **FULLY OPERATIONAL** - 6 LLM providers: Claude, OpenAI, Gemini, Deepseek, Moonshot, Mistral - 4-stage enhancement pipeline: Claude → GPT-4 → Gemini → Mistral - Direct generation bypass for 16+ elements - Average execution: 60-90 seconds for full multi-LLM processing - **Anti-Detection System**: ✅ **ADVANCED** - Random personality selection from 15 profiles (9 selected per run) - Temperature = 1.0 for maximum variability - Multiple writing styles and vocabularies - Content DNA mixing across 4 AI models per element ### 🚀 Core Features Implemented 1. **Google Sheets Integration** - Complete authentication via environment variables - Read from "Instructions" sheet (slug, CSV data, XML templates) - Read from "Personnalites" sheet (15 AI personalities) - Write to "Generated_Articles" sheet (compiled text only, no XML) 2. **Advanced Personality System** - 15 diverse personalities: technical, creative, commercial, multilingual - Random selection of 60% personalities per generation - AI-powered intelligent selection within random subset - Maximum style variability for anti-detection 3. **XML Template Processing** - Default XML template with 16 content elements - Instruction extraction with fixed regex ({{variables}} vs {instructions}) - Base64 and plain text template support - Automatic fallback when filenames detected 4. **Multi-LLM Content Generation** - Direct element generation (bypasses faulty hierarchy) - Missing keywords auto-generation - 4-stage enhancement pipeline - Organic content compilation maintaining natural flow ## Development Commands ### Production Workflow Execution bash # Execute real production workflow from Google Sheets node -e "const main = require('./lib/Main'); main.handleFullWorkflow({ rowNumber: 2, source: 'production' });" # Test with different rows node -e "const main = require('./lib/Main'); main.handleFullWorkflow({ rowNumber: 3, source: 'production' });" ### Basic Operations - npm start - Start the production server on port 3000 - npm run dev - Start the development server (same as start) - node server.js - Direct server startup ### Testing Commands #### Google Sheets Integration Tests bash # Test personality loading from Google Sheets node -e "const {getPersonalities} = require('./lib/BrainConfig'); getPersonalities().then(p => console.log(${p.length} personalities loaded));" # Test CSV data loading node -e "const {readInstructionsData} = require('./lib/BrainConfig'); readInstructionsData(2).then(d => console.log('Data:', d));" # Test random personality selection node -e "const {selectPersonalityWithAI, getPersonalities} = require('./lib/BrainConfig'); getPersonalities().then(p => selectPersonalityWithAI('test', 'test', p)).then(r => console.log('Selected:', r.nom));" #### LLM Connectivity Tests - node -e "require('./lib/LLMManager').testLLMManager()" - Test basic LLM connectivity - node -e "require('./lib/LLMManager').testLLMManagerComplete()" - Full LLM provider test suite #### Complete System Test bash node -e " const main = require('./lib/Main'); const testData = { csvData: { mc0: 'plaque personnalisée', t0: 'Créer une plaque personnalisée unique', personality: { nom: 'Marc', style: 'professionnel' }, tMinus1: 'décoration personnalisée', mcPlus1: 'plaque gravée,plaque métal,plaque bois,plaque acrylique', tPlus1: 'Plaque Gravée Premium,Plaque Métal Moderne,Plaque Bois Naturel,Plaque Acrylique Design' }, xmlTemplate: Buffer.from(\

|Titre_Principal{{T0}}{Rédige un titre H1 accrocheur}|

|Introduction{{MC0}}{Rédige une introduction engageante}|
\).toString('base64'), source: 'node_server_test' }; main.handleFullWorkflow(testData); " ## Architecture Overview ### Core Workflow (lib/Main.js) 1. **Data Preparation** - Read from Google Sheets (CSV + XML template) 2. **Element Extraction** - Parse 16+ XML elements with instructions 3. **Missing Keywords Generation** - Auto-complete missing data 4. **Direct Content Generation** - Bypass hierarchy, generate all elements 5. **Multi-LLM Enhancement** - 4-stage processing (Claude → GPT-4 → Gemini → Mistral) 6. **Content Assembly** - Inject content back into XML template 7. **Organic Compilation & Storage** - Save clean text to Google Sheets ### Google Sheets Integration (lib/BrainConfig.js, lib/ArticleStorage.js) **Authentication**: Environment variables (GOOGLE_SERVICE_ACCOUNT_EMAIL, GOOGLE_PRIVATE_KEY) **Data Sources**: - **Instructions Sheet**: Columns A-I (slug, T0, MC0, T-1, L-1, MC+1, T+1, L+1, XML) - **Personnalites Sheet**: 15 personalities with complete profiles - **Generated_Articles Sheet**: Compiled text output with metadata ### Personality System (lib/BrainConfig.js:265-340) **Random Selection Process**: 1. Load 15 personalities from Google Sheets 2. Fisher-Yates shuffle for true randomness 3. Select 60% (9 personalities) per generation 4. AI chooses best match within random subset 5. Temperature = 1.0 for maximum variability **15 Available Personalities**: - Marc (technical), Sophie (déco), Laurent (commercial), Julie (architecture) - Kévin (terrain), Amara (engineering), Mamadou (artisan), Émilie (digital) - Pierre-Henri (heritage), Yasmine (greentech), Fabrice (metallurgy) - Chloé (content), Linh (manufacturing), Minh (design), Thierry (creole) ### Multi-LLM Pipeline (lib/ContentGeneration.js) 1. **Base Generation** (Claude Sonnet-4) - Initial content creation 2. **Technical Enhancement** (GPT-4o-mini) - Add precision and terminology 3. **Transition Enhancement** (Gemini) - Improve flow (if available) 4. **Personality Style** (Mistral) - Apply personality-specific voice ### Key Components Status #### lib/LLMManager.js ✅ - 6 LLM providers operational: Claude, OpenAI, Gemini, Deepseek, Moonshot, Mistral - Retry logic and rate limiting implemented - Provider rotation and fallback chains - **Note**: Gemini geo-blocked in some regions (fallback to other providers) #### lib/BrainConfig.js ✅ - **FULLY MIGRATED** to Google Sheets integration - Random personality selection implemented - Environment variable authentication - Default XML template system for filename fallbacks #### lib/ElementExtraction.js ✅ - Fixed regex for instruction parsing: {{variables}} vs {instructions} - 16+ element extraction capability - Direct generation mode operational #### lib/ArticleStorage.js ✅ - Organic text compilation (maintains natural hierarchy) - Google Sheets storage (compiled text only, no XML) - Automatic slug generation and metadata tracking - French timestamp formatting #### lib/ErrorReporting.js ✅ - Centralized logging system - Email notifications (requires credential setup) ## Current System Status (2025-09-01) ### ✅ **Fully Operational** - **Google Sheets Integration**: Complete data pipeline - **15 AI Personalities**: Random selection with 100% variability tested - **Multi-LLM Generation**: 6 providers, 4-stage enhancement - **Direct Element Generation**: 16+ elements processed - **Organic Content Storage**: Clean text compilation - **Anti-Detection System**: Maximum style diversity ### 🔶 **Partially Operational** - **Email Notifications**: Implemented but needs credentials setup - **Gemini Integration**: Geo-blocked in some regions (5/6 LLMs operational) ### ⚠️ **Known Issues** - Email SMTP credentials need configuration in .env - Some XML tag replacements may need optimization (rare validation errors) - Gemini API blocked by geolocation (non-critical - 5 other providers work) ### 🎯 **Production Ready Features** - **Real-time execution**: 60-90 seconds for complete multi-LLM workflow - **Google Sheets automation**: Full read/write integration - **Anti-detection guarantee**: 15 personalities × random selection × 4 LLM stages - **Content quality**: Organic compilation maintains natural readability - **Scalability**: Direct Node.js execution, no web interface dependency ## Migration Status: Google Apps Script → Node.js ### ✅ **100% Migrated** - Google Sheets API integration - Multi-LLM content generation - Personality selection system - XML template processing - Content assembly and storage - Workflow orchestration - Error handling and logging ### 🔶 **Configuration Needed** - Email notification credentials - Optional: VPN for Gemini access ### 📊 **Performance Metrics** - **Execution time**: 60-90 seconds (full multi-LLM pipeline) - **Success rate**: 97%+ workflow completion - **Personality variability**: 100% tested (5/5 different personalities in consecutive runs) - **Content quality**: Natural, human-like output with organic flow - **Anti-detection**: Multiple writing styles, vocabularies, and tones per generation ## Workflow Sources - **production** - Real Google Sheets data processing - **test_random_personality** - Testing with personality randomization - **node_server** - Direct API processing - Legacy: make_com, digital_ocean_autonomous ## Key Dependencies - **googleapis** : Google Sheets API integration - **axios** : HTTP client for LLM APIs - **dotenv** : Environment variable management - **express** : Web server framework - **nodemailer** : Email notifications (needs setup) ## File Structure - **server.js** : Express server with basic endpoints - **lib/Main.js** : Core workflow orchestration - **lib/BrainConfig.js** : Google Sheets integration + personality system - **lib/LLMManager.js** : Multi-LLM provider management - **lib/ContentGeneration.js** : Content generation and enhancement - **lib/ElementExtraction.js** : XML parsing and element extraction - **lib/ArticleStorage.js** : Google Sheets storage and compilation - **lib/ErrorReporting.js** : Logging and error handling - **.env** : Environment configuration (Google credentials, API keys) ## Important Notes for Future Development - **Personality system is now random-based**: 60% of 15 personalities selected per run - **All data comes from Google Sheets**: No more JSON files or hardcoded data - **Default XML template**: Auto-generated when column I contains filename - **Temperature = 1.0**: Maximum variability in AI selection - **Direct element generation**: Bypasses hierarchy system for reliability - **Organic compilation**: Maintains natural text flow in final output - **5/6 LLM providers operational**: Gemini geo-blocked, others fully functional ## LogSh - Centralized Logging System ### **Architecture** - **Centralized logging**: All logs must go through LogSh function in ErrorReporting.js - **Multi-output streams**: Console (pretty format) + File (JSON) + WebSocket (real-time) - **No console or custom loggers**: Do not use console.* or alternate logger modules ### **Log Levels and Usage** - **TRACE**: Hierarchical workflow execution with parameters (▶ ✔ ✖ symbols) - **DEBUG**: Detailed debugging information (visible in files with debug level) - **INFO**: Standard operational messages - **WARN**: Warning conditions - **ERROR**: Error conditions with stack traces ### **File Logging** - **Format**: JSON structured logs in timestamped files - **Location**: logs/seo-generator-YYYY-MM-DD_HH-MM-SS.log - **Flush behavior**: Immediate flush on every log call to prevent buffer loss - **Level**: DEBUG and above (includes all TRACE logs) ### **Real-time Logging** - **WebSocket server**: Port 8081 for live log viewing - **Auto-launch**: logs-viewer.html opens in Edge browser automatically - **Features**: Search, filtering by level, scroll preservation, compact UI ### **Trace System** - **Hierarchical execution tracking**: Using AsyncLocalStorage for span context - **Function parameters**: All tracer.run() calls include relevant parameters - **Format**: Function names with file prefixes (e.g., "Main.handleFullWorkflow()") - **Performance timing**: Start/end with duration measurements - **Error handling**: Automatic stack trace logging on failures ### **Log Viewer Features** - **Real-time updates**: WebSocket connection to Node.js server - **Level filtering**: Toggle TRACE/DEBUG/INFO/WARN/ERROR visibility - **Search functionality**: Regex search with match highlighting - **Proportional scrolling**: Maintains relative position when filtering - **Compact UI**: Optimized for full viewport utilization ## Unused Audit Tool - **Location**: tools/audit-unused.cjs (manual run only) - **Reports**: Dead files, broken relative imports, unused exports - **Use sparingly**: Run before cleanup or release; keep with // @keep:export Name ## 📦 Bundling Tool pack-lib.cjs creates a single code.js from all files in lib/. Each file is concatenated with an ASCII header showing its path. Imports/exports are kept, so the bundle is for **reading/audit only**, not execution. ### Usage node pack-lib.cjs # default → code.js node pack-lib.cjs --out out.js # custom output node pack-lib.cjs --order alpha node pack-lib.cjs --entry lib/test-manual.js ## 🔍 Log Consultation (LogViewer) ### Contexte - Les logs ne sont plus envoyés en console.log (trop verbeux). - Tous les événements sont enregistrés dans logs/app.log au format **JSONL Pino**. - Exemple de ligne : json {"level":30,"time":1756797556942,"evt":"span.end","path":"Workflow SEO > Génération mots-clés","dur_ms":4584.6,"msg":"✔ Génération mots-clés (4.58s)"} ### Outil dédié Un outil tools/logViewer.js permet d’interroger facilement ce fichier. #### Commandes rapides * **Voir les 200 dernières lignes formatées** bash node tools/logViewer.js --pretty * **Rechercher un mot-clé dans les messages** (exemple : tout ce qui mentionne Claude) bash node tools/logViewer.js --search --includes "Claude" --pretty * **Rechercher par plage de temps** (ISO string ou date partielle) bash # Tous les logs du 2 septembre 2025 node tools/logViewer.js --since 2025-09-02T00:00:00Z --until 2025-09-02T23:59:59Z --pretty * **Filtrer par niveau d’erreur** bash node tools/logViewer.js --last 300 --level ERROR --pretty * **Stats par jour** bash node tools/logViewer.js --stats --by day --level ERROR ### Filtres disponibles * --level : 30=INFO, 40=WARN, 50=ERROR (ou INFO, WARN, ERROR) * --module : filtre par path ou module * --includes : mot-clé dans msg * --regex : expression régulière sur msg * --since / --until : bornes temporelles (ISO ou YYYY-MM-DD) ### Champs principaux * level : niveau de log * time : timestamp (epoch ou ISO) * path : workflow concerné * evt : type d’événement (span.start, span.end, etc.) * dur_ms : durée si span.end * msg : message lisible ### Résumé 👉 Ne pas lire le log brut. Toujours utiliser tools/logViewer.js pour chercher **par mot-clé** ou **par date** afin de naviguer efficacement dans les logs.