Commit Graph

3 Commits

Author SHA1 Message Date
3751ab047b feat(keywords): Add hierarchical context to missing keywords prompt and fix LLM response format
This commit improves keyword generation by providing hierarchical context for each element and fixing the LLM response format parsing.

Changes:
1. lib/MissingKeywords.js:
   - Add buildHierarchicalContext() to generate compact contextual info for each element
   - Display hierarchy in prompt (e.g., "H2 existants: 'Titre1', 'Titre2'")
   - For Txt elements: show associated MC keyword + parent title
   - For FAQ elements: count existing FAQs
   - Fix LLM response format by providing 3 concrete examples from actual list
   - Add explicit warning to use exact tag names [Titre_H2_3], [Txt_H2_6]
   - Improve getElementContext() to better retrieve hierarchical elements

2. lib/selective-enhancement/SelectiveUtils.js:
   - Fix createTypedPrompt() to use specific keyword from resolvedContent
   - Remove fallback to csvData.mc0 (log error if no specific keyword)

3. lib/pipeline/PipelineExecutor.js:
   - Integrate generateMissingSheetVariables() as "Étape 0" before extraction

Prompt format now:
  1. [Titre_H2_3] (titre) — H2 existants: "Titre1", "Titre2"
  2. [Txt_H2_6] (texte) — MC: "Plaque dibond" | Parent: "Guide dibond"
  3. [Faq_q_1] (question) — 3 FAQ existantes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 14:51:01 +08:00
b2fe9e0b7b Fix workflow production avec XML Digital Ocean et format Google Sheets
Corrections majeures:
- Digital Ocean: Récupération réelle XML depuis /wp-content/XML/ (86k chars au lieu de mock 1k)
- Nettoyage tags: Suppression <strong> dans extractElements() pour éviter parsing errors
- Doublons résilients: Tolérance doublons XML avec validation tags uniques
- Hiérarchie complète: StepExecutor génère 36 éléments depuis hierarchy.originalElement.name
- Format Google Sheets: Adaptation colonnes selon useVersionedSheet (17 legacy vs 21 versioned)
- Range Google Sheets: Force A1 avec INSERT_ROWS pour éviter décalage U:AO
- xmlTemplate optimisé: Exclusion du JSON metadata pour limite 50k chars

Résultat: 2151 mots, 36 éléments, sauvegarde correcte A-Q

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-08 14:52:19 +08:00
239a7161c2 Initial commit 2025-09-03 15:29:19 +08:00