# 🏗️ ARCHITECTURE ULTRA-MODULAIRE - SourceFinder *Version modulaire, gratuite, full LLM avec composants interchangeables* --- ## 🎯 **Principe architectural** **Règle d'or** : Chaque composant respecte une interface stricte et peut être remplacé sans impacter les autres. ```javascript // ❌ Couplage fort (mauvais) const mongodb = require('mongodb'); const puppeteer = require('puppeteer'); class NewsService { async search() { const db = mongodb.connect(); // Couplé à MongoDB const browser = puppeteer.launch(); // Couplé à Puppeteer } } // ✅ Architecture modulaire (bon) class NewsService { constructor(stockRepo, newsProvider, scorer) { this.stock = stockRepo; // Interface IStockRepository this.provider = newsProvider; // Interface INewsProvider this.scorer = scorer; // Interface IScoringEngine } } ``` --- ## 🔌 **Interfaces Core** ### **INewsProvider** - Fournisseur d'actualités ```javascript // src/interfaces/INewsProvider.js class INewsProvider { /** * Recherche d'actualités par critères * @param {SearchQuery} query - Critères de recherche * @returns {Promise} - Articles trouvés */ async searchNews(query) { throw new Error('Must implement searchNews()'); } /** * Validation des résultats * @param {NewsItem[]} results - Articles à valider * @returns {Promise} - Articles validés */ async validateResults(results) { throw new Error('Must implement validateResults()'); } /** * Métadonnées du provider * @returns {ProviderMetadata} - Infos provider */ getMetadata() { throw new Error('Must implement getMetadata()'); } } // Types const SearchQuery = { raceCode: String, // "352-1" keywords: [String], // ["santé", "comportement"] maxAge: Number, // Jours sources: [String], // ["premium", "standard"] limit: Number // Nombre max résultats }; const NewsItem = { id: String, title: String, content: String, url: String, publishDate: Date, sourceType: String, // "premium", "standard", "fallback" sourceDomain: String, metadata: Object }; ``` ### **IStockRepository** - Stockage d'articles ```javascript // src/interfaces/IStockRepository.js class IStockRepository { async save(newsItem) { throw new Error('Must implement save()'); } async findByRaceCode(raceCode, options = {}) { throw new Error('Must implement findByRaceCode()'); } async findByScore(minScore, options = {}) { throw new Error('Must implement findByScore()'); } async updateUsage(id, usageData) { throw new Error('Must implement updateUsage()'); } async cleanup(criteria) { throw new Error('Must implement cleanup()'); } async getStats() { throw new Error('Must implement getStats()'); } } ``` ### **IScoringEngine** - Moteur de scoring ```javascript // src/interfaces/IScoringEngine.js class IScoringEngine { async scoreArticle(article, context) { throw new Error('Must implement scoreArticle()'); } async batchScore(articles, context) { throw new Error('Must implement batchScore()'); } getWeights() { throw new Error('Must implement getWeights()'); } } ``` --- ## 🧠 **Implémentation LLM (par défaut)** ### **LLMNewsProvider** - Recherche via LLM ```javascript // src/implementations/providers/LLMNewsProvider.js const { INewsProvider } = require('../../interfaces/INewsProvider'); const OpenAI = require('openai'); class LLMNewsProvider extends INewsProvider { constructor(config) { super(); this.openai = new OpenAI({ apiKey: config.apiKey }); this.model = config.model || 'gpt-4o-mini'; this.maxTokens = config.maxTokens || 2000; } async searchNews(query) { const prompt = this.buildSearchPrompt(query); const response = await this.openai.chat.completions.create({ model: this.model, messages: [{ role: 'user', content: prompt }], max_tokens: this.maxTokens, temperature: 0.3 }); return this.parseResults(response.choices[0].message.content); } buildSearchPrompt(query) { return ` Recherche d'actualités canines spécialisées: Race ciblée: ${query.raceCode} (code FCI) Mots-clés: ${query.keywords.join(', ')} Période: ${query.maxAge} derniers jours Sources préférées: ${query.sources.join(', ')} Trouve ${query.limit} articles récents et pertinents. Retourne UNIQUEMENT du JSON valide: [ { "title": "Titre article", "content": "Résumé 200 mots", "url": "https://source.com/article", "publishDate": "2025-09-15", "sourceType": "premium|standard|fallback", "sourceDomain": "example.com", "metadata": { "relevanceScore": 0.9, "specialization": "health|behavior|legislation|general" } } ] `; } async parseResults(response) { try { const results = JSON.parse(response); return results.map(item => ({ ...item, id: require('uuid').v4(), publishDate: new Date(item.publishDate), extractedAt: new Date() })); } catch (error) { console.error('Failed to parse LLM response:', error); return []; } } async validateResults(results) { // Anti-prompt injection sur résultats LLM return results.filter(result => { return this.isValidContent(result.content) && this.isValidUrl(result.url) && this.isRecentEnough(result.publishDate); }); } getMetadata() { return { type: 'llm', provider: 'openai', model: this.model, capabilities: ['search', 'summarize', 'validate'], costPerRequest: 0.02, avgResponseTime: 3000 }; } } module.exports = LLMNewsProvider; ``` --- ## 💾 **Implémentation JSON (par défaut)** ### **JSONStockRepository** - Stockage fichiers JSON ```javascript // src/implementations/storage/JSONStockRepository.js const { IStockRepository } = require('../../interfaces/IStockRepository'); const fs = require('fs').promises; const path = require('path'); class JSONStockRepository extends IStockRepository { constructor(config) { super(); this.dataPath = config.dataPath || './data/stock'; this.indexPath = path.join(this.dataPath, 'index.json'); this.memoryIndex = new Map(); // Performance cache this.initialized = false; } async init() { if (this.initialized) return; await fs.mkdir(this.dataPath, { recursive: true }); try { const indexData = await fs.readFile(this.indexPath, 'utf8'); const index = JSON.parse(indexData); // Charger index en mémoire for (const [key, value] of Object.entries(index)) { this.memoryIndex.set(key, value); } } catch (error) { // Créer nouvel index si inexistant await this.saveIndex(); } this.initialized = true; } async save(newsItem) { await this.init(); const id = newsItem.id || require('uuid').v4(); const filePath = path.join(this.dataPath, `${id}.json`); // Sauvegarder article await fs.writeFile(filePath, JSON.stringify(newsItem, null, 2)); // Mettre à jour index this.memoryIndex.set(id, { id, raceCode: newsItem.raceCode, sourceType: newsItem.sourceType, finalScore: newsItem.finalScore, publishDate: newsItem.publishDate, usageCount: newsItem.usageCount || 0, lastUsed: newsItem.lastUsed, filePath }); await this.saveIndex(); return { ...newsItem, id }; } async findByRaceCode(raceCode, options = {}) { await this.init(); const results = []; for (const [id, indexEntry] of this.memoryIndex.entries()) { if (indexEntry.raceCode === raceCode) { if (options.minScore && indexEntry.finalScore < options.minScore) { continue; } const article = await this.loadArticle(id); results.push(article); } } return this.sortAndLimit(results, options); } async findByScore(minScore, options = {}) { await this.init(); const results = []; for (const [id, indexEntry] of this.memoryIndex.entries()) { if (indexEntry.finalScore >= minScore) { const article = await this.loadArticle(id); results.push(article); } } return this.sortAndLimit(results, options); } async loadArticle(id) { const indexEntry = this.memoryIndex.get(id); if (!indexEntry) return null; const data = await fs.readFile(indexEntry.filePath, 'utf8'); return JSON.parse(data); } async saveIndex() { const indexObj = Object.fromEntries(this.memoryIndex); await fs.writeFile(this.indexPath, JSON.stringify(indexObj, null, 2)); } sortAndLimit(results, options) { let sorted = results.sort((a, b) => b.finalScore - a.finalScore); if (options.limit) { sorted = sorted.slice(0, options.limit); } return sorted; } async getStats() { await this.init(); const stats = { totalArticles: this.memoryIndex.size, bySourceType: {}, byRaceCode: {}, avgScore: 0 }; let totalScore = 0; for (const entry of this.memoryIndex.values()) { // Comptage par type source stats.bySourceType[entry.sourceType] = (stats.bySourceType[entry.sourceType] || 0) + 1; // Comptage par race stats.byRaceCode[entry.raceCode] = (stats.byRaceCode[entry.raceCode] || 0) + 1; totalScore += entry.finalScore || 0; } stats.avgScore = stats.totalArticles > 0 ? totalScore / stats.totalArticles : 0; return stats; } } module.exports = JSONStockRepository; ``` --- ## 🎯 **Container d'injection de dépendances** ### **Dependency Injection Container** ```javascript // src/container.js const LLMNewsProvider = require('./implementations/providers/LLMNewsProvider'); const JSONStockRepository = require('./implementations/storage/JSONStockRepository'); const BasicScoringEngine = require('./implementations/scoring/BasicScoringEngine'); class Container { constructor() { this.services = new Map(); this.config = this.loadConfig(); } loadConfig() { return { newsProvider: { type: 'llm', llm: { apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini', maxTokens: 2000 } }, stockRepository: { type: 'json', json: { dataPath: './data/stock' } }, scoringEngine: { type: 'basic', weights: { freshness: 0.3, specificity: 0.4, quality: 0.2, reusability: 0.1 } } }; } register(name, factory) { this.services.set(name, factory); } get(name) { const factory = this.services.get(name); if (!factory) { throw new Error(`Service ${name} not registered`); } return factory(); } init() { // News Provider this.register('newsProvider', () => { switch (this.config.newsProvider.type) { case 'llm': return new LLMNewsProvider(this.config.newsProvider.llm); // Futurs providers // case 'scraping': // return new ScrapingNewsProvider(this.config.newsProvider.scraping); // case 'hybrid': // return new HybridNewsProvider(this.config.newsProvider.hybrid); default: throw new Error(`Unknown news provider: ${this.config.newsProvider.type}`); } }); // Stock Repository this.register('stockRepository', () => { switch (this.config.stockRepository.type) { case 'json': return new JSONStockRepository(this.config.stockRepository.json); // Futurs stockages // case 'mongodb': // return new MongoStockRepository(this.config.stockRepository.mongodb); // case 'postgresql': // return new PostgreSQLStockRepository(this.config.stockRepository.postgresql); default: throw new Error(`Unknown stock repository: ${this.config.stockRepository.type}`); } }); // Scoring Engine this.register('scoringEngine', () => { return new BasicScoringEngine(this.config.scoringEngine); }); } } // Singleton const container = new Container(); container.init(); module.exports = container; ``` --- ## 🏢 **Services métier (stables)** ### **NewsSearchService** - Service principal ```javascript // src/services/NewsSearchService.js class NewsSearchService { constructor(newsProvider, stockRepository, scoringEngine) { this.newsProvider = newsProvider; this.stockRepository = stockRepository; this.scoringEngine = scoringEngine; } async search(query) { // 1. Recherche en stock d'abord const stockResults = await this.searchInStock(query); // 2. Si insuffisant, recherche live let liveResults = []; if (stockResults.length < query.limit) { const remaining = query.limit - stockResults.length; liveResults = await this.searchLive({ ...query, limit: remaining }); } // 3. Scoring combiné const allResults = [...stockResults, ...liveResults]; const scoredResults = await this.scoringEngine.batchScore(allResults, query); // 4. Tri et limite const finalResults = scoredResults .sort((a, b) => b.finalScore - a.finalScore) .slice(0, query.limit); // 5. Tracking usage await this.trackUsage(finalResults); return { results: finalResults, metadata: { fromStock: stockResults.length, fromLive: liveResults.length, totalFound: allResults.length, searchTime: Date.now() - query.startTime } }; } async searchInStock(query) { return await this.stockRepository.findByRaceCode(query.raceCode, { minScore: query.minScore || 100, limit: query.limit }); } async searchLive(query) { const results = await this.newsProvider.searchNews(query); const validated = await this.newsProvider.validateResults(results); // Sauvegarder en stock pour réutilisation for (const result of validated) { await this.stockRepository.save(result); } return validated; } async trackUsage(results) { for (const result of results) { await this.stockRepository.updateUsage(result.id, { lastUsed: new Date(), usageCount: (result.usageCount || 0) + 1 }); } } } module.exports = NewsSearchService; ``` --- ## 🔧 **Configuration modulaire** ### **Changement de composant en 1 ligne** ```javascript // config/environments/development.js module.exports = { // Version actuelle : Full LLM + JSON newsProvider: { type: 'llm', llm: { model: 'gpt-4o-mini' }}, stockRepository: { type: 'json', json: { dataPath: './data' }}, // Migration facile vers d'autres composants : // Si on veut tester scraping : // newsProvider: { type: 'scraping', scraping: { antiBot: true }}, // Si on veut MongoDB : // stockRepository: { type: 'mongodb', mongodb: { uri: '...' }}, // Si on veut hybride : // newsProvider: { // type: 'hybrid', // hybrid: { // primary: { type: 'llm' }, // fallback: { type: 'scraping' } // } // } }; ``` --- ## ✅ **Avantages architecture modulaire** 1. **Flexibilité totale** : Changer un composant = modifier 1 ligne config 2. **Tests isolés** : Mocker chaque interface indépendamment 3. **Évolution sans risque** : Nouveau composant n'impacte pas les autres 4. **Développement parallèle** : Équipe peut travailler sur interfaces différentes 5. **Migration progressive** : Pas de big bang, composant par composant 6. **Maintenance simplifiée** : Bug isolé dans son composant 7. **Performance optimisable** : Optimiser 1 composant sans casser les autres **Cette architecture permet de démarrer simple (LLM + JSON) et d'évoluer composant par composant selon les besoins.** --- *Architecture finalisée pour version modulaire, gratuite, full LLM*