Some checks failed
SourceFinder CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
SourceFinder CI/CD Pipeline / Unit Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Security Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Integration Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Performance Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Code Coverage Report (push) Has been cancelled
SourceFinder CI/CD Pipeline / Build & Deployment Validation (16.x) (push) Has been cancelled
SourceFinder CI/CD Pipeline / Build & Deployment Validation (18.x) (push) Has been cancelled
SourceFinder CI/CD Pipeline / Build & Deployment Validation (20.x) (push) Has been cancelled
SourceFinder CI/CD Pipeline / Regression Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Security Audit (push) Has been cancelled
SourceFinder CI/CD Pipeline / Notify Results (push) Has been cancelled
- Architecture modulaire avec injection de dépendances - Système de scoring intelligent multi-facteurs (spécificité, fraîcheur, qualité, réutilisation) - Moteur anti-injection 4 couches (preprocessing, patterns, sémantique, pénalités) - API REST complète avec validation et rate limiting - Repository JSON avec index mémoire et backup automatique - Provider LLM modulaire pour génération de contenu - Suite de tests complète (Jest) : * Tests unitaires pour sécurité et scoring * Tests d'intégration API end-to-end * Tests de sécurité avec simulation d'attaques * Tests de performance et charge - Pipeline CI/CD avec GitHub Actions - Logging structuré et monitoring - Configuration ESLint et environnement de test 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
621 lines
16 KiB
Markdown
621 lines
16 KiB
Markdown
# 🏗️ ARCHITECTURE ULTRA-MODULAIRE - SourceFinder
|
|
|
|
*Version modulaire, gratuite, full LLM avec composants interchangeables*
|
|
|
|
---
|
|
|
|
## 🎯 **Principe architectural**
|
|
|
|
**Règle d'or** : Chaque composant respecte une interface stricte et peut être remplacé sans impacter les autres.
|
|
|
|
```javascript
|
|
// ❌ Couplage fort (mauvais)
|
|
const mongodb = require('mongodb');
|
|
const puppeteer = require('puppeteer');
|
|
class NewsService {
|
|
async search() {
|
|
const db = mongodb.connect(); // Couplé à MongoDB
|
|
const browser = puppeteer.launch(); // Couplé à Puppeteer
|
|
}
|
|
}
|
|
|
|
// ✅ Architecture modulaire (bon)
|
|
class NewsService {
|
|
constructor(stockRepo, newsProvider, scorer) {
|
|
this.stock = stockRepo; // Interface IStockRepository
|
|
this.provider = newsProvider; // Interface INewsProvider
|
|
this.scorer = scorer; // Interface IScoringEngine
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 🔌 **Interfaces Core**
|
|
|
|
### **INewsProvider** - Fournisseur d'actualités
|
|
```javascript
|
|
// src/interfaces/INewsProvider.js
|
|
class INewsProvider {
|
|
/**
|
|
* Recherche d'actualités par critères
|
|
* @param {SearchQuery} query - Critères de recherche
|
|
* @returns {Promise<NewsItem[]>} - Articles trouvés
|
|
*/
|
|
async searchNews(query) {
|
|
throw new Error('Must implement searchNews()');
|
|
}
|
|
|
|
/**
|
|
* Validation des résultats
|
|
* @param {NewsItem[]} results - Articles à valider
|
|
* @returns {Promise<NewsItem[]>} - Articles validés
|
|
*/
|
|
async validateResults(results) {
|
|
throw new Error('Must implement validateResults()');
|
|
}
|
|
|
|
/**
|
|
* Métadonnées du provider
|
|
* @returns {ProviderMetadata} - Infos provider
|
|
*/
|
|
getMetadata() {
|
|
throw new Error('Must implement getMetadata()');
|
|
}
|
|
}
|
|
|
|
// Types
|
|
const SearchQuery = {
|
|
raceCode: String, // "352-1"
|
|
keywords: [String], // ["santé", "comportement"]
|
|
maxAge: Number, // Jours
|
|
sources: [String], // ["premium", "standard"]
|
|
limit: Number // Nombre max résultats
|
|
};
|
|
|
|
const NewsItem = {
|
|
id: String,
|
|
title: String,
|
|
content: String,
|
|
url: String,
|
|
publishDate: Date,
|
|
sourceType: String, // "premium", "standard", "fallback"
|
|
sourceDomain: String,
|
|
metadata: Object
|
|
};
|
|
```
|
|
|
|
### **IStockRepository** - Stockage d'articles
|
|
```javascript
|
|
// src/interfaces/IStockRepository.js
|
|
class IStockRepository {
|
|
async save(newsItem) {
|
|
throw new Error('Must implement save()');
|
|
}
|
|
|
|
async findByRaceCode(raceCode, options = {}) {
|
|
throw new Error('Must implement findByRaceCode()');
|
|
}
|
|
|
|
async findByScore(minScore, options = {}) {
|
|
throw new Error('Must implement findByScore()');
|
|
}
|
|
|
|
async updateUsage(id, usageData) {
|
|
throw new Error('Must implement updateUsage()');
|
|
}
|
|
|
|
async cleanup(criteria) {
|
|
throw new Error('Must implement cleanup()');
|
|
}
|
|
|
|
async getStats() {
|
|
throw new Error('Must implement getStats()');
|
|
}
|
|
}
|
|
```
|
|
|
|
### **IScoringEngine** - Moteur de scoring
|
|
```javascript
|
|
// src/interfaces/IScoringEngine.js
|
|
class IScoringEngine {
|
|
async scoreArticle(article, context) {
|
|
throw new Error('Must implement scoreArticle()');
|
|
}
|
|
|
|
async batchScore(articles, context) {
|
|
throw new Error('Must implement batchScore()');
|
|
}
|
|
|
|
getWeights() {
|
|
throw new Error('Must implement getWeights()');
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 🧠 **Implémentation LLM (par défaut)**
|
|
|
|
### **LLMNewsProvider** - Recherche via LLM
|
|
```javascript
|
|
// src/implementations/providers/LLMNewsProvider.js
|
|
const { INewsProvider } = require('../../interfaces/INewsProvider');
|
|
const OpenAI = require('openai');
|
|
|
|
class LLMNewsProvider extends INewsProvider {
|
|
constructor(config) {
|
|
super();
|
|
this.openai = new OpenAI({ apiKey: config.apiKey });
|
|
this.model = config.model || 'gpt-4o-mini';
|
|
this.maxTokens = config.maxTokens || 2000;
|
|
}
|
|
|
|
async searchNews(query) {
|
|
const prompt = this.buildSearchPrompt(query);
|
|
|
|
const response = await this.openai.chat.completions.create({
|
|
model: this.model,
|
|
messages: [{ role: 'user', content: prompt }],
|
|
max_tokens: this.maxTokens,
|
|
temperature: 0.3
|
|
});
|
|
|
|
return this.parseResults(response.choices[0].message.content);
|
|
}
|
|
|
|
buildSearchPrompt(query) {
|
|
return `
|
|
Recherche d'actualités canines spécialisées:
|
|
|
|
Race ciblée: ${query.raceCode} (code FCI)
|
|
Mots-clés: ${query.keywords.join(', ')}
|
|
Période: ${query.maxAge} derniers jours
|
|
Sources préférées: ${query.sources.join(', ')}
|
|
|
|
Trouve ${query.limit} articles récents et pertinents.
|
|
|
|
Retourne UNIQUEMENT du JSON valide:
|
|
[
|
|
{
|
|
"title": "Titre article",
|
|
"content": "Résumé 200 mots",
|
|
"url": "https://source.com/article",
|
|
"publishDate": "2025-09-15",
|
|
"sourceType": "premium|standard|fallback",
|
|
"sourceDomain": "example.com",
|
|
"metadata": {
|
|
"relevanceScore": 0.9,
|
|
"specialization": "health|behavior|legislation|general"
|
|
}
|
|
}
|
|
]
|
|
`;
|
|
}
|
|
|
|
async parseResults(response) {
|
|
try {
|
|
const results = JSON.parse(response);
|
|
return results.map(item => ({
|
|
...item,
|
|
id: require('uuid').v4(),
|
|
publishDate: new Date(item.publishDate),
|
|
extractedAt: new Date()
|
|
}));
|
|
} catch (error) {
|
|
console.error('Failed to parse LLM response:', error);
|
|
return [];
|
|
}
|
|
}
|
|
|
|
async validateResults(results) {
|
|
// Anti-prompt injection sur résultats LLM
|
|
return results.filter(result => {
|
|
return this.isValidContent(result.content) &&
|
|
this.isValidUrl(result.url) &&
|
|
this.isRecentEnough(result.publishDate);
|
|
});
|
|
}
|
|
|
|
getMetadata() {
|
|
return {
|
|
type: 'llm',
|
|
provider: 'openai',
|
|
model: this.model,
|
|
capabilities: ['search', 'summarize', 'validate'],
|
|
costPerRequest: 0.02,
|
|
avgResponseTime: 3000
|
|
};
|
|
}
|
|
}
|
|
|
|
module.exports = LLMNewsProvider;
|
|
```
|
|
|
|
---
|
|
|
|
## 💾 **Implémentation JSON (par défaut)**
|
|
|
|
### **JSONStockRepository** - Stockage fichiers JSON
|
|
```javascript
|
|
// src/implementations/storage/JSONStockRepository.js
|
|
const { IStockRepository } = require('../../interfaces/IStockRepository');
|
|
const fs = require('fs').promises;
|
|
const path = require('path');
|
|
|
|
class JSONStockRepository extends IStockRepository {
|
|
constructor(config) {
|
|
super();
|
|
this.dataPath = config.dataPath || './data/stock';
|
|
this.indexPath = path.join(this.dataPath, 'index.json');
|
|
this.memoryIndex = new Map(); // Performance cache
|
|
this.initialized = false;
|
|
}
|
|
|
|
async init() {
|
|
if (this.initialized) return;
|
|
|
|
await fs.mkdir(this.dataPath, { recursive: true });
|
|
|
|
try {
|
|
const indexData = await fs.readFile(this.indexPath, 'utf8');
|
|
const index = JSON.parse(indexData);
|
|
|
|
// Charger index en mémoire
|
|
for (const [key, value] of Object.entries(index)) {
|
|
this.memoryIndex.set(key, value);
|
|
}
|
|
} catch (error) {
|
|
// Créer nouvel index si inexistant
|
|
await this.saveIndex();
|
|
}
|
|
|
|
this.initialized = true;
|
|
}
|
|
|
|
async save(newsItem) {
|
|
await this.init();
|
|
|
|
const id = newsItem.id || require('uuid').v4();
|
|
const filePath = path.join(this.dataPath, `${id}.json`);
|
|
|
|
// Sauvegarder article
|
|
await fs.writeFile(filePath, JSON.stringify(newsItem, null, 2));
|
|
|
|
// Mettre à jour index
|
|
this.memoryIndex.set(id, {
|
|
id,
|
|
raceCode: newsItem.raceCode,
|
|
sourceType: newsItem.sourceType,
|
|
finalScore: newsItem.finalScore,
|
|
publishDate: newsItem.publishDate,
|
|
usageCount: newsItem.usageCount || 0,
|
|
lastUsed: newsItem.lastUsed,
|
|
filePath
|
|
});
|
|
|
|
await this.saveIndex();
|
|
return { ...newsItem, id };
|
|
}
|
|
|
|
async findByRaceCode(raceCode, options = {}) {
|
|
await this.init();
|
|
|
|
const results = [];
|
|
for (const [id, indexEntry] of this.memoryIndex.entries()) {
|
|
if (indexEntry.raceCode === raceCode) {
|
|
if (options.minScore && indexEntry.finalScore < options.minScore) {
|
|
continue;
|
|
}
|
|
|
|
const article = await this.loadArticle(id);
|
|
results.push(article);
|
|
}
|
|
}
|
|
|
|
return this.sortAndLimit(results, options);
|
|
}
|
|
|
|
async findByScore(minScore, options = {}) {
|
|
await this.init();
|
|
|
|
const results = [];
|
|
for (const [id, indexEntry] of this.memoryIndex.entries()) {
|
|
if (indexEntry.finalScore >= minScore) {
|
|
const article = await this.loadArticle(id);
|
|
results.push(article);
|
|
}
|
|
}
|
|
|
|
return this.sortAndLimit(results, options);
|
|
}
|
|
|
|
async loadArticle(id) {
|
|
const indexEntry = this.memoryIndex.get(id);
|
|
if (!indexEntry) return null;
|
|
|
|
const data = await fs.readFile(indexEntry.filePath, 'utf8');
|
|
return JSON.parse(data);
|
|
}
|
|
|
|
async saveIndex() {
|
|
const indexObj = Object.fromEntries(this.memoryIndex);
|
|
await fs.writeFile(this.indexPath, JSON.stringify(indexObj, null, 2));
|
|
}
|
|
|
|
sortAndLimit(results, options) {
|
|
let sorted = results.sort((a, b) => b.finalScore - a.finalScore);
|
|
|
|
if (options.limit) {
|
|
sorted = sorted.slice(0, options.limit);
|
|
}
|
|
|
|
return sorted;
|
|
}
|
|
|
|
async getStats() {
|
|
await this.init();
|
|
|
|
const stats = {
|
|
totalArticles: this.memoryIndex.size,
|
|
bySourceType: {},
|
|
byRaceCode: {},
|
|
avgScore: 0
|
|
};
|
|
|
|
let totalScore = 0;
|
|
for (const entry of this.memoryIndex.values()) {
|
|
// Comptage par type source
|
|
stats.bySourceType[entry.sourceType] =
|
|
(stats.bySourceType[entry.sourceType] || 0) + 1;
|
|
|
|
// Comptage par race
|
|
stats.byRaceCode[entry.raceCode] =
|
|
(stats.byRaceCode[entry.raceCode] || 0) + 1;
|
|
|
|
totalScore += entry.finalScore || 0;
|
|
}
|
|
|
|
stats.avgScore = stats.totalArticles > 0 ?
|
|
totalScore / stats.totalArticles : 0;
|
|
|
|
return stats;
|
|
}
|
|
}
|
|
|
|
module.exports = JSONStockRepository;
|
|
```
|
|
|
|
---
|
|
|
|
## 🎯 **Container d'injection de dépendances**
|
|
|
|
### **Dependency Injection Container**
|
|
```javascript
|
|
// src/container.js
|
|
const LLMNewsProvider = require('./implementations/providers/LLMNewsProvider');
|
|
const JSONStockRepository = require('./implementations/storage/JSONStockRepository');
|
|
const BasicScoringEngine = require('./implementations/scoring/BasicScoringEngine');
|
|
|
|
class Container {
|
|
constructor() {
|
|
this.services = new Map();
|
|
this.config = this.loadConfig();
|
|
}
|
|
|
|
loadConfig() {
|
|
return {
|
|
newsProvider: {
|
|
type: 'llm',
|
|
llm: {
|
|
apiKey: process.env.OPENAI_API_KEY,
|
|
model: 'gpt-4o-mini',
|
|
maxTokens: 2000
|
|
}
|
|
},
|
|
stockRepository: {
|
|
type: 'json',
|
|
json: {
|
|
dataPath: './data/stock'
|
|
}
|
|
},
|
|
scoringEngine: {
|
|
type: 'basic',
|
|
weights: {
|
|
freshness: 0.3,
|
|
specificity: 0.4,
|
|
quality: 0.2,
|
|
reusability: 0.1
|
|
}
|
|
}
|
|
};
|
|
}
|
|
|
|
register(name, factory) {
|
|
this.services.set(name, factory);
|
|
}
|
|
|
|
get(name) {
|
|
const factory = this.services.get(name);
|
|
if (!factory) {
|
|
throw new Error(`Service ${name} not registered`);
|
|
}
|
|
return factory();
|
|
}
|
|
|
|
init() {
|
|
// News Provider
|
|
this.register('newsProvider', () => {
|
|
switch (this.config.newsProvider.type) {
|
|
case 'llm':
|
|
return new LLMNewsProvider(this.config.newsProvider.llm);
|
|
// Futurs providers
|
|
// case 'scraping':
|
|
// return new ScrapingNewsProvider(this.config.newsProvider.scraping);
|
|
// case 'hybrid':
|
|
// return new HybridNewsProvider(this.config.newsProvider.hybrid);
|
|
default:
|
|
throw new Error(`Unknown news provider: ${this.config.newsProvider.type}`);
|
|
}
|
|
});
|
|
|
|
// Stock Repository
|
|
this.register('stockRepository', () => {
|
|
switch (this.config.stockRepository.type) {
|
|
case 'json':
|
|
return new JSONStockRepository(this.config.stockRepository.json);
|
|
// Futurs stockages
|
|
// case 'mongodb':
|
|
// return new MongoStockRepository(this.config.stockRepository.mongodb);
|
|
// case 'postgresql':
|
|
// return new PostgreSQLStockRepository(this.config.stockRepository.postgresql);
|
|
default:
|
|
throw new Error(`Unknown stock repository: ${this.config.stockRepository.type}`);
|
|
}
|
|
});
|
|
|
|
// Scoring Engine
|
|
this.register('scoringEngine', () => {
|
|
return new BasicScoringEngine(this.config.scoringEngine);
|
|
});
|
|
}
|
|
}
|
|
|
|
// Singleton
|
|
const container = new Container();
|
|
container.init();
|
|
|
|
module.exports = container;
|
|
```
|
|
|
|
---
|
|
|
|
## 🏢 **Services métier (stables)**
|
|
|
|
### **NewsSearchService** - Service principal
|
|
```javascript
|
|
// src/services/NewsSearchService.js
|
|
class NewsSearchService {
|
|
constructor(newsProvider, stockRepository, scoringEngine) {
|
|
this.newsProvider = newsProvider;
|
|
this.stockRepository = stockRepository;
|
|
this.scoringEngine = scoringEngine;
|
|
}
|
|
|
|
async search(query) {
|
|
// 1. Recherche en stock d'abord
|
|
const stockResults = await this.searchInStock(query);
|
|
|
|
// 2. Si insuffisant, recherche live
|
|
let liveResults = [];
|
|
if (stockResults.length < query.limit) {
|
|
const remaining = query.limit - stockResults.length;
|
|
liveResults = await this.searchLive({
|
|
...query,
|
|
limit: remaining
|
|
});
|
|
}
|
|
|
|
// 3. Scoring combiné
|
|
const allResults = [...stockResults, ...liveResults];
|
|
const scoredResults = await this.scoringEngine.batchScore(allResults, query);
|
|
|
|
// 4. Tri et limite
|
|
const finalResults = scoredResults
|
|
.sort((a, b) => b.finalScore - a.finalScore)
|
|
.slice(0, query.limit);
|
|
|
|
// 5. Tracking usage
|
|
await this.trackUsage(finalResults);
|
|
|
|
return {
|
|
results: finalResults,
|
|
metadata: {
|
|
fromStock: stockResults.length,
|
|
fromLive: liveResults.length,
|
|
totalFound: allResults.length,
|
|
searchTime: Date.now() - query.startTime
|
|
}
|
|
};
|
|
}
|
|
|
|
async searchInStock(query) {
|
|
return await this.stockRepository.findByRaceCode(query.raceCode, {
|
|
minScore: query.minScore || 100,
|
|
limit: query.limit
|
|
});
|
|
}
|
|
|
|
async searchLive(query) {
|
|
const results = await this.newsProvider.searchNews(query);
|
|
const validated = await this.newsProvider.validateResults(results);
|
|
|
|
// Sauvegarder en stock pour réutilisation
|
|
for (const result of validated) {
|
|
await this.stockRepository.save(result);
|
|
}
|
|
|
|
return validated;
|
|
}
|
|
|
|
async trackUsage(results) {
|
|
for (const result of results) {
|
|
await this.stockRepository.updateUsage(result.id, {
|
|
lastUsed: new Date(),
|
|
usageCount: (result.usageCount || 0) + 1
|
|
});
|
|
}
|
|
}
|
|
}
|
|
|
|
module.exports = NewsSearchService;
|
|
```
|
|
|
|
---
|
|
|
|
## 🔧 **Configuration modulaire**
|
|
|
|
### **Changement de composant en 1 ligne**
|
|
```javascript
|
|
// config/environments/development.js
|
|
module.exports = {
|
|
// Version actuelle : Full LLM + JSON
|
|
newsProvider: { type: 'llm', llm: { model: 'gpt-4o-mini' }},
|
|
stockRepository: { type: 'json', json: { dataPath: './data' }},
|
|
|
|
// Migration facile vers d'autres composants :
|
|
|
|
// Si on veut tester scraping :
|
|
// newsProvider: { type: 'scraping', scraping: { antiBot: true }},
|
|
|
|
// Si on veut MongoDB :
|
|
// stockRepository: { type: 'mongodb', mongodb: { uri: '...' }},
|
|
|
|
// Si on veut hybride :
|
|
// newsProvider: {
|
|
// type: 'hybrid',
|
|
// hybrid: {
|
|
// primary: { type: 'llm' },
|
|
// fallback: { type: 'scraping' }
|
|
// }
|
|
// }
|
|
};
|
|
```
|
|
|
|
---
|
|
|
|
## ✅ **Avantages architecture modulaire**
|
|
|
|
1. **Flexibilité totale** : Changer un composant = modifier 1 ligne config
|
|
2. **Tests isolés** : Mocker chaque interface indépendamment
|
|
3. **Évolution sans risque** : Nouveau composant n'impacte pas les autres
|
|
4. **Développement parallèle** : Équipe peut travailler sur interfaces différentes
|
|
5. **Migration progressive** : Pas de big bang, composant par composant
|
|
6. **Maintenance simplifiée** : Bug isolé dans son composant
|
|
7. **Performance optimisable** : Optimiser 1 composant sans casser les autres
|
|
|
|
**Cette architecture permet de démarrer simple (LLM + JSON) et d'évoluer composant par composant selon les besoins.**
|
|
|
|
---
|
|
|
|
*Architecture finalisée pour version modulaire, gratuite, full LLM* |