Some checks failed
SourceFinder CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
SourceFinder CI/CD Pipeline / Unit Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Security Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Integration Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Performance Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Code Coverage Report (push) Has been cancelled
SourceFinder CI/CD Pipeline / Build & Deployment Validation (16.x) (push) Has been cancelled
SourceFinder CI/CD Pipeline / Build & Deployment Validation (18.x) (push) Has been cancelled
SourceFinder CI/CD Pipeline / Build & Deployment Validation (20.x) (push) Has been cancelled
SourceFinder CI/CD Pipeline / Regression Tests (push) Has been cancelled
SourceFinder CI/CD Pipeline / Security Audit (push) Has been cancelled
SourceFinder CI/CD Pipeline / Notify Results (push) Has been cancelled
- Architecture modulaire avec injection de dépendances - Système de scoring intelligent multi-facteurs (spécificité, fraîcheur, qualité, réutilisation) - Moteur anti-injection 4 couches (preprocessing, patterns, sémantique, pénalités) - API REST complète avec validation et rate limiting - Repository JSON avec index mémoire et backup automatique - Provider LLM modulaire pour génération de contenu - Suite de tests complète (Jest) : * Tests unitaires pour sécurité et scoring * Tests d'intégration API end-to-end * Tests de sécurité avec simulation d'attaques * Tests de performance et charge - Pipeline CI/CD avec GitHub Actions - Logging structuré et monitoring - Configuration ESLint et environnement de test 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
16 KiB
16 KiB
🏗️ ARCHITECTURE ULTRA-MODULAIRE - SourceFinder
Version modulaire, gratuite, full LLM avec composants interchangeables
🎯 Principe architectural
Règle d'or : Chaque composant respecte une interface stricte et peut être remplacé sans impacter les autres.
// ❌ Couplage fort (mauvais)
const mongodb = require('mongodb');
const puppeteer = require('puppeteer');
class NewsService {
async search() {
const db = mongodb.connect(); // Couplé à MongoDB
const browser = puppeteer.launch(); // Couplé à Puppeteer
}
}
// ✅ Architecture modulaire (bon)
class NewsService {
constructor(stockRepo, newsProvider, scorer) {
this.stock = stockRepo; // Interface IStockRepository
this.provider = newsProvider; // Interface INewsProvider
this.scorer = scorer; // Interface IScoringEngine
}
}
🔌 Interfaces Core
INewsProvider - Fournisseur d'actualités
// src/interfaces/INewsProvider.js
class INewsProvider {
/**
* Recherche d'actualités par critères
* @param {SearchQuery} query - Critères de recherche
* @returns {Promise<NewsItem[]>} - Articles trouvés
*/
async searchNews(query) {
throw new Error('Must implement searchNews()');
}
/**
* Validation des résultats
* @param {NewsItem[]} results - Articles à valider
* @returns {Promise<NewsItem[]>} - Articles validés
*/
async validateResults(results) {
throw new Error('Must implement validateResults()');
}
/**
* Métadonnées du provider
* @returns {ProviderMetadata} - Infos provider
*/
getMetadata() {
throw new Error('Must implement getMetadata()');
}
}
// Types
const SearchQuery = {
raceCode: String, // "352-1"
keywords: [String], // ["santé", "comportement"]
maxAge: Number, // Jours
sources: [String], // ["premium", "standard"]
limit: Number // Nombre max résultats
};
const NewsItem = {
id: String,
title: String,
content: String,
url: String,
publishDate: Date,
sourceType: String, // "premium", "standard", "fallback"
sourceDomain: String,
metadata: Object
};
IStockRepository - Stockage d'articles
// src/interfaces/IStockRepository.js
class IStockRepository {
async save(newsItem) {
throw new Error('Must implement save()');
}
async findByRaceCode(raceCode, options = {}) {
throw new Error('Must implement findByRaceCode()');
}
async findByScore(minScore, options = {}) {
throw new Error('Must implement findByScore()');
}
async updateUsage(id, usageData) {
throw new Error('Must implement updateUsage()');
}
async cleanup(criteria) {
throw new Error('Must implement cleanup()');
}
async getStats() {
throw new Error('Must implement getStats()');
}
}
IScoringEngine - Moteur de scoring
// src/interfaces/IScoringEngine.js
class IScoringEngine {
async scoreArticle(article, context) {
throw new Error('Must implement scoreArticle()');
}
async batchScore(articles, context) {
throw new Error('Must implement batchScore()');
}
getWeights() {
throw new Error('Must implement getWeights()');
}
}
🧠 Implémentation LLM (par défaut)
LLMNewsProvider - Recherche via LLM
// src/implementations/providers/LLMNewsProvider.js
const { INewsProvider } = require('../../interfaces/INewsProvider');
const OpenAI = require('openai');
class LLMNewsProvider extends INewsProvider {
constructor(config) {
super();
this.openai = new OpenAI({ apiKey: config.apiKey });
this.model = config.model || 'gpt-4o-mini';
this.maxTokens = config.maxTokens || 2000;
}
async searchNews(query) {
const prompt = this.buildSearchPrompt(query);
const response = await this.openai.chat.completions.create({
model: this.model,
messages: [{ role: 'user', content: prompt }],
max_tokens: this.maxTokens,
temperature: 0.3
});
return this.parseResults(response.choices[0].message.content);
}
buildSearchPrompt(query) {
return `
Recherche d'actualités canines spécialisées:
Race ciblée: ${query.raceCode} (code FCI)
Mots-clés: ${query.keywords.join(', ')}
Période: ${query.maxAge} derniers jours
Sources préférées: ${query.sources.join(', ')}
Trouve ${query.limit} articles récents et pertinents.
Retourne UNIQUEMENT du JSON valide:
[
{
"title": "Titre article",
"content": "Résumé 200 mots",
"url": "https://source.com/article",
"publishDate": "2025-09-15",
"sourceType": "premium|standard|fallback",
"sourceDomain": "example.com",
"metadata": {
"relevanceScore": 0.9,
"specialization": "health|behavior|legislation|general"
}
}
]
`;
}
async parseResults(response) {
try {
const results = JSON.parse(response);
return results.map(item => ({
...item,
id: require('uuid').v4(),
publishDate: new Date(item.publishDate),
extractedAt: new Date()
}));
} catch (error) {
console.error('Failed to parse LLM response:', error);
return [];
}
}
async validateResults(results) {
// Anti-prompt injection sur résultats LLM
return results.filter(result => {
return this.isValidContent(result.content) &&
this.isValidUrl(result.url) &&
this.isRecentEnough(result.publishDate);
});
}
getMetadata() {
return {
type: 'llm',
provider: 'openai',
model: this.model,
capabilities: ['search', 'summarize', 'validate'],
costPerRequest: 0.02,
avgResponseTime: 3000
};
}
}
module.exports = LLMNewsProvider;
💾 Implémentation JSON (par défaut)
JSONStockRepository - Stockage fichiers JSON
// src/implementations/storage/JSONStockRepository.js
const { IStockRepository } = require('../../interfaces/IStockRepository');
const fs = require('fs').promises;
const path = require('path');
class JSONStockRepository extends IStockRepository {
constructor(config) {
super();
this.dataPath = config.dataPath || './data/stock';
this.indexPath = path.join(this.dataPath, 'index.json');
this.memoryIndex = new Map(); // Performance cache
this.initialized = false;
}
async init() {
if (this.initialized) return;
await fs.mkdir(this.dataPath, { recursive: true });
try {
const indexData = await fs.readFile(this.indexPath, 'utf8');
const index = JSON.parse(indexData);
// Charger index en mémoire
for (const [key, value] of Object.entries(index)) {
this.memoryIndex.set(key, value);
}
} catch (error) {
// Créer nouvel index si inexistant
await this.saveIndex();
}
this.initialized = true;
}
async save(newsItem) {
await this.init();
const id = newsItem.id || require('uuid').v4();
const filePath = path.join(this.dataPath, `${id}.json`);
// Sauvegarder article
await fs.writeFile(filePath, JSON.stringify(newsItem, null, 2));
// Mettre à jour index
this.memoryIndex.set(id, {
id,
raceCode: newsItem.raceCode,
sourceType: newsItem.sourceType,
finalScore: newsItem.finalScore,
publishDate: newsItem.publishDate,
usageCount: newsItem.usageCount || 0,
lastUsed: newsItem.lastUsed,
filePath
});
await this.saveIndex();
return { ...newsItem, id };
}
async findByRaceCode(raceCode, options = {}) {
await this.init();
const results = [];
for (const [id, indexEntry] of this.memoryIndex.entries()) {
if (indexEntry.raceCode === raceCode) {
if (options.minScore && indexEntry.finalScore < options.minScore) {
continue;
}
const article = await this.loadArticle(id);
results.push(article);
}
}
return this.sortAndLimit(results, options);
}
async findByScore(minScore, options = {}) {
await this.init();
const results = [];
for (const [id, indexEntry] of this.memoryIndex.entries()) {
if (indexEntry.finalScore >= minScore) {
const article = await this.loadArticle(id);
results.push(article);
}
}
return this.sortAndLimit(results, options);
}
async loadArticle(id) {
const indexEntry = this.memoryIndex.get(id);
if (!indexEntry) return null;
const data = await fs.readFile(indexEntry.filePath, 'utf8');
return JSON.parse(data);
}
async saveIndex() {
const indexObj = Object.fromEntries(this.memoryIndex);
await fs.writeFile(this.indexPath, JSON.stringify(indexObj, null, 2));
}
sortAndLimit(results, options) {
let sorted = results.sort((a, b) => b.finalScore - a.finalScore);
if (options.limit) {
sorted = sorted.slice(0, options.limit);
}
return sorted;
}
async getStats() {
await this.init();
const stats = {
totalArticles: this.memoryIndex.size,
bySourceType: {},
byRaceCode: {},
avgScore: 0
};
let totalScore = 0;
for (const entry of this.memoryIndex.values()) {
// Comptage par type source
stats.bySourceType[entry.sourceType] =
(stats.bySourceType[entry.sourceType] || 0) + 1;
// Comptage par race
stats.byRaceCode[entry.raceCode] =
(stats.byRaceCode[entry.raceCode] || 0) + 1;
totalScore += entry.finalScore || 0;
}
stats.avgScore = stats.totalArticles > 0 ?
totalScore / stats.totalArticles : 0;
return stats;
}
}
module.exports = JSONStockRepository;
🎯 Container d'injection de dépendances
Dependency Injection Container
// src/container.js
const LLMNewsProvider = require('./implementations/providers/LLMNewsProvider');
const JSONStockRepository = require('./implementations/storage/JSONStockRepository');
const BasicScoringEngine = require('./implementations/scoring/BasicScoringEngine');
class Container {
constructor() {
this.services = new Map();
this.config = this.loadConfig();
}
loadConfig() {
return {
newsProvider: {
type: 'llm',
llm: {
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-4o-mini',
maxTokens: 2000
}
},
stockRepository: {
type: 'json',
json: {
dataPath: './data/stock'
}
},
scoringEngine: {
type: 'basic',
weights: {
freshness: 0.3,
specificity: 0.4,
quality: 0.2,
reusability: 0.1
}
}
};
}
register(name, factory) {
this.services.set(name, factory);
}
get(name) {
const factory = this.services.get(name);
if (!factory) {
throw new Error(`Service ${name} not registered`);
}
return factory();
}
init() {
// News Provider
this.register('newsProvider', () => {
switch (this.config.newsProvider.type) {
case 'llm':
return new LLMNewsProvider(this.config.newsProvider.llm);
// Futurs providers
// case 'scraping':
// return new ScrapingNewsProvider(this.config.newsProvider.scraping);
// case 'hybrid':
// return new HybridNewsProvider(this.config.newsProvider.hybrid);
default:
throw new Error(`Unknown news provider: ${this.config.newsProvider.type}`);
}
});
// Stock Repository
this.register('stockRepository', () => {
switch (this.config.stockRepository.type) {
case 'json':
return new JSONStockRepository(this.config.stockRepository.json);
// Futurs stockages
// case 'mongodb':
// return new MongoStockRepository(this.config.stockRepository.mongodb);
// case 'postgresql':
// return new PostgreSQLStockRepository(this.config.stockRepository.postgresql);
default:
throw new Error(`Unknown stock repository: ${this.config.stockRepository.type}`);
}
});
// Scoring Engine
this.register('scoringEngine', () => {
return new BasicScoringEngine(this.config.scoringEngine);
});
}
}
// Singleton
const container = new Container();
container.init();
module.exports = container;
🏢 Services métier (stables)
NewsSearchService - Service principal
// src/services/NewsSearchService.js
class NewsSearchService {
constructor(newsProvider, stockRepository, scoringEngine) {
this.newsProvider = newsProvider;
this.stockRepository = stockRepository;
this.scoringEngine = scoringEngine;
}
async search(query) {
// 1. Recherche en stock d'abord
const stockResults = await this.searchInStock(query);
// 2. Si insuffisant, recherche live
let liveResults = [];
if (stockResults.length < query.limit) {
const remaining = query.limit - stockResults.length;
liveResults = await this.searchLive({
...query,
limit: remaining
});
}
// 3. Scoring combiné
const allResults = [...stockResults, ...liveResults];
const scoredResults = await this.scoringEngine.batchScore(allResults, query);
// 4. Tri et limite
const finalResults = scoredResults
.sort((a, b) => b.finalScore - a.finalScore)
.slice(0, query.limit);
// 5. Tracking usage
await this.trackUsage(finalResults);
return {
results: finalResults,
metadata: {
fromStock: stockResults.length,
fromLive: liveResults.length,
totalFound: allResults.length,
searchTime: Date.now() - query.startTime
}
};
}
async searchInStock(query) {
return await this.stockRepository.findByRaceCode(query.raceCode, {
minScore: query.minScore || 100,
limit: query.limit
});
}
async searchLive(query) {
const results = await this.newsProvider.searchNews(query);
const validated = await this.newsProvider.validateResults(results);
// Sauvegarder en stock pour réutilisation
for (const result of validated) {
await this.stockRepository.save(result);
}
return validated;
}
async trackUsage(results) {
for (const result of results) {
await this.stockRepository.updateUsage(result.id, {
lastUsed: new Date(),
usageCount: (result.usageCount || 0) + 1
});
}
}
}
module.exports = NewsSearchService;
🔧 Configuration modulaire
Changement de composant en 1 ligne
// config/environments/development.js
module.exports = {
// Version actuelle : Full LLM + JSON
newsProvider: { type: 'llm', llm: { model: 'gpt-4o-mini' }},
stockRepository: { type: 'json', json: { dataPath: './data' }},
// Migration facile vers d'autres composants :
// Si on veut tester scraping :
// newsProvider: { type: 'scraping', scraping: { antiBot: true }},
// Si on veut MongoDB :
// stockRepository: { type: 'mongodb', mongodb: { uri: '...' }},
// Si on veut hybride :
// newsProvider: {
// type: 'hybrid',
// hybrid: {
// primary: { type: 'llm' },
// fallback: { type: 'scraping' }
// }
// }
};
✅ Avantages architecture modulaire
- Flexibilité totale : Changer un composant = modifier 1 ligne config
- Tests isolés : Mocker chaque interface indépendamment
- Évolution sans risque : Nouveau composant n'impacte pas les autres
- Développement parallèle : Équipe peut travailler sur interfaces différentes
- Migration progressive : Pas de big bang, composant par composant
- Maintenance simplifiée : Bug isolé dans son composant
- Performance optimisable : Optimiser 1 composant sans casser les autres
Cette architecture permet de démarrer simple (LLM + JSON) et d'évoluer composant par composant selon les besoins.
Architecture finalisée pour version modulaire, gratuite, full LLM