aissia/docs/03-implementation/configuration/error-handling.md
StillHammer ba42b6d9c7 Update CDC with hybrid architecture (WarFactory + multi-target)
- Add hybrid deployment modes: local_dev (MVP) and production_pwa (optional)
- Integrate WarFactory engine reuse with hot-reload 0.4ms
- Define multi-target compilation strategy (DLL/SO/WASM)
- Detail both deployment modes with cost analysis
- Add progressive roadmap: Phase 1 (local), Phase 2 (POC WASM), Phase 3 (cloud)
- Budget clarified: $10-20/mois (local) vs $13-25/mois (cloud)
- Document open questions for technical validation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-27 11:49:09 +08:00

3.3 KiB

Error Handling & Reliability System

Points 291-310 - Error Handling - Anti-cheat validation, input responses, module failures, network issues

Engine Crash/Restart Strategy

Detection System

Health monitoring automatique :

# Health check HTTP
GET /health toutes les 30 secondes

# Redis heartbeat
PUBLISH engine:heartbeat {"engine": "factory", "timestamp": "..."}

# Timeout detection
Si pas de heartbeat depuis 60s = engine down

Recovery Automatique

Restart et state restoration :

// Engine redémarre → republish état current dans Redis
// Example: Factory Engine restart
PUBLISH factory:status {"active_productions": [...]}

// Autres engines reçoivent l'update et ajustent leur état local

Graceful Degradation

Fallback vers cache local :

// Dans chaque engine
if (!canReachEngine("economy")) {
    // Utiliser derniers prix connus en cache
    price = fallbackPriceCache.get(resource);
    logWarning("Using cached price, Economy engine unreachable");
}

Redis Failover Strategy

Persistence Configuration

Redis durability :

# Configuration Redis
save 900 1     # Save snapshot if 1+ keys changed in 15min
appendonly yes # Log toutes les commandes

Multiple Instances

High availability setup :

Primary Redis: 6379 (read/write)
Replica Redis: 6380 (read-only backup)
Si primary down → engines switch automatiquement vers replica

Message Replay

State recovery après Redis restart avec replay des messages critiques.

Anti-cheat & Validation

Server Authoritative Design

  • Toute logique métier côté serveur
  • Anti-cheat naturel via validation centralisée
  • Zero game logic côté client

Anti-cheat Psychologique

Stratégie : Cheat attempts → "bugs" simulés progressifs

class AntiCheatPsycho {
    void onCheatDetected(CheatType type) {
        switch(type) {
            case SPEED_HACK:
                simulateRandomLag(50ms, 500ms);
                break;
            case RESOURCE_HACK:
                simulateVisualGlitches(tanks, 5%);
                break;
            case DAMAGE_HACK:
                simulateDesync(movement, 2%);
                break;
        }
    }
};

Input Validation

  • V1 Thin Client : Validation authoritative serveur
  • V2 Client Prediction : Validation + réconciliation
  • Build verification : Hot-reload seulement si build réussi

Network Issues & Module Failures

Module Isolation

  • Failures localisées : Pas de cascade entre modules
  • Module autonomy : Continue avec dernières données reçues
  • Async communication : Redis Pub/Sub pour resilience

Timeout Handling

  • Fallback patterns : Cache local si engine unreachable
  • Graceful degradation : Fonctionnalité réduite vs crash total
  • State preservation : Maintien état durant failures temporaires

Sources

Documentation originale :

  • docs/global/architecture-technique.md - Section "Error Handling & Reliability"
  • docs/toCheck/architecture-modulaire.md - Anti-cheat psychologique
  • docs/architecture-technique.md - Validation patterns

Points couverts : 20 spécifications error handling détaillées