GroveEngine/tests/benchmarks/plans/04_e2e.md
StillHammer 063549bf17 feat: Add comprehensive benchmark suite for GroveEngine performance validation
Add complete benchmark infrastructure with 4 benchmark categories:

**Benchmark Helpers (00_helpers.md)**
- BenchmarkTimer.h: High-resolution timing with std::chrono
- BenchmarkStats.h: Statistical analysis (mean, median, p95, p99, stddev)
- BenchmarkReporter.h: Professional formatted output
- benchmark_helpers_demo.cpp: Validation suite

**TopicTree Routing (01_topictree.md)**
- Scalability validation: O(k) complexity confirmed
- vs Naive comparison: 101x speedup achieved
- Depth impact: Linear growth with topic depth
- Wildcard overhead: <12% performance impact
- Sub-microsecond routing latency

**IntraIO Batching (02_batching.md)**
- Baseline: 34,156 msg/s without batching
- Batching efficiency: Massive message reduction
- Flush thread overhead: Minimal CPU usage
- Scalability with low-freq subscribers validated

**DataNode Read-Only API (03_readonly.md)**
- Zero-copy speedup: 2x faster than getChild()
- Concurrent reads: 23.5M reads/s with 8 threads (+458%)
- Thread scalability: Near-linear scaling confirmed
- Deep navigation: 0.005µs per level

**End-to-End Real World (04_e2e.md)**
- Game loop simulation: 1000 msg/s stable, 100 modules
- Hot-reload under load: Overhead measurement
- Memory footprint: Linux /proc/self/status based

Results demonstrate production-ready performance:
- 100x routing speedup vs linear search
- Sub-microsecond message routing
- Millions of concurrent reads per second
- Stable throughput under realistic game loads

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 16:08:10 +08:00

3.0 KiB

Plan: End-to-End Real World Benchmarks

Objectif

Scénarios réalistes de jeu pour valider performance globale.


Benchmark M: Game Loop Simulation

Test: Latence et throughput dans un scénario de jeu réaliste.

Setup:

  • 100 modules simulés:
    • 50 game logic: publish player:*, game:*
    • 30 AI: subscribe ai:*, player:*
    • 20 rendering: subscribe render:*, player:*
  • 1000 messages/sec pendant 10 secondes
  • Topics variés: player:123:position, ai:enemy:target, render:draw, physics:collision

Mesures:

  • Latence p50: X µs
  • Latence p95: Y µs
  • Latence p99: Z µs (attendu <1ms)
  • Throughput: W msg/s
  • CPU usage: U%

Succès:

  • p99 < 1ms
  • Throughput stable à 1000 msg/s
  • CPU < 50%

Benchmark N: Hot-Reload Under Load

Test: Overhead du hot-reload pendant charge active.

Setup:

  • Lancer benchmark M (game loop)
  • Après 5s, déclencher hot-reload d'un module
  • Mesurer pause time et impact sur latence

Mesures:

  • Pause time: X ms (attendu <50ms)
  • Latence p99 pendant reload: Y µs
  • Overhead: (latence_reload - latence_normale) / latence_normale

Succès:

  • Pause < 50ms
  • Overhead < 10%

Note: Simuler hot-reload avec unload/reload d'un module


Benchmark O: Memory Footprint

Test: Consommation mémoire du TopicTree et buffers.

Setup:

  • Créer 10000 topics uniques
  • Créer 1000 subscribers (patterns variés)
  • Mesurer memory usage avant/après

Mesures:

  • Memory avant: X MB (baseline)
  • Memory après topics: Y MB
  • Memory après subscribers: Z MB
  • Memory/topic: (Y-X) / 10000 bytes
  • Memory/subscriber: (Z-Y) / 1000 bytes

Succès:

  • Memory/topic < 1KB
  • Memory/subscriber < 5KB

Implémentation: Lire /proc/self/status (VmRSS) ou utiliser malloc_stats()


Implémentation

Fichier: benchmark_e2e.cpp

Dépendances:

  • IntraIOManager (src/)
  • JsonDataNode (src/)
  • Potentiellement ModuleLoader pour hot-reload simulation
  • Helpers: Timer, Stats, Reporter

Structure:

class MockModule {
    // Simule un module (publisher ou subscriber)
};

void benchmarkM_game_loop();
void benchmarkN_hotreload_under_load();
void benchmarkO_memory_footprint();

int main() {
    benchmarkM_game_loop();
    benchmarkN_hotreload_under_load();
    benchmarkO_memory_footprint();
}

Complexité: Plus élevée que les autres benchmarks (intégration multiple features)

Référence: tests/integration/test_13_cross_system.cpp (IO + DataNode)


Notes

Benchmark M:

  • Utiliser threads pour simuler modules concurrents
  • Randomiser patterns pour réalisme
  • Mesurer latence = temps entre publish et receive

Benchmark N:

  • Peut nécessiter hook dans ModuleLoader pour mesurer pause
  • Alternative: simuler avec mutex lock/unlock

Benchmark O:

  • Memory measurement peut être OS-dépendant
  • Utiliser #ifdef __linux__ pour /proc, alternative pour autres OS