GroveEngine/docs/coding_guidelines.md
StillHammer 98acb32c4c fix: Resolve deadlock in IntraIOManager + cleanup SEGFAULTs
- Fix critical deadlock in IntraIOManager using std::scoped_lock for
  multi-mutex acquisition (CrossSystemIntegration: 1901s → 4s)
- Add std::shared_mutex for read-heavy operations (TopicTree, IntraIOManager)
- Fix SEGFAULT in SequentialModuleSystem destructor (logger guard)
- Fix SEGFAULT in ModuleLoader (don't auto-unload when modules still alive)
- Fix iterator invalidation in DependencyTestEngine destructor
- Add TSan/Helgrind integration for deadlock detection
- Add coding guidelines for synchronization patterns

All 23 tests now pass (100%)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 11:36:33 +08:00

3.5 KiB

Synchronization Guidelines - GroveEngine

Overview

This document describes the thread-safety patterns used in GroveEngine to prevent deadlocks and optimize concurrent access.

Mutex Types

std::mutex

Use for simple, exclusive access to resources with balanced read/write patterns.

std::shared_mutex (C++17)

Use for read-heavy workloads where multiple readers can access data concurrently.

  • std::shared_lock - Allows concurrent reads
  • std::unique_lock - Exclusive write access

DO: Use std::scoped_lock for multiple mutexes

When you need to lock multiple mutexes, always use std::scoped_lock to prevent deadlock via lock-order-inversion:

void function() {
    std::scoped_lock lock(mutex1, mutex2, mutex3);
    // Safe - lock order guaranteed by implementation (deadlock-free algorithm)
}

DON'T: Use std::lock_guard for multiple mutexes

void function() {
    std::lock_guard<std::mutex> lock1(mutex1);  // BAD
    std::lock_guard<std::mutex> lock2(mutex2);  // DEADLOCK RISK
    // If another thread locks in reverse order -> deadlock
}

DO: Use std::unique_lock with std::lock if you need early unlock

void function() {
    std::unique_lock<std::mutex> lock1(mutex1, std::defer_lock);
    std::unique_lock<std::mutex> lock2(mutex2, std::defer_lock);
    std::lock(lock1, lock2);  // Safe deadlock-free acquisition

    // ... do work ...

    // Can unlock early if needed
    lock1.unlock();
}

DO: Use shared_lock for read operations

class DataStore {
    mutable std::shared_mutex mutex;
    std::map<std::string, Data> data;

public:
    Data get(const std::string& key) const {
        std::shared_lock lock(mutex);  // Multiple readers allowed
        return data.at(key);
    }

    void set(const std::string& key, Data value) {
        std::unique_lock lock(mutex);  // Exclusive write access
        data[key] = std::move(value);
    }
};

Read/Write Ratio Guidelines

Ratio Recommendation
>10:1 (read-heavy) Use std::shared_mutex
1:1 to 10:1 Consider std::shared_mutex
<1:1 (write-heavy) Use std::mutex (shared_mutex overhead not worth it)

GroveEngine Specific Patterns

TopicTree

  • Uses std::shared_mutex
  • findSubscribers() - shared_lock (READ)
  • registerSubscriber() - unique_lock (WRITE)
  • unregisterSubscriber() - unique_lock (WRITE)

IntraIOManager

  • Uses std::shared_mutex for instances map
  • Uses std::scoped_lock when both managerMutex and batchMutex needed
  • getInstance() - shared_lock (READ)
  • createInstance() - unique_lock (WRITE)
  • routeMessage() - scoped_lock (both mutexes)

Validation Tools

ThreadSanitizer (TSan)

Build with TSan to detect lock-order-inversions and data races:

cmake -DGROVE_ENABLE_TSAN=ON -B build-tsan
cmake --build build-tsan
TSAN_OPTIONS="detect_deadlocks=1" ctest

Helgrind

Alternative detector via Valgrind:

cmake -DGROVE_ENABLE_HELGRIND=ON -B build
cmake --build build
make helgrind

Common Pitfalls

  1. Nested lock calls - Don't call a function that takes a lock while holding the same lock
  2. Lock-order-inversion - Always use scoped_lock for multiple mutexes
  3. shared_lock in write context - Never use shared_lock when modifying data
  4. Recursive locking - Avoid; redesign if needed

Author: Claude Code Date: 2025-01-21