GroveEngine/docs/coding_guidelines.md
StillHammer 98acb32c4c fix: Resolve deadlock in IntraIOManager + cleanup SEGFAULTs
- Fix critical deadlock in IntraIOManager using std::scoped_lock for
  multi-mutex acquisition (CrossSystemIntegration: 1901s → 4s)
- Add std::shared_mutex for read-heavy operations (TopicTree, IntraIOManager)
- Fix SEGFAULT in SequentialModuleSystem destructor (logger guard)
- Fix SEGFAULT in ModuleLoader (don't auto-unload when modules still alive)
- Fix iterator invalidation in DependencyTestEngine destructor
- Add TSan/Helgrind integration for deadlock detection
- Add coding guidelines for synchronization patterns

All 23 tests now pass (100%)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 11:36:33 +08:00

124 lines
3.5 KiB
Markdown

# Synchronization Guidelines - GroveEngine
## Overview
This document describes the thread-safety patterns used in GroveEngine to prevent deadlocks and optimize concurrent access.
## Mutex Types
### std::mutex
Use for simple, exclusive access to resources with balanced read/write patterns.
### std::shared_mutex (C++17)
Use for **read-heavy workloads** where multiple readers can access data concurrently.
- `std::shared_lock` - Allows concurrent reads
- `std::unique_lock` - Exclusive write access
## DO: Use std::scoped_lock for multiple mutexes
When you need to lock multiple mutexes, **always** use `std::scoped_lock` to prevent deadlock via lock-order-inversion:
```cpp
void function() {
std::scoped_lock lock(mutex1, mutex2, mutex3);
// Safe - lock order guaranteed by implementation (deadlock-free algorithm)
}
```
## DON'T: Use std::lock_guard for multiple mutexes
```cpp
void function() {
std::lock_guard<std::mutex> lock1(mutex1); // BAD
std::lock_guard<std::mutex> lock2(mutex2); // DEADLOCK RISK
// If another thread locks in reverse order -> deadlock
}
```
## DO: Use std::unique_lock with std::lock if you need early unlock
```cpp
void function() {
std::unique_lock<std::mutex> lock1(mutex1, std::defer_lock);
std::unique_lock<std::mutex> lock2(mutex2, std::defer_lock);
std::lock(lock1, lock2); // Safe deadlock-free acquisition
// ... do work ...
// Can unlock early if needed
lock1.unlock();
}
```
## DO: Use shared_lock for read operations
```cpp
class DataStore {
mutable std::shared_mutex mutex;
std::map<std::string, Data> data;
public:
Data get(const std::string& key) const {
std::shared_lock lock(mutex); // Multiple readers allowed
return data.at(key);
}
void set(const std::string& key, Data value) {
std::unique_lock lock(mutex); // Exclusive write access
data[key] = std::move(value);
}
};
```
## Read/Write Ratio Guidelines
| Ratio | Recommendation |
|-------|---------------|
| >10:1 (read-heavy) | Use `std::shared_mutex` |
| 1:1 to 10:1 | Consider `std::shared_mutex` |
| <1:1 (write-heavy) | Use `std::mutex` (shared_mutex overhead not worth it) |
## GroveEngine Specific Patterns
### TopicTree
- Uses `std::shared_mutex`
- `findSubscribers()` - `shared_lock` (READ)
- `registerSubscriber()` - `unique_lock` (WRITE)
- `unregisterSubscriber()` - `unique_lock` (WRITE)
### IntraIOManager
- Uses `std::shared_mutex` for instances map
- Uses `std::scoped_lock` when both `managerMutex` and `batchMutex` needed
- `getInstance()` - `shared_lock` (READ)
- `createInstance()` - `unique_lock` (WRITE)
- `routeMessage()` - `scoped_lock` (both mutexes)
## Validation Tools
### ThreadSanitizer (TSan)
Build with TSan to detect lock-order-inversions and data races:
```bash
cmake -DGROVE_ENABLE_TSAN=ON -B build-tsan
cmake --build build-tsan
TSAN_OPTIONS="detect_deadlocks=1" ctest
```
### Helgrind
Alternative detector via Valgrind:
```bash
cmake -DGROVE_ENABLE_HELGRIND=ON -B build
cmake --build build
make helgrind
```
## Common Pitfalls
1. **Nested lock calls** - Don't call a function that takes a lock while holding the same lock
2. **Lock-order-inversion** - Always use `scoped_lock` for multiple mutexes
3. **shared_lock in write context** - Never use shared_lock when modifying data
4. **Recursive locking** - Avoid; redesign if needed
---
**Author**: Claude Code
**Date**: 2025-01-21