Co-authored-by: aider (openai/andrew/openrouter/mistralai/mistral-medium-3.1) <aider@aider.chat>
331 lines
12 KiB
Markdown
331 lines
12 KiB
Markdown
# Refactoring Plan to Reduce Code Duplication
|
|
|
|
## 1. Create Core Service Layer with Clear Boundaries
|
|
**Files:**
|
|
- Add: `src/core/item_service.rs`
|
|
- Add: `src/core/async_item_service.rs`
|
|
- Add: `src/core/compression_service.rs`
|
|
- Add: `src/core/meta_service.rs`
|
|
- Add: `src/core/mod.rs`
|
|
- Add: `src/core/error.rs`
|
|
|
|
**Functions:**
|
|
- Add: `get_item_full` in `item_service.rs`
|
|
- Add: `save_item` in `item_service.rs`
|
|
- Add: `list_items` in `item_service.rs`
|
|
- Add: `delete_item` in `item_service.rs`
|
|
- Add: `get_compressed_content` in `compression_service.rs`
|
|
- Add: `process_metadata` in `meta_service.rs`
|
|
- Add: Async wrappers for all core functions in `async_item_service.rs`
|
|
|
|
**Reason:** Extract common business logic from modes and APIs into reusable services
|
|
**Implementation:**
|
|
- Move logic from modes (get, save, list, info) and API handlers into service functions
|
|
- Services should return structured data, not format output
|
|
- Handle compression, metadata, and database operations
|
|
- Keep core services **synchronous** for CLI performance
|
|
- Provide async wrappers using `tokio::task::spawn_blocking` for API use
|
|
- Use streaming for pipeline efficiency (process data in chunks)
|
|
|
|
**Layer Division:**
|
|
- **`db.rs`**: Low-level SQL operations (e.g., `insert_item`, `query_all_items`)
|
|
- **`item_service.rs`**: Higher-level business logic (e.g., "get item with metadata and tags," "validate item before save")
|
|
- Example:
|
|
```rust
|
|
// db.rs (low-level)
|
|
pub fn get_item_with_tags(conn: &Connection, id: i64) -> Result<(Item, Vec<Tag>)>
|
|
|
|
// item_service.rs (high-level)
|
|
pub fn get_item_full(conn: &Connection, id: i64) -> Result<ItemWithMeta> {
|
|
let (item, tags) = db::get_item_with_tags(conn, id)?;
|
|
let meta = db::get_item_meta(conn, &item)?;
|
|
Ok(ItemWithMeta { item, tags, meta })
|
|
}
|
|
```
|
|
|
|
## 2. Thread Safety and Resource Management
|
|
**Files:**
|
|
- Change: `src/modes/server/api/item.rs`
|
|
- Change: `src/modes/server/api/status.rs`
|
|
- Change: `src/modes/server/mcp/tools.rs`
|
|
|
|
**Functions:**
|
|
- Add: `get_item_async` in `async_item_service.rs`
|
|
- Add: `save_item_async` in `async_item_service.rs`
|
|
- Change: All API handlers to use async services
|
|
|
|
**Reason:** Ensure services can be safely used in both synchronous and asynchronous contexts
|
|
**Implementation:**
|
|
- Document thread-safety guarantees for all core services
|
|
- Use `Arc<Mutex<T>>` or connection pooling for shared resources like database connections
|
|
- Provide examples for safe async/sync boundaries:
|
|
```rust
|
|
pub async fn get_item_async(
|
|
state: &AppState,
|
|
id: i64,
|
|
) -> Result<ItemWithMeta, CoreError> {
|
|
let conn = state.db.clone(); // Arc<Mutex<Connection>>
|
|
tokio::task::spawn_blocking(move || {
|
|
let conn = conn.lock().unwrap();
|
|
item_service::get_item_full(&conn, id)
|
|
})
|
|
.await?
|
|
}
|
|
```
|
|
- Use `tokio::task::spawn_blocking` for CPU-bound or blocking I/O operations
|
|
- Benchmark thread pool sizes for CPU-bound tasks (e.g., compression)
|
|
|
|
## 3. Create Common Data Structures
|
|
**Files:**
|
|
- Add: `src/core/types.rs`
|
|
|
|
**Functions:**
|
|
- Add: `From<db::Item>` for `ItemWithMeta`
|
|
- Add: `From<ItemWithMeta>` for `ItemInfo` (API response)
|
|
- Add: Serialization/deserialization implementations
|
|
|
|
**Reason:** Standardize data structures used across modes and APIs
|
|
**Implementation:**
|
|
- Define structs for `Item`, `ItemWithContent`, `ItemWithMeta`, `Response<T>`
|
|
- Include conversion functions from database types (`From<db::Item>`)
|
|
- Add serialization/deserialization support for JSON/YAML
|
|
- Ensure all fields are properly documented
|
|
- Use zero-copy patterns where possible (slicing instead of copying)
|
|
|
|
## 4. Unified Error Handling with Conversions
|
|
**Files:**
|
|
- Add: `src/core/error.rs`
|
|
- Change: `src/modes/server/api/item.rs`
|
|
- Change: `src/modes/server/mcp/tools.rs`
|
|
- Change: All mode files (`get.rs`, `save.rs`, etc.)
|
|
|
|
**Functions:**
|
|
- Add: `CoreError` enum with comprehensive variants
|
|
- Add: `From<CoreError>` for `StatusCode`
|
|
- Add: `From<CoreError>` for `ToolError`
|
|
- Add: `From<CoreError>` for `anyhow::Error`
|
|
- Change: All error handling to use `CoreError`
|
|
|
|
**Reason:** Standardize error types across CLI, API, and MCP interfaces
|
|
**Implementation:**
|
|
- Define a base error enum (`CoreError`) with conversions to all interface-specific error types
|
|
- Example:
|
|
```rust
|
|
#[derive(Debug, thiserror::Error)]
|
|
pub enum CoreError {
|
|
#[error("DB error: {0}")]
|
|
Database(#[from] rusqlite::Error),
|
|
#[error("IO error: {0}")]
|
|
Io(#[from] std::io::Error),
|
|
// ...
|
|
}
|
|
|
|
// Auto-convert to HTTP status
|
|
impl From<CoreError> for StatusCode {
|
|
fn from(err: CoreError) -> Self {
|
|
match err {
|
|
CoreError::Database(_) => StatusCode::INTERNAL_SERVER_ERROR,
|
|
// ...
|
|
}
|
|
}
|
|
}
|
|
```
|
|
- Use `#[derive(thiserror::Error)]` for easy `Display` and `Error` implementations
|
|
- Provide user-friendly error messages with error codes for programmatic handling
|
|
|
|
## 5. Refactor CLI Modes to Use Services
|
|
**Files:**
|
|
- Change: `src/modes/get.rs`
|
|
- Change: `src/modes/save.rs`
|
|
- Change: `src/modes/list.rs`
|
|
- Change: `src/modes/info.rs`
|
|
- Change: `src/modes/delete.rs`
|
|
- Change: `src/modes/diff.rs`
|
|
- Change: `src/modes/status.rs`
|
|
|
|
**Functions:**
|
|
- Change: `mode_get` to use `item_service::get_item_full`
|
|
- Change: `mode_save` to use `item_service::save_item`
|
|
- Change: `mode_list` to use `item_service::list_items`
|
|
- Change: `mode_info` to use `item_service::get_item_full`
|
|
- Change: `mode_delete` to use `item_service::delete_item`
|
|
- Change: `mode_diff` to use `item_service::get_item_full` for both items
|
|
- Change: `mode_status` to use new status service functions
|
|
|
|
**Reason:** Remove direct database and file system access from modes
|
|
**Implementation:**
|
|
- Replace current implementations with calls to core services
|
|
- Keep only CLI-specific formatting and output logic
|
|
- Handle command-line argument parsing and validation
|
|
- Use synchronous services directly
|
|
- Implement streaming for stdin/stdout to maintain pipeline performance
|
|
|
|
## 6. Refactor REST API to Use Async Services
|
|
**Files:**
|
|
- Change: `src/modes/server/api/item.rs`
|
|
- Change: `src/modes/server/api/status.rs`
|
|
|
|
**Functions:**
|
|
- Change: `handle_get_item` to use `async_item_service::get_item_async`
|
|
- Change: `handle_get_item_latest` to use `async_item_service::get_item_async`
|
|
- Change: `handle_list_items` to use `async_item_service::list_items_async`
|
|
- Change: `handle_post_item` to use `async_item_service::save_item_async`
|
|
- Change: `handle_get_item_content` to use `async_item_service::get_item_content_async`
|
|
- Change: `handle_get_item_meta` to use `async_item_service::get_item_meta_async`
|
|
- Change: `handle_status` to use `async_item_service::get_status_async`
|
|
|
|
**Reason:** Remove business logic from HTTP handlers
|
|
**Implementation:**
|
|
- Convert handlers to call async core services
|
|
- Keep only HTTP-specific code (status codes, headers, etc.)
|
|
- Use common error handling with conversions to HTTP responses
|
|
- Wrap synchronous service calls in `tokio::task::spawn_blocking`
|
|
|
|
## 7. Refactor MCP Tools to Use Services
|
|
**Files:**
|
|
- Change: `src/modes/server/mcp/tools.rs`
|
|
|
|
**Functions:**
|
|
- Change: `save_item` to use `item_service::save_item`
|
|
- Change: `get_item` to use `item_service::get_item_full`
|
|
- Change: `get_latest_item` to use `item_service::get_latest_item`
|
|
- Change: `list_items` to use `item_service::list_items`
|
|
- Change: `search_items` to use `item_service::search_items`
|
|
|
|
**Reason:** Remove duplication with REST API and CLI modes
|
|
**Implementation:**
|
|
- Replace current implementation with calls to core services
|
|
- Keep only MCP protocol-specific logic
|
|
- Use synchronous services directly (MCP is typically local/short-lived)
|
|
- Standardize response format to match API/CLI
|
|
|
|
## 8. Create Common Error Handling
|
|
**Files:**
|
|
- Add: `src/core/error.rs`
|
|
- Change: All files that handle errors
|
|
|
|
**Functions:**
|
|
- Add: Comprehensive error handling in `core/error.rs`
|
|
- Add: Conversion traits for all error types
|
|
- Change: All error handling to use new error system
|
|
|
|
**Reason:** Standardize error types across the application
|
|
**Implementation:**
|
|
- Define comprehensive error enum with conversions:
|
|
- From database errors
|
|
- From I/O errors
|
|
- From compression errors
|
|
- From validation errors
|
|
- Implement conversions to:
|
|
- `anyhow::Error` (for CLI)
|
|
- `axum::http::StatusCode` (for API)
|
|
- `ToolError` (for MCP)
|
|
- Provide user-friendly error messages
|
|
- Include error codes for programmatic handling
|
|
|
|
## 9. Update Database Layer for Batch Operations
|
|
**Files:**
|
|
- Change: `src/db.rs`
|
|
|
|
**Functions:**
|
|
- Add: `get_items_with_meta_batch`
|
|
- Add: `get_items_with_tags_batch`
|
|
- Add: `insert_item_with_meta_transaction`
|
|
- Add: `delete_item_with_meta_transaction`
|
|
- Change: Optimize existing queries for batch operations
|
|
|
|
**Reason:** Support efficient batch operations needed by services
|
|
**Implementation:**
|
|
- Add functions to get multiple items with their metadata and tags
|
|
- Add batch insertion/updates for tags and metadata
|
|
- Add transaction support for atomic operations
|
|
- Optimize queries for common access patterns
|
|
- Ensure all batch operations are properly documented
|
|
|
|
## 10. Add Integration Tests
|
|
**Files:**
|
|
- Add: `tests/integration/core_tests.rs`
|
|
- Add: `tests/integration/cli_tests.rs`
|
|
- Add: `tests/integration/api_tests.rs`
|
|
- Add: `tests/integration/performance_tests.rs`
|
|
|
|
**Functions:**
|
|
- Add: Test cases for all core service functions
|
|
- Add: Test cases for CLI modes
|
|
- Add: Test cases for API endpoints
|
|
- Add: Performance benchmarks
|
|
|
|
**Reason:** Ensure refactored code maintains functionality and performance
|
|
**Implementation:**
|
|
- Test core services independently
|
|
- Test CLI modes and APIs through their public interfaces
|
|
- Verify compression, metadata, and database operations
|
|
- Include performance benchmarks for critical paths
|
|
- Use in-memory databases and tempfiles for isolation
|
|
- Test both sync and async service implementations
|
|
|
|
## 11. Performance Optimization Guidelines
|
|
**Files:**
|
|
- Change: All core service files
|
|
- Change: All mode files
|
|
- Change: All API handler files
|
|
|
|
**Functions:**
|
|
- Add: Streaming implementations for I/O operations
|
|
- Add: Benchmark functions for critical paths
|
|
- Change: Buffer management to minimize copies
|
|
|
|
**Reason:** Ensure the refactored version doesn't slow down pipelines
|
|
**Implementation:**
|
|
- Use streaming for stdin/stdout processing (chunked I/O)
|
|
- Minimize buffering and memory copies
|
|
- Offload CPU-bound work (compression, plugins) to thread pools
|
|
- Provide fast-path options (e.g., `--fast` flag to skip metadata plugins)
|
|
- Benchmark critical operations before/after refactoring
|
|
- Document performance characteristics and tradeoffs
|
|
|
|
## Implementation Order:
|
|
1. Create core module structure and error types (`core/error.rs`, `core/types.rs`)
|
|
2. Implement core services with basic functionality (sync first)
|
|
3. Add async wrappers for API use
|
|
4. Refactor one mode (e.g., `get`) to use services and validate approach
|
|
5. Refactor corresponding API endpoints to use async services
|
|
6. Repeat for other modes and APIs
|
|
7. Refactor MCP tools to use services
|
|
8. Add comprehensive tests and benchmarks
|
|
9. Clean up removed code from original files
|
|
10. Document performance characteristics and tradeoffs
|
|
|
|
## Benefits:
|
|
- Reduced code duplication between CLI, API, and MCP
|
|
- Easier maintenance with clear separation of concerns
|
|
- Consistent behavior across all interfaces
|
|
- Better testability with isolated service layer
|
|
- Maintained or improved pipeline performance
|
|
- Flexible architecture supporting both sync and async use cases
|
|
|
|
## Key Risks and Mitigations:
|
|
1. **Performance Regression:**
|
|
- Risk: Refactoring could slow down pipeline operations
|
|
- Mitigation: Benchmark before/after, use streaming, minimize overhead
|
|
|
|
2. **Increased Complexity:**
|
|
- Risk: Adding service layer could make code harder to understand
|
|
- Mitigation: Clear documentation, gradual refactoring, maintain simple interfaces
|
|
|
|
3. **Async/Sync Boundaries:**
|
|
- Risk: Mixing sync/async could lead to deadlocks or inefficiencies
|
|
- Mitigation: Clear boundaries, use `spawn_blocking` for sync work in async context
|
|
|
|
4. **Breaking Changes:**
|
|
- Risk: Refactoring could change behavior in subtle ways
|
|
- Mitigation: Comprehensive tests, gradual rollout, maintain backward compatibility
|
|
|
|
## Design Principles:
|
|
1. **Zero-Copy Where Possible:** Use slicing instead of copying data
|
|
2. **Streaming Processing:** Handle data in chunks for memory efficiency
|
|
3. **Clear Boundaries:** Separate core logic from interface-specific code
|
|
4. **Performance First:** Optimize for common pipeline use cases
|
|
5. **Consistent Errors:** Unified error handling across all interfaces
|
|
6. **Backward Compatibility:** Maintain existing CLI/API behavior
|