Files
keep/PLAN.md
Andrew Phillips da59401ca7 chore: mark REST API refactoring as complete
Co-authored-by: aider (openai/andrew/openrouter/google/gemini-2.5-pro) <aider@aider.chat>
2025-08-25 12:44:04 -03:00

272 lines
10 KiB
Markdown

# Refactoring Plan to Reduce Code Duplication
## Implementation Order:
- [x] 1. Create core module structure and error types
- [x] 2. Create common data structures
- [x] 3. Update database layer for batch operations
- [x] 4. Create core services with clear boundaries (synchronous)
- [x] 5. Add async wrappers for API use
- [x] 6. Refactor CLI modes to use services (DONE)
- [x] 7. Refactor REST API to use async services
- [ ] 8. Refactor MCP tools to use services
- [x] 9. Create unified error handling
- [ ] 10. Add integration tests
- [ ] 11. Add performance optimization guidelines (partially done)
## 1. Create Core Module Structure and Error Types (DONE)
**Files:**
- Add: `src/core/error.rs`
- Add: `src/core/mod.rs`
**Functions:**
- Add: `CoreError` enum with comprehensive variants
- Add: Conversion traits for all error types
**Reason:** Establish foundation for consistent error handling
**Implementation:**
- Define base error enum with conversions to all interface-specific error types
- Use `#[derive(thiserror::Error)]` for easy `Display` and `Error` implementations
- Provide user-friendly error messages with error codes
## 2. Create Common Data Structures (DONE)
**Files:**
- Add: `src/core/types.rs`
**Functions:**
- Add: `ItemWithMeta` struct
- Add: `ItemWithContent` struct
- Add: `From<db::Item>` for `ItemWithMeta`
- Add: `From<ItemWithMeta>` for `ItemInfo` (API response)
- Add: Serialization/deserialization implementations
**Reason:** Standardize data structures used across modes and APIs
**Implementation:**
- Define structs for common data types
- Include conversion functions from database types
- Add serialization/deserialization support for JSON/YAML
- Ensure all fields are properly documented
## 3. Update Database Layer for Batch Operations (DONE)
**Files:**
- Change: `src/db.rs`
**Functions:**
- Add: `get_items_with_meta_batch`
- Add: `get_items_with_tags_batch`
- Add: `insert_item_with_meta_transaction`
- Add: `delete_item_with_meta_transaction`
- Change: Optimize existing queries for batch operations
**Reason:** Support efficient batch operations needed by services
**Implementation:**
- Add functions to get multiple items with their metadata and tags
- Add batch insertion/updates for tags and metadata
- Add transaction support for atomic operations
- Optimize queries for common access patterns
## 4. Create Core Service Layer with Clear Boundaries (DONE)
**Files:**
- Add: `src/core/item_service.rs`
- Add: `src/core/compression_service.rs`
- Add: `src/core/meta_service.rs`
**Functions:**
- Add: `get_item_full` in `item_service.rs`
- Add: `save_item` in `item_service.rs`
- Add: `list_items` in `item_service.rs`
- Add: `delete_item` in `item_service.rs`
- Add: `get_compressed_content` in `compression_service.rs`
- Add: `process_metadata` in `meta_service.rs`
**Reason:** Extract common business logic from modes and APIs into reusable services
**Implementation:**
- Move logic from modes (get, save, list, info) and API handlers into service functions
- Services should return structured data, not format output
- Handle compression, metadata, and database operations
- Keep core services **synchronous** for CLI performance
- Use streaming for pipeline efficiency (process data in chunks)
**Layer Division:**
- **`db.rs`**: Low-level SQL operations (e.g., `insert_item`, `query_all_items`)
- **`item_service.rs`**: Higher-level business logic (e.g., "get item with metadata and tags," "validate item before save")
## 5. Add Async Wrappers for API Use
**Files:**
- Add: `src/core/async_item_service.rs`
**Functions:**
- Add: `get_item_async` in `async_item_service.rs`
- Add: `save_item_async` in `async_item_service.rs`
- Add: `list_items_async` in `async_item_service.rs`
- Add: `delete_item_async` in `async_item_service.rs`
**Reason:** Ensure services can be safely used in asynchronous contexts
**Implementation:**
- Document thread-safety guarantees for all core services
- Use `Arc<Mutex<T>>` or connection pooling for shared resources
- Provide examples for safe async/sync boundaries
- Use `tokio::task::spawn_blocking` for CPU-bound or blocking I/O operations
## 6. Refactor CLI Modes to Use Services (DONE)
**Files:**
- Change: `src/modes/get.rs` (DONE)
- Change: `src/modes/save.rs` (DONE)
- Change: `src/modes/list.rs` (DONE)
- Change: `src/modes/info.rs` (DONE)
- Change: `src/modes/delete.rs` (DONE)
- Change: `src/modes/diff.rs` (DONE)
- Change: `src/modes/status.rs` (DONE, uses shared function)
**Functions:**
- Change: `mode_get` to use `item_service` (DONE)
- Change: `mode_save` to use `item_service` (DONE)
- Change: `mode_list` to use `item_service` (DONE)
- Change: `mode_info` to use `item_service` (DONE)
- Change: `mode_delete` to use `item_service` (DONE)
- Change: `mode_diff` to use `item_service` (DONE)
- Change: `mode_status` to use new status service functions (DONE, uses shared function)
**Reason:** Remove direct database and file system access from modes
**Implementation:**
- Replace current implementations with calls to core services
- Keep only CLI-specific formatting and output logic
- Handle command-line argument parsing and validation
- Use synchronous services directly
- Implement streaming for stdin/stdout to maintain pipeline performance
## 7. Refactor REST API to Use Async Services (DONE)
**Files:**
- Change: `src/modes/server/api/item.rs` (DONE)
- Change: `src/modes/server/api/status.rs` (DONE)
**Functions:**
- Change: `handle_get_item` to use `async_item_service::get_item_async` (DONE)
- Change: `handle_get_item_latest` to use `async_item_service::get_item_async` (DONE)
- Change: `handle_list_items` to use `async_item_service::list_items_async` (DONE)
- Change: `handle_post_item` to use `async_item_service::save_item_async` (Not implemented)
- Change: `handle_get_item_content` to use `async_item_service::get_item_content_async` (DONE)
- Change: `handle_get_item_meta` to use `async_item_service::get_item_meta_async` (DONE)
- Change: `handle_status` to use `async_item_service::get_status_async` (DONE, uses shared function)
**Reason:** Remove business logic from HTTP handlers
**Implementation:**
- Convert handlers to call async core services
- Keep only HTTP-specific code (status codes, headers, etc.)
- Use common error handling with conversions to HTTP responses
- Wrap synchronous service calls in `tokio::task::spawn_blocking`
## 8. Refactor MCP Tools to Use Services
**Files:**
- Change: `src/modes/server/mcp/tools.rs`
**Functions:**
- Change: `save_item` to use `item_service::save_item`
- Change: `get_item` to use `item_service::get_item_full`
- Change: `get_latest_item` to use `item_service::get_latest_item`
- Change: `list_items` to use `item_service::list_items`
- Change: `search_items` to use `item_service::search_items`
**Reason:** Remove duplication with REST API and CLI modes
**Implementation:**
- Replace current implementation with calls to core services
- Keep only MCP protocol-specific logic
- Use synchronous services directly (MCP is typically local/short-lived)
- Standardize response format to match API/CLI
## 9. Create Unified Error Handling (DONE)
**Files:**
- Change: All files that handle errors
**Functions:**
- Add: Conversion traits for all error types
- Change: All error handling to use new error system
**Reason:** Standardize error types across the application
**Implementation:**
- Define comprehensive error enum with conversions:
- From database errors
- From I/O errors
- From compression errors
- From validation errors
- Implement conversions to:
- `anyhow::Error` (for CLI)
- `axum::http::StatusCode` (for API)
- `ToolError` (for MCP)
- Provide user-friendly error messages
- Include error codes for programmatic handling
## 10. Add Integration Tests
**Files:**
- Add: `tests/integration/core_tests.rs`
- Add: `tests/integration/cli_tests.rs`
- Add: `tests/integration/api_tests.rs`
- Add: `tests/integration/performance_tests.rs`
**Functions:**
- Add: Test cases for all core service functions
- Add: Test cases for CLI modes
- Add: Test cases for API endpoints
- Add: Performance benchmarks
**Reason:** Ensure refactored code maintains functionality and performance
**Implementation:**
- Test core services independently
- Test CLI modes and APIs through their public interfaces
- Verify compression, metadata, and database operations
- Include performance benchmarks for critical paths
- Use in-memory databases and tempfiles for isolation
- Test both sync and async service implementations
## 11. Performance Optimization Guidelines (PARTIALLY DONE)
**Files:**
- Change: All core service files
- Change: All mode files
- Change: All API handler files
**Functions:**
- Add: Streaming implementations for I/O operations
- Add: Benchmark functions for critical paths
- Change: Buffer management to minimize copies
**Reason:** Ensure the refactored version doesn't slow down pipelines
**Implementation:**
- Use streaming for stdin/stdout processing (chunked I/O)
- Minimize buffering and memory copies
- Offload CPU-bound work (compression, plugins) to thread pools
- Provide fast-path options (e.g., `--fast` flag to skip metadata plugins)
- Benchmark critical operations before/after refactoring
- Document performance characteristics and tradeoffs
## Benefits:
- Reduced code duplication between CLI, API, and MCP
- Easier maintenance with clear separation of concerns
- Consistent behavior across all interfaces
- Better testability with isolated service layer
- Maintained or improved pipeline performance
- Flexible architecture supporting both sync and async use cases
## Key Risks and Mitigations:
1. **Performance Regression:**
- Risk: Refactoring could slow down pipeline operations
- Mitigation: Benchmark before/after, use streaming, minimize overhead
2. **Increased Complexity:**
- Risk: Adding service layer could make code harder to understand
- Mitigation: Clear documentation, gradual refactoring, maintain simple interfaces
3. **Async/Sync Boundaries:**
- Risk: Mixing sync/async could lead to deadlocks or inefficiencies
- Mitigation: Clear boundaries, use `spawn_blocking` for sync work in async context
4. **Breaking Changes:**
- Risk: Refactoring could change behavior in subtle ways
- Mitigation: Comprehensive tests, gradual rollout, maintain backward compatibility
## Design Principles:
1. **Zero-Copy Where Possible:** Use slicing instead of copying data
2. **Streaming Processing:** Handle data in chunks for memory efficiency
3. **Clear Boundaries:** Separate core logic from interface-specific code
4. **Performance First:** Optimize for common pipeline use cases
5. **Consistent Errors:** Unified error handling across all interfaces
6. **Backward Compatibility:** Maintain existing CLI/API behavior