# Refactoring Plan to Reduce Code Duplication ## Implementation Order: - [x] 1. Create core module structure and error types - [x] 2. Create common data structures - [x] 3. Update database layer for batch operations - [x] 4. Create core services with clear boundaries (synchronous) - [ ] 5. Add async wrappers for API use - [ ] 6. Refactor CLI modes to use services (partially done) - [ ] 7. Refactor REST API to use async services - [ ] 8. Refactor MCP tools to use services - [x] 9. Create unified error handling - [ ] 10. Add integration tests - [ ] 11. Add performance optimization guidelines (partially done) ## 1. Create Core Module Structure and Error Types (DONE) **Files:** - Add: `src/core/error.rs` - Add: `src/core/mod.rs` **Functions:** - Add: `CoreError` enum with comprehensive variants - Add: Conversion traits for all error types **Reason:** Establish foundation for consistent error handling **Implementation:** - Define base error enum with conversions to all interface-specific error types - Use `#[derive(thiserror::Error)]` for easy `Display` and `Error` implementations - Provide user-friendly error messages with error codes ## 2. Create Common Data Structures (DONE) **Files:** - Add: `src/core/types.rs` **Functions:** - Add: `ItemWithMeta` struct - Add: `ItemWithContent` struct - Add: `From` for `ItemWithMeta` - Add: `From` for `ItemInfo` (API response) - Add: Serialization/deserialization implementations **Reason:** Standardize data structures used across modes and APIs **Implementation:** - Define structs for common data types - Include conversion functions from database types - Add serialization/deserialization support for JSON/YAML - Ensure all fields are properly documented ## 3. Update Database Layer for Batch Operations (DONE) **Files:** - Change: `src/db.rs` **Functions:** - Add: `get_items_with_meta_batch` - Add: `get_items_with_tags_batch` - Add: `insert_item_with_meta_transaction` - Add: `delete_item_with_meta_transaction` - Change: Optimize existing queries for batch operations **Reason:** Support efficient batch operations needed by services **Implementation:** - Add functions to get multiple items with their metadata and tags - Add batch insertion/updates for tags and metadata - Add transaction support for atomic operations - Optimize queries for common access patterns ## 4. Create Core Service Layer with Clear Boundaries (DONE) **Files:** - Add: `src/core/item_service.rs` - Add: `src/core/compression_service.rs` - Add: `src/core/meta_service.rs` **Functions:** - Add: `get_item_full` in `item_service.rs` - Add: `save_item` in `item_service.rs` - Add: `list_items` in `item_service.rs` - Add: `delete_item` in `item_service.rs` - Add: `get_compressed_content` in `compression_service.rs` - Add: `process_metadata` in `meta_service.rs` **Reason:** Extract common business logic from modes and APIs into reusable services **Implementation:** - Move logic from modes (get, save, list, info) and API handlers into service functions - Services should return structured data, not format output - Handle compression, metadata, and database operations - Keep core services **synchronous** for CLI performance - Use streaming for pipeline efficiency (process data in chunks) **Layer Division:** - **`db.rs`**: Low-level SQL operations (e.g., `insert_item`, `query_all_items`) - **`item_service.rs`**: Higher-level business logic (e.g., "get item with metadata and tags," "validate item before save") ## 5. Add Async Wrappers for API Use **Files:** - Add: `src/core/async_item_service.rs` **Functions:** - Add: `get_item_async` in `async_item_service.rs` - Add: `save_item_async` in `async_item_service.rs` - Add: `list_items_async` in `async_item_service.rs` - Add: `delete_item_async` in `async_item_service.rs` **Reason:** Ensure services can be safely used in asynchronous contexts **Implementation:** - Document thread-safety guarantees for all core services - Use `Arc>` or connection pooling for shared resources - Provide examples for safe async/sync boundaries - Use `tokio::task::spawn_blocking` for CPU-bound or blocking I/O operations ## 6. Refactor CLI Modes to Use Services (PARTIALLY DONE) **Files:** - Change: `src/modes/get.rs` (DONE) - Change: `src/modes/save.rs` (DONE) - Change: `src/modes/list.rs` (DONE) - Change: `src/modes/info.rs` (DONE) - Change: `src/modes/delete.rs` (DONE) - Change: `src/modes/diff.rs` (NOT DONE) - Change: `src/modes/status.rs` (DONE, uses shared function) **Functions:** - Change: `mode_get` to use `item_service` (DONE) - Change: `mode_save` to use `item_service` (DONE) - Change: `mode_list` to use `item_service` (DONE) - Change: `mode_info` to use `item_service` (DONE) - Change: `mode_delete` to use `item_service` (DONE) - Change: `mode_diff` to use `item_service` (NOT DONE) - Change: `mode_status` to use new status service functions (DONE, uses shared function) **Reason:** Remove direct database and file system access from modes **Implementation:** - Replace current implementations with calls to core services - Keep only CLI-specific formatting and output logic - Handle command-line argument parsing and validation - Use synchronous services directly - Implement streaming for stdin/stdout to maintain pipeline performance ## 7. Refactor REST API to Use Async Services **Files:** - Change: `src/modes/server/api/item.rs` - Change: `src/modes/server/api/status.rs` **Functions:** - Change: `handle_get_item` to use `async_item_service::get_item_async` - Change: `handle_get_item_latest` to use `async_item_service::get_item_async` - Change: `handle_list_items` to use `async_item_service::list_items_async` - Change: `handle_post_item` to use `async_item_service::save_item_async` - Change: `handle_get_item_content` to use `async_item_service::get_item_content_async` - Change: `handle_get_item_meta` to use `async_item_service::get_item_meta_async` - Change: `handle_status` to use `async_item_service::get_status_async` **Reason:** Remove business logic from HTTP handlers **Implementation:** - Convert handlers to call async core services - Keep only HTTP-specific code (status codes, headers, etc.) - Use common error handling with conversions to HTTP responses - Wrap synchronous service calls in `tokio::task::spawn_blocking` ## 8. Refactor MCP Tools to Use Services **Files:** - Change: `src/modes/server/mcp/tools.rs` **Functions:** - Change: `save_item` to use `item_service::save_item` - Change: `get_item` to use `item_service::get_item_full` - Change: `get_latest_item` to use `item_service::get_latest_item` - Change: `list_items` to use `item_service::list_items` - Change: `search_items` to use `item_service::search_items` **Reason:** Remove duplication with REST API and CLI modes **Implementation:** - Replace current implementation with calls to core services - Keep only MCP protocol-specific logic - Use synchronous services directly (MCP is typically local/short-lived) - Standardize response format to match API/CLI ## 9. Create Unified Error Handling (DONE) **Files:** - Change: All files that handle errors **Functions:** - Add: Conversion traits for all error types - Change: All error handling to use new error system **Reason:** Standardize error types across the application **Implementation:** - Define comprehensive error enum with conversions: - From database errors - From I/O errors - From compression errors - From validation errors - Implement conversions to: - `anyhow::Error` (for CLI) - `axum::http::StatusCode` (for API) - `ToolError` (for MCP) - Provide user-friendly error messages - Include error codes for programmatic handling ## 10. Add Integration Tests **Files:** - Add: `tests/integration/core_tests.rs` - Add: `tests/integration/cli_tests.rs` - Add: `tests/integration/api_tests.rs` - Add: `tests/integration/performance_tests.rs` **Functions:** - Add: Test cases for all core service functions - Add: Test cases for CLI modes - Add: Test cases for API endpoints - Add: Performance benchmarks **Reason:** Ensure refactored code maintains functionality and performance **Implementation:** - Test core services independently - Test CLI modes and APIs through their public interfaces - Verify compression, metadata, and database operations - Include performance benchmarks for critical paths - Use in-memory databases and tempfiles for isolation - Test both sync and async service implementations ## 11. Performance Optimization Guidelines (PARTIALLY DONE) **Files:** - Change: All core service files - Change: All mode files - Change: All API handler files **Functions:** - Add: Streaming implementations for I/O operations - Add: Benchmark functions for critical paths - Change: Buffer management to minimize copies **Reason:** Ensure the refactored version doesn't slow down pipelines **Implementation:** - Use streaming for stdin/stdout processing (chunked I/O) - Minimize buffering and memory copies - Offload CPU-bound work (compression, plugins) to thread pools - Provide fast-path options (e.g., `--fast` flag to skip metadata plugins) - Benchmark critical operations before/after refactoring - Document performance characteristics and tradeoffs ## Benefits: - Reduced code duplication between CLI, API, and MCP - Easier maintenance with clear separation of concerns - Consistent behavior across all interfaces - Better testability with isolated service layer - Maintained or improved pipeline performance - Flexible architecture supporting both sync and async use cases ## Key Risks and Mitigations: 1. **Performance Regression:** - Risk: Refactoring could slow down pipeline operations - Mitigation: Benchmark before/after, use streaming, minimize overhead 2. **Increased Complexity:** - Risk: Adding service layer could make code harder to understand - Mitigation: Clear documentation, gradual refactoring, maintain simple interfaces 3. **Async/Sync Boundaries:** - Risk: Mixing sync/async could lead to deadlocks or inefficiencies - Mitigation: Clear boundaries, use `spawn_blocking` for sync work in async context 4. **Breaking Changes:** - Risk: Refactoring could change behavior in subtle ways - Mitigation: Comprehensive tests, gradual rollout, maintain backward compatibility ## Design Principles: 1. **Zero-Copy Where Possible:** Use slicing instead of copying data 2. **Streaming Processing:** Handle data in chunks for memory efficiency 3. **Clear Boundaries:** Separate core logic from interface-specific code 4. **Performance First:** Optimize for common pipeline use cases 5. **Consistent Errors:** Unified error handling across all interfaces 6. **Backward Compatibility:** Maintain existing CLI/API behavior