Co-authored-by: aider (openai/andrew/openrouter/mistralai/mistral-medium-3.1) <aider@aider.chat>
9.0 KiB
Refactoring Plan to Reduce Code Duplication
1. Create Core Service Layer with Clear Boundaries
Files: src/core/ (new directory)
src/core/item_service.rs- Main service for item operations (synchronous)src/core/async_item_service.rs- Async wrapper for API usesrc/core/compression_service.rs- Compression handlingsrc/core/meta_service.rs- Metadata processingsrc/core/mod.rs- Module exportssrc/core/error.rs- Unified error handling
Reason: Extract common business logic from modes and APIs into reusable services Implementation:
- Move logic from modes (get, save, list, info) and API handlers into service functions
- Services should return structured data, not format output
- Handle compression, metadata, and database operations
- Keep core services synchronous for CLI performance
- Provide async wrappers using
tokio::task::spawn_blockingfor API use - Use streaming for pipeline efficiency (process data in chunks)
Layer Division:
db.rs: Low-level SQL operations (e.g.,insert_item,query_all_items)item_service.rs: Higher-level business logic (e.g., "get item with metadata and tags," "validate item before save")- Example:
// db.rs (low-level) pub fn get_item_with_tags(conn: &Connection, id: i64) -> Result<(Item, Vec<Tag>)> // item_service.rs (high-level) pub fn get_item_full(conn: &Connection, id: i64) -> Result<ItemWithMeta> { let (item, tags) = db::get_item_with_tags(conn, id)?; let meta = db::get_item_meta(conn, &item)?; Ok(ItemWithMeta { item, tags, meta }) }
2. Thread Safety and Resource Management
Reason: Ensure services can be safely used in both synchronous and asynchronous contexts Implementation:
- Document thread-safety guarantees for all core services
- Use
Arc<Mutex<T>>or connection pooling for shared resources like database connections - Provide examples for safe async/sync boundaries:
pub async fn get_item_async( state: &AppState, id: i64, ) -> Result<ItemWithMeta, CoreError> { let conn = state.db.clone(); // Arc<Mutex<Connection>> tokio::task::spawn_blocking(move || { let conn = conn.lock().unwrap(); item_service::get_item_full(&conn, id) }) .await? } - Use
tokio::task::spawn_blockingfor CPU-bound or blocking I/O operations - Benchmark thread pool sizes for CPU-bound tasks (e.g., compression)
3. Create Common Data Structures
Files: src/core/types.rs
Reason: Standardize data structures used across modes and APIs
Implementation:
- Define structs for
Item,ItemWithContent,ItemWithMeta,Response<T> - Include conversion functions from database types (
From<db::Item>) - Add serialization/deserialization support for JSON/YAML
- Ensure all fields are properly documented
- Use zero-copy patterns where possible (slicing instead of copying)
4. Unified Error Handling with Conversions
Reason: Standardize error types across CLI, API, and MCP interfaces Implementation:
- Define a base error enum (
CoreError) with conversions to all interface-specific error types - Example:
#[derive(Debug, thiserror::Error)] pub enum CoreError { #[error("DB error: {0}")] Database(#[from] rusqlite::Error), #[error("IO error: {0}")] Io(#[from] std::io::Error), // ... } // Auto-convert to HTTP status impl From<CoreError> for StatusCode { fn from(err: CoreError) -> Self { match err { CoreError::Database(_) => StatusCode::INTERNAL_SERVER_ERROR, // ... } } } - Use
#[derive(thiserror::Error)]for easyDisplayandErrorimplementations - Provide user-friendly error messages with error codes for programmatic handling
5. Refactor CLI Modes to Use Services
Files:
src/modes/get.rssrc/modes/save.rssrc/modes/list.rssrc/modes/info.rssrc/modes/delete.rssrc/modes/diff.rssrc/modes/status.rs
Reason: Remove direct database and file system access from modes Implementation:
- Replace current implementations with calls to core services
- Keep only CLI-specific formatting and output logic
- Handle command-line argument parsing and validation
- Use synchronous services directly
- Implement streaming for stdin/stdout to maintain pipeline performance
6. Refactor REST API to Use Async Services
Files:
src/modes/server/api/item.rssrc/modes/server/api/status.rs
Reason: Remove business logic from HTTP handlers Implementation:
- Convert handlers to call async core services
- Keep only HTTP-specific code (status codes, headers, etc.)
- Use common error handling with conversions to HTTP responses
- Wrap synchronous service calls in
tokio::task::spawn_blocking
7. Refactor MCP Tools to Use Services
Files: src/modes/server/mcp/tools.rs
Reason: Remove duplication with REST API and CLI modes Implementation:
- Replace current implementation with calls to core services
- Keep only MCP protocol-specific logic
- Use synchronous services directly (MCP is typically local/short-lived)
- Standardize response format to match API/CLI
8. Create Common Error Handling
Files: src/core/error.rs
Reason: Standardize error types across the application
Implementation:
- Define comprehensive error enum with conversions:
- From database errors
- From I/O errors
- From compression errors
- From validation errors
- Implement conversions to:
anyhow::Error(for CLI)axum::http::StatusCode(for API)ToolError(for MCP)
- Provide user-friendly error messages
- Include error codes for programmatic handling
9. Update Database Layer for Batch Operations
Files: src/db.rs
Reason: Support efficient batch operations needed by services
Implementation:
- Add functions to get multiple items with their metadata and tags
- Add batch insertion/updates for tags and metadata
- Add transaction support for atomic operations
- Optimize queries for common access patterns
- Ensure all batch operations are properly documented
10. Add Integration Tests
Files: tests/integration/ (new directory)
Reason: Ensure refactored code maintains functionality and performance
Implementation:
- Test core services independently
- Test CLI modes and APIs through their public interfaces
- Verify compression, metadata, and database operations
- Include performance benchmarks for critical paths
- Use in-memory databases and tempfiles for isolation
- Test both sync and async service implementations
11. Performance Optimization Guidelines
Reason: Ensure the refactored version doesn't slow down pipelines Implementation:
- Use streaming for stdin/stdout processing (chunked I/O)
- Minimize buffering and memory copies
- Offload CPU-bound work (compression, plugins) to thread pools
- Provide fast-path options (e.g.,
--fastflag to skip metadata plugins) - Benchmark critical operations before/after refactoring
- Document performance characteristics
Implementation Order:
- Create core module structure and error types (
core/error.rs,core/types.rs) - Implement core services with basic functionality (sync first)
- Add async wrappers for API use
- Refactor one mode (e.g.,
get) to use services and validate approach - Refactor corresponding API endpoints to use async services
- Repeat for other modes and APIs
- Refactor MCP tools to use services
- Add comprehensive tests and benchmarks
- Clean up removed code from original files
- Document performance characteristics and tradeoffs
Benefits:
- Reduced code duplication between CLI, API, and MCP
- Easier maintenance with clear separation of concerns
- Consistent behavior across all interfaces
- Better testability with isolated service layer
- Maintained or improved pipeline performance
- Flexible architecture supporting both sync and async use cases
Key Risks and Mitigations:
-
Performance Regression:
- Risk: Refactoring could slow down pipeline operations
- Mitigation: Benchmark before/after, use streaming, minimize overhead
-
Increased Complexity:
- Risk: Adding service layer could make code harder to understand
- Mitigation: Clear documentation, gradual refactoring, maintain simple interfaces
-
Async/Sync Boundaries:
- Risk: Mixing sync/async could lead to deadlocks or inefficiencies
- Mitigation: Clear boundaries, use
spawn_blockingfor sync work in async context
-
Breaking Changes:
- Risk: Refactoring could change behavior in subtle ways
- Mitigation: Comprehensive tests, gradual rollout, maintain backward compatibility
Design Principles:
- Zero-Copy Where Possible: Use slicing instead of copying data
- Streaming Processing: Handle data in chunks for memory efficiency
- Clear Boundaries: Separate core logic from interface-specific code
- Performance First: Optimize for common pipeline use cases
- Consistent Errors: Unified error handling across all interfaces
- Backward Compatibility: Maintain existing CLI/API behavior