Files
keep/PLAN.md
Andrew Phillips 89fa9bee6f chore: update plan with implemented changes
Co-authored-by: aider (openai/andrew/openrouter/google/gemini-2.5-pro) <aider@aider.chat>
2025-08-25 00:05:35 -03:00

10 KiB

Refactoring Plan to Reduce Code Duplication

Implementation Order:

  • 1. Create core module structure and error types
  • 2. Create common data structures
  • 3. Update database layer for batch operations
  • 4. Create core services with clear boundaries (synchronous)
  • 5. Add async wrappers for API use
  • 6. Refactor CLI modes to use services (partially done)
  • 7. Refactor REST API to use async services
  • 8. Refactor MCP tools to use services
  • 9. Create unified error handling
  • 10. Add integration tests
  • 11. Add performance optimization guidelines (partially done)

1. Create Core Module Structure and Error Types (DONE)

Files:

  • Add: src/core/error.rs
  • Add: src/core/mod.rs

Functions:

  • Add: CoreError enum with comprehensive variants
  • Add: Conversion traits for all error types

Reason: Establish foundation for consistent error handling Implementation:

  • Define base error enum with conversions to all interface-specific error types
  • Use #[derive(thiserror::Error)] for easy Display and Error implementations
  • Provide user-friendly error messages with error codes

2. Create Common Data Structures (DONE)

Files:

  • Add: src/core/types.rs

Functions:

  • Add: ItemWithMeta struct
  • Add: ItemWithContent struct
  • Add: From<db::Item> for ItemWithMeta
  • Add: From<ItemWithMeta> for ItemInfo (API response)
  • Add: Serialization/deserialization implementations

Reason: Standardize data structures used across modes and APIs Implementation:

  • Define structs for common data types
  • Include conversion functions from database types
  • Add serialization/deserialization support for JSON/YAML
  • Ensure all fields are properly documented

3. Update Database Layer for Batch Operations (DONE)

Files:

  • Change: src/db.rs

Functions:

  • Add: get_items_with_meta_batch
  • Add: get_items_with_tags_batch
  • Add: insert_item_with_meta_transaction
  • Add: delete_item_with_meta_transaction
  • Change: Optimize existing queries for batch operations

Reason: Support efficient batch operations needed by services Implementation:

  • Add functions to get multiple items with their metadata and tags
  • Add batch insertion/updates for tags and metadata
  • Add transaction support for atomic operations
  • Optimize queries for common access patterns

4. Create Core Service Layer with Clear Boundaries (DONE)

Files:

  • Add: src/core/item_service.rs
  • Add: src/core/compression_service.rs
  • Add: src/core/meta_service.rs

Functions:

  • Add: get_item_full in item_service.rs
  • Add: save_item in item_service.rs
  • Add: list_items in item_service.rs
  • Add: delete_item in item_service.rs
  • Add: get_compressed_content in compression_service.rs
  • Add: process_metadata in meta_service.rs

Reason: Extract common business logic from modes and APIs into reusable services Implementation:

  • Move logic from modes (get, save, list, info) and API handlers into service functions
  • Services should return structured data, not format output
  • Handle compression, metadata, and database operations
  • Keep core services synchronous for CLI performance
  • Use streaming for pipeline efficiency (process data in chunks)

Layer Division:

  • db.rs: Low-level SQL operations (e.g., insert_item, query_all_items)
  • item_service.rs: Higher-level business logic (e.g., "get item with metadata and tags," "validate item before save")

5. Add Async Wrappers for API Use

Files:

  • Add: src/core/async_item_service.rs

Functions:

  • Add: get_item_async in async_item_service.rs
  • Add: save_item_async in async_item_service.rs
  • Add: list_items_async in async_item_service.rs
  • Add: delete_item_async in async_item_service.rs

Reason: Ensure services can be safely used in asynchronous contexts Implementation:

  • Document thread-safety guarantees for all core services
  • Use Arc<Mutex<T>> or connection pooling for shared resources
  • Provide examples for safe async/sync boundaries
  • Use tokio::task::spawn_blocking for CPU-bound or blocking I/O operations

6. Refactor CLI Modes to Use Services (PARTIALLY DONE)

Files:

  • Change: src/modes/get.rs (DONE)
  • Change: src/modes/save.rs (DONE)
  • Change: src/modes/list.rs (DONE)
  • Change: src/modes/info.rs (DONE)
  • Change: src/modes/delete.rs (DONE)
  • Change: src/modes/diff.rs (NOT DONE)
  • Change: src/modes/status.rs (DONE, uses shared function)

Functions:

  • Change: mode_get to use item_service (DONE)
  • Change: mode_save to use item_service (DONE)
  • Change: mode_list to use item_service (DONE)
  • Change: mode_info to use item_service (DONE)
  • Change: mode_delete to use item_service (DONE)
  • Change: mode_diff to use item_service (NOT DONE)
  • Change: mode_status to use new status service functions (DONE, uses shared function)

Reason: Remove direct database and file system access from modes Implementation:

  • Replace current implementations with calls to core services
  • Keep only CLI-specific formatting and output logic
  • Handle command-line argument parsing and validation
  • Use synchronous services directly
  • Implement streaming for stdin/stdout to maintain pipeline performance

7. Refactor REST API to Use Async Services

Files:

  • Change: src/modes/server/api/item.rs
  • Change: src/modes/server/api/status.rs

Functions:

  • Change: handle_get_item to use async_item_service::get_item_async
  • Change: handle_get_item_latest to use async_item_service::get_item_async
  • Change: handle_list_items to use async_item_service::list_items_async
  • Change: handle_post_item to use async_item_service::save_item_async
  • Change: handle_get_item_content to use async_item_service::get_item_content_async
  • Change: handle_get_item_meta to use async_item_service::get_item_meta_async
  • Change: handle_status to use async_item_service::get_status_async

Reason: Remove business logic from HTTP handlers Implementation:

  • Convert handlers to call async core services
  • Keep only HTTP-specific code (status codes, headers, etc.)
  • Use common error handling with conversions to HTTP responses
  • Wrap synchronous service calls in tokio::task::spawn_blocking

8. Refactor MCP Tools to Use Services

Files:

  • Change: src/modes/server/mcp/tools.rs

Functions:

  • Change: save_item to use item_service::save_item
  • Change: get_item to use item_service::get_item_full
  • Change: get_latest_item to use item_service::get_latest_item
  • Change: list_items to use item_service::list_items
  • Change: search_items to use item_service::search_items

Reason: Remove duplication with REST API and CLI modes Implementation:

  • Replace current implementation with calls to core services
  • Keep only MCP protocol-specific logic
  • Use synchronous services directly (MCP is typically local/short-lived)
  • Standardize response format to match API/CLI

9. Create Unified Error Handling (DONE)

Files:

  • Change: All files that handle errors

Functions:

  • Add: Conversion traits for all error types
  • Change: All error handling to use new error system

Reason: Standardize error types across the application Implementation:

  • Define comprehensive error enum with conversions:
    • From database errors
    • From I/O errors
    • From compression errors
    • From validation errors
  • Implement conversions to:
    • anyhow::Error (for CLI)
    • axum::http::StatusCode (for API)
    • ToolError (for MCP)
  • Provide user-friendly error messages
  • Include error codes for programmatic handling

10. Add Integration Tests

Files:

  • Add: tests/integration/core_tests.rs
  • Add: tests/integration/cli_tests.rs
  • Add: tests/integration/api_tests.rs
  • Add: tests/integration/performance_tests.rs

Functions:

  • Add: Test cases for all core service functions
  • Add: Test cases for CLI modes
  • Add: Test cases for API endpoints
  • Add: Performance benchmarks

Reason: Ensure refactored code maintains functionality and performance Implementation:

  • Test core services independently
  • Test CLI modes and APIs through their public interfaces
  • Verify compression, metadata, and database operations
  • Include performance benchmarks for critical paths
  • Use in-memory databases and tempfiles for isolation
  • Test both sync and async service implementations

11. Performance Optimization Guidelines (PARTIALLY DONE)

Files:

  • Change: All core service files
  • Change: All mode files
  • Change: All API handler files

Functions:

  • Add: Streaming implementations for I/O operations
  • Add: Benchmark functions for critical paths
  • Change: Buffer management to minimize copies

Reason: Ensure the refactored version doesn't slow down pipelines Implementation:

  • Use streaming for stdin/stdout processing (chunked I/O)
  • Minimize buffering and memory copies
  • Offload CPU-bound work (compression, plugins) to thread pools
  • Provide fast-path options (e.g., --fast flag to skip metadata plugins)
  • Benchmark critical operations before/after refactoring
  • Document performance characteristics and tradeoffs

Benefits:

  • Reduced code duplication between CLI, API, and MCP
  • Easier maintenance with clear separation of concerns
  • Consistent behavior across all interfaces
  • Better testability with isolated service layer
  • Maintained or improved pipeline performance
  • Flexible architecture supporting both sync and async use cases

Key Risks and Mitigations:

  1. Performance Regression:

    • Risk: Refactoring could slow down pipeline operations
    • Mitigation: Benchmark before/after, use streaming, minimize overhead
  2. Increased Complexity:

    • Risk: Adding service layer could make code harder to understand
    • Mitigation: Clear documentation, gradual refactoring, maintain simple interfaces
  3. Async/Sync Boundaries:

    • Risk: Mixing sync/async could lead to deadlocks or inefficiencies
    • Mitigation: Clear boundaries, use spawn_blocking for sync work in async context
  4. Breaking Changes:

    • Risk: Refactoring could change behavior in subtle ways
    • Mitigation: Comprehensive tests, gradual rollout, maintain backward compatibility

Design Principles:

  1. Zero-Copy Where Possible: Use slicing instead of copying data
  2. Streaming Processing: Handle data in chunks for memory efficiency
  3. Clear Boundaries: Separate core logic from interface-specific code
  4. Performance First: Optimize for common pipeline use cases
  5. Consistent Errors: Unified error handling across all interfaces
  6. Backward Compatibility: Maintain existing CLI/API behavior