keep/PLAN.md

# Refactoring Plan to Reduce Code Duplication

## 1. Create Core Service Layer with Clear Boundaries
**Files:**
- Add: `src/core/item_service.rs`
- Add: `src/core/async_item_service.rs`
- Add: `src/core/compression_service.rs`
- Add: `src/core/meta_service.rs`
- Add: `src/core/mod.rs`
- Add: `src/core/error.rs`

**Functions:**
- Add: `get_item_full` in `item_service.rs`
- Add: `save_item` in `item_service.rs`
- Add: `list_items` in `item_service.rs`
- Add: `delete_item` in `item_service.rs`
- Add: `get_compressed_content` in `compression_service.rs`
- Add: `process_metadata` in `meta_service.rs`
- Add: Async wrappers for all core functions in `async_item_service.rs`

**Reason:** Extract common business logic from modes and APIs into reusable services
**Implementation:**
- Move logic from modes (get, save, list, info) and API handlers into service functions
- Services should return structured data, not format output
- Handle compression, metadata, and database operations
- Keep core services **synchronous** for CLI performance
- Provide async wrappers using `tokio::task::spawn_blocking` for API use
- Use streaming for pipeline efficiency (process data in chunks)

**Layer Division:**
- **`db.rs`**: Low-level SQL operations (e.g., `insert_item`, `query_all_items`)
- **`item_service.rs`**: Higher-level business logic (e.g., "get item with metadata and tags," "validate item before save")
- Example:
  ```rust
  // db.rs (low-level)
  pub fn get_item_with_tags(conn: &Connection, id: i64) -> Result<(Item, Vec<Tag>)>

  // item_service.rs (high-level)
  pub fn get_item_full(conn: &Connection, id: i64) -> Result<ItemWithMeta> {
      let (item, tags) = db::get_item_with_tags(conn, id)?;
      let meta = db::get_item_meta(conn, &item)?;
      Ok(ItemWithMeta { item, tags, meta })
  }
  ```

## 2. Thread Safety and Resource Management
**Files:**
- Change: `src/modes/server/api/item.rs`
- Change: `src/modes/server/api/status.rs`
- Change: `src/modes/server/mcp/tools.rs`

**Functions:**
- Add: `get_item_async` in `async_item_service.rs`
- Add: `save_item_async` in `async_item_service.rs`
- Change: All API handlers to use async services

**Reason:** Ensure services can be safely used in both synchronous and asynchronous contexts
**Implementation:**
- Document thread-safety guarantees for all core services
- Use `Arc<Mutex<T>>` or connection pooling for shared resources like database connections
- Provide examples for safe async/sync boundaries:
  ```rust
  pub async fn get_item_async(
      state: &AppState,
      id: i64,
  ) -> Result<ItemWithMeta, CoreError> {
      let conn = state.db.clone(); // Arc<Mutex<Connection>>
      tokio::task::spawn_blocking(move || {
          let conn = conn.lock().unwrap();
          item_service::get_item_full(&conn, id)
      })
      .await?
  }
  ```
- Use `tokio::task::spawn_blocking` for CPU-bound or blocking I/O operations
- Benchmark thread pool sizes for CPU-bound tasks (e.g., compression)

## 3. Create Common Data Structures
**Files:**
- Add: `src/core/types.rs`

**Functions:**
- Add: `From<db::Item>` for `ItemWithMeta`
- Add: `From<ItemWithMeta>` for `ItemInfo` (API response)
- Add: Serialization/deserialization implementations

**Reason:** Standardize data structures used across modes and APIs
**Implementation:**
- Define structs for `Item`, `ItemWithContent`, `ItemWithMeta`, `Response<T>`
- Include conversion functions from database types (`From<db::Item>`)
- Add serialization/deserialization support for JSON/YAML
- Ensure all fields are properly documented
- Use zero-copy patterns where possible (slicing instead of copying)

## 4. Unified Error Handling with Conversions
**Files:**
- Add: `src/core/error.rs`
- Change: `src/modes/server/api/item.rs`
- Change: `src/modes/server/mcp/tools.rs`
- Change: All mode files (`get.rs`, `save.rs`, etc.)

**Functions:**
- Add: `CoreError` enum with comprehensive variants
- Add: `From<CoreError>` for `StatusCode`
- Add: `From<CoreError>` for `ToolError`
- Add: `From<CoreError>` for `anyhow::Error`
- Change: All error handling to use `CoreError`

**Reason:** Standardize error types across CLI, API, and MCP interfaces
**Implementation:**
- Define a base error enum (`CoreError`) with conversions to all interface-specific error types
- Example:
  ```rust
  #[derive(Debug, thiserror::Error)]
  pub enum CoreError {
      #[error("DB error: {0}")]
      Database(#[from] rusqlite::Error),
      #[error("IO error: {0}")]
      Io(#[from] std::io::Error),
      // ...
  }

  // Auto-convert to HTTP status
  impl From<CoreError> for StatusCode {
      fn from(err: CoreError) -> Self {
          match err {
              CoreError::Database(_) => StatusCode::INTERNAL_SERVER_ERROR,
              // ...
          }
      }
  }
  ```
- Use `#[derive(thiserror::Error)]` for easy `Display` and `Error` implementations
- Provide user-friendly error messages with error codes for programmatic handling

## 5. Refactor CLI Modes to Use Services
**Files:**
- Change: `src/modes/get.rs`
- Change: `src/modes/save.rs`
- Change: `src/modes/list.rs`
- Change: `src/modes/info.rs`
- Change: `src/modes/delete.rs`
- Change: `src/modes/diff.rs`
- Change: `src/modes/status.rs`

**Functions:**
- Change: `mode_get` to use `item_service::get_item_full`
- Change: `mode_save` to use `item_service::save_item`
- Change: `mode_list` to use `item_service::list_items`
- Change: `mode_info` to use `item_service::get_item_full`
- Change: `mode_delete` to use `item_service::delete_item`
- Change: `mode_diff` to use `item_service::get_item_full` for both items
- Change: `mode_status` to use new status service functions

**Reason:** Remove direct database and file system access from modes
**Implementation:**
- Replace current implementations with calls to core services
- Keep only CLI-specific formatting and output logic
- Handle command-line argument parsing and validation
- Use synchronous services directly
- Implement streaming for stdin/stdout to maintain pipeline performance

## 6. Refactor REST API to Use Async Services
**Files:**
- Change: `src/modes/server/api/item.rs`
- Change: `src/modes/server/api/status.rs`

**Functions:**
- Change: `handle_get_item` to use `async_item_service::get_item_async`
- Change: `handle_get_item_latest` to use `async_item_service::get_item_async`
- Change: `handle_list_items` to use `async_item_service::list_items_async`
- Change: `handle_post_item` to use `async_item_service::save_item_async`
- Change: `handle_get_item_content` to use `async_item_service::get_item_content_async`
- Change: `handle_get_item_meta` to use `async_item_service::get_item_meta_async`
- Change: `handle_status` to use `async_item_service::get_status_async`

**Reason:** Remove business logic from HTTP handlers
**Implementation:**
- Convert handlers to call async core services
- Keep only HTTP-specific code (status codes, headers, etc.)
- Use common error handling with conversions to HTTP responses
- Wrap synchronous service calls in `tokio::task::spawn_blocking`

## 7. Refactor MCP Tools to Use Services
**Files:**
- Change: `src/modes/server/mcp/tools.rs`

**Functions:**
- Change: `save_item` to use `item_service::save_item`
- Change: `get_item` to use `item_service::get_item_full`
- Change: `get_latest_item` to use `item_service::get_latest_item`
- Change: `list_items` to use `item_service::list_items`
- Change: `search_items` to use `item_service::search_items`

**Reason:** Remove duplication with REST API and CLI modes
**Implementation:**
- Replace current implementation with calls to core services
- Keep only MCP protocol-specific logic
- Use synchronous services directly (MCP is typically local/short-lived)
- Standardize response format to match API/CLI

## 8. Create Common Error Handling
**Files:**
- Add: `src/core/error.rs`
- Change: All files that handle errors

**Functions:**
- Add: Comprehensive error handling in `core/error.rs`
- Add: Conversion traits for all error types
- Change: All error handling to use new error system

**Reason:** Standardize error types across the application
**Implementation:**
- Define comprehensive error enum with conversions:
  - From database errors
  - From I/O errors
  - From compression errors
  - From validation errors
- Implement conversions to:
  - `anyhow::Error` (for CLI)
  - `axum::http::StatusCode` (for API)
  - `ToolError` (for MCP)
- Provide user-friendly error messages
- Include error codes for programmatic handling

## 9. Update Database Layer for Batch Operations
**Files:**
- Change: `src/db.rs`

**Functions:**
- Add: `get_items_with_meta_batch`
- Add: `get_items_with_tags_batch`
- Add: `insert_item_with_meta_transaction`
- Add: `delete_item_with_meta_transaction`
- Change: Optimize existing queries for batch operations

**Reason:** Support efficient batch operations needed by services
**Implementation:**
- Add functions to get multiple items with their metadata and tags
- Add batch insertion/updates for tags and metadata
- Add transaction support for atomic operations
- Optimize queries for common access patterns
- Ensure all batch operations are properly documented

## 10. Add Integration Tests
**Files:**
- Add: `tests/integration/core_tests.rs`
- Add: `tests/integration/cli_tests.rs`
- Add: `tests/integration/api_tests.rs`
- Add: `tests/integration/performance_tests.rs`

**Functions:**
- Add: Test cases for all core service functions
- Add: Test cases for CLI modes
- Add: Test cases for API endpoints
- Add: Performance benchmarks

**Reason:** Ensure refactored code maintains functionality and performance
**Implementation:**
- Test core services independently
- Test CLI modes and APIs through their public interfaces
- Verify compression, metadata, and database operations
- Include performance benchmarks for critical paths
- Use in-memory databases and tempfiles for isolation
- Test both sync and async service implementations

## 11. Performance Optimization Guidelines
**Files:**
- Change: All core service files
- Change: All mode files
- Change: All API handler files

**Functions:**
- Add: Streaming implementations for I/O operations
- Add: Benchmark functions for critical paths
- Change: Buffer management to minimize copies

**Reason:** Ensure the refactored version doesn't slow down pipelines
**Implementation:**
- Use streaming for stdin/stdout processing (chunked I/O)
- Minimize buffering and memory copies
- Offload CPU-bound work (compression, plugins) to thread pools
- Provide fast-path options (e.g., `--fast` flag to skip metadata plugins)
- Benchmark critical operations before/after refactoring
- Document performance characteristics and tradeoffs

## Implementation Order:
1. Create core module structure and error types (`core/error.rs`, `core/types.rs`)
2. Implement core services with basic functionality (sync first)
3. Add async wrappers for API use
4. Refactor one mode (e.g., `get`) to use services and validate approach
5. Refactor corresponding API endpoints to use async services
6. Repeat for other modes and APIs
7. Refactor MCP tools to use services
8. Add comprehensive tests and benchmarks
9. Clean up removed code from original files
10. Document performance characteristics and tradeoffs

## Benefits:
- Reduced code duplication between CLI, API, and MCP
- Easier maintenance with clear separation of concerns
- Consistent behavior across all interfaces
- Better testability with isolated service layer
- Maintained or improved pipeline performance
- Flexible architecture supporting both sync and async use cases

## Key Risks and Mitigations:
1. **Performance Regression:**
   - Risk: Refactoring could slow down pipeline operations
   - Mitigation: Benchmark before/after, use streaming, minimize overhead

2. **Increased Complexity:**
   - Risk: Adding service layer could make code harder to understand
   - Mitigation: Clear documentation, gradual refactoring, maintain simple interfaces

3. **Async/Sync Boundaries:**
   - Risk: Mixing sync/async could lead to deadlocks or inefficiencies
   - Mitigation: Clear boundaries, use `spawn_blocking` for sync work in async context

4. **Breaking Changes:**
   - Risk: Refactoring could change behavior in subtle ways
   - Mitigation: Comprehensive tests, gradual rollout, maintain backward compatibility

## Design Principles:
1. **Zero-Copy Where Possible:** Use slicing instead of copying data
2. **Streaming Processing:** Handle data in chunks for memory efficiency
3. **Clear Boundaries:** Separate core logic from interface-specific code
4. **Performance First:** Optimize for common pipeline use cases
5. **Consistent Errors:** Unified error handling across all interfaces
6. **Backward Compatibility:** Maintain existing CLI/API behavior