asp/keep

Files

Andrew Phillips 19e1ae587f docs: update refactoring plan with thread safety, error handling, and layer boundaries

Co-authored-by: aider (openai/andrew/openrouter/mistralai/mistral-medium-3.1) <aider@aider.chat>

2025-08-24 23:36:55 -03:00

9.0 KiB

Raw Blame History

Refactoring Plan to Reduce Code Duplication

1. Create Core Service Layer with Clear Boundaries

Files: src/core/ (new directory)

src/core/item_service.rs - Main service for item operations (synchronous)
src/core/async_item_service.rs - Async wrapper for API use
src/core/compression_service.rs - Compression handling
src/core/meta_service.rs - Metadata processing
src/core/mod.rs - Module exports
src/core/error.rs - Unified error handling

Reason: Extract common business logic from modes and APIs into reusable services Implementation:

Move logic from modes (get, save, list, info) and API handlers into service functions
Services should return structured data, not format output
Handle compression, metadata, and database operations
Keep core services synchronous for CLI performance
Provide async wrappers using tokio::task::spawn_blocking for API use
Use streaming for pipeline efficiency (process data in chunks)

Layer Division:

db.rs: Low-level SQL operations (e.g., insert_item, query_all_items)
item_service.rs: Higher-level business logic (e.g., "get item with metadata and tags," "validate item before save")

Example:

// db.rs (low-level)
pub fn get_item_with_tags(conn: &Connection, id: i64) -> Result<(Item, Vec<Tag>)>

// item_service.rs (high-level)
pub fn get_item_full(conn: &Connection, id: i64) -> Result<ItemWithMeta> {
    let (item, tags) = db::get_item_with_tags(conn, id)?;
    let meta = db::get_item_meta(conn, &item)?;
    Ok(ItemWithMeta { item, tags, meta })
}

2. Thread Safety and Resource Management

Reason: Ensure services can be safely used in both synchronous and asynchronous contexts Implementation:

Document thread-safety guarantees for all core services
Use Arc<Mutex<T>> or connection pooling for shared resources like database connections

Provide examples for safe async/sync boundaries:

pub async fn get_item_async(
    state: &AppState,
    id: i64,
) -> Result<ItemWithMeta, CoreError> {
    let conn = state.db.clone(); // Arc<Mutex<Connection>>
    tokio::task::spawn_blocking(move || {
        let conn = conn.lock().unwrap();
        item_service::get_item_full(&conn, id)
    })
    .await?
}

Use tokio::task::spawn_blocking for CPU-bound or blocking I/O operations
Benchmark thread pool sizes for CPU-bound tasks (e.g., compression)

3. Create Common Data Structures

Files: src/core/types.rs Reason: Standardize data structures used across modes and APIs Implementation:

Define structs for Item, ItemWithContent, ItemWithMeta, Response<T>
Include conversion functions from database types (From<db::Item>)
Add serialization/deserialization support for JSON/YAML
Ensure all fields are properly documented
Use zero-copy patterns where possible (slicing instead of copying)

4. Unified Error Handling with Conversions

Reason: Standardize error types across CLI, API, and MCP interfaces Implementation:

Define a base error enum (CoreError) with conversions to all interface-specific error types

Example:

#[derive(Debug, thiserror::Error)]
pub enum CoreError {
    #[error("DB error: {0}")]
    Database(#[from] rusqlite::Error),
    #[error("IO error: {0}")]
    Io(#[from] std::io::Error),
    // ...
}

// Auto-convert to HTTP status
impl From<CoreError> for StatusCode {
    fn from(err: CoreError) -> Self {
        match err {
            CoreError::Database(_) => StatusCode::INTERNAL_SERVER_ERROR,
            // ...
        }
    }
}

Use #[derive(thiserror::Error)] for easy Display and Error implementations
Provide user-friendly error messages with error codes for programmatic handling

5. Refactor CLI Modes to Use Services

Files:

src/modes/get.rs
src/modes/save.rs
src/modes/list.rs
src/modes/info.rs
src/modes/delete.rs
src/modes/diff.rs
src/modes/status.rs

Reason: Remove direct database and file system access from modes Implementation:

Replace current implementations with calls to core services
Keep only CLI-specific formatting and output logic
Handle command-line argument parsing and validation
Use synchronous services directly
Implement streaming for stdin/stdout to maintain pipeline performance

6. Refactor REST API to Use Async Services

Files:

src/modes/server/api/item.rs
src/modes/server/api/status.rs

Reason: Remove business logic from HTTP handlers Implementation:

Convert handlers to call async core services
Keep only HTTP-specific code (status codes, headers, etc.)
Use common error handling with conversions to HTTP responses
Wrap synchronous service calls in tokio::task::spawn_blocking

7. Refactor MCP Tools to Use Services

Files: src/modes/server/mcp/tools.rs

Reason: Remove duplication with REST API and CLI modes Implementation:

Replace current implementation with calls to core services
Keep only MCP protocol-specific logic
Use synchronous services directly (MCP is typically local/short-lived)
Standardize response format to match API/CLI

8. Create Common Error Handling

Files: src/core/error.rs Reason: Standardize error types across the application Implementation:

Define comprehensive error enum with conversions:
- From database errors
- From I/O errors
- From compression errors
- From validation errors
Implement conversions to:
- anyhow::Error (for CLI)
- axum::http::StatusCode (for API)
- ToolError (for MCP)
Provide user-friendly error messages
Include error codes for programmatic handling

9. Update Database Layer for Batch Operations

Files: src/db.rs Reason: Support efficient batch operations needed by services Implementation:

Add functions to get multiple items with their metadata and tags
Add batch insertion/updates for tags and metadata
Add transaction support for atomic operations
Optimize queries for common access patterns
Ensure all batch operations are properly documented

10. Add Integration Tests

Files: tests/integration/ (new directory) Reason: Ensure refactored code maintains functionality and performance Implementation:

Test core services independently
Test CLI modes and APIs through their public interfaces
Verify compression, metadata, and database operations
Include performance benchmarks for critical paths
Use in-memory databases and tempfiles for isolation
Test both sync and async service implementations

11. Performance Optimization Guidelines

Reason: Ensure the refactored version doesn't slow down pipelines Implementation:

Use streaming for stdin/stdout processing (chunked I/O)
Minimize buffering and memory copies
Offload CPU-bound work (compression, plugins) to thread pools
Provide fast-path options (e.g., --fast flag to skip metadata plugins)
Benchmark critical operations before/after refactoring
Document performance characteristics

Implementation Order:

Create core module structure and error types (core/error.rs, core/types.rs)
Implement core services with basic functionality (sync first)
Add async wrappers for API use
Refactor one mode (e.g., get) to use services and validate approach
Refactor corresponding API endpoints to use async services
Repeat for other modes and APIs
Refactor MCP tools to use services
Add comprehensive tests and benchmarks
Clean up removed code from original files
Document performance characteristics and tradeoffs

Benefits:

Reduced code duplication between CLI, API, and MCP
Easier maintenance with clear separation of concerns
Consistent behavior across all interfaces
Better testability with isolated service layer
Maintained or improved pipeline performance
Flexible architecture supporting both sync and async use cases

Key Risks and Mitigations:

Performance Regression:
- Risk: Refactoring could slow down pipeline operations
- Mitigation: Benchmark before/after, use streaming, minimize overhead
Increased Complexity:
- Risk: Adding service layer could make code harder to understand
- Mitigation: Clear documentation, gradual refactoring, maintain simple interfaces
Async/Sync Boundaries:
- Risk: Mixing sync/async could lead to deadlocks or inefficiencies
- Mitigation: Clear boundaries, use spawn_blocking for sync work in async context
Breaking Changes:
- Risk: Refactoring could change behavior in subtle ways
- Mitigation: Comprehensive tests, gradual rollout, maintain backward compatibility

Design Principles:

Zero-Copy Where Possible: Use slicing instead of copying data
Streaming Processing: Handle data in chunks for memory efficiency
Clear Boundaries: Separate core logic from interface-specific code
Performance First: Optimize for common pipeline use cases
Consistent Errors: Unified error handling across all interfaces
Backward Compatibility: Maintain existing CLI/API behavior

9.0 KiB Raw Blame History