Files
keep/DESIGN.md
Andrew Phillips 20c2716915 docs: update DESIGN.md with current application state
Co-authored-by: aider (openai/andrew/openrouter/qwen/qwen3-coder) <aider@aider.chat>
2025-08-15 13:20:53 -03:00

7.2 KiB

PROJECT RULES - KEEP FIRST

Standard Rules

  1. ALWAYS keep DESIGN.md updated with any architectural or design changes
  2. ALWAYS keep project rules first in this document
  3. ALWAYS use git commands to remove or move files (git rm, git mv, etc.)
  4. Follow Rust naming conventions and idioms
  5. Use anyhow for error handling throughout the codebase
  6. Maintain comprehensive logging with the log crate
  7. Write unit tests for critical functionality
  8. Document public APIs with rustdoc comments
  9. Keep modules focused on single responsibilities
  10. Prefer composition over inheritance
  11. Handle errors gracefully and provide meaningful error messages
  12. Ensure code is safe and avoids unsafe blocks where possible

Code - Modules

Main Module

  • main.rs - Entry point, CLI argument parsing, mode dispatching
  • Interacts with all mode modules based on user input
  • Handles database connection setup and data directory management

Mode Modules

  • modes/save.rs - Save new items with tags/metadata
  • modes/get.rs - Retrieve items by ID/tags
  • modes/list.rs - List items with filtering and formatting
  • modes/delete.rs - Delete items by ID
  • modes/update.rs - Update item tags/metadata
  • modes/info.rs - Show detailed item information
  • modes/diff.rs - Compare two items
  • modes/status.rs - Show system status and capabilities
  • modes/server.rs - REST HTTP server mode with OpenAPI documentation
  • modes/common.rs - Shared utilities for all modes

Database Module

  • db.rs - SQLite database operations
  • Handles items, tags, and metadata storage
  • Provides query functions for all modes
  • Manages database migrations

Compression Engine Module

  • compression_engine.rs - Trait and type definitions
  • compression_engine/gzip.rs - GZip implementation
  • compression_engine/lz4.rs - LZ4 implementation
  • compression_engine/none.rs - No compression implementation
  • compression_engine/program.rs - External program wrapper

Meta Plugin Module

  • meta_plugin.rs - Trait and type definitions
  • meta_plugin/program.rs - External program wrapper
  • meta_plugin/digest.rs - Internal digest implementations
  • meta_plugin/system.rs - System information metadata plugins

Common Modules

  • common/is_binary.rs - Binary file detection utilities
  • common/status.rs - Status information generation

Utility Modules

  • plugins.rs - Shared plugin utilities
  • args.rs - CLI argument definitions

Command Line Interface

Modes

  • Save mode: keep [--save] (default when no mode specified and no IDs provided)
  • Get mode: keep [--get] <ID|tag...> (default when IDs provided)
  • List mode: keep [--list] [tag...]
  • Info mode: keep [--info] <ID|tag...>
  • Delete mode: keep [--delete] <ID...>
  • Update mode: keep [--update] <ID> [tag...]
  • Diff mode: keep [--diff] <ID1> <ID2>
  • Status mode: keep [--status]
  • Server mode: keep [--server] <address:port>

Item Options

  • --meta KEY[=VALUE] - Set metadata for the item, remove if VALUE not provided
  • --digest <sha256|md5> - Digest algorithm to use when saving items
  • --compression <lz4|gzip|bzip2|xz|zstd|none> - Compression algorithm to use when saving items
  • --meta-plugins <plugin[,plugin...]> - Meta plugins to use when saving items

General Options

  • --dir <PATH> - Specify the directory to use for storage
  • --list-format <FORMAT> - A comma separated list of columns to display with --list
  • --human-readable - Display file sizes with units
  • --verbose - Increase message verbosity
  • --quiet - Do not show any messages
  • --output-format <table|json|yaml> - Output format for info, status, and list modes
  • --server-password <PASSWORD> - Password for server authentication
  • --force - Force output even when binary data would be sent to a TTY

Data Storage

Database Schema

  • items table: id (primary key), ts (timestamp), size (optional), compression
  • tags table: id (foreign key to items), name (tag name)
  • metas table: id (foreign key to items), name (meta key), value (meta value)
  • Indexes on tag names and meta names for faster queries

File Storage

  • Data directory contains compressed item files named by their item ID
  • Database file stored in data directory
  • File permissions set to be private to user (umask 077)

REST API Endpoints

Status Operations

  • GET /api/status - Get system status information

Item Operations

  • GET /api/item/ - Get a list of items as JSON. Optional params: order=newest|oldest, start=0, count=100, tags[]=tag1&tags[]=tag2
  • POST /api/item/ - Add a new item
  • DELETE /api/item/<#> - Delete an item
  • GET /api/item/latest - Return the latest item as JSON. Optional params: tags[]=tag1&tags[]=tag2, allow_binary=true|false
  • GET /api/item/latest/meta - Return the latest item metadata as JSON. Optional params: tags[]=tag1&tags[]=tag2
  • GET /api/item/latest/content - Return the raw content of the latest item. Optional params: tags[]=tag1&tags[]=tag2
  • GET /api/item/<#> - Return the item as JSON. Optional params: allow_binary=true|false
  • GET /api/item/<#>/meta - Return the item metadata as JSON
  • GET /api/item/<#>/content - Return the raw content of the item

Authentication

  • Bearer token authentication: Authorization: Bearer <password>
  • Basic authentication: Authorization: Basic base64(keep:<password>)
  • When no password is set, authentication is disabled

Supported Compression Types

  • LZ4 (internal implementation)
  • GZip (internal implementation)
  • BZip2 (external program)
  • XZ (external program)
  • ZStd (external program)
  • None (no compression)

Supported Meta Plugins

  • FileMagic - File type detection using file command
  • FileMime - MIME type detection using file command
  • FileEncoding - File encoding detection using file command
  • LineCount - Line count using wc command
  • WordCount - Word count using wc command
  • Cwd - Current working directory
  • Binary - Binary file detection
  • Uid - Current user ID
  • User - Current username
  • Gid - Current group ID
  • Group - Current group name
  • Shell - Shell path from SHELL environment variable
  • ShellPid - Shell process ID from PPID environment variable
  • KeepPid - Keep process ID
  • DigestSha256 - SHA-256 digest
  • DigestMd5 - MD5 digest using md5sum command
  • ReadTime - Time taken to read data
  • ReadRate - Rate of data reading
  • Hostname - System hostname
  • FullHostname - Fully qualified domain name

Testing Strategy

  • Unit tests for each module in src/tests/
  • Integration tests for modes
  • Database tests for CRUD operations
  • Compression engine tests for each supported format
  • Meta plugin tests for each plugin type
  • Server tests for API endpoints and authentication
  • Common utilities tests for helper functions

Binary Data Handling

  • Automatic binary detection using file signatures and heuristics
  • Prevents binary data output to TTY unless --force is used
  • Binary meta plugin analyzes content to determine if it's binary
  • API endpoints respect binary flags to prevent accidental binary transmission

Security Considerations

  • File permissions are restricted to user only (umask 077)
  • Input validation for item IDs to prevent path traversal
  • Authentication for server mode with bearer or basic auth
  • Proper resource cleanup using RAII patterns
  • Safe handling of external processes with proper stdin/stdout management