# PROJECT RULES - KEEP FIRST ## Standard Rules 1. ALWAYS keep DESIGN.md updated with any architectural or design changes 2. ALWAYS keep project rules first in this document 3. ALWAYS use git commands to remove or move files (`git rm`, `git mv`, etc.) 4. Follow Rust naming conventions and idioms 5. Use anyhow for error handling throughout the codebase 6. Maintain comprehensive logging with the log crate 7. Write unit tests for critical functionality 8. Document public APIs with rustdoc comments 9. Keep modules focused on single responsibilities 10. Prefer composition over inheritance 11. Handle errors gracefully and provide meaningful error messages 12. Ensure code is safe and avoids unsafe blocks where possible ## Code - Modules ### Main Module - `main.rs` - Entry point, CLI argument parsing, mode dispatching - Interacts with all mode modules based on user input - Handles database connection setup and data directory management ### Mode Modules - `modes/save.rs` - Save new items with tags/metadata - `modes/get.rs` - Retrieve items by ID/tags - `modes/list.rs` - List items with filtering and formatting - `modes/delete.rs` - Delete items by ID - `modes/update.rs` - Update item tags/metadata - `modes/info.rs` - Show detailed item information - `modes/diff.rs` - Compare two items - `modes/status.rs` - Show system status and capabilities - `modes/server.rs` - REST HTTP/HTTPS server mode with OpenAPI documentation - `modes/client.rs` - Client mode for remote server (streaming save, local decompression) - `modes/common.rs` - Shared utilities for all modes ### Database Module - `db.rs` - SQLite database operations - Handles items, tags, and metadata storage - Provides query functions for all modes - Manages database migrations ### Compression Engine Module - `compression_engine.rs` - Trait and type definitions - `compression_engine/gzip.rs` - GZip implementation - `compression_engine/lz4.rs` - LZ4 implementation - `compression_engine/none.rs` - No compression implementation - `compression_engine/program.rs` - External program wrapper ### Meta Plugin Module - `meta_plugin.rs` - Trait and type definitions - `meta_plugin/program.rs` - External program wrapper - `meta_plugin/digest.rs` - Internal digest implementations - `meta_plugin/system.rs` - System information metadata plugins ### Common Modules - `common/is_binary.rs` - Binary file detection utilities - `common/status.rs` - Status information generation ### Client Module - `client.rs` - HTTP client wrapper (ureq-based, supports streaming POST) - `modes/client/save.rs` - 3-thread streaming save (stdin → tee → compress → pipe → HTTP POST) - `modes/client/get.rs` - Get with server-side raw fetch + local decompression - `modes/client/list.rs` - List delegation to server - `modes/client/info.rs` - Info delegation to server - `modes/client/delete.rs` - Delete delegation to server - `modes/client/diff.rs` - Diff delegation to server - `modes/client/status.rs` - Status delegation to server ### Utility Modules - `plugins.rs` - Shared plugin utilities - `args.rs` - CLI argument definitions ## Command Line Interface ### Modes - Save mode: `keep [--save]` (default when no mode specified and no IDs provided) - Get mode: `keep [--get] ` (default when IDs provided) - List mode: `keep [--list] [tag...]` - Info mode: `keep [--info] ` - Delete mode: `keep [--delete] ` - Update mode: `keep [--update] [tag...]` - Diff mode: `keep [--diff] ` - Status mode: `keep [--status]` - Server mode: `keep [--server] ` ### Item Options - `--meta KEY[=VALUE]` - Set metadata for the item, remove if VALUE not provided - `--digest ` - Digest algorithm to use when saving items - `--compression ` - Compression algorithm to use when saving items - `--meta-plugins ` - Meta plugins to use when saving items ### General Options - `--dir ` - Specify the directory to use for storage - `--list-format ` - A comma separated list of columns to display with --list - `--human-readable` - Display file sizes with units - `--verbose` - Increase message verbosity - `--quiet` - Do not show any messages - `--output-format ` - Output format for info, status, and list modes - `--server-password ` - Password for server authentication - `--server-cert ` - TLS certificate file (PEM) for HTTPS server - `--server-key ` - TLS private key file (PEM) for HTTPS server - `--force` - Force output even when binary data would be sent to a TTY ### Client Options (requires `client` feature) - `--client-url ` - Remote keep server URL - `--client-password ` - Remote server password ## Data Storage ### Database Schema - `items` table: id (primary key), ts (timestamp), size (optional), compression - `tags` table: id (foreign key to items), name (tag name) - `metas` table: id (foreign key to items), name (meta key), value (meta value) - Indexes on tag names and meta names for faster queries ### File Storage - Data directory contains compressed item files named by their item ID - Database file stored in data directory - File permissions set to be private to user (umask 077) ## REST API Endpoints ### Status Operations - `GET /api/status` - Get system status information - `GET /api/plugins/status` - Get plugin status information ### Item Operations - `GET /api/item/` - Get a list of items as JSON. Optional params: `order=newest|oldest`, `start=0`, `count=100`, `tags=tag1,tag2` - `POST /api/item/` - Add a new item (body: raw content). Query params: `tags`, `metadata` (JSON), `compress=true|false`, `meta=true|false` - `POST /api/item/<#>/meta` - Add metadata to an existing item (body: JSON object) - `DELETE /api/item/<#>` - Delete an item - `GET /api/item/latest` - Return the latest item as JSON. Optional params: `tags=tag1,tag2`, `allow_binary=true|false` - `GET /api/item/latest/meta` - Return the latest item metadata as JSON. Optional params: `tags=tag1,tag2` - `GET /api/item/latest/content` - Return the raw content of the latest item. Optional params: `tags=tag1,tag2`, `decompress=true|false` - `GET /api/item/<#>` - Return the item as JSON. Optional params: `allow_binary=true|false` - `GET /api/item/<#>/meta` - Return the item metadata as JSON - `GET /api/item/<#>/content` - Return the raw content of the item. Optional params: `decompress=true|false` - `GET /api/diff` - Diff two items. Params: `id_a`, `id_b` ### Server Modes - **Plain HTTP** (default): `tokio::net::TcpListener` + `axum::serve()` - **HTTPS** (with `tls` feature): `axum_server::bind_rustls()` with rustls when `--server-cert` and `--server-key` are provided - Conditional selection at startup: cert+key present → HTTPS, otherwise → HTTP ### Client/Server Protocol - Smart clients (keep CLI) set `compress=false` and `meta=false` on POST, handling compression/metadata locally - Dumb clients (curl) use defaults (`compress=true`, `meta=true`), server handles everything - GET responses include `X-Keep-Compression` header when `decompress=false` - Streaming save uses chunked transfer encoding for constant memory usage ### Authentication - Bearer token authentication: `Authorization: Bearer ` - Basic authentication: `Authorization: Basic base64(keep:)` - When no password is set, authentication is disabled ## Supported Compression Types - LZ4 (internal implementation) - GZip (internal implementation) - BZip2 (external program) - XZ (external program) - ZStd (external program) - None (no compression) ## Supported Meta Plugins - FileMagic - File type detection using file command - FileMime - MIME type detection using file command - FileEncoding - File encoding detection using file command - LineCount - Line count using wc command - WordCount - Word count using wc command - Cwd - Current working directory - Binary - Binary file detection - Uid - Current user ID - User - Current username - Gid - Current group ID - Group - Current group name - Shell - Shell path from SHELL environment variable - ShellPid - Shell process ID from PPID environment variable - KeepPid - Keep process ID - DigestSha256 - SHA-256 digest - DigestMd5 - MD5 digest using md5sum command - ReadTime - Time taken to read data - ReadRate - Rate of data reading - Hostname - System hostname - FullHostname - Fully qualified domain name ## Testing Strategy - Unit tests for each module in `src/tests/` - Integration tests for modes - Database tests for CRUD operations - Compression engine tests for each supported format - Meta plugin tests for each plugin type - Server tests for API endpoints and authentication - Common utilities tests for helper functions ## Binary Data Handling - Automatic binary detection using file signatures and heuristics - Prevents binary data output to TTY unless --force is used - Binary meta plugin analyzes content to determine if it's binary - API endpoints respect binary flags to prevent accidental binary transmission ## Security Considerations - File permissions are restricted to user only (umask 077) - Input validation for item IDs to prevent path traversal - Authentication for server mode with bearer or basic auth - TLS/HTTPS support via rustls when certificate and key are provided - Proper resource cleanup using RAII patterns - Safe handling of external processes with proper stdin/stdout management ## Feature Flags - `server` - HTTP REST API server (axum-based) - `tls` - HTTPS/TLS support for server (axum-server + rustls) - `client` - HTTP client for remote server (ureq-based, includes streaming save) - `mcp` - Model Context Protocol for AI assistant integration - `swagger` - OpenAPI/Swagger UI documentation - `magic` - File type detection via libmagic - `lz4` - LZ4 compression (internal) - `gzip` - GZip compression (internal) - `bzip2` - BZip2 compression (external) - `xz` - XZ compression (external) - `zstd` - ZStd compression (external)