Major overhaul of server architecture and security posture: - Streaming: Unified all I/O through PIPESIZE (8192-byte) buffers. POST bodies stream via MpscReader through the save pipeline. GET content streams from disk via decompression to client. Removed save_item_with_reader, get_item_content_info, ChannelReader. 413 responses keep partial items (nonfatal by design). - Security: XSS protection in all HTML pages via html_escape crate. Security headers middleware (nosniff, frame deny, referrer policy). CORS tightened to explicit headers. Input validation for tags (256 chars), metadata (128/4096), pagination (10k cap). Config file reads use from_utf8_lossy. Generic error messages in HTML. Diff endpoint has 10 MB per-item cap. max_body_size config option. - Panics eliminated: Path unwraps → proper error propagation. Mutex unwraps → map_err (registries) / expect with message (local). - MCP removed: Deleted all MCP code, rmcp dependency, mcp feature. - Docs: Updated README, DESIGN, AGENTS to reflect all changes.
11 KiB
11 KiB
PROJECT RULES - KEEP FIRST
Standard Rules
- ALWAYS keep DESIGN.md updated with any architectural or design changes
- ALWAYS keep project rules first in this document
- ALWAYS use git commands to remove or move files (
git rm,git mv, etc.) - Follow Rust naming conventions and idioms
- Use anyhow for error handling throughout the codebase
- Maintain comprehensive logging with the log crate
- Write unit tests for critical functionality
- Document public APIs with rustdoc comments
- Keep modules focused on single responsibilities
- Prefer composition over inheritance
- Handle errors gracefully and provide meaningful error messages
- Ensure code is safe and avoids unsafe blocks where possible
Code - Modules
Main Module
main.rs- Entry point, CLI argument parsing, mode dispatching- Interacts with all mode modules based on user input
- Handles database connection setup and data directory management
Mode Modules
modes/save.rs- Save new items with tags/metadatamodes/get.rs- Retrieve items by ID/tagsmodes/list.rs- List items with filtering and formattingmodes/delete.rs- Delete items by IDmodes/update.rs- Update item tags/metadatamodes/info.rs- Show detailed item informationmodes/diff.rs- Compare two itemsmodes/status.rs- Show system status and capabilitiesmodes/server.rs- REST HTTP/HTTPS server mode with OpenAPI documentationmodes/client.rs- Client mode for remote server (streaming save, local decompression)modes/common.rs- Shared utilities for all modes
Database Module
db.rs- SQLite database operations- Handles items, tags, and metadata storage
- Provides query functions for all modes
- Manages database migrations
Compression Engine Module
compression_engine.rs- Trait and type definitionscompression_engine/gzip.rs- GZip implementationcompression_engine/lz4.rs- LZ4 implementationcompression_engine/none.rs- No compression implementationcompression_engine/program.rs- External program wrapper
Meta Plugin Module
meta_plugin.rs- Trait and type definitionsmeta_plugin/program.rs- External program wrappermeta_plugin/digest.rs- Internal digest implementationsmeta_plugin/system.rs- System information metadata plugins
Common Modules
common/is_binary.rs- Binary file detection utilitiescommon/status.rs- Status information generation
Client Module
client.rs- HTTP client wrapper (ureq-based, supports streaming POST)modes/client/save.rs- 3-thread streaming save (stdin → tee → compress → pipe → HTTP POST)modes/client/get.rs- Get with server-side raw fetch + local decompressionmodes/client/list.rs- List delegation to servermodes/client/info.rs- Info delegation to servermodes/client/delete.rs- Delete delegation to servermodes/client/diff.rs- Diff delegation to servermodes/client/status.rs- Status delegation to server
Utility Modules
plugins.rs- Shared plugin utilitiesargs.rs- CLI argument definitions
Command Line Interface
Modes
- Save mode:
keep [--save](default when no mode specified and no IDs provided) - Get mode:
keep [--get] <ID|tag...>(default when IDs provided) - List mode:
keep [--list] [tag...] - Info mode:
keep [--info] <ID|tag...> - Delete mode:
keep [--delete] <ID...> - Update mode:
keep [--update] <ID> [tag...] - Diff mode:
keep [--diff] <ID1> <ID2> - Status mode:
keep [--status] - Server mode:
keep [--server] <address:port>
Item Options
--meta KEY[=VALUE]- Set metadata for the item, remove if VALUE not provided--digest <sha256|md5>- Digest algorithm to use when saving items--compression <lz4|gzip|bzip2|xz|zstd|none>- Compression algorithm to use when saving items--meta-plugins <plugin[,plugin...]>- Meta plugins to use when saving items
General Options
--dir <PATH>- Specify the directory to use for storage--list-format <FORMAT>- A comma separated list of columns to display with --list--human-readable- Display file sizes with units--verbose- Increase message verbosity--quiet- Do not show any messages--output-format <table|json|yaml>- Output format for info, status, and list modes--server-password <PASSWORD>- Password for server authentication--server-cert <PATH>- TLS certificate file (PEM) for HTTPS server--server-key <PATH>- TLS private key file (PEM) for HTTPS server--force- Force output even when binary data would be sent to a TTY
Client Options (requires client feature)
--client-url <URL>- Remote keep server URL--client-password <PASSWORD>- Remote server password
Data Storage
Database Schema
itemstable: id (primary key), ts (timestamp), size (optional), compressiontagstable: id (foreign key to items), name (tag name)metastable: id (foreign key to items), name (meta key), value (meta value)- Indexes on tag names and meta names for faster queries
File Storage
- Data directory contains compressed item files named by their item ID
- Database file stored in data directory
- File permissions set to be private to user (umask 077)
REST API Endpoints
Status Operations
GET /api/status- Get system status informationGET /api/plugins/status- Get plugin status information
Item Operations
GET /api/item/- Get a list of items as JSON. Optional params:order=newest|oldest,start=0,count=100,tags=tag1,tag2POST /api/item/- Add a new item (body: raw content, streamed through fixed-size 8192-byte buffers). Query params:tags,metadata(JSON),compress=true|false,meta=true|falsePOST /api/item/<#>/meta- Add metadata to an existing item (body: JSON object)DELETE /api/item/<#>- Delete an itemGET /api/item/latest- Return the latest item as JSON. Optional params:tags=tag1,tag2,allow_binary=true|falseGET /api/item/latest/meta- Return the latest item metadata as JSON. Optional params:tags=tag1,tag2GET /api/item/latest/content- Return the raw content of the latest item (streamed). Optional params:tags=tag1,tag2,decompress=true|falseGET /api/item/<#>- Return the item as JSON. Optional params:allow_binary=true|falseGET /api/item/<#>/meta- Return the item metadata as JSONGET /api/item/<#>/content- Return the raw content of the item (streamed). Optional params:decompress=true|falseGET /api/diff- Diff two items. Params:id_a,id_b(individual items capped at 10 MB)
Server Configuration
max_body_size- Maximum POST body size in bytes (default: unlimited). When exceeded, server returns413 PAYLOAD_TOO_LARGEwhile keeping the partial item already saved through the streaming pipeline. Set to0for unlimited.
Server Modes
- Plain HTTP (default):
tokio::net::TcpListener+axum::serve() - HTTPS (with
tlsfeature):axum_server::bind_rustls()with rustls when--server-certand--server-keyare provided - Conditional selection at startup: cert+key present → HTTPS, otherwise → HTTP
Client/Server Protocol
- Smart clients (keep CLI) set
compress=falseandmeta=falseon POST, handling compression/metadata locally - Dumb clients (curl) use defaults (
compress=true,meta=true), server handles everything - GET responses include
X-Keep-Compressionheader whendecompress=false - Streaming save uses chunked transfer encoding for constant memory usage
- Universal streaming: All server paths (POST, GET, diff) use
PIPESIZE(8192) byte buffers - 413 partial item: When
max_body_sizeis exceeded, the server returns413but keeps the partial item already saved through the pipeline (nonfatal design — pipes continue normally)
Authentication
- Bearer token authentication:
Authorization: Bearer <password> - Basic authentication:
Authorization: Basic base64(keep:<password>) - When no password is set, authentication is disabled
Supported Compression Types
- LZ4 (internal implementation)
- GZip (internal implementation)
- BZip2 (external program)
- XZ (external program)
- ZStd (external program)
- None (no compression)
Supported Meta Plugins
- FileMagic - File type detection using file command
- FileMime - MIME type detection using file command
- FileEncoding - File encoding detection using file command
- LineCount - Line count using wc command
- WordCount - Word count using wc command
- Cwd - Current working directory
- Binary - Binary file detection
- Uid - Current user ID
- User - Current username
- Gid - Current group ID
- Group - Current group name
- Shell - Shell path from SHELL environment variable
- ShellPid - Shell process ID from PPID environment variable
- KeepPid - Keep process ID
- DigestSha256 - SHA-256 digest
- DigestMd5 - MD5 digest using md5sum command
- ReadTime - Time taken to read data
- ReadRate - Rate of data reading
- Hostname - System hostname
- FullHostname - Fully qualified domain name
Testing Strategy
- Unit tests for each module in
src/tests/ - Integration tests for modes
- Database tests for CRUD operations
- Compression engine tests for each supported format
- Meta plugin tests for each plugin type
- Server tests for API endpoints and authentication
- Common utilities tests for helper functions
Binary Data Handling
- Automatic binary detection using file signatures and heuristics
- Prevents binary data output to TTY unless --force is used
- Binary meta plugin analyzes content to determine if it's binary
- API endpoints respect binary flags to prevent accidental binary transmission
Security Considerations
- File permissions are restricted to user only (umask 077)
- Input validation for item IDs to prevent path traversal
- Authentication for server mode with bearer or basic auth
- TLS/HTTPS support via rustls when certificate and key are provided
- Proper resource cleanup using RAII patterns
- Safe handling of external processes with proper stdin/stdout management
- Streaming architecture: All server I/O uses fixed-size 8192-byte buffers; no full file contents held in memory
- XSS protection: All user-controlled data in HTML pages is escaped via
html-escape - Security headers:
X-Content-Type-Options: nosniff,X-Frame-Options: DENY,Referrer-Policy: strict-origin-when-cross-origin - CORS: Explicit allowed headers only (
Content-Type,Authorization,Accept); no wildcard headers - Input limits: Tags (256 chars), metadata keys (128 chars), metadata values (4096 chars), pagination (10,000 max)
- Config file size: 4 KB cap with
from_utf8_lossyfor safe UTF-8 handling - Error sanitization: Internal errors never exposed in HTML responses
- No
unsafe_code: Enforced via#
Feature Flags
server- HTTP REST API server (axum-based)tls- HTTPS/TLS support for server (axum-server + rustls)client- HTTP client for remote server (ureq-based, includes streaming save)swagger- OpenAPI/Swagger UI documentationmagic- File type detection via libmagiclz4- LZ4 compression (internal)gzip- GZip compression (internal)bzip2- BZip2 compression (external)xz- XZ compression (external)zstd- ZStd compression (external)