Compare commits

..

14 Commits

Author SHA1 Message Date
8379ae2136 refactor: rename plugin features with type prefix for consistency
- Plugin features now use type_ prefix (meta_magic, filter_grep, etc.)
- Added meta_all_musl and filter_all_musl for MUSL-compatible builds
- grep filter plugin made optional via filter_grep feature flag
- Removed regex crate from grep-related code, uses strip_prefix instead
- Updated CHANGELOG.md with breaking change documentation
2026-03-21 17:36:29 -03:00
12de215527 feat: feature-gate CLI args by server/client features
- CLI now shows only relevant options: --server and --server-* args
  hidden when built without 'server' feature; --client-* args hidden
  without 'client' feature. Run --help only displays applicable options.
- Removed verbose 'conflicts_with_all' from all mode args — clap's
  implicit group("mode") already enforces mutual exclusivity.
- 'server' feature now includes TLS/HTTPS by default (axum-server);
  'tls' feature removed. rustls already available via client/ureq.
- Gated KeepModes::Server, server mode detection, and server-password
  validation in main.rs.
- Gated server arg reads in config.rs.
- Removed redundant #[cfg(feature = "tls")] guards from server/mod.rs.
- Gated resolve_item_id/resolve_item_ids helpers in common.rs.
- All 4 feature combinations (server+client, server-only, client-only,
  neither) compile and pass tests.
2026-03-21 16:26:27 -03:00
e2cb36d2a8 feat(server): add file_size to API ItemInfo response 2026-03-21 14:03:58 -03:00
0004324301 perf: pre-allocate status info collections with known capacities 2026-03-21 13:54:37 -03:00
b3edfe7de6 chore: code review cleanup — fixes, deps, docs
Fixed:
- CLI help typo: "metatdata" -> "metadata"
- Filter buffer OOM: check size before loading into memory

Changed:
- #[inline] on HTML escape helpers for hot path performance
- Replaced once_cell and lazy_static with std::sync::LazyLock
- Removed unused once_cell and lazy_static crate dependencies

Refactored:
- Added module-level doc to services/ module

Documentation:
- README.md: zstd is native not external, "none" -> "raw"
- DESIGN.md: current schema and meta plugins section
- CHANGELOG.md: Unreleased section populated
2026-03-21 11:44:37 -03:00
ab2fb07505 docs: add changelog update instructions to AGENTS.md 2026-03-21 10:56:43 -03:00
547f0b5d11 docs: add CHANGELOG.md following Keep a Changelog format 2026-03-21 10:55:16 -03:00
30d7836bcf refactor: deduplicate ItemInfo, improve error handling, fix pre-existing bugs
- Move ItemInfo to services/types.rs for sharing between client and server
- Replace .expect() in compression_service with proper error handling
- Add CoreError::PayloadTooLarge variant for semantic error handling
- Export CoreError from lib.rs for library users
- Unify get_item_meta_name/value to take &str instead of String
- Extract item_path() helper in ItemService to reduce duplication
- Add warning logs for silent errors in list.rs
- Fix pre-existing borrow errors: tx moved in export handler,
  item_with_meta partial move in TryFrom implementation
- Fix unused data_dir variables in server code
2026-03-21 10:43:26 -03:00
2cfee5075e fix: panic guards, dedup, and unsafe documentation
- diff.rs: graceful error instead of expect() on item ID in spawned thread
- common.rs: lazy_static regex, avoid unwrap on regex captures
- db.rs: ok_or_else guard on item.id in delete_item
- list/get/info/export/client/list: use settings.meta_filter() helper
- item_service.rs: expect() on meta lock instead of silent swallow
- filter_plugin/mod.rs: extract parse_encoding_option() helper
- main.rs: document unsafe libc::umask block with safety rationale
2026-03-20 17:17:58 -03:00
52e9787edb refactor: deduplicate filter plugins, extract helpers across codebase
Bug fixes:
- client: add error field to ApiResponse to avoid swallowing server errors
- args/config: fix list_format default mismatch (5 vs 7 columns)
- client: url-encode size param in set_item_size

Dedup - filter plugins:
- Extract count_option() and pattern_option() helpers, replace 7 identical options()
- Add #[derive(Clone)] to all filter structs; remove verbose clone_box() impls
- Simplify FilterChain clone() and impl Clone for Box<dyn FilterPlugin>
- Add filter_clone_box! macro for future use
- Fix doctest example missing clone_box

Dedup - server API:
- Extract spawn_body_reader() with LimitBehavior enum for body streaming
- Extract check_binary_content() helper
- Extract stream_with_offset_and_length() helper
- Extract generate_status() helper in status.rs
- Extract append_query_params() helper in client.rs

Dedup - other:
- Extract yaml_value_to_string() in meta_plugin/mod.rs
- Extract item_from_row() in db.rs
- Delete unused DisplayListItem struct

Misc:
- Remove duplicate doc comment in compression_service.rs
2026-03-20 15:54:33 -03:00
00be72f3d0 refactor: rename size to uncompressed_size, add compressed_size and closed columns
Schema changes:
- Rename items.size to items.uncompressed_size for clarity
- Add compressed_size (INTEGER NULL) - tracks compressed file size on disk
- Add closed (BOOLEAN NOT NULL DEFAULT 1) - tracks whether item is fully written
- Existing items default to closed=true via migration

Lifecycle:
- Items created with closed=false, set to true on successful save/import
- Compressed size captured via fs::metadata() after compression writer closes
- Truncated uploads (413) get compressed_size set, closed=true, uncompressed_size=None
- Update command now backfills both uncompressed_size and compressed_size

Also includes bug fixes and dedup from prior review:
- Fix stream_raw_content_response using uncompressed_size for raw byte Content-Length
- ApiResponse::ok()/empty() constructors, TryFrom<ItemWithMeta> for ItemInfo
- tag_names() method on ItemWithMeta, meta_filter() on Settings
- Fix .unwrap() panics in compression engine Read/Write impls
- Fix TOCTOU race in stream_raw_content_response (now uses compressed_size)
- Fix swallowed write errors in meta plugins (digest, magic_file, exec)
- Fix term::stderr().unwrap() panic in item_service
- Deduplicate ItemService::new() calls across 20 API handlers
- ImportMeta supports #[serde(alias = "size")] for backward compat

All 75 tests, 67 doc tests pass. Clippy clean.
2026-03-18 10:58:26 -03:00
49793a0f94 feat: add streaming tar export/import and rename "none" to "raw"
- Add streaming tar-based export (--export produces .keep.tar)
- Add streaming tar import (--import reads .keep.tar archives)
- Add server endpoints GET /api/export and POST /api/import
- Rename CompressionType::None to CompressionType::Raw with "none" as alias
- Add DB migration to update existing "none" compression values to "raw"
- Fix export endpoint to propagate errors to client instead of swallowing
- Fix import endpoint to return 413 on max_body_size instead of truncating

Export streams items as tar archives without loading entire files into memory.
Import extracts items with new IDs, preserving original order. Both work
locally and via client/server mode.

Co-Authored-By: opencode <noreply@opencode.ai>
2026-03-17 21:24:39 -03:00
074ba64805 feat: allow --list to accept item IDs for filtering
- Local and client/server modes now support ID-based filtering
- keep -l 1 2 3 lists specific items by ID
- keep -l --ids-only 1 2 3 outputs just those IDs
- Server API adds optional 'ids' query parameter to GET /api/item/
- KeepClient.list_items gains ids parameter
2026-03-17 17:56:35 -03:00
02f0c8d453 fix: use XDG config directory for default config file location
Changes from manual HOME/.config/keep/config.yml construction to
dirs::config_dir(), which respects XDG_CONFIG_HOME.
2026-03-17 16:07:13 -03:00
73 changed files with 2605 additions and 1410 deletions

View File

@@ -53,3 +53,13 @@ TERM=dumb cargo build --features server # With server feature
- Use `html_escape` crate for all user-controlled data in HTML pages - Use `html_escape` crate for all user-controlled data in HTML pages
- `esc()` for text content, `esc_attr()` for HTML attributes - `esc()` for text content, `esc_attr()` for HTML attributes
- Security headers middleware: `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY`, `Referrer-Policy: strict-origin-when-cross-origin` - Security headers middleware: `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY`, `Referrer-Policy: strict-origin-when-cross-origin`
## Changelog
The project uses [Keep a Changelog](https://keepachangelog.com/). The changelog lives at `CHANGELOG.md` in the project root.
- **Always update `CHANGELOG.md`** when making changes that affect users (new features, breaking changes, bug fixes, etc.)
- Add entries under the `[Unreleased]` section using these categories: `Added`, `Changed`, `Deprecated`, `Removed`, `Fixed`, `Security`
- Keep descriptions concise and user-focused — what changed from the user's perspective, not implementation details
- Commit changelog updates in the same commit as the feature/fix they document
- Before releasing a new version, move `[Unreleased]` entries to a versioned section (e.g., `[0.2.0] - YYYY-MM-DD`) and add a new empty `[Unreleased]` above it

107
CHANGELOG.md Normal file
View File

@@ -0,0 +1,107 @@
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]
### Added
- New `filter_grep` feature to optionally include the grep filter plugin (regex-based line filtering). Disabling this feature removes the `regex` crate and its ~800 KiB dependency stack from the binary.
- New `meta_all_musl` feature for all MUSL-compatible meta plugins (excludes `meta_magic` which requires libmagic)
- New `filter_all_musl` feature for all MUSL-compatible filter plugins
- Database index on `items(ts)` column for faster ORDER BY sorting
- Server API `ItemInfo` now includes `file_size` — actual filesystem-reported size of the item data file
### Changed
- CLI args now feature-gated: `--server` and related options hidden when built without `server` feature; `--client-*` options hidden when built without `client` feature. Run `--help` only shows relevant options.
- `server` Cargo feature now includes TLS support by default (`axum-server`); `tls` feature removed
- Clap `conflicts_with_all` removed from all mode args — exclusivity now handled by implicit `group("mode")`
- Filter plugins check size before loading content into memory (prevents OOM on large inputs)
- Status page pre-allocates collections with known capacities (meta plugins, compression info)
- `#[inline]` on HTML escape helper functions (`esc`, `esc_attr`) for hot path performance
- Removed `once_cell` crate (replaced with `std::sync::LazyLock` from Rust 1.80)
- Removed `lazy_static` crate (replaced with `std::sync::LazyLock`)
### Breaking
- Plugin feature flags renamed with type prefix for consistency:
- `magic``meta_magic`
- `infer``meta_infer`
- `tree_magic_mini``meta_tree_magic_mini`
- `tokens``meta_tokens`
- `grep``filter_grep`
- `all-meta-plugins``meta_all`
- `all-filter-plugins``filter_all`
### Fixed
- CLI help text typo: "metatdata" → "metadata" in `--get` and `--info` descriptions
### Refactored
- Added module-level documentation to `services/` module
### Documentation
- README.md: Fixed compression table — zstd is native (not external), "none" renamed to "raw"
- DESIGN.md: Updated schema to reflect current `items` table columns and meta plugin inventory
## [0.1.0] - 2026-03-21
### Added
- Streaming tar-based export (`--export`) producing `.keep.tar` archives without loading entire files into memory
- Streaming tar-based import (`--import`) extracting `.keep.tar` archives with new IDs
- Server endpoints `GET /api/export` and `POST /api/import`
- ID-based filtering for `--list` (`keep -l 1 2 3` lists specific items by ID)
- Server API accepts optional `ids` query parameter on `GET /api/item/`
- `--ids-only` flag for `--list` mode for scripting
- `infer` and `tree_magic_mini` meta plugins for MIME type detection
- Native `zstd` compression plugin as default
- Configurable compression via `--compression` flag
- Export/import modes with format detection (JSON, YAML, binary)
- `XDG_CONFIG_HOME` support for default config file location
- `XDG_DATA_HOME` support for default storage location
- Tilde (`~`) expansion in config file paths
### Changed
- `CompressionType::None` renamed to `CompressionType::Raw` (with `"none"` as alias for backward compatibility)
- `items.size` column renamed to `items.uncompressed_size`
- Added `items.compressed_size` column tracking compressed file size on disk
- Added `items.closed` column tracking whether an item is fully written
- Default `list_format` in config now matches CLI default (7 vs 5 columns)
- All filter plugins share deduplicated option implementations
### Refactored
- Extracted `spawn_body_reader()` and `check_binary_content()` helpers for streaming uploads
- Extracted `yaml_value_to_string()` helper for meta plugins
- Extracted `item_path()` helper in `ItemService` to reduce path duplication
- Unified `get_item_meta_name`/`value` to take `&str` instead of `String`
- Shared `ItemInfo` struct between client and server
- Compression service now returns `Result` types instead of panicking via `.expect()`
- `ApiResponse::ok()` and `ApiResponse::empty()` constructors
- `meta_filter()` helper on `Settings` for consistent filtering
- Added `tag_names()` method on `ItemWithMeta`
- `filter_clone_box!` macro for filter plugin cloning
### Fixed
- Panic guards in diff, compression engine, and spawned threads
- Pre-existing borrow errors in export handler and `TryFrom` implementation
- TOCTOU race in `stream_raw_content_response`
- Swallowed write errors in meta plugins (digest, magic_file, exec)
- Truncated uploads (413) now properly store compressed data
- `term::stderr().unwrap()` panic in `item_service`
- `.unwrap()` panics in compression engine `Read`/`Write` impls
- Client API errors now propagate to user instead of being swallowed
- Import endpoint returns 413 on `max_body_size` instead of truncating
- `keep --list` uses `list_format` from config in all modes
- All tables respect `table_config` from settings
- `DisplayListItem` struct removed (was unused)
- `#[serde(alias = "size")]` on `ImportMeta` for backward compatibility

55
Cargo.lock generated
View File

@@ -1025,6 +1025,17 @@ version = "2.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be" checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be"
[[package]]
name = "filetime"
version = "0.2.27"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f98844151eee8917efc50bd9e8318cb963ae8b297431495d3f758616ea5c57db"
dependencies = [
"cfg-if",
"libc",
"libredox",
]
[[package]] [[package]]
name = "find-msvc-tools" name = "find-msvc-tools"
version = "0.1.9" version = "0.1.9"
@@ -1716,7 +1727,6 @@ dependencies = [
"inventory", "inventory",
"is-terminal", "is-terminal",
"jsonwebtoken", "jsonwebtoken",
"lazy_static",
"libc", "libc",
"local-ip-address", "local-ip-address",
"log", "log",
@@ -1724,7 +1734,6 @@ dependencies = [
"magic", "magic",
"md5", "md5",
"nix", "nix",
"once_cell",
"os_pipe", "os_pipe",
"pest", "pest",
"pest_derive", "pest_derive",
@@ -1744,6 +1753,7 @@ dependencies = [
"strip-ansi-escapes", "strip-ansi-escapes",
"strum", "strum",
"subtle", "subtle",
"tar",
"tempfile", "tempfile",
"term", "term",
"thiserror 2.0.18", "thiserror 2.0.18",
@@ -1793,7 +1803,10 @@ version = "0.1.14"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1744e39d1d6a9948f4f388969627434e31128196de472883b39f148769bfe30a" checksum = "1744e39d1d6a9948f4f388969627434e31128196de472883b39f148769bfe30a"
dependencies = [ dependencies = [
"bitflags 2.11.0",
"libc", "libc",
"plain",
"redox_syscall 0.7.3",
] ]
[[package]] [[package]]
@@ -2108,7 +2121,7 @@ checksum = "2621685985a2ebf1c516881c026032ac7deafcda1a2c9b7850dc81e3dfcb64c1"
dependencies = [ dependencies = [
"cfg-if", "cfg-if",
"libc", "libc",
"redox_syscall", "redox_syscall 0.5.18",
"smallvec", "smallvec",
"windows-link", "windows-link",
] ]
@@ -2207,6 +2220,12 @@ version = "0.3.32"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7edddbd0b52d732b21ad9a5fab5c704c14cd949e5e9a1ec5929a24fded1b904c" checksum = "7edddbd0b52d732b21ad9a5fab5c704c14cd949e5e9a1ec5929a24fded1b904c"
[[package]]
name = "plain"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b4596b6d070b27117e987119b4dac604f3c58cfb0b191112e24771b2faeac1a6"
[[package]] [[package]]
name = "portable-atomic" name = "portable-atomic"
version = "1.13.1" version = "1.13.1"
@@ -2391,6 +2410,15 @@ dependencies = [
"bitflags 2.11.0", "bitflags 2.11.0",
] ]
[[package]]
name = "redox_syscall"
version = "0.7.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6ce70a74e890531977d37e532c34d45e9055d2409ed08ddba14529471ed0be16"
dependencies = [
"bitflags 2.11.0",
]
[[package]] [[package]]
name = "redox_users" name = "redox_users"
version = "0.5.2" version = "0.5.2"
@@ -2938,6 +2966,17 @@ dependencies = [
"syn", "syn",
] ]
[[package]]
name = "tar"
version = "0.4.44"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1d863878d212c87a19c1a610eb53bb01fe12951c0501cf5a0d65f724914a667a"
dependencies = [
"filetime",
"libc",
"xattr",
]
[[package]] [[package]]
name = "tempfile" name = "tempfile"
version = "3.27.0" version = "3.27.0"
@@ -3961,6 +4000,16 @@ version = "0.6.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9edde0db4769d2dc68579893f2306b26c6ecfbe0ef499b013d731b7b9247e0b9" checksum = "9edde0db4769d2dc68579893f2306b26c6ecfbe0ef499b013d731b7b9247e0b9"
[[package]]
name = "xattr"
version = "1.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32e45ad4206f6d2479085147f02bc2ef834ac85886624a23575ae137c8aa8156"
dependencies = [
"libc",
"rustix",
]
[[package]] [[package]]
name = "xdg" name = "xdg"
version = "2.5.2" version = "2.5.2"

View File

@@ -35,7 +35,6 @@ hyper = { version = "1.0", features = ["full"] }
http-body-util = "0.1" http-body-util = "0.1"
inventory = "0.3" inventory = "0.3"
is-terminal = "0.4" is-terminal = "0.4"
lazy_static = "1.5"
libc = "0.2" libc = "0.2"
local-ip-address = "0.6" local-ip-address = "0.6"
log = "0.4" log = "0.4"
@@ -45,10 +44,9 @@ magic = { version = "0.13", optional = true }
infer = { version = "0.19", optional = true } infer = { version = "0.19", optional = true }
tree_magic_mini = { version = "3.2", optional = true } tree_magic_mini = { version = "3.2", optional = true }
nix = { version = "0.30", features = ["fs", "process"] } nix = { version = "0.30", features = ["fs", "process"] }
once_cell = "1.21"
comfy-table = "7.2" comfy-table = "7.2"
pwhash = "1.0" pwhash = "1.0"
regex = "1.10" regex = { version = "1.10", optional = true }
ringbuf = "0.4" ringbuf = "0.4"
rusqlite = { version = "0.37", features = ["bundled", "array", "chrono"] } rusqlite = { version = "0.37", features = ["bundled", "array", "chrono"] }
rusqlite_migration = "2.3" rusqlite_migration = "2.3"
@@ -73,6 +71,7 @@ uzers = "0.12"
which = "8.0" which = "8.0"
xdg = "2.5" xdg = "2.5"
strip-ansi-escapes = "0.2" strip-ansi-escapes = "0.2"
tar = "0.4"
pest = "2.8" pest = "2.8"
pest_derive = "2.8" pest_derive = "2.8"
dirs = "6.0" dirs = "6.0"
@@ -83,21 +82,23 @@ os_pipe = { version = "1", optional = true }
axum-server = { version = "0.8", features = ["tls-rustls"], optional = true } axum-server = { version = "0.8", features = ["tls-rustls"], optional = true }
jsonwebtoken = { version = "10", optional = true, features = ["aws_lc_rs"] } jsonwebtoken = { version = "10", optional = true, features = ["aws_lc_rs"] }
tiktoken-rs = { version = "0.9", optional = true } tiktoken-rs = { version = "0.9", optional = true }
tempfile = "3.3"
[features] [features]
# Default features include core compression engines and swagger UI # Default features include core compression engines plugins that support MUSL
default = [ default = [
"client", "client",
"gzip", "gzip",
"infer", "filter_grep",
"meta_infer",
"lz4", "lz4",
"tokens", "meta_tokens",
"tree_magic_mini", "meta_tree_magic_mini",
"zstd" "zstd"
] ]
# Server feature (includes axum and related dependencies) # Server feature (includes axum and TLS/HTTPS via axum-server; rustls already available via client/ureq)
server = ["dep:axum", "dep:tower", "dep:tower-http", "dep:utoipa", "dep:jsonwebtoken"] server = ["dep:axum", "dep:tower", "dep:tower-http", "dep:utoipa", "dep:jsonwebtoken", "dep:axum-server"]
# Compression features # Compression features
gzip = ["flate2"] gzip = ["flate2"]
@@ -106,14 +107,18 @@ bzip2 = []
xz = [] xz = []
zstd = ["dep:zstd"] zstd = ["dep:zstd"]
# Plugin features (meta and filter) # Meta plugin features
all-meta-plugins = ["dep:magic", "dep:infer", "dep:tree_magic_mini"] meta_magic = ["dep:magic"]
all-filter-plugins = [] meta_infer = ["dep:infer"]
meta_tree_magic_mini = ["dep:tree_magic_mini"]
meta_tokens = ["dep:tiktoken-rs"]
meta_all = ["meta_magic", "meta_infer", "meta_tree_magic_mini", "meta_tokens"]
meta_all_musl = ["meta_infer", "meta_tree_magic_mini", "meta_tokens"]
# Individual plugin features # Filter plugin features
magic = ["dep:magic"] filter_grep = ["dep:regex"]
infer = ["dep:infer"] filter_all = ["filter_grep"]
tree_magic_mini = ["dep:tree_magic_mini"] filter_all_musl = ["filter_grep"]
# Swagger UI feature # Swagger UI feature
swagger = ["dep:utoipa-swagger-ui"] swagger = ["dep:utoipa-swagger-ui"]
@@ -121,12 +126,5 @@ swagger = ["dep:utoipa-swagger-ui"]
# Client feature (HTTP client for remote server) # Client feature (HTTP client for remote server)
client = ["dep:ureq", "dep:os_pipe"] client = ["dep:ureq", "dep:os_pipe"]
# TLS feature (HTTPS server support)
tls = ["dep:axum-server"]
# Token counting feature (LLM token support via tiktoken)
tokens = ["dep:tiktoken-rs"]
[dev-dependencies] [dev-dependencies]
tempfile = "3.3"
rand = "0.9" rand = "0.9"

View File

@@ -117,7 +117,7 @@
## Data Storage ## Data Storage
### Database Schema ### Database Schema
- `items` table: id (primary key), ts (timestamp), size (optional), compression - `items` table: id (primary key), ts (timestamp), uncompressed_size (optional), compressed_size (optional), closed (boolean), compression
- `tags` table: id (foreign key to items), name (tag name) - `tags` table: id (foreign key to items), name (tag name)
- `metas` table: id (foreign key to items), name (meta key), value (meta value) - `metas` table: id (foreign key to items), name (meta key), value (meta value)
- Indexes on tag names and meta names for faster queries - Indexes on tag names and meta names for faster queries
@@ -178,26 +178,25 @@
- None (no compression) - None (no compression)
## Supported Meta Plugins ## Supported Meta Plugins
- FileMagic - File type detection using file command
- FileMime - MIME type detection using file command Meta plugins collect metadata during item save. Each plugin produces one or more key-value pairs:
- FileEncoding - File encoding detection using file command
- LineCount - Line count using wc command - `magic_file` - File type detection using libmagic (when `magic` feature enabled)
- WordCount - Word count using wc command - `infer` - MIME type detection using infer crate (when `infer` feature enabled)
- Cwd - Current working directory - `tree_magic_mini` - MIME type detection using tree_magic_mini (when `tree_magic_mini` feature enabled)
- Binary - Binary file detection - `tokens` - LLM token counting using tiktoken (when `tokens` feature enabled)
- Uid - Current user ID - `text` - Text analysis: line count, word count, char count, line average length
- User - Current username - `digest` - SHA-256 and MD5 checksums
- Gid - Current group ID - `hostname` - System hostname (full and short)
- Group - Current group name - `cwd` - Current working directory
- Shell - Shell path from SHELL environment variable - `user` - Current username and UID
- ShellPid - Shell process ID from PPID environment variable - `shell` - Shell path from SHELL environment variable
- KeepPid - Keep process ID - `shell_pid` - Shell process ID from PPID
- DigestSha256 - SHA-256 digest - `keep_pid` - Keep process ID
- DigestMd5 - MD5 digest using md5sum command - `env` - Arbitrary environment variables (via `KEEP_META_ENV_*` prefix)
- ReadTime - Time taken to read data - `exec` - Execute external commands for custom metadata
- ReadRate - Rate of data reading - `read_time` - Time taken to read content
- Hostname - System hostname - `read_rate` - Content read rate (bytes/second)
- FullHostname - Fully qualified domain name
## Testing Strategy ## Testing Strategy
- Unit tests for each module in `src/tests/` - Unit tests for each module in `src/tests/`

View File

@@ -345,8 +345,8 @@ Items are compressed automatically on save. Default: LZ4.
| `gzip` | Internal | Fast | Good | | `gzip` | Internal | Fast | Good |
| `bzip2` | External | Slow | Better | | `bzip2` | External | Slow | Better |
| `xz` | External | Slowest | Best | | `xz` | External | Slowest | Best |
| `zstd` | External | Fast | Good | | `zstd` | Internal | Fast | Good |
| `none` | Internal | N/A | N/A | | `raw` | Internal | N/A | N/A |
```sh ```sh
# Specify compression per item # Specify compression per item

View File

@@ -24,81 +24,80 @@ pub struct Args {
/// Struct for mode-specific arguments, defining CLI flags for different operations. /// Struct for mode-specific arguments, defining CLI flags for different operations.
#[derive(Parser, Debug, Clone)] #[derive(Parser, Debug, Clone)]
pub struct ModeArgs { pub struct ModeArgs {
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["get", "diff", "list", "delete", "info", "update", "status", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help("Save an item using any tags or metadata provided"))] #[arg(help("Save an item using any tags or metadata provided"))]
pub save: bool, pub save: bool,
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["save", "diff", "list", "delete", "info", "update", "status", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help( #[arg(help("Get an item either by its ID or by a combination of matching tags and metadata"))]
"Get an item either by it's ID or by a combination of matching tags and metatdata"
))]
pub get: bool, pub get: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "list", "delete", "info", "update", "status", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Show a diff between two items by ID"))] #[arg(help("Show a diff between two items by ID"))]
pub diff: bool, pub diff: bool,
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["save", "get", "diff", "delete", "info", "update", "status", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help("List items, filtering on tags or metadata if given"))] #[arg(help("List items, filtering on tags or metadata if given"))]
pub list: bool, pub list: bool,
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["save", "get", "diff", "list", "info", "update", "status", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help("Delete items either by ID or by matching tags"))] #[arg(help("Delete items either by ID or by matching tags"))]
#[arg(requires = "ids_or_tags")] #[arg(requires = "ids_or_tags")]
pub delete: bool, pub delete: bool,
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["save", "get", "diff", "list", "delete", "update", "status", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help( #[arg(help("Get an item either by its ID or by a combination of matching tags and metadata"))]
"Get an item either by it's ID or by a combination of matching tags and metatdata"
))]
pub info: bool, pub info: bool,
#[arg(group("mode"), help_heading("Mode Options"), short('u'), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "status", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), short('u'), long)]
#[arg(help("Update an item's tags and metadata by ID"))] #[arg(help("Update an item's tags and metadata by ID"))]
pub update: bool, pub update: bool,
#[arg(group("mode"), help_heading("Mode Options"), short('S'), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "update", "server", "status_plugins", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), short('S'), long)]
#[arg(help("Show status of directories and supported compression algorithms"))] #[arg(help("Show status of directories and supported compression algorithms"))]
pub status: bool, pub status: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "update", "status", "server", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Show available plugins and their configurations"))] #[arg(help("Show available plugins and their configurations"))]
pub status_plugins: bool, pub status_plugins: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "update", "status", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Export an item to data and metadata files (default: latest item)"))] #[arg(help("Export items to a .keep.tar archive (requires IDs or tags)"))]
pub export: bool, pub export: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, value_name("META_FILE"), conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "update", "status", "export"]))] #[arg(group("mode"), help_heading("Mode Options"), long, value_name("FILE"))]
#[arg(help("Import an item from a metadata file (data from --import-data-file or stdin)"))] #[arg(help("Import items from a .keep.tar archive or legacy .meta.yml file"))]
pub import: Option<String>, pub import: Option<String>,
#[arg(group("mode"), help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "update", "status"]))] #[cfg(feature = "server")]
#[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Start REST HTTP server"))] #[arg(help("Start REST HTTP server"))]
pub server: bool, pub server: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "update", "status", "server", "export", "import"]))] #[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Generate default configuration and output to stdout"))] #[arg(help("Generate default configuration and output to stdout"))]
pub generate_config: bool, pub generate_config: bool,
#[arg(help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "update", "status", "server", "generate_config", "export", "import"]))] #[arg(help_heading("Mode Options"), long)]
#[arg(help("Generate shell completion script"))] #[arg(help("Generate shell completion script"))]
pub generate_completion: Option<Shell>, pub generate_completion: Option<Shell>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_ADDRESS"))] #[arg(help_heading("Server Options"), long, env("KEEP_SERVER_ADDRESS"))]
#[arg(help("Server address to bind to"))] #[arg(help("Server address to bind to"))]
pub server_address: Option<String>, pub server_address: Option<String>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_PORT"))] #[arg(help_heading("Server Options"), long, env("KEEP_SERVER_PORT"))]
#[arg(help("Server port to bind to"))] #[arg(help("Server port to bind to"))]
pub server_port: Option<u16>, pub server_port: Option<u16>,
#[cfg(feature = "tls")] #[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_CERT"))] #[arg(help_heading("Server Options"), long, env("KEEP_SERVER_CERT"))]
#[arg(help("Path to TLS certificate file (PEM) for HTTPS"))] #[arg(help("Path to TLS certificate file (PEM) for HTTPS"))]
pub server_cert: Option<PathBuf>, pub server_cert: Option<PathBuf>,
#[cfg(feature = "tls")] #[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_KEY"))] #[arg(help_heading("Server Options"), long, env("KEEP_SERVER_KEY"))]
#[arg(help("Path to TLS private key file (PEM) for HTTPS"))] #[arg(help("Path to TLS private key file (PEM) for HTTPS"))]
pub server_key: Option<PathBuf>, pub server_key: Option<PathBuf>,
@@ -201,14 +200,14 @@ pub struct ItemArgs {
#[arg(help("Filter string to apply to content when getting items"))] #[arg(help("Filter string to apply to content when getting items"))]
pub filters: Option<String>, pub filters: Option<String>,
#[arg( #[arg(help_heading("Export Options"), long, default_value = "{name}_{ts}")]
help_heading("Export Options"), #[arg(help("Template for export tar filename (appends .keep.tar). Variables: {name} {ts}"))]
long,
default_value = "{id}_{tags}_{ts}"
)]
#[arg(help("Template for export filename. Variables: {id} {tags} {ts} {compression}"))]
pub export_filename_format: String, pub export_filename_format: String,
#[arg(help_heading("Export Options"), long, value_name("NAME"))]
#[arg(help("Export name used for {name} variable (default: export_<common-tags>)"))]
pub export_name: Option<String>,
#[arg(help_heading("Import Options"), long, value_name("DATA_FILE"))] #[arg(help_heading("Import Options"), long, value_name("DATA_FILE"))]
#[arg(help("Data file for import (reads from stdin if omitted)"))] #[arg(help("Data file for import (reads from stdin if omitted)"))]
pub import_data_file: Option<PathBuf>, pub import_data_file: Option<PathBuf>,
@@ -228,7 +227,7 @@ pub struct OptionsArgs {
#[arg( #[arg(
long, long,
env("KEEP_LIST_FORMAT"), env("KEEP_LIST_FORMAT"),
default_value("id,time,size,tags,meta:hostname") default_value("id,time,size,meta:text_line_count,tags,meta:hostname_short,meta:command")
)] )]
#[arg(help("A comma separated list of columns to display with --list"))] #[arg(help("A comma separated list of columns to display with --list"))]
pub list_format: String, pub list_format: String,
@@ -253,24 +252,29 @@ pub struct OptionsArgs {
#[arg(help("Output format (only works with --info, --status, --list)"))] #[arg(help("Output format (only works with --info, --status, --list)"))]
pub output_format: Option<String>, pub output_format: Option<String>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_PASSWORD"))] #[arg(help_heading("Server Options"), long, env("KEEP_SERVER_PASSWORD"))]
#[arg(help("Password for server authentication (requires --server)"))] #[arg(help("Password for server authentication (requires --server)"))]
pub server_password: Option<String>, pub server_password: Option<String>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_PASSWORD_HASH"))] #[arg(help_heading("Server Options"), long, env("KEEP_SERVER_PASSWORD_HASH"))]
#[arg(help("Password hash for server authentication (requires --server)"))] #[arg(help("Password hash for server authentication (requires --server)"))]
pub server_password_hash: Option<String>, pub server_password_hash: Option<String>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_USERNAME"))] #[arg(help_heading("Server Options"), long, env("KEEP_SERVER_USERNAME"))]
#[arg(help( #[arg(help(
"Username for server Basic authentication (requires --server, defaults to 'keep')" "Username for server Basic authentication (requires --server, defaults to 'keep')"
))] ))]
pub server_username: Option<String>, pub server_username: Option<String>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_JWT_SECRET"))] #[arg(help_heading("Server Options"), long, env("KEEP_SERVER_JWT_SECRET"))]
#[arg(help("JWT secret for token-based authentication (requires --server)"))] #[arg(help("JWT secret for token-based authentication (requires --server)"))]
pub server_jwt_secret: Option<String>, pub server_jwt_secret: Option<String>,
#[cfg(feature = "server")]
#[arg( #[arg(
help_heading("Server Options"), help_heading("Server Options"),
long, long,
@@ -279,6 +283,7 @@ pub struct OptionsArgs {
#[arg(help("Path to file containing JWT secret (requires --server)"))] #[arg(help("Path to file containing JWT secret (requires --server)"))]
pub server_jwt_secret_file: Option<PathBuf>, pub server_jwt_secret_file: Option<PathBuf>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_MAX_BODY_SIZE"))] #[arg(help_heading("Server Options"), long, env("KEEP_SERVER_MAX_BODY_SIZE"))]
#[arg(help("Maximum request body size in bytes (requires --server, default: unlimited)"))] #[arg(help("Maximum request body size in bytes (requires --server, default: unlimited)"))]
pub server_max_body_size: Option<u64>, pub server_max_body_size: Option<u64>,

View File

@@ -1,20 +1,9 @@
use crate::services::error::CoreError; use crate::services::{ItemInfo, error::CoreError};
use base64::Engine; use base64::Engine;
use serde::de::DeserializeOwned; use serde::de::DeserializeOwned;
use std::collections::HashMap; use std::collections::HashMap;
use std::io::Read; use std::io::Read;
/// Item information returned from the server API.
#[derive(Debug, Clone, serde::Deserialize, serde::Serialize)]
pub struct ItemInfo {
pub id: i64,
pub ts: String,
pub size: Option<i64>,
pub compression: String,
pub tags: Vec<String>,
pub metadata: HashMap<String, String>,
}
/// Percent-encode a value for use in a URL query string. /// Percent-encode a value for use in a URL query string.
fn url_encode(s: &str) -> String { fn url_encode(s: &str) -> String {
let mut result = String::with_capacity(s.len() * 3); let mut result = String::with_capacity(s.len() * 3);
@@ -33,6 +22,18 @@ fn url_encode(s: &str) -> String {
result result
} }
fn append_query_params(url: &mut String, params: &[(&str, &str)]) {
if !params.is_empty() {
url.push('?');
for (i, (key, value)) in params.iter().enumerate() {
if i > 0 {
url.push('&');
}
url.push_str(&format!("{}={}", url_encode(key), url_encode(value)));
}
}
}
pub struct KeepClient { pub struct KeepClient {
base_url: String, base_url: String,
agent: ureq::Agent, agent: ureq::Agent,
@@ -125,15 +126,7 @@ impl KeepClient {
params: &[(&str, &str)], params: &[(&str, &str)],
) -> Result<T, CoreError> { ) -> Result<T, CoreError> {
let mut url = self.url(path); let mut url = self.url(path);
if !params.is_empty() { append_query_params(&mut url, params);
url.push('?');
for (i, (key, value)) in params.iter().enumerate() {
if i > 0 {
url.push('&');
}
url.push_str(&format!("{}={}", url_encode(key), url_encode(value)));
}
}
let mut req = self.agent.get(&url); let mut req = self.agent.get(&url);
if let Some(ref auth) = self.auth_header() { if let Some(ref auth) = self.auth_header() {
req = req.header("Authorization", auth); req = req.header("Authorization", auth);
@@ -178,15 +171,7 @@ impl KeepClient {
params: &[(&str, &str)], params: &[(&str, &str)],
) -> Result<ItemInfo, CoreError> { ) -> Result<ItemInfo, CoreError> {
let mut url = self.url(path); let mut url = self.url(path);
if !params.is_empty() { append_query_params(&mut url, params);
url.push('?');
for (i, (key, value)) in params.iter().enumerate() {
if i > 0 {
url.push('&');
}
url.push_str(&format!("{}={}", url_encode(key), url_encode(value)));
}
}
let mut req = self.agent.post(&url); let mut req = self.agent.post(&url);
if let Some(ref auth) = self.auth_header() { if let Some(ref auth) = self.auth_header() {
@@ -244,15 +229,22 @@ impl KeepClient {
#[derive(serde::Deserialize)] #[derive(serde::Deserialize)]
struct ApiResponse { struct ApiResponse {
data: Option<ItemInfo>, data: Option<ItemInfo>,
error: Option<String>,
} }
let response: ApiResponse = self.get_json(&format!("/api/item/{id}/info"))?; let response: ApiResponse = self.get_json(&format!("/api/item/{id}/info"))?;
response response.data.ok_or_else(|| {
.data CoreError::Other(anyhow::anyhow!(
.ok_or_else(|| CoreError::Other(anyhow::anyhow!("Item not found"))) "{}",
response
.error
.unwrap_or_else(|| "Item not found".to_string())
))
})
} }
pub fn list_items( pub fn list_items(
&self, &self,
ids: &[i64],
tags: &[String], tags: &[String],
order: &str, order: &str,
start: u64, start: u64,
@@ -262,12 +254,22 @@ impl KeepClient {
#[derive(serde::Deserialize)] #[derive(serde::Deserialize)]
struct ApiResponse { struct ApiResponse {
data: Option<Vec<ItemInfo>>, data: Option<Vec<ItemInfo>>,
error: Option<String>,
} }
let mut params: Vec<(String, String)> = Vec::new(); let mut params: Vec<(String, String)> = Vec::new();
params.push(("order".to_string(), order.to_string())); params.push(("order".to_string(), order.to_string()));
params.push(("start".to_string(), start.to_string())); params.push(("start".to_string(), start.to_string()));
params.push(("count".to_string(), count.to_string())); params.push(("count".to_string(), count.to_string()));
if !ids.is_empty() {
params.push((
"ids".to_string(),
ids.iter()
.map(|i| i.to_string())
.collect::<Vec<_>>()
.join(","),
));
}
if !tags.is_empty() { if !tags.is_empty() {
params.push(("tags".to_string(), tags.join(","))); params.push(("tags".to_string(), tags.join(",")));
} }
@@ -284,7 +286,13 @@ impl KeepClient {
.collect(); .collect();
let response: ApiResponse = self.get_json_with_query("/api/item/", &param_refs)?; let response: ApiResponse = self.get_json_with_query("/api/item/", &param_refs)?;
Ok(response.data.unwrap_or_default()) if let Some(data) = response.data {
return Ok(data);
}
if let Some(err) = response.error {
return Err(CoreError::Other(anyhow::anyhow!("Server error: {err}")));
}
Ok(Vec::new())
} }
pub fn save_item( pub fn save_item(
@@ -344,9 +352,9 @@ impl KeepClient {
/// Set the uncompressed size for an item. /// Set the uncompressed size for an item.
pub fn set_item_size(&self, id: i64, size: u64) -> Result<(), CoreError> { pub fn set_item_size(&self, id: i64, size: u64) -> Result<(), CoreError> {
let url = format!( let url = format!(
"{}?size={}", "{}?uncompressed_size={}",
self.url(&format!("/api/item/{id}/update")), self.url(&format!("/api/item/{id}/update")),
size url_encode(&size.to_string())
); );
let mut req = self.agent.post(&url); let mut req = self.agent.post(&url);
if let Some(ref auth) = self.auth_header() { if let Some(ref auth) = self.auth_header() {
@@ -387,7 +395,7 @@ impl KeepClient {
.headers() .headers()
.get("X-Keep-Compression") .get("X-Keep-Compression")
.and_then(|v| v.to_str().ok()) .and_then(|v| v.to_str().ok())
.unwrap_or("none") .unwrap_or("raw")
.to_string(); .to_string();
let reader = response.into_body().into_reader(); let reader = response.into_body().into_reader();
@@ -406,4 +414,101 @@ impl KeepClient {
let response: ApiResponse = self.get_json_with_query("/api/diff", &param_refs)?; let response: ApiResponse = self.get_json_with_query("/api/diff", &param_refs)?;
Ok(response.data.unwrap_or_default()) Ok(response.data.unwrap_or_default())
} }
/// Export items to a tar archive, streaming the response to a file.
///
/// # Arguments
///
/// * `ids` - Item IDs to export (mutually exclusive with tags).
/// * `tags` - Tags to search for items (mutually exclusive with ids).
/// * `dest` - Destination file path.
pub fn export_items_to_file(
&self,
ids: &[i64],
tags: &[String],
dest: &std::path::Path,
) -> Result<(), CoreError> {
let mut params: Vec<(String, String)> = Vec::new();
if !ids.is_empty() {
let id_strs: Vec<String> = ids.iter().map(|id| id.to_string()).collect();
params.push(("ids".to_string(), id_strs.join(",")));
}
if !tags.is_empty() {
params.push(("tags".to_string(), tags.join(",")));
}
let param_refs: Vec<(&str, &str)> = params
.iter()
.map(|(k, v)| (k.as_str(), v.as_str()))
.collect();
let mut url = self.url("/api/export");
append_query_params(&mut url, &param_refs);
let mut req = self.agent.get(&url);
if let Some(ref auth) = self.auth_header() {
req = req.header("Authorization", auth);
}
let response = self.handle_error(req.call())?;
let mut reader = response.into_body().into_reader();
let mut file = std::fs::File::create(dest).map_err(CoreError::Io)?;
let mut buf = [0u8; crate::common::PIPESIZE];
loop {
let n = reader.read(&mut buf).map_err(CoreError::Io)?;
if n == 0 {
break;
}
std::io::Write::write_all(&mut file, &buf[..n]).map_err(CoreError::Io)?;
}
Ok(())
}
/// Import items from a tar archive, streaming the file to the server.
///
/// # Arguments
///
/// * `tar_path` - Path to the `.keep.tar` file.
///
/// # Returns
///
/// A list of newly assigned item IDs.
pub fn import_tar_file(&self, tar_path: &std::path::Path) -> Result<Vec<i64>, CoreError> {
#[derive(serde::Deserialize)]
struct ApiResponse {
data: Option<ImportResponse>,
error: Option<String>,
}
#[derive(serde::Deserialize)]
struct ImportResponse {
ids: Vec<i64>,
}
let mut file = std::fs::File::open(tar_path).map_err(CoreError::Io)?;
let url = self.url("/api/import");
let mut req = self.agent.post(&url);
if let Some(ref auth) = self.auth_header() {
req = req.header("Authorization", auth);
}
req = req.header("Content-Type", "application/x-tar");
let response = self.handle_error(req.send(ureq::SendBody::from_reader(&mut file)))?;
let body = response
.into_body()
.read_to_string()
.map_err(|e| CoreError::InvalidInput(format!("Cannot read response: {e}")))?;
let api_response: ApiResponse = serde_json::from_str(&body)
.map_err(|e| CoreError::InvalidInput(format!("Cannot parse response: {e}")))?;
if let Some(error) = api_response.error {
return Err(CoreError::InvalidInput(error));
}
Ok(api_response.data.map(|d| d.ids).unwrap_or_default())
}
} }

View File

@@ -149,7 +149,7 @@ fn has_binary_signature(data: &[u8]) -> bool {
/// Check if data looks like UTF-16 without BOM /// Check if data looks like UTF-16 without BOM
fn looks_like_utf16(data: &[u8]) -> bool { fn looks_like_utf16(data: &[u8]) -> bool {
if data.len() < 4 || !data.len().is_multiple_of(2) { if data.len() < 4 || data.len() % 2 != 0 {
return false; return false;
} }

View File

@@ -82,3 +82,10 @@ pub fn read_with_bounds<R: std::io::Read>(
} }
Ok(result) Ok(result)
} }
/// Sanitize a timestamp string for use in filenames.
///
/// Replaces colons with hyphens (e.g., `2026-03-17T12:00:00Z` → `2026-03-17T12-00-00Z`).
pub fn sanitize_ts_string(ts: &str) -> String {
ts.replace(':', "-")
}

View File

@@ -89,7 +89,7 @@ pub fn generate_status_info(
}; };
let _default_type = crate::compression_engine::default_compression_type(); let _default_type = crate::compression_engine::default_compression_type();
let mut compression_info = Vec::new(); let mut compression_info = Vec::with_capacity(CompressionType::iter().count());
// Sort compression types by their string representation // Sort compression types by their string representation
let mut sorted_compression_types: Vec<CompressionType> = CompressionType::iter().collect(); let mut sorted_compression_types: Vec<CompressionType> = CompressionType::iter().collect();
@@ -141,7 +141,8 @@ pub fn generate_status_info(
}); });
} }
let mut meta_plugins_map = std::collections::HashMap::new(); let mut meta_plugins_map =
std::collections::HashMap::with_capacity(MetaPluginType::iter().count());
let mut enabled_meta_plugins_vec = Vec::new(); let mut enabled_meta_plugins_vec = Vec::new();
// Sort meta plugin types by their string representation to avoid creating plugins just for sorting // Sort meta plugin types by their string representation to avoid creating plugins just for sorting

View File

@@ -93,10 +93,22 @@ impl<W: Write> Drop for AutoFinishGzEncoder<W> {
#[cfg(feature = "gzip")] #[cfg(feature = "gzip")]
impl<W: Write> Write for AutoFinishGzEncoder<W> { impl<W: Write> Write for AutoFinishGzEncoder<W> {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> { fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
self.encoder.as_mut().unwrap().write(buf) match self.encoder.as_mut() {
Some(encoder) => encoder.write(buf),
None => Err(io::Error::new(
io::ErrorKind::BrokenPipe,
"encoder already finished",
)),
}
} }
fn flush(&mut self) -> io::Result<()> { fn flush(&mut self) -> io::Result<()> {
self.encoder.as_mut().unwrap().flush() match self.encoder.as_mut() {
Some(encoder) => encoder.flush(),
None => Err(io::Error::new(
io::ErrorKind::BrokenPipe,
"encoder already finished",
)),
}
} }
} }

View File

@@ -7,16 +7,14 @@ use strum::{Display, EnumIter, EnumString};
use log::*; use log::*;
use lazy_static::lazy_static;
extern crate enum_map; extern crate enum_map;
use enum_map::enum_map; use enum_map::enum_map;
use enum_map::{Enum, EnumMap}; use enum_map::{Enum, EnumMap};
pub mod gzip; pub mod gzip;
pub mod lz4; pub mod lz4;
pub mod none;
pub mod program; pub mod program;
pub mod raw;
pub mod zstd; pub mod zstd;
use crate::compression_engine::program::CompressionEngineProgram; use crate::compression_engine::program::CompressionEngineProgram;
@@ -45,8 +43,8 @@ pub enum CompressionType {
XZ, XZ,
#[strum(serialize = "zstd")] #[strum(serialize = "zstd")]
ZStd, ZStd,
#[strum(serialize = "none")] #[strum(to_string = "raw", serialize = "raw", serialize = "none")]
None, Raw,
} }
/// Trait defining the interface for compression engines. /// Trait defining the interface for compression engines.
@@ -180,63 +178,65 @@ impl Clone for Box<dyn CompressionEngine> {
} }
} }
lazy_static! { fn init_compression_engines() -> EnumMap<CompressionType, Box<dyn CompressionEngine>> {
static ref COMPRESSION_ENGINES: EnumMap<CompressionType, Box<dyn CompressionEngine>> = { #[allow(unused_mut)]
#[allow(unused_mut)] // mut needed when gzip/lz4 features are enabled let mut em: EnumMap<CompressionType, Box<dyn CompressionEngine>> = enum_map! {
let mut em = enum_map! { CompressionType::LZ4 => Box::new(crate::compression_engine::program::CompressionEngineProgram::new(
CompressionType::LZ4 => Box::new(crate::compression_engine::program::CompressionEngineProgram::new( "lz4",
"lz4", vec!["-c"],
vec!["-c"], vec!["-d", "-c"]
vec!["-d", "-c"] )) as Box<dyn CompressionEngine>,
)) as Box<dyn CompressionEngine>, CompressionType::GZip => Box::new(crate::compression_engine::program::CompressionEngineProgram::new(
CompressionType::GZip => Box::new(crate::compression_engine::program::CompressionEngineProgram::new( "gzip",
"gzip", vec!["-c"],
vec!["-c"], vec!["-d", "-c"]
vec!["-d", "-c"] )) as Box<dyn CompressionEngine>,
)) as Box<dyn CompressionEngine>, CompressionType::BZip2 => Box::new(crate::compression_engine::program::CompressionEngineProgram::new(
CompressionType::BZip2 => Box::new(crate::compression_engine::program::CompressionEngineProgram::new( "bzip2",
"bzip2", vec!["-c"],
vec!["-c"], vec!["-d", "-c"]
vec!["-d", "-c"] )) as Box<dyn CompressionEngine>,
)) as Box<dyn CompressionEngine>, CompressionType::XZ => Box::new(crate::compression_engine::program::CompressionEngineProgram::new(
CompressionType::XZ => Box::new(crate::compression_engine::program::CompressionEngineProgram::new( "xz",
"xz", vec!["-c"],
vec!["-c"], vec!["-d", "-c"]
vec!["-d", "-c"] )) as Box<dyn CompressionEngine>,
)) as Box<dyn CompressionEngine>, CompressionType::ZStd => Box::new(crate::compression_engine::program::CompressionEngineProgram::new(
CompressionType::ZStd => Box::new(crate::compression_engine::program::CompressionEngineProgram::new( "zstd",
"zstd", vec!["-c"],
vec!["-c"], vec!["-d", "-c"]
vec!["-d", "-c"] )) as Box<dyn CompressionEngine>,
)) as Box<dyn CompressionEngine>, CompressionType::Raw => Box::new(crate::compression_engine::raw::CompressionEngineRaw::new()) as Box<dyn CompressionEngine>
CompressionType::None => Box::new(crate::compression_engine::none::CompressionEngineNone::new()) as Box<dyn CompressionEngine>
};
#[cfg(feature = "gzip")]
{
em[CompressionType::GZip] =
Box::new(crate::compression_engine::gzip::CompressionEngineGZip::new())
as Box<dyn CompressionEngine>;
}
#[cfg(feature = "lz4")]
{
em[CompressionType::LZ4] =
Box::new(crate::compression_engine::lz4::CompressionEngineLZ4::new())
as Box<dyn CompressionEngine>;
}
#[cfg(feature = "zstd")]
{
em[CompressionType::ZStd] =
Box::new(crate::compression_engine::zstd::CompressionEngineZstd::new())
as Box<dyn CompressionEngine>;
}
em
}; };
#[cfg(feature = "gzip")]
{
em[CompressionType::GZip] =
Box::new(crate::compression_engine::gzip::CompressionEngineGZip::new())
as Box<dyn CompressionEngine>;
}
#[cfg(feature = "lz4")]
{
em[CompressionType::LZ4] =
Box::new(crate::compression_engine::lz4::CompressionEngineLZ4::new())
as Box<dyn CompressionEngine>;
}
#[cfg(feature = "zstd")]
{
em[CompressionType::ZStd] =
Box::new(crate::compression_engine::zstd::CompressionEngineZstd::new())
as Box<dyn CompressionEngine>;
}
em
} }
static COMPRESSION_ENGINES: std::sync::LazyLock<
EnumMap<CompressionType, Box<dyn CompressionEngine>>,
> = std::sync::LazyLock::new(init_compression_engines);
pub fn default_compression_type() -> CompressionType { pub fn default_compression_type() -> CompressionType {
CompressionType::LZ4 CompressionType::LZ4
} }

View File

@@ -15,7 +15,13 @@ pub struct ProgramReader {
impl Read for ProgramReader { impl Read for ProgramReader {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> { fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
self.stdout.as_mut().unwrap().read(buf) match self.stdout.as_mut() {
Some(stdout) => stdout.read(buf),
None => Err(std::io::Error::new(
std::io::ErrorKind::BrokenPipe,
"stdout already taken",
)),
}
} }
} }
@@ -33,11 +39,23 @@ pub struct ProgramWriter {
impl Write for ProgramWriter { impl Write for ProgramWriter {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> { fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
self.stdin.as_mut().unwrap().write(buf) match self.stdin.as_mut() {
Some(stdin) => stdin.write(buf),
None => Err(std::io::Error::new(
std::io::ErrorKind::BrokenPipe,
"stdin already taken",
)),
}
} }
fn flush(&mut self) -> std::io::Result<()> { fn flush(&mut self) -> std::io::Result<()> {
self.stdin.as_mut().unwrap().flush() match self.stdin.as_mut() {
Some(stdin) => stdin.flush(),
None => Err(std::io::Error::new(
std::io::ErrorKind::BrokenPipe,
"stdin already taken",
)),
}
} }
} }

View File

@@ -7,15 +7,15 @@ use std::path::PathBuf;
use crate::compression_engine::CompressionEngine; use crate::compression_engine::CompressionEngine;
#[derive(Debug, Eq, PartialEq, Clone, Default)] #[derive(Debug, Eq, PartialEq, Clone, Default)]
pub struct CompressionEngineNone {} pub struct CompressionEngineRaw {}
impl CompressionEngineNone { impl CompressionEngineRaw {
pub fn new() -> CompressionEngineNone { pub fn new() -> CompressionEngineRaw {
CompressionEngineNone {} CompressionEngineRaw {}
} }
} }
impl CompressionEngine for CompressionEngineNone { impl CompressionEngine for CompressionEngineRaw {
fn is_supported(&self) -> bool { fn is_supported(&self) -> bool {
true true
} }

View File

@@ -217,6 +217,9 @@ pub struct Settings {
// Export filename format template (--export-filename-format) // Export filename format template (--export-filename-format)
#[serde(skip)] #[serde(skip)]
pub export_filename_format: String, pub export_filename_format: String,
// Export name for {name} variable (--export-name)
#[serde(skip)]
pub export_name: Option<String>,
// Import data file path (--import-data-file) // Import data file path (--import-data-file)
#[serde(skip)] #[serde(skip)]
pub import_data_file: Option<std::path::PathBuf>, pub import_data_file: Option<std::path::PathBuf>,
@@ -232,15 +235,13 @@ impl Settings {
} else if let Ok(env_config) = std::env::var("KEEP_CONFIG") { } else if let Ok(env_config) = std::env::var("KEEP_CONFIG") {
PathBuf::from(env_config) PathBuf::from(env_config)
} else { } else {
let default_path = if let Ok(home_dir) = std::env::var("HOME") { let default_path = dirs::config_dir()
let mut path = PathBuf::from(home_dir); .map(|mut p| {
path.push(".config"); p.push("keep");
path.push("keep"); p.push("config.yml");
path.push("config.yml"); p
path })
} else { .unwrap_or_else(|| PathBuf::from("~/.config/keep/config.yml"));
PathBuf::from("~/.config/keep/config.yml")
};
debug!("CONFIG: Using default config path: {default_path:?}"); debug!("CONFIG: Using default config path: {default_path:?}");
default_path default_path
}; };
@@ -300,42 +301,48 @@ impl Settings {
config_builder = config_builder.set_override("force", true)?; config_builder = config_builder.set_override("force", true)?;
} }
#[cfg(feature = "server")]
if let Some(server_password) = &args.options.server_password { if let Some(server_password) = &args.options.server_password {
config_builder = config_builder =
config_builder.set_override("server.password", server_password.as_str())?; config_builder.set_override("server.password", server_password.as_str())?;
} }
#[cfg(feature = "server")]
if let Some(server_password_hash) = &args.options.server_password_hash { if let Some(server_password_hash) = &args.options.server_password_hash {
config_builder = config_builder config_builder = config_builder
.set_override("server.password_hash", server_password_hash.as_str())?; .set_override("server.password_hash", server_password_hash.as_str())?;
} }
#[cfg(feature = "server")]
if let Some(server_username) = &args.options.server_username { if let Some(server_username) = &args.options.server_username {
config_builder = config_builder =
config_builder.set_override("server.username", server_username.as_str())?; config_builder.set_override("server.username", server_username.as_str())?;
} }
#[cfg(feature = "server")]
if let Some(server_address) = &args.mode.server_address { if let Some(server_address) = &args.mode.server_address {
config_builder = config_builder =
config_builder.set_override("server.address", server_address.as_str())?; config_builder.set_override("server.address", server_address.as_str())?;
} }
#[cfg(feature = "server")]
if let Some(server_port) = args.mode.server_port { if let Some(server_port) = args.mode.server_port {
config_builder = config_builder.set_override("server.port", server_port)?; config_builder = config_builder.set_override("server.port", server_port)?;
} }
#[cfg(feature = "tls")] #[cfg(feature = "server")]
if let Some(server_cert) = &args.mode.server_cert { if let Some(server_cert) = &args.mode.server_cert {
config_builder = config_builder config_builder = config_builder
.set_override("server.cert_file", server_cert.to_string_lossy().as_ref())?; .set_override("server.cert_file", server_cert.to_string_lossy().as_ref())?;
} }
#[cfg(feature = "tls")] #[cfg(feature = "server")]
if let Some(server_key) = &args.mode.server_key { if let Some(server_key) = &args.mode.server_key {
config_builder = config_builder config_builder = config_builder
.set_override("server.key_file", server_key.to_string_lossy().as_ref())?; .set_override("server.key_file", server_key.to_string_lossy().as_ref())?;
} }
#[cfg(feature = "server")]
if let Some(max_body_size) = args.options.server_max_body_size { if let Some(max_body_size) = args.options.server_max_body_size {
config_builder = config_builder.set_override("server.max_body_size", max_body_size)?; config_builder = config_builder.set_override("server.max_body_size", max_body_size)?;
} }
@@ -488,7 +495,9 @@ impl Settings {
} }
// Override list_format from --list-format CLI arg // Override list_format from --list-format CLI arg
if args.options.list_format != "id,time,size,tags,meta:hostname" { if args.options.list_format
!= "id,time,size,meta:text_line_count,tags,meta:hostname_short,meta:command"
{
debug!("CONFIG: Overriding list_format from --list-format CLI arg"); debug!("CONFIG: Overriding list_format from --list-format CLI arg");
settings.list_format = Settings::parse_list_format(&args.options.list_format); settings.list_format = Settings::parse_list_format(&args.options.list_format);
} }
@@ -540,6 +549,7 @@ impl Settings {
// Set export filename format from CLI args // Set export filename format from CLI args
settings.export_filename_format = args.item.export_filename_format.clone(); settings.export_filename_format = args.item.export_filename_format.clone();
settings.export_name = args.item.export_name.clone();
settings.import_data_file = args.item.import_data_file.clone(); settings.import_data_file = args.item.import_data_file.clone();
// Expand ~ in all path fields // Expand ~ in all path fields
@@ -692,6 +702,14 @@ impl Settings {
.unwrap_or_default() .unwrap_or_default()
} }
/// Returns the metadata filter as a HashMap.
///
/// Converts the `meta` field (list of key-value pairs from CLI --meta flags)
/// into a `HashMap<String, Option<String>>` suitable for filtering.
pub fn meta_filter(&self) -> std::collections::HashMap<String, Option<String>> {
self.meta.iter().cloned().collect()
}
/// Validates the configuration against plugin schemas. /// Validates the configuration against plugin schemas.
/// ///
/// Checks that: /// Checks that:

156
src/db.rs
View File

@@ -1,8 +1,7 @@
use anyhow::{Context, Error, Result, anyhow}; use anyhow::{Context, Error, Result, anyhow};
use chrono::prelude::*; use chrono::prelude::*;
use lazy_static::lazy_static;
use log::*; use log::*;
use rusqlite::{Connection, OpenFlags, params}; use rusqlite::{Connection, OpenFlags, Row, params};
use rusqlite_migration::{M, Migrations}; use rusqlite_migration::{M, Migrations};
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
use std::collections::HashMap; use std::collections::HashMap;
@@ -19,7 +18,7 @@ and query utilities for efficient data access.
# Schema # Schema
The database uses three main tables: The database uses three main tables:
- `items`: Core item information (ID, timestamp, size, compression). - `items`: Core item information (ID, timestamp, uncompressed_size, compressed_size, closed, compression).
- `tags`: Item-tag associations (many-to-many). - `tags`: Item-tag associations (many-to-many).
- `metas`: Item-metadata associations (many-to-many). - `metas`: Item-metadata associations (many-to-many).
@@ -42,30 +41,26 @@ let conn = db::open(PathBuf::from("keep.db"))?;
``` ```
Insert an item: Insert an item:
```ignore ```ignore
let item = db::Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() }; let item = db::Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
let id = db::insert_item(&conn, item)?; let id = db::insert_item(&conn, item)?;
``` ```
*/ */
lazy_static! { static MIGRATIONS: std::sync::LazyLock<Migrations<'static>> = std::sync::LazyLock::new(|| {
// Database schema migrations for the Keep application. Migrations::new(vec![
//
// Defines the sequence of migrations to create and update the schema.
// Applied automatically when opening a database connection.
static ref MIGRATIONS: Migrations<'static> = Migrations::new(vec![
M::up( M::up(
"CREATE TABLE items( "CREATE TABLE items(
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
ts TEXT NOT NULL, ts TEXT NOT NULL,
size INTEGER NULL, size INTEGER NULL,
compression TEXT NOT NULL)" compression TEXT NOT NULL)",
), ),
M::up( M::up(
"CREATE TABLE tags ( "CREATE TABLE tags (
id INTEGER NOT NULL, id INTEGER NOT NULL,
name TEXT NOT NULL, name TEXT NOT NULL,
FOREIGN KEY(id) REFERENCES items(id) ON DELETE CASCADE, FOREIGN KEY(id) REFERENCES items(id) ON DELETE CASCADE,
PRIMARY KEY(id, name));" PRIMARY KEY(id, name));",
), ),
M::up( M::up(
"CREATE TABLE metas ( "CREATE TABLE metas (
@@ -73,12 +68,17 @@ lazy_static! {
name TEXT NOT NULL, name TEXT NOT NULL,
value TEXT NOT NULL, value TEXT NOT NULL,
FOREIGN KEY(id) REFERENCES items(id) ON DELETE CASCADE, FOREIGN KEY(id) REFERENCES items(id) ON DELETE CASCADE,
PRIMARY KEY(id, name));" PRIMARY KEY(id, name));",
), ),
M::up("CREATE INDEX idx_tags_name ON tags(name)"), M::up("CREATE INDEX idx_tags_name ON tags(name)"),
M::up("CREATE INDEX idx_metas_name ON metas(name)"), M::up("CREATE INDEX idx_metas_name ON metas(name)"),
]); M::up("CREATE INDEX idx_items_ts ON items(ts)"),
} M::up("UPDATE items SET compression = 'raw' WHERE compression = 'none'"),
M::up("ALTER TABLE items RENAME COLUMN size TO uncompressed_size"),
M::up("ALTER TABLE items ADD COLUMN compressed_size INTEGER NULL"),
M::up("ALTER TABLE items ADD COLUMN closed BOOLEAN NOT NULL DEFAULT 1"),
])
});
/// Represents an item stored in the database. /// Represents an item stored in the database.
/// ///
@@ -88,7 +88,9 @@ lazy_static! {
/// ///
/// * `id` - Unique identifier, `None` for new items. /// * `id` - Unique identifier, `None` for new items.
/// * `ts` - Creation timestamp in UTC. /// * `ts` - Creation timestamp in UTC.
/// * `size` - Content size in bytes, `None` if not set. /// * `uncompressed_size` - Uncompressed content size in bytes, `None` if not set.
/// * `compressed_size` - Compressed file size on disk, `None` if not set.
/// * `closed` - Whether the item has been fully written and closed.
/// * `compression` - Compression algorithm used. /// * `compression` - Compression algorithm used.
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Item { pub struct Item {
@@ -96,12 +98,27 @@ pub struct Item {
pub id: Option<i64>, pub id: Option<i64>,
/// Timestamp when the item was created. /// Timestamp when the item was created.
pub ts: DateTime<Utc>, pub ts: DateTime<Utc>,
/// Size of the item content in bytes, None if not set. /// Uncompressed size of the item content in bytes, None if not set.
pub size: Option<i64>, pub uncompressed_size: Option<i64>,
/// Compressed file size on disk in bytes, None if not set.
pub compressed_size: Option<i64>,
/// Whether the item has been fully written and closed.
pub closed: bool,
/// Compression algorithm used for the item content. /// Compression algorithm used for the item content.
pub compression: String, pub compression: String,
} }
fn item_from_row(row: &Row) -> Result<Item> {
Ok(Item {
id: row.get(0)?,
ts: row.get(1)?,
uncompressed_size: row.get(2)?,
compressed_size: row.get(3)?,
closed: row.get(4)?,
compression: row.get(5)?,
})
}
/// Represents a tag associated with an item. /// Represents a tag associated with an item.
/// ///
/// Defines the relationship between items and tags in a many-to-many structure. /// Defines the relationship between items and tags in a many-to-many structure.
@@ -223,7 +240,9 @@ pub fn open(path: PathBuf) -> Result<Connection, Error> {
/// let item = Item { /// let item = Item {
/// id: None, /// id: None,
/// ts: Utc::now(), /// ts: Utc::now(),
/// size: None, /// uncompressed_size: None,
/// compressed_size: None,
/// closed: false,
/// compression: "lz4".to_string(), /// compression: "lz4".to_string(),
/// }; /// };
/// let id = db::insert_item(&conn, item)?; /// let id = db::insert_item(&conn, item)?;
@@ -234,8 +253,8 @@ pub fn open(path: PathBuf) -> Result<Connection, Error> {
pub fn insert_item(conn: &Connection, item: Item) -> Result<i64> { pub fn insert_item(conn: &Connection, item: Item) -> Result<i64> {
debug!("DB: Inserting item: {item:?}"); debug!("DB: Inserting item: {item:?}");
conn.execute( conn.execute(
"INSERT INTO items (ts, size, compression) VALUES (?1, ?2, ?3)", "INSERT INTO items (ts, uncompressed_size, compressed_size, closed, compression) VALUES (?1, ?2, ?3, ?4, ?5)",
params![item.ts, item.size, item.compression], params![item.ts, item.uncompressed_size, item.compressed_size, item.closed, item.compression],
)?; )?;
Ok(conn.last_insert_rowid()) Ok(conn.last_insert_rowid())
} }
@@ -282,7 +301,9 @@ pub fn create_item(
let item = Item { let item = Item {
id: None, id: None,
ts: chrono::Utc::now(), ts: chrono::Utc::now(),
size: None, uncompressed_size: None,
compressed_size: None,
closed: false,
compression: compression_type.to_string(), compression: compression_type.to_string(),
}; };
let item_id = insert_item(conn, item.clone())?; let item_id = insert_item(conn, item.clone())?;
@@ -298,7 +319,7 @@ pub fn create_item(
/// ///
/// * `conn` - Database connection. /// * `conn` - Database connection.
/// * `ts` - Timestamp to use for the item. /// * `ts` - Timestamp to use for the item.
/// * `compression` - Compression type string (e.g., "lz4", "gzip", "none"). /// * `compression` - Compression type string (e.g., "lz4", "gzip", "raw").
/// ///
/// # Returns /// # Returns
/// ///
@@ -311,7 +332,9 @@ pub fn insert_item_with_ts(
let item = Item { let item = Item {
id: None, id: None,
ts, ts,
size: None, uncompressed_size: None,
compressed_size: None,
closed: false,
compression: compression.to_string(), compression: compression.to_string(),
}; };
let item_id = insert_item(conn, item.clone())?; let item_id = insert_item(conn, item.clone())?;
@@ -352,7 +375,7 @@ pub fn insert_item_with_ts(
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?; /// let item_id = db::insert_item(&conn, item)?;
/// db::add_tag(&conn, item_id, "important")?; /// db::add_tag(&conn, item_id, "important")?;
/// # Ok(()) /// # Ok(())
@@ -410,7 +433,7 @@ pub fn upsert_tag(conn: &Connection, item_id: i64, tag_name: &str) -> Result<()>
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?; /// let item_id = db::insert_item(&conn, item)?;
/// db::add_meta(&conn, item_id, "mime_type", "text/plain")?; /// db::add_meta(&conn, item_id, "mime_type", "text/plain")?;
/// # Ok(()) /// # Ok(())
@@ -455,7 +478,7 @@ pub fn add_meta(conn: &Connection, item_id: i64, name: &str, value: &str) -> Res
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), size: Some(1024), compression: "lz4".to_string(), ts: Utc::now() }; /// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: Some(1024), compressed_size: Some(512), closed: true, compression: "lz4".to_string() };
/// db::update_item(&conn, item)?; /// db::update_item(&conn, item)?;
/// # Ok(()) /// # Ok(())
/// # } /// # }
@@ -463,8 +486,8 @@ pub fn add_meta(conn: &Connection, item_id: i64, name: &str, value: &str) -> Res
pub fn update_item(conn: &Connection, item: Item) -> Result<()> { pub fn update_item(conn: &Connection, item: Item) -> Result<()> {
debug!("DB: Updating item: {item:?}"); debug!("DB: Updating item: {item:?}");
conn.execute( conn.execute(
"UPDATE items SET size=?2, compression=?3 WHERE id=?1", "UPDATE items SET uncompressed_size=?2, compressed_size=?3, closed=?4, compression=?5 WHERE id=?1",
params![item.id, item.size, item.compression,], params![item.id, item.uncompressed_size, item.compressed_size, item.closed, item.compression,],
)?; )?;
Ok(()) Ok(())
} }
@@ -499,14 +522,17 @@ pub fn update_item(conn: &Connection, item: Item) -> Result<()> {
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// db::delete_item(&conn, item)?; /// db::delete_item(&conn, item)?;
/// # Ok(()) /// # Ok(())
/// # } /// # }
/// ``` /// ```
pub fn delete_item(conn: &Connection, item: Item) -> Result<()> { pub fn delete_item(conn: &Connection, item: Item) -> Result<()> {
debug!("DB: Deleting item: {item:?}"); debug!("DB: Deleting item: {item:?}");
conn.execute("DELETE FROM items WHERE id=?1", params![item.id])?; let id = item
.id
.ok_or_else(|| anyhow::anyhow!("Cannot delete item: ID is None"))?;
conn.execute("DELETE FROM items WHERE id=?1", params![id])?;
Ok(()) Ok(())
} }
@@ -583,7 +609,7 @@ pub fn query_delete_meta(conn: &Connection, meta: Meta) -> Result<()> {
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?; /// let item_id = db::insert_item(&conn, item)?;
/// let meta = Meta { id: item_id, name: "mime_type".to_string(), value: "text/plain".to_string() }; /// let meta = Meta { id: item_id, name: "mime_type".to_string(), value: "text/plain".to_string() };
/// db::query_upsert_meta(&conn, meta)?; /// db::query_upsert_meta(&conn, meta)?;
@@ -629,7 +655,7 @@ pub fn query_upsert_meta(conn: &Connection, meta: Meta) -> Result<()> {
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?; /// let item_id = db::insert_item(&conn, item)?;
/// // Insert new metadata /// // Insert new metadata
/// let meta = Meta { id: item_id, name: "source".to_string(), value: "cli".to_string() }; /// let meta = Meta { id: item_id, name: "source".to_string(), value: "cli".to_string() };
@@ -680,7 +706,7 @@ pub fn store_meta(conn: &Connection, meta: Meta) -> Result<()> {
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?; /// let item_id = db::insert_item(&conn, item)?;
/// let tag = Tag { id: item_id, name: "work".to_string() }; /// let tag = Tag { id: item_id, name: "work".to_string() };
/// db::insert_tag(&conn, tag)?; /// db::insert_tag(&conn, tag)?;
@@ -725,7 +751,7 @@ pub fn insert_tag(conn: &Connection, tag: Tag) -> Result<()> {
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// db::delete_item_tags(&conn, item)?; /// db::delete_item_tags(&conn, item)?;
/// # Ok(()) /// # Ok(())
/// # } /// # }
@@ -767,9 +793,9 @@ pub fn delete_item_tags(conn: &Connection, item: Item) -> Result<()> {
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?; /// let item_id = db::insert_item(&conn, item)?;
/// let item = Item { id: Some(item_id), ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: Some(item_id), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let tags = vec!["project_a".to_string(), "urgent".to_string()]; /// let tags = vec!["project_a".to_string(), "urgent".to_string()];
/// db::set_item_tags(&conn, item, &tags)?; /// db::set_item_tags(&conn, item, &tags)?;
/// # Ok(()) /// # Ok(())
@@ -830,19 +856,13 @@ pub fn set_item_tags(conn: &Connection, item: Item, tags: &Vec<String>) -> Resul
pub fn query_all_items(conn: &Connection) -> Result<Vec<Item>> { pub fn query_all_items(conn: &Connection) -> Result<Vec<Item>> {
debug!("DB: Querying all items"); debug!("DB: Querying all items");
let mut statement = conn let mut statement = conn
.prepare("SELECT id, ts, size, compression FROM items ORDER BY id ASC") .prepare("SELECT id, ts, uncompressed_size, compressed_size, closed, compression FROM items ORDER BY id ASC")
.context("Problem preparing SQL statement")?; .context("Problem preparing SQL statement")?;
let mut rows = statement.query(params![])?; let mut rows = statement.query(params![])?;
let mut items = Vec::new(); let mut items = Vec::new();
while let Some(row) = rows.next()? { while let Some(row) = rows.next()? {
let item = Item { items.push(item_from_row(row)?);
id: row.get(0)?,
ts: row.get(1)?,
size: row.get(2)?,
compression: row.get(3)?,
};
items.push(item);
} }
Ok(items) Ok(items)
@@ -888,7 +908,9 @@ pub fn query_tagged_items<'a>(conn: &'a Connection, tags: &'a Vec<String>) -> Re
" "
SELECT items.id, SELECT items.id,
items.ts, items.ts,
items.size, items.uncompressed_size,
items.compressed_size,
items.closed,
items.compression, items.compression,
count(tags_match.id) as tags_score count(tags_match.id) as tags_score
FROM items, FROM items,
@@ -911,13 +933,7 @@ pub fn query_tagged_items<'a>(conn: &'a Connection, tags: &'a Vec<String>) -> Re
let mut items = Vec::new(); let mut items = Vec::new();
while let Some(row) = rows.next()? { while let Some(row) = rows.next()? {
let item = Item { items.push(item_from_row(row)?);
id: row.get(0)?,
ts: row.get(1)?,
size: row.get(2)?,
compression: row.get(3)?,
};
items.push(item);
} }
Ok(items) Ok(items)
@@ -1106,7 +1122,7 @@ pub fn get_item_matching(
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?; /// let item_id = db::insert_item(&conn, item)?;
/// let item = db::get_item(&conn, item_id)?; /// let item = db::get_item(&conn, item_id)?;
/// assert!(item.is_some()); /// assert!(item.is_some());
@@ -1118,7 +1134,7 @@ pub fn get_item(conn: &Connection, item_id: i64) -> Result<Option<Item>> {
let mut statement = conn let mut statement = conn
.prepare_cached( .prepare_cached(
" "
SELECT id, ts, size, compression SELECT id, ts, uncompressed_size, compressed_size, closed, compression
FROM items FROM items
WHERE items.id = ?1", WHERE items.id = ?1",
) )
@@ -1130,8 +1146,10 @@ pub fn get_item(conn: &Connection, item_id: i64) -> Result<Option<Item>> {
Some(row) => Ok(Some(Item { Some(row) => Ok(Some(Item {
id: row.get(0)?, id: row.get(0)?,
ts: row.get(1)?, ts: row.get(1)?,
size: row.get(2)?, uncompressed_size: row.get(2)?,
compression: row.get(3)?, compressed_size: row.get(3)?,
closed: row.get(4)?,
compression: row.get(5)?,
})), })),
None => Ok(None), None => Ok(None),
} }
@@ -1173,7 +1191,7 @@ pub fn get_item_last(conn: &Connection) -> Result<Option<Item>> {
let mut statement = conn let mut statement = conn
.prepare_cached( .prepare_cached(
" "
SELECT id, ts, size, compression SELECT id, ts, uncompressed_size, compressed_size, closed, compression
FROM items FROM items
ORDER BY id DESC ORDER BY id DESC
LIMIT 1", LIMIT 1",
@@ -1186,8 +1204,10 @@ pub fn get_item_last(conn: &Connection) -> Result<Option<Item>> {
Some(row) => Ok(Some(Item { Some(row) => Ok(Some(Item {
id: row.get(0)?, id: row.get(0)?,
ts: row.get(1)?, ts: row.get(1)?,
size: row.get(2)?, uncompressed_size: row.get(2)?,
compression: row.get(3)?, compressed_size: row.get(3)?,
closed: row.get(4)?,
compression: row.get(5)?,
})), })),
None => Ok(None), None => Ok(None),
} }
@@ -1222,7 +1242,7 @@ pub fn get_item_last(conn: &Connection) -> Result<Option<Item>> {
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let tags = db::get_item_tags(&conn, &item)?; /// let tags = db::get_item_tags(&conn, &item)?;
/// # Ok(()) /// # Ok(())
/// # } /// # }
@@ -1275,7 +1295,7 @@ pub fn get_item_tags(conn: &Connection, item: &Item) -> Result<Vec<Tag>> {
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let meta = db::get_item_meta(&conn, &item)?; /// let meta = db::get_item_meta(&conn, &item)?;
/// # Ok(()) /// # Ok(())
/// # } /// # }
@@ -1330,12 +1350,12 @@ pub fn get_item_meta(conn: &Connection, item: &Item) -> Result<Vec<Meta>> {
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let meta = db::get_item_meta_name(&conn, &item, "mime_type".to_string())?; /// let meta = db::get_item_meta_name(&conn, &item, "mime_type")?;
/// # Ok(()) /// # Ok(())
/// # } /// # }
/// ``` /// ```
pub fn get_item_meta_name(conn: &Connection, item: &Item, name: String) -> Result<Option<Meta>> { pub fn get_item_meta_name(conn: &Connection, item: &Item, name: &str) -> Result<Option<Meta>> {
debug!("DB: Getting item meta name: {item:?} {name:?}"); debug!("DB: Getting item meta name: {item:?} {name:?}");
let mut statement = conn let mut statement = conn
.prepare_cached("SELECT id, name, value FROM metas WHERE id=?1 AND name=?2") .prepare_cached("SELECT id, name, value FROM metas WHERE id=?1 AND name=?2")
@@ -1382,12 +1402,12 @@ pub fn get_item_meta_name(conn: &Connection, item: &Item, name: String) -> Resul
/// let _tmp = tempfile::tempdir()?; /// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db"); /// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?; /// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() }; /// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let value = db::get_item_meta_value(&conn, &item, "source".to_string())?; /// let value = db::get_item_meta_value(&conn, &item, "source")?;
/// # Ok(()) /// # Ok(())
/// # } /// # }
/// ``` /// ```
pub fn get_item_meta_value(conn: &Connection, item: &Item, name: String) -> Result<Option<String>> { pub fn get_item_meta_value(conn: &Connection, item: &Item, name: &str) -> Result<Option<String>> {
debug!("DB: Getting item meta value: {item:?} {name:?}"); debug!("DB: Getting item meta value: {item:?} {name:?}");
let mut statement = conn let mut statement = conn
.prepare_cached("SELECT value FROM metas WHERE id=?1 AND name=?2") .prepare_cached("SELECT value FROM metas WHERE id=?1 AND name=?2")

167
src/export_tar.rs Normal file
View File

@@ -0,0 +1,167 @@
use anyhow::{Context, Result, anyhow};
use log::debug;
use std::collections::HashSet;
use std::fs;
use std::io::{Read, Seek, Write};
use std::path::Path;
use tar::{Builder, Header};
use crate::filter_plugin::FilterChain;
use crate::modes::common::ExportMeta;
use crate::services::item_service::ItemService;
use crate::services::types::ItemWithMeta;
/// Compute the intersection of all items' tag sets.
///
/// Returns sorted tags that are present on ALL items.
pub fn common_tags(items: &[ItemWithMeta]) -> Vec<String> {
if items.is_empty() {
return Vec::new();
}
let mut common: HashSet<String> = items[0].tag_names().into_iter().collect();
for item in items.iter().skip(1) {
let item_tags: HashSet<String> = item.tag_names().into_iter().collect();
common = common.intersection(&item_tags).cloned().collect();
}
let mut result: Vec<String> = common.into_iter().collect();
result.sort();
result
}
/// Resolve the export name from the CLI arg or compute default from common tags.
///
/// If `arg` is Some, uses that value directly.
/// Otherwise, computes `export_<common-tags>` or just `export` if no common tags.
pub fn export_name(arg: &Option<String>, items: &[ItemWithMeta]) -> String {
if let Some(name) = arg {
return name.clone();
}
let tags = common_tags(items);
if tags.is_empty() {
"export".to_string()
} else {
format!("export_{}", tags.join("_"))
}
}
/// Write items to a tar archive, streaming data without loading files into memory.
///
/// The archive contains `<dir_name>/<id>.data.<compression>` and
/// `<dir_name>/<id>.meta.yml` for each item.
///
/// # Arguments
///
/// * `writer` - The output writer (e.g., a File).
/// * `dir_name` - Top-level directory name inside the tar.
/// * `items` - Items to export.
/// * `data_path` - Path to the data storage directory.
/// * `filter_chain` - Optional filter chain for transforming content on export.
/// * `item_service` - Item service for streaming content.
/// * `conn` - Database connection for filter chain operations.
pub fn write_export_tar<W: Write>(
writer: W,
dir_name: &str,
items: &[ItemWithMeta],
data_path: &Path,
filter_chain: Option<&FilterChain>,
item_service: &ItemService,
conn: &rusqlite::Connection,
) -> Result<()> {
let mut builder = Builder::new(writer);
for item_with_meta in items {
let item_id = item_with_meta.item.id.context("Item missing ID")?;
let compression = &item_with_meta.item.compression;
let item_tags = item_with_meta.tag_names();
let meta_map = item_with_meta.meta_as_map();
let data_path_entry = format!("{dir_name}/{item_id}.data.{compression}");
let meta_path_entry = format!("{dir_name}/{item_id}.meta.yml");
// Meta entry (small, in-memory is fine)
let export_meta = ExportMeta {
ts: item_with_meta.item.ts,
compression: compression.clone(),
uncompressed_size: item_with_meta.item.uncompressed_size,
tags: item_tags,
metadata: meta_map,
};
let meta_yaml = serde_yaml::to_string(&export_meta)?;
let meta_bytes = meta_yaml.into_bytes();
let meta_len = meta_bytes.len() as u64;
let mut meta_header = Header::new_gnu();
meta_header.set_size(meta_len);
meta_header.set_mode(0o644);
meta_header.set_path(&meta_path_entry)?;
meta_header.set_cksum();
builder
.append(&meta_header, meta_bytes.as_slice())
.with_context(|| format!("Cannot write meta entry for item {item_id}"))?;
debug!("EXPORT_TAR: Wrote meta entry {meta_path_entry}");
// Data entry
let mut item_file_path = data_path.to_path_buf();
item_file_path.push(item_id.to_string());
if let Some(chain) = filter_chain {
// Filtered export: spool through filter chain to a temp file,
// then stream the temp file into the tar with known size.
let (mut reader, _, _) = item_service.get_item_content_info_streaming_with_chain(
conn,
item_id,
Some(chain),
)?;
let mut tmp = tempfile::NamedTempFile::new()
.context("Cannot create temp file for filtered export")?;
let mut buf = [0u8; crate::common::PIPESIZE];
loop {
let n = reader.read(&mut buf)?;
if n == 0 {
break;
}
tmp.write_all(&buf[..n])?;
}
tmp.flush()?;
let total_size = tmp.as_file().metadata()?.len();
tmp.rewind()?;
let mut data_header = Header::new_gnu();
data_header.set_size(total_size);
data_header.set_mode(0o644);
data_header.set_path(&data_path_entry)?;
data_header.set_cksum();
builder
.append(&data_header, &mut tmp)
.with_context(|| format!("Cannot write data entry for item {item_id}"))?;
debug!("EXPORT_TAR: Wrote filtered data entry {data_path_entry} ({total_size} bytes)");
} else {
// Unfiltered export: stream raw compressed file
let file = fs::File::open(&item_file_path)
.with_context(|| format!("Cannot open data file: {}", item_file_path.display()))?;
let file_size = file.metadata()?.len();
let mut data_header = Header::new_gnu();
data_header.set_size(file_size);
data_header.set_mode(0o644);
data_header.set_path(&data_path_entry)?;
data_header.set_cksum();
builder
.append(&data_header, file)
.with_context(|| format!("Cannot write data entry for item {item_id}"))?;
debug!("EXPORT_TAR: Wrote data entry {data_path_entry} ({file_size} bytes)");
}
}
builder.finish().context("Cannot finalize tar archive")?;
debug!("EXPORT_TAR: Archive finalized");
Ok(())
}

View File

@@ -164,13 +164,6 @@ impl FilterPlugin for ExecFilter {
Ok(()) Ok(())
} }
/// Clones this filter into a new boxed instance.
///
/// Creates a new instance without active process handles.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(ExecFilter { Box::new(ExecFilter {
program: self.program.clone(), program: self.program.clone(),

View File

@@ -87,21 +87,6 @@ impl FilterPlugin for GrepFilter {
Ok(()) Ok(())
} }
/// Clones this filter into a new boxed instance.
///
/// Creates a new GrepFilter with the same regex pattern.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::{FilterPlugin, GrepFilter};
/// let filter = GrepFilter::new("test".to_string()).unwrap();
/// let cloned = filter.clone_box();
/// ```
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self { Box::new(Self {
regex: self.regex.clone(), regex: self.regex.clone(),
@@ -126,11 +111,7 @@ impl FilterPlugin for GrepFilter {
/// assert!(opts[0].required); /// assert!(opts[0].required);
/// ``` /// ```
fn options(&self) -> Vec<FilterOption> { fn options(&self) -> Vec<FilterOption> {
vec![FilterOption { crate::filter_plugin::pattern_option()
name: "pattern".to_string(),
default: None,
required: true,
}]
} }
fn description(&self) -> &str { fn description(&self) -> &str {

View File

@@ -3,14 +3,7 @@ use crate::common::PIPESIZE;
use crate::services::filter_service::register_filter_plugin; use crate::services::filter_service::register_filter_plugin;
use std::io::{BufRead, Read, Result, Write}; use std::io::{BufRead, Read, Result, Write};
/// A filter that reads the first N bytes from the input stream. #[derive(Clone)]
///
/// Limits the output to the initial bytes specified in the configuration.
/// Useful for previewing file contents without reading everything.
///
/// # Fields
///
/// * `remaining` - Number of bytes left to read before stopping.
pub struct HeadBytesFilter { pub struct HeadBytesFilter {
remaining: usize, remaining: usize,
} }
@@ -94,21 +87,6 @@ impl FilterPlugin for HeadBytesFilter {
Ok(()) Ok(())
} }
/// Clones this filter into a new boxed instance.
///
/// Creates an independent copy with the same configuration.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` clone.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::{FilterPlugin, HeadBytesFilter};
/// let filter = HeadBytesFilter::new(100);
/// let cloned = filter.clone_box();
/// ```
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self { Box::new(Self {
remaining: self.remaining, remaining: self.remaining,
@@ -134,11 +112,7 @@ impl FilterPlugin for HeadBytesFilter {
/// assert!(opts[0].required); /// assert!(opts[0].required);
/// ``` /// ```
fn options(&self) -> Vec<FilterOption> { fn options(&self) -> Vec<FilterOption> {
vec![FilterOption { crate::filter_plugin::count_option()
name: "count".to_string(),
default: None,
required: true,
}]
} }
fn description(&self) -> &str { fn description(&self) -> &str {
@@ -146,7 +120,7 @@ impl FilterPlugin for HeadBytesFilter {
} }
} }
/// A filter that reads the first N lines from the input stream. #[derive(Clone)]
pub struct HeadLinesFilter { pub struct HeadLinesFilter {
remaining: usize, remaining: usize,
} }
@@ -228,21 +202,6 @@ impl FilterPlugin for HeadLinesFilter {
Ok(()) Ok(())
} }
/// Clones this filter into a new boxed instance.
///
/// Creates an independent copy with the same configuration.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` clone.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::{FilterPlugin, HeadLinesFilter};
/// let filter = HeadLinesFilter::new(5);
/// let cloned = filter.clone_box();
/// ```
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self { Box::new(Self {
remaining: self.remaining, remaining: self.remaining,
@@ -250,29 +209,8 @@ impl FilterPlugin for HeadLinesFilter {
} }
/// Returns the configuration options for this filter. /// Returns the configuration options for this filter.
///
/// Defines the "count" parameter as required with no default.
///
/// # Returns
///
/// Vector of `FilterOption` describing parameters.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::{FilterPlugin, HeadLinesFilter};
/// let filter = HeadLinesFilter::new(5);
/// let opts = filter.options();
/// assert_eq!(opts.len(), 1);
/// assert_eq!(opts[0].name, "count");
/// assert!(opts[0].required);
/// ```
fn options(&self) -> Vec<FilterOption> { fn options(&self) -> Vec<FilterOption> {
vec![FilterOption { crate::filter_plugin::count_option()
name: "count".to_string(),
default: None,
required: true,
}]
} }
fn description(&self) -> &str { fn description(&self) -> &str {

View File

@@ -2,6 +2,7 @@ use std::io::{Read, Result, Write};
use std::str::FromStr; use std::str::FromStr;
use strum::EnumString; use strum::EnumString;
#[cfg(feature = "filter_grep")]
pub mod grep; pub mod grep;
/// Filter plugin module for processing input streams. /// Filter plugin module for processing input streams.
/// ///
@@ -16,7 +17,7 @@ pub mod grep;
/// ``` /// ```
/// # use std::io::{Read, Write}; /// # use std::io::{Read, Write};
/// # use keep::filter_plugin::parse_filter_string; /// # use keep::filter_plugin::parse_filter_string;
/// let mut chain = parse_filter_string("head_lines(10)|grep(pattern=error)")?; /// let mut chain = parse_filter_string("head_lines(10)|tail_lines(5)")?;
/// # let mut reader: &mut dyn Read = &mut std::io::empty(); /// # let mut reader: &mut dyn Read = &mut std::io::empty();
/// # let mut writer: Vec<u8> = Vec::new(); /// # let mut writer: Vec<u8> = Vec::new();
/// # chain.filter(&mut reader, &mut writer)?; /// # chain.filter(&mut reader, &mut writer)?;
@@ -26,12 +27,13 @@ pub mod head;
pub mod skip; pub mod skip;
pub mod strip_ansi; pub mod strip_ansi;
pub mod tail; pub mod tail;
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
pub mod tokens; pub mod tokens;
pub mod utils; pub mod utils;
use std::collections::HashMap; use std::collections::HashMap;
#[cfg(feature = "filter_grep")]
pub use grep::GrepFilter; pub use grep::GrepFilter;
pub use head::{HeadBytesFilter, HeadLinesFilter}; pub use head::{HeadBytesFilter, HeadLinesFilter};
pub use skip::{SkipBytesFilter, SkipLinesFilter}; pub use skip::{SkipBytesFilter, SkipLinesFilter};
@@ -108,18 +110,16 @@ pub trait FilterPlugin: Send {
/// struct MyFilter; /// struct MyFilter;
/// impl FilterPlugin for MyFilter { /// impl FilterPlugin for MyFilter {
/// fn filter(&mut self, reader: &mut dyn Read, writer: &mut dyn Write) -> Result<()> { /// fn filter(&mut self, reader: &mut dyn Read, writer: &mut dyn Write) -> Result<()> {
/// // Read and filter data
/// let mut buf = [0; 1024]; /// let mut buf = [0; 1024];
/// loop { /// loop {
/// let n = reader.read(&mut buf)?; /// let n = reader.read(&mut buf)?;
/// if n == 0 { break; } /// if n == 0 { break; }
/// // Apply filter logic to buf[0..n]
/// writer.write_all(&buf[0..n])?; /// writer.write_all(&buf[0..n])?;
/// } /// }
/// Ok(()) /// Ok(())
/// } /// }
/// fn clone_box(&self) -> Box<dyn FilterPlugin> { /// fn clone_box(&self) -> Box<dyn FilterPlugin> {
/// Box::new(MyFilter) /// Box::new(Self)
/// } /// }
/// fn options(&self) -> Vec<FilterOption> { /// fn options(&self) -> Vec<FilterOption> {
/// vec![] /// vec![]
@@ -131,22 +131,6 @@ pub trait FilterPlugin: Send {
Ok(()) Ok(())
} }
/// Clones this plugin into a new boxed instance.
///
/// This method is required for dynamic dispatch and cloning in filter chains.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` clone of the current plugin.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::FilterPlugin;
/// fn example_clone_box(filter: &dyn FilterPlugin) -> Box<dyn FilterPlugin> {
/// filter.clone_box()
/// }
/// ```
fn clone_box(&self) -> Box<dyn FilterPlugin>; fn clone_box(&self) -> Box<dyn FilterPlugin>;
/// Returns the configuration options for this plugin. /// Returns the configuration options for this plugin.
@@ -183,6 +167,22 @@ pub trait FilterPlugin: Send {
} }
} }
pub fn count_option() -> Vec<FilterOption> {
vec![FilterOption {
name: "count".to_string(),
default: None,
required: true,
}]
}
pub fn pattern_option() -> Vec<FilterOption> {
vec![FilterOption {
name: "pattern".to_string(),
default: None,
required: true,
}]
}
/// Enum representing the different types of filters. /// Enum representing the different types of filters.
/// ///
/// Used for parsing and instantiating specific filter plugins. /// Used for parsing and instantiating specific filter plugins.
@@ -201,13 +201,14 @@ pub enum FilterType {
TailLines, TailLines,
SkipBytes, SkipBytes,
SkipLines, SkipLines,
#[cfg(feature = "filter_grep")]
Grep, Grep,
StripAnsi, StripAnsi,
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
HeadTokens, HeadTokens,
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
SkipTokens, SkipTokens,
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
TailTokens, TailTokens,
} }
@@ -215,6 +216,44 @@ pub enum FilterType {
/// Prevents OOM on large files by rejecting inputs that exceed this limit. /// Prevents OOM on large files by rejecting inputs that exceed this limit.
const MAX_FILTER_BUFFER_SIZE: usize = 256 * 1024 * 1024; const MAX_FILTER_BUFFER_SIZE: usize = 256 * 1024 * 1024;
struct BoundedVecWriter {
data: Vec<u8>,
limit: usize,
}
impl BoundedVecWriter {
fn new(limit: usize) -> Self {
Self {
data: Vec::new(),
limit,
}
}
fn into_inner(self) -> Vec<u8> {
self.data
}
}
impl std::io::Write for BoundedVecWriter {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
if self.data.len() + buf.len() > self.limit {
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidData,
format!(
"Input size exceeds maximum filter buffer size ({} bytes)",
MAX_FILTER_BUFFER_SIZE
),
));
}
self.data.write_all(buf)?;
Ok(buf.len())
}
fn flush(&mut self) -> std::io::Result<()> {
Ok(())
}
}
/// A chain of filter plugins applied sequentially. /// A chain of filter plugins applied sequentially.
/// ///
/// Chains multiple filters, applying them in order to the input stream. /// Chains multiple filters, applying them in order to the input stream.
@@ -262,16 +301,27 @@ impl Clone for FilterChain {
} }
impl Clone for Box<dyn FilterPlugin> { impl Clone for Box<dyn FilterPlugin> {
/// Clones the boxed filter plugin.
///
/// # Returns
///
/// A new boxed clone of the filter plugin.
fn clone(&self) -> Self { fn clone(&self) -> Self {
self.clone_box() self.clone_box()
} }
} }
#[macro_export]
macro_rules! filter_clone_box {
($self:expr) => {
Box::new($self.clone())
};
($self:expr, $field:ident) => {
Box::new(Self { $field: $self.$field.clone() })
};
($self:expr, $field:ident, $($rest:ident),+) => {
Box::new(Self {
$field: $self.$field.clone(),
$($rest: $self.$rest.clone()),+
})
};
}
impl Default for FilterChain { impl Default for FilterChain {
fn default() -> Self { fn default() -> Self {
Self::new() Self::new()
@@ -309,9 +359,8 @@ impl FilterChain {
/// # Examples /// # Examples
/// ///
/// ``` /// ```
/// # use keep::filter_plugin::{FilterChain, GrepFilter}; /// # use keep::filter_plugin::FilterChain;
/// let mut chain = FilterChain::new(); /// let mut chain = FilterChain::new();
/// chain.add_plugin(Box::new(GrepFilter::new("error".to_string()).unwrap()));
/// ``` /// ```
pub fn add_plugin(&mut self, plugin: Box<dyn FilterPlugin>) { pub fn add_plugin(&mut self, plugin: Box<dyn FilterPlugin>) {
self.plugins.push(plugin); self.plugins.push(plugin);
@@ -351,21 +400,10 @@ impl FilterChain {
} }
// For multiple plugins, we need to chain them together // For multiple plugins, we need to chain them together
// We'll use a temporary buffer to hold intermediate results // We'll use a bounded buffer to hold intermediate results
let mut current_data = Vec::new(); let mut bounded_writer = BoundedVecWriter::new(MAX_FILTER_BUFFER_SIZE);
std::io::copy(reader, &mut current_data)?; std::io::copy(reader, &mut bounded_writer)?;
let mut current_data = bounded_writer.into_inner();
if current_data.len() > MAX_FILTER_BUFFER_SIZE {
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidData,
format!(
"Input size ({} bytes) exceeds maximum filter buffer size ({} bytes). \
Consider using fewer filter plugins or smaller inputs.",
current_data.len(),
MAX_FILTER_BUFFER_SIZE
),
));
}
// Store the plugins length to avoid borrowing issues // Store the plugins length to avoid borrowing issues
let plugins_len = self.plugins.len(); let plugins_len = self.plugins.len();
@@ -499,6 +537,7 @@ fn create_filter_with_options(
// Get the default options for this filter type by creating a temporary instance // Get the default options for this filter type by creating a temporary instance
// To do this, we need to create a default instance of the appropriate filter // To do this, we need to create a default instance of the appropriate filter
let option_defs = match filter_type { let option_defs = match filter_type {
#[cfg(feature = "filter_grep")]
FilterType::Grep => grep::GrepFilter::new("".to_string())?.options(), FilterType::Grep => grep::GrepFilter::new("".to_string())?.options(),
FilterType::HeadBytes => head::HeadBytesFilter::new(0).options(), FilterType::HeadBytes => head::HeadBytesFilter::new(0).options(),
FilterType::HeadLines => head::HeadLinesFilter::new(0).options(), FilterType::HeadLines => head::HeadLinesFilter::new(0).options(),
@@ -507,11 +546,11 @@ fn create_filter_with_options(
FilterType::SkipBytes => skip::SkipBytesFilter::new(0).options(), FilterType::SkipBytes => skip::SkipBytesFilter::new(0).options(),
FilterType::SkipLines => skip::SkipLinesFilter::new(0).options(), FilterType::SkipLines => skip::SkipLinesFilter::new(0).options(),
FilterType::StripAnsi => strip_ansi::StripAnsiFilter::new().options(), FilterType::StripAnsi => strip_ansi::StripAnsiFilter::new().options(),
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
FilterType::HeadTokens => tokens::HeadTokensFilter::new(0).options(), FilterType::HeadTokens => tokens::HeadTokensFilter::new(0).options(),
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
FilterType::SkipTokens => tokens::SkipTokensFilter::new(0).options(), FilterType::SkipTokens => tokens::SkipTokensFilter::new(0).options(),
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
FilterType::TailTokens => tokens::TailTokensFilter::new(0).options(), FilterType::TailTokens => tokens::TailTokensFilter::new(0).options(),
}; };
@@ -581,6 +620,7 @@ fn create_specific_filter(
options: &HashMap<String, serde_json::Value>, options: &HashMap<String, serde_json::Value>,
) -> Result<Box<dyn FilterPlugin>> { ) -> Result<Box<dyn FilterPlugin>> {
match filter_type { match filter_type {
#[cfg(feature = "filter_grep")]
FilterType::Grep => { FilterType::Grep => {
let pattern = options let pattern = options
.get("pattern") .get("pattern")
@@ -681,7 +721,7 @@ fn create_specific_filter(
} }
Ok(Box::new(strip_ansi::StripAnsiFilter::new())) Ok(Box::new(strip_ansi::StripAnsiFilter::new()))
} }
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
FilterType::HeadTokens => { FilterType::HeadTokens => {
let count = options let count = options
.get("count") .get("count")
@@ -693,17 +733,13 @@ fn create_specific_filter(
"head_tokens filter requires 'count' parameter", "head_tokens filter requires 'count' parameter",
) )
})?; })?;
let encoding = options let (encoding, tokenizer) = parse_encoding_option(options);
.get("encoding")
.and_then(|v| v.as_str())
.and_then(|s| s.parse::<crate::tokenizer::TokenEncoding>().ok())
.unwrap_or_default();
let mut f = tokens::HeadTokensFilter::new(count); let mut f = tokens::HeadTokensFilter::new(count);
f.tokenizer = crate::tokenizer::get_tokenizer(encoding).clone(); f.tokenizer = tokenizer;
f.encoding = encoding; f.encoding = encoding;
Ok(Box::new(f)) Ok(Box::new(f))
} }
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
FilterType::SkipTokens => { FilterType::SkipTokens => {
let count = options let count = options
.get("count") .get("count")
@@ -715,17 +751,13 @@ fn create_specific_filter(
"skip_tokens filter requires 'count' parameter", "skip_tokens filter requires 'count' parameter",
) )
})?; })?;
let encoding = options let (encoding, tokenizer) = parse_encoding_option(options);
.get("encoding")
.and_then(|v| v.as_str())
.and_then(|s| s.parse::<crate::tokenizer::TokenEncoding>().ok())
.unwrap_or_default();
let mut f = tokens::SkipTokensFilter::new(count); let mut f = tokens::SkipTokensFilter::new(count);
f.tokenizer = crate::tokenizer::get_tokenizer(encoding).clone(); f.tokenizer = tokenizer;
f.encoding = encoding; f.encoding = encoding;
Ok(Box::new(f)) Ok(Box::new(f))
} }
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
FilterType::TailTokens => { FilterType::TailTokens => {
let count = options let count = options
.get("count") .get("count")
@@ -737,19 +769,28 @@ fn create_specific_filter(
"tail_tokens filter requires 'count' parameter", "tail_tokens filter requires 'count' parameter",
) )
})?; })?;
let encoding = options let (encoding, tokenizer) = parse_encoding_option(options);
.get("encoding")
.and_then(|v| v.as_str())
.and_then(|s| s.parse::<crate::tokenizer::TokenEncoding>().ok())
.unwrap_or_default();
let mut f = tokens::TailTokensFilter::new(count); let mut f = tokens::TailTokensFilter::new(count);
f.tokenizer = crate::tokenizer::get_tokenizer(encoding).clone(); f.tokenizer = tokenizer;
f.encoding = encoding; f.encoding = encoding;
Ok(Box::new(f)) Ok(Box::new(f))
} }
} }
} }
#[cfg(feature = "meta_tokens")]
fn parse_encoding_option(
options: &std::collections::HashMap<String, serde_json::Value>,
) -> (crate::tokenizer::TokenEncoding, crate::tokenizer::Tokenizer) {
let encoding = options
.get("encoding")
.and_then(|v| v.as_str())
.and_then(|s| s.parse::<crate::tokenizer::TokenEncoding>().ok())
.unwrap_or_default();
let tokenizer = crate::tokenizer::get_tokenizer(encoding).clone();
(encoding, tokenizer)
}
/// Parses an option value from a string into a JSON value. /// Parses an option value from a string into a JSON value.
/// ///
/// # Arguments /// # Arguments

View File

@@ -4,6 +4,7 @@ use crate::services::filter_service::register_filter_plugin;
use std::io::{BufRead, Read, Result, Write}; use std::io::{BufRead, Read, Result, Write};
/// A filter that skips the first N bytes from the input stream. /// A filter that skips the first N bytes from the input stream.
#[derive(Clone)]
pub struct SkipBytesFilter { pub struct SkipBytesFilter {
remaining: usize, remaining: usize,
} }
@@ -49,11 +50,6 @@ impl FilterPlugin for SkipBytesFilter {
Ok(()) Ok(())
} }
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self { Box::new(Self {
remaining: self.remaining, remaining: self.remaining,
@@ -61,16 +57,8 @@ impl FilterPlugin for SkipBytesFilter {
} }
/// Returns the configuration options for this filter. /// Returns the configuration options for this filter.
///
/// # Returns
///
/// A vector of `FilterOption` describing the filter's configurable parameters.
fn options(&self) -> Vec<FilterOption> { fn options(&self) -> Vec<FilterOption> {
vec![FilterOption { crate::filter_plugin::count_option()
name: "count".to_string(),
default: None,
required: true,
}]
} }
fn description(&self) -> &str { fn description(&self) -> &str {
@@ -79,6 +67,7 @@ impl FilterPlugin for SkipBytesFilter {
} }
/// A filter that skips the first N lines from the input stream. /// A filter that skips the first N lines from the input stream.
#[derive(Clone)]
pub struct SkipLinesFilter { pub struct SkipLinesFilter {
remaining: usize, remaining: usize,
} }
@@ -118,11 +107,6 @@ impl FilterPlugin for SkipLinesFilter {
Ok(()) Ok(())
} }
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self { Box::new(Self {
remaining: self.remaining, remaining: self.remaining,
@@ -130,16 +114,8 @@ impl FilterPlugin for SkipLinesFilter {
} }
/// Returns the configuration options for this filter. /// Returns the configuration options for this filter.
///
/// # Returns
///
/// A vector of `FilterOption` describing the filter's configurable parameters.
fn options(&self) -> Vec<FilterOption> { fn options(&self) -> Vec<FilterOption> {
vec![FilterOption { crate::filter_plugin::count_option()
name: "count".to_string(),
default: None,
required: true,
}]
} }
fn description(&self) -> &str { fn description(&self) -> &str {

View File

@@ -7,7 +7,7 @@ use strip_ansi_escapes::Writer;
/// # Fields /// # Fields
/// ///
/// None, stateless filter. /// None, stateless filter.
#[derive(Default)] #[derive(Default, Clone)]
pub struct StripAnsiFilter; pub struct StripAnsiFilter;
impl StripAnsiFilter { impl StripAnsiFilter {
@@ -39,22 +39,12 @@ impl FilterPlugin for StripAnsiFilter {
Ok(()) Ok(())
} }
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self) Box::new(Self)
} }
/// Returns the configuration options for this filter (none required).
///
/// # Returns
///
/// An empty vector since this filter has no configurable options.
fn options(&self) -> Vec<FilterOption> { fn options(&self) -> Vec<FilterOption> {
Vec::new() // strip_ansi doesn't take any options Vec::new()
} }
fn description(&self) -> &str { fn description(&self) -> &str {

View File

@@ -4,7 +4,7 @@ use crate::services::filter_service::register_filter_plugin;
use std::collections::VecDeque; use std::collections::VecDeque;
use std::io::{BufRead, Read, Result, Write}; use std::io::{BufRead, Read, Result, Write};
/// A filter that reads the last N bytes from the input stream. #[derive(Clone)]
pub struct TailBytesFilter { pub struct TailBytesFilter {
buffer: VecDeque<u8>, buffer: VecDeque<u8>,
count: usize, count: usize,
@@ -58,11 +58,6 @@ impl FilterPlugin for TailBytesFilter {
Ok(()) Ok(())
} }
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self { Box::new(Self {
buffer: self.buffer.clone(), buffer: self.buffer.clone(),
@@ -71,16 +66,8 @@ impl FilterPlugin for TailBytesFilter {
} }
/// Returns the configuration options for this filter. /// Returns the configuration options for this filter.
///
/// # Returns
///
/// A vector of `FilterOption` describing the filter's configurable parameters.
fn options(&self) -> Vec<FilterOption> { fn options(&self) -> Vec<FilterOption> {
vec![FilterOption { crate::filter_plugin::count_option()
name: "count".to_string(),
default: None,
required: true,
}]
} }
fn description(&self) -> &str { fn description(&self) -> &str {
@@ -89,6 +76,7 @@ impl FilterPlugin for TailBytesFilter {
} }
/// A filter that reads the last N lines from the input stream. /// A filter that reads the last N lines from the input stream.
#[derive(Clone)]
pub struct TailLinesFilter { pub struct TailLinesFilter {
lines: VecDeque<String>, lines: VecDeque<String>,
count: usize, count: usize,
@@ -136,11 +124,6 @@ impl FilterPlugin for TailLinesFilter {
Ok(()) Ok(())
} }
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self { Box::new(Self {
lines: self.lines.clone(), lines: self.lines.clone(),
@@ -149,16 +132,8 @@ impl FilterPlugin for TailLinesFilter {
} }
/// Returns the configuration options for this filter. /// Returns the configuration options for this filter.
///
/// # Returns
///
/// A vector of `FilterOption` describing the filter's configurable parameters.
fn options(&self) -> Vec<FilterOption> { fn options(&self) -> Vec<FilterOption> {
vec![FilterOption { crate::filter_plugin::count_option()
name: "count".to_string(),
default: None,
required: true,
}]
} }
fn description(&self) -> &str { fn description(&self) -> &str {

View File

@@ -8,11 +8,7 @@ use std::io::{Read, Result, Write};
// head_tokens // head_tokens
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
/// A filter that outputs only the first N tokens of the input stream. #[derive(Clone)]
///
/// Streams bytes directly until the token limit is reached. When the limit
/// falls mid-chunk, uses `split_by_token_iter` to find the exact byte boundary
/// without allocating token strings beyond what is needed.
pub struct HeadTokensFilter { pub struct HeadTokensFilter {
pub remaining: usize, pub remaining: usize,
pub tokenizer: Tokenizer, pub tokenizer: Tokenizer,
@@ -78,7 +74,7 @@ impl FilterPlugin for HeadTokensFilter {
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self { Box::new(Self {
remaining: self.remaining, remaining: self.remaining,
tokenizer: get_tokenizer(self.encoding).clone(), tokenizer: self.tokenizer.clone(),
encoding: self.encoding, encoding: self.encoding,
}) })
} }
@@ -107,7 +103,7 @@ impl FilterPlugin for HeadTokensFilter {
// skip_tokens // skip_tokens
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
/// A filter that skips the first N tokens of the input stream and outputs the rest. #[derive(Clone)]
pub struct SkipTokensFilter { pub struct SkipTokensFilter {
pub remaining: usize, pub remaining: usize,
pub tokenizer: Tokenizer, pub tokenizer: Tokenizer,
@@ -180,7 +176,7 @@ impl FilterPlugin for SkipTokensFilter {
fn clone_box(&self) -> Box<dyn FilterPlugin> { fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self { Box::new(Self {
remaining: self.remaining, remaining: self.remaining,
tokenizer: get_tokenizer(self.encoding).clone(), tokenizer: self.tokenizer.clone(),
encoding: self.encoding, encoding: self.encoding,
}) })
} }
@@ -211,8 +207,7 @@ impl FilterPlugin for SkipTokensFilter {
/// A filter that outputs only the last N tokens of the input stream. /// A filter that outputs only the last N tokens of the input stream.
/// ///
/// Buffers all bytes from the stream, then at finalize tokenizes the #[derive(Clone)]
/// content and writes only the last N tokens.
pub struct TailTokensFilter { pub struct TailTokensFilter {
pub count: usize, pub count: usize,
/// Buffer holding all bytes from the stream. /// Buffer holding all bytes from the stream.
@@ -276,7 +271,7 @@ impl FilterPlugin for TailTokensFilter {
Box::new(Self { Box::new(Self {
count: self.count, count: self.count,
buffer: Vec::new(), buffer: Vec::new(),
tokenizer: get_tokenizer(self.encoding).clone(), tokenizer: self.tokenizer.clone(),
encoding: self.encoding, encoding: self.encoding,
}) })
} }

225
src/import_tar.rs Normal file
View File

@@ -0,0 +1,225 @@
use anyhow::{Context, Result, anyhow};
use log::debug;
use std::collections::HashMap;
use std::fs;
use std::io::{Read, Write};
use std::path::Path;
use std::str::FromStr;
use tempfile::TempDir;
use tar::Archive;
use crate::common::PIPESIZE;
use crate::compression_engine::CompressionType;
use crate::db;
use crate::modes::common::ImportMeta;
/// Represents a parsed tar entry from an export archive.
struct TarEntry {
/// Path to the extracted data file in the temp directory.
data_path: Option<std::path::PathBuf>,
/// Path to the extracted meta file in the temp directory.
meta_path: Option<std::path::PathBuf>,
}
/// Import all items from a `.keep.tar` archive.
///
/// Items are imported in ascending order of their original IDs,
/// ensuring chronological ordering is preserved. Each imported item
/// receives a new auto-incremented ID from the target database.
///
/// # Arguments
///
/// * `tar_path` - Path to the `.keep.tar` file.
/// * `conn` - Mutable database connection.
/// * `data_path` - Path to the data storage directory.
///
/// # Returns
///
/// A list of newly assigned item IDs.
pub fn import_from_tar(
tar_path: &Path,
conn: &mut rusqlite::Connection,
data_path: &Path,
) -> Result<Vec<i64>> {
let file = fs::File::open(tar_path)
.with_context(|| format!("Cannot open tar file: {}", tar_path.display()))?;
let mut archive = Archive::new(file);
let tmp_dir = TempDir::new().context("Cannot create temporary directory for import")?;
let tmp_path = tmp_dir.path();
// Extract entries to temp dir
let mut entries_map: HashMap<i64, TarEntry> = HashMap::new();
for entry_result in archive.entries().context("Cannot read tar entries")? {
let mut entry = entry_result.context("Cannot read tar entry")?;
let entry_path = entry.path().context("Cannot get entry path")?.to_path_buf();
let path_str = entry_path.to_string_lossy().replace('\\', "/");
// Reject path traversal attempts
if path_str.starts_with('/') || path_str.starts_with("..") || path_str.contains("/../") {
return Err(anyhow!("Rejected path traversal entry: {path_str}"));
}
// Skip directory entries
if entry.header().entry_type().is_dir() {
debug!("IMPORT_TAR: Skipping directory entry: {path_str}");
continue;
}
// Parse: <dir>/<id>.data.<compression> or <dir>/<id>.meta.yml
let filename = entry_path
.file_name()
.ok_or_else(|| anyhow!("Invalid entry path: {path_str}"))?
.to_string_lossy();
let (orig_id, is_data) = if let Some(id_str) = filename.strip_suffix(".meta.yml") {
let id: i64 = id_str
.parse()
.with_context(|| format!("Invalid ID in entry: {path_str}"))?;
(id, false)
} else if let Some(dot_pos) = filename.find(".data.") {
let id_str = &filename[..dot_pos];
let id: i64 = id_str
.parse()
.with_context(|| format!("Invalid ID in entry: {path_str}"))?;
(id, true)
} else {
debug!("IMPORT_TAR: Skipping unrecognized entry: {path_str}");
continue;
};
let entry_ref = entries_map.entry(orig_id).or_insert_with(|| TarEntry {
data_path: None,
meta_path: None,
});
if is_data {
let dest = tmp_path.join(format!("{orig_id}.data"));
let mut dest_file = fs::File::create(&dest).context("Cannot create temp data file")?;
let mut buf = [0u8; PIPESIZE];
loop {
let n = entry.read(&mut buf)?;
if n == 0 {
break;
}
dest_file.write_all(&buf[..n])?;
}
entry_ref.data_path = Some(dest);
debug!("IMPORT_TAR: Extracted data for original ID {orig_id}");
} else {
let dest = tmp_path.join(format!("{orig_id}.meta.yml"));
let mut dest_file = fs::File::create(&dest).context("Cannot create temp meta file")?;
let mut buf = [0u8; PIPESIZE];
loop {
let n = entry.read(&mut buf)?;
if n == 0 {
break;
}
dest_file.write_all(&buf[..n])?;
}
entry_ref.meta_path = Some(dest);
debug!("IMPORT_TAR: Extracted meta for original ID {orig_id}");
}
}
if entries_map.is_empty() {
return Err(anyhow!("No items found in archive"));
}
// Sort by original ID ascending
let mut sorted_ids: Vec<i64> = entries_map.keys().copied().collect();
sorted_ids.sort_unstable();
let mut imported_ids = Vec::new();
for orig_id in sorted_ids {
let entry = entries_map.get(&orig_id).expect("ID should exist in map");
let meta_path = entry
.meta_path
.as_ref()
.ok_or_else(|| anyhow!("Item {orig_id} missing .meta.yml entry"))?;
let data_path_entry = entry
.data_path
.as_ref()
.ok_or_else(|| anyhow!("Item {orig_id} missing .data entry"))?;
// Parse metadata
let meta_yaml = fs::read_to_string(meta_path)
.with_context(|| format!("Cannot read meta file for item {orig_id}"))?;
let import_meta: ImportMeta = serde_yaml::from_str(&meta_yaml)
.with_context(|| format!("Cannot parse meta file for item {orig_id}"))?;
// Validate compression type
CompressionType::from_str(&import_meta.compression).map_err(|_| {
anyhow!(
"Invalid compression type '{}' for item {}",
import_meta.compression,
orig_id
)
})?;
// Create item with original timestamp
let item = db::insert_item_with_ts(conn, import_meta.ts, &import_meta.compression)?;
let new_id = item.id.context("New item missing ID")?;
// Set tags
let tags = if !import_meta.tags.is_empty() {
db::set_item_tags(conn, item.clone(), &import_meta.tags)?;
import_meta.tags.clone()
} else {
Vec::new()
};
// Stream data to storage
let mut storage_path = data_path.to_path_buf();
storage_path.push(new_id.to_string());
let mut reader = fs::File::open(data_path_entry)
.with_context(|| format!("Cannot read data file for item {orig_id}"))?;
let mut writer = fs::File::create(&storage_path)
.with_context(|| format!("Cannot create storage file for item {new_id}"))?;
let mut buf = [0u8; PIPESIZE];
let mut total = 0i64;
loop {
let n = reader.read(&mut buf)?;
if n == 0 {
break;
}
writer.write_all(&buf[..n])?;
total += n as i64;
}
if total == 0 {
return Err(anyhow!("Item {orig_id} has empty data file"));
}
// Set metadata
for (key, value) in &import_meta.metadata {
db::query_upsert_meta(
conn,
db::Meta {
id: new_id,
name: key.clone(),
value: value.clone(),
},
)?;
}
// Update item sizes
let size_to_record = import_meta.uncompressed_size.unwrap_or(total);
let mut updated_item = item;
updated_item.uncompressed_size = Some(size_to_record);
updated_item.compressed_size = Some(std::fs::metadata(&storage_path)?.len() as i64);
updated_item.closed = true;
db::update_item(conn, updated_item)?;
log::info!("KEEP: Imported item {new_id} (was {orig_id}) tags: {tags:?}");
imported_ids.push(new_id);
}
Ok(imported_ids)
}

View File

@@ -35,7 +35,9 @@ pub mod common;
pub mod compression_engine; pub mod compression_engine;
pub mod config; pub mod config;
pub mod db; pub mod db;
pub mod export_tar;
pub mod filter_plugin; pub mod filter_plugin;
pub mod import_tar;
pub mod meta_plugin; pub mod meta_plugin;
pub mod modes; pub mod modes;
pub mod services; pub mod services;
@@ -43,19 +45,23 @@ pub mod services;
#[cfg(feature = "client")] #[cfg(feature = "client")]
pub mod client; pub mod client;
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
pub mod tokenizer; pub mod tokenizer;
// Re-export Args struct for library usage // Re-export Args struct for library usage
pub use args::Args; pub use args::Args;
// Re-export PIPESIZE constant // Re-export PIPESIZE constant
pub use common::PIPESIZE; pub use common::PIPESIZE;
pub use services::CoreError;
// Import all filter plugins to ensure they register themselves // Import all filter plugins to ensure they register themselves
#[allow(unused_imports)] #[allow(unused_imports)]
use filter_plugin::{grep, head, skip, strip_ansi, tail}; #[cfg(feature = "filter_grep")]
use filter_plugin::grep;
#[allow(unused_imports)]
use filter_plugin::{head, skip, strip_ansi, tail};
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
#[allow(unused_imports)] #[allow(unused_imports)]
use filter_plugin::tokens as token_filters; use filter_plugin::tokens as token_filters;
@@ -63,19 +69,19 @@ use crate::meta_plugin::{
cwd, digest, env, exec, hostname, keep_pid, read_rate, read_time, shell, shell_pid, user, cwd, digest, env, exec, hostname, keep_pid, read_rate, read_time, shell, shell_pid, user,
}; };
#[cfg(feature = "magic")] #[cfg(feature = "meta_magic")]
#[allow(unused_imports)] #[allow(unused_imports)]
use crate::meta_plugin::magic_file; use crate::meta_plugin::magic_file;
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
#[allow(unused_imports)] #[allow(unused_imports)]
use crate::meta_plugin::tokens; use crate::meta_plugin::tokens;
#[cfg(feature = "infer")] #[cfg(feature = "meta_infer")]
#[allow(unused_imports)] #[allow(unused_imports)]
use crate::meta_plugin::infer_plugin; use crate::meta_plugin::infer_plugin;
#[cfg(feature = "tree_magic_mini")] #[cfg(feature = "meta_tree_magic_mini")]
#[allow(unused_imports)] #[allow(unused_imports)]
use crate::meta_plugin::tree_magic_mini; use crate::meta_plugin::tree_magic_mini;

View File

@@ -81,7 +81,7 @@ fn main() -> Result<(), Error> {
let ids = &mut Vec::new(); let ids = &mut Vec::new();
let tags = &mut Vec::new(); let tags = &mut Vec::new();
// For --info, --get, and --export modes, treat numeric strings as IDs // For --info, --get, --export, and --list modes, treat numeric strings as IDs
for v in args.ids_or_tags.iter() { for v in args.ids_or_tags.iter() {
debug!("MAIN: Parsed value: {v:?}"); debug!("MAIN: Parsed value: {v:?}");
match v.clone() { match v.clone() {
@@ -90,15 +90,15 @@ fn main() -> Result<(), Error> {
ids.push(num) ids.push(num)
} }
NumberOrString::Str(str) => { NumberOrString::Str(str) => {
// For --info, --get, and --export, try to parse strings as numbers to treat them as IDs // For --info, --get, --export, and --list, try to parse strings as numbers to treat them as IDs
if (args.mode.info || args.mode.get || args.mode.export) if (args.mode.info || args.mode.get || args.mode.export || args.mode.list)
&& let Ok(num) = str.parse::<i64>() && let Ok(num) = str.parse::<i64>()
{ {
debug!("MAIN: Adding parsed string to ids: {num}"); debug!("MAIN: Adding parsed string to ids: {num}");
ids.push(num); ids.push(num);
continue; continue;
} }
// If not a number, or not using --info/--get/--export, treat as tag // If not a number, or not using --info/--get/--export/--list, treat as tag
debug!("MAIN: Adding to tags: {str}"); debug!("MAIN: Adding to tags: {str}");
tags.push(str) tags.push(str)
} }
@@ -122,6 +122,7 @@ fn main() -> Result<(), Error> {
Import, Import,
Status, Status,
StatusPlugins, StatusPlugins,
#[cfg(feature = "server")]
Server, Server,
GenerateConfig, GenerateConfig,
} }
@@ -150,9 +151,14 @@ fn main() -> Result<(), Error> {
mode = KeepModes::Status; mode = KeepModes::Status;
} else if args.mode.status_plugins { } else if args.mode.status_plugins {
mode = KeepModes::StatusPlugins; mode = KeepModes::StatusPlugins;
} else if args.mode.server { }
mode = KeepModes::Server; #[cfg(feature = "server")]
} else if args.mode.generate_config { {
if args.mode.server {
mode = KeepModes::Server;
}
}
if args.mode.generate_config {
mode = KeepModes::GenerateConfig; mode = KeepModes::GenerateConfig;
} }
@@ -188,6 +194,7 @@ fn main() -> Result<(), Error> {
} }
// Validate server password usage // Validate server password usage
#[cfg(feature = "server")]
if settings.server_password().is_some() && mode != KeepModes::Server { if settings.server_password().is_some() && mode != KeepModes::Server {
cmd.error( cmd.error(
ErrorKind::InvalidValue, ErrorKind::InvalidValue,
@@ -256,7 +263,7 @@ fn main() -> Result<(), Error> {
filter_chain, filter_chain,
), ),
KeepModes::List => { KeepModes::List => {
keep::modes::client::list::mode(&client, &mut cmd, &settings, tags) keep::modes::client::list::mode(&client, &mut cmd, &settings, ids, tags)
} }
KeepModes::Delete => { KeepModes::Delete => {
keep::modes::client::delete::mode(&client, &mut cmd, &settings, ids) keep::modes::client::delete::mode(&client, &mut cmd, &settings, ids)
@@ -291,6 +298,9 @@ fn main() -> Result<(), Error> {
} }
} }
// SAFETY: umask is thread-safe by POSIX spec, and we invoke it exactly once
// before any file operations to set a secure default mask. No other threads
// exist yet at this point in main(), so there is no data race.
unsafe { unsafe {
libc::umask(0o077); libc::umask(0o077);
} }
@@ -352,19 +362,8 @@ fn main() -> Result<(), Error> {
KeepModes::StatusPlugins => { KeepModes::StatusPlugins => {
modes::status_plugins::mode_status_plugins(&mut cmd, &settings, data_path, db_path) modes::status_plugins::mode_status_plugins(&mut cmd, &settings, data_path, db_path)
} }
KeepModes::Server => { #[cfg(feature = "server")]
#[cfg(feature = "server")] KeepModes::Server => modes::server::mode_server(&mut cmd, &settings, &mut conn, data_path),
{
modes::server::mode_server(&mut cmd, &settings, &mut conn, data_path)
}
#[cfg(not(feature = "server"))]
{
cmd.error(
ErrorKind::MissingRequiredArgument,
"This binary was not compiled with server support. Recompile with --features server"
).exit();
}
}
KeepModes::GenerateConfig => { KeepModes::GenerateConfig => {
modes::generate_config::mode_generate_config(&mut cmd, &settings) modes::generate_config::mode_generate_config(&mut cmd, &settings)
} }

View File

@@ -32,7 +32,7 @@ impl Hasher {
match self { match self {
Hasher::Sha256(hasher) => hasher.update(data), Hasher::Sha256(hasher) => hasher.update(data),
Hasher::Md5(hasher) => { Hasher::Md5(hasher) => {
let _ = hasher.write(data); hasher.consume(data);
} }
Hasher::Sha512(hasher) => hasher.update(data), Hasher::Sha512(hasher) => hasher.update(data),
} }

View File

@@ -131,7 +131,19 @@ impl MetaPluginExec {
match cmd.spawn() { match cmd.spawn() {
Ok(mut child) => { Ok(mut child) => {
let stdin = child.stdin.take().unwrap(); let stdin = match child.stdin.take() {
Some(s) => s,
None => {
error!(
"META: Exec plugin: failed to capture stdin for '{}'",
self.program
);
return MetaPluginResponse {
metadata: Vec::new(),
is_finalized: true,
};
}
};
self.writer = Some(Box::new(stdin)); self.writer = Some(Box::new(stdin));
self.process = Some(child); self.process = Some(child);
debug!("META: Exec plugin: started process for '{}'", self.program); debug!("META: Exec plugin: started process for '{}'", self.program);

View File

@@ -1,7 +1,7 @@
use crate::common::PIPESIZE; use crate::common::PIPESIZE;
use crate::meta_plugin::{ use crate::meta_plugin::{
process_metadata_outputs, register_meta_plugin, BaseMetaPlugin, MetaPlugin, MetaPluginResponse, BaseMetaPlugin, MetaPlugin, MetaPluginResponse, MetaPluginType, process_metadata_outputs,
MetaPluginType, register_meta_plugin,
}; };
#[derive(Debug, Default)] #[derive(Debug, Default)]

View File

@@ -1,6 +1,6 @@
#[cfg(feature = "magic")] #[cfg(feature = "meta_magic")]
use magic::{Cookie, CookieFlags}; use magic::{Cookie, CookieFlags};
#[cfg(not(feature = "magic"))] #[cfg(not(feature = "meta_magic"))]
use std::process::{Command, Stdio}; use std::process::{Command, Stdio};
use std::io::{self, Write}; use std::io::{self, Write};
@@ -16,12 +16,12 @@ use crate::meta_plugin::{
// separate cookies can be used from different threads concurrently without // separate cookies can be used from different threads concurrently without
// synchronization. Using thread_local! avoids unsafe impl Send since the // synchronization. Using thread_local! avoids unsafe impl Send since the
// storage is inherently !Send. // storage is inherently !Send.
#[cfg(feature = "magic")] #[cfg(feature = "meta_magic")]
thread_local! { thread_local! {
static MAGIC_COOKIE: std::cell::RefCell<Option<Cookie>> = const { std::cell::RefCell::new(None) }; static MAGIC_COOKIE: std::cell::RefCell<Option<Cookie>> = const { std::cell::RefCell::new(None) };
} }
#[cfg(feature = "magic")] #[cfg(feature = "meta_magic")]
#[derive(Debug)] #[derive(Debug)]
pub struct MagicFileMetaPluginImpl { pub struct MagicFileMetaPluginImpl {
buffer: Vec<u8>, buffer: Vec<u8>,
@@ -30,7 +30,7 @@ pub struct MagicFileMetaPluginImpl {
base: BaseMetaPlugin, base: BaseMetaPlugin,
} }
#[cfg(feature = "magic")] #[cfg(feature = "meta_magic")]
impl MagicFileMetaPluginImpl { impl MagicFileMetaPluginImpl {
pub fn new( pub fn new(
options: Option<std::collections::HashMap<String, serde_yaml::Value>>, options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
@@ -113,7 +113,7 @@ impl MagicFileMetaPluginImpl {
} }
} }
#[cfg(feature = "magic")] #[cfg(feature = "meta_magic")]
impl MetaPlugin for MagicFileMetaPluginImpl { impl MetaPlugin for MagicFileMetaPluginImpl {
fn is_finalized(&self) -> bool { fn is_finalized(&self) -> bool {
self.is_finalized self.is_finalized
@@ -222,10 +222,10 @@ impl MetaPlugin for MagicFileMetaPluginImpl {
} }
} }
#[cfg(feature = "magic")] #[cfg(feature = "meta_magic")]
pub use MagicFileMetaPluginImpl as MagicFileMetaPlugin; pub use MagicFileMetaPluginImpl as MagicFileMetaPlugin;
#[cfg(not(feature = "magic"))] #[cfg(not(feature = "meta_magic"))]
#[derive(Debug)] #[derive(Debug)]
pub struct FallbackMagicFileMetaPlugin { pub struct FallbackMagicFileMetaPlugin {
buffer: Vec<u8>, buffer: Vec<u8>,
@@ -234,7 +234,7 @@ pub struct FallbackMagicFileMetaPlugin {
base: BaseMetaPlugin, base: BaseMetaPlugin,
} }
#[cfg(not(feature = "magic"))] #[cfg(not(feature = "meta_magic"))]
impl FallbackMagicFileMetaPlugin { impl FallbackMagicFileMetaPlugin {
pub fn new( pub fn new(
options: Option<std::collections::HashMap<String, serde_yaml::Value>>, options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
@@ -267,7 +267,10 @@ impl FallbackMagicFileMetaPlugin {
.spawn() .spawn()
.and_then(|mut child| { .and_then(|mut child| {
if let Some(mut stdin) = child.stdin.take() { if let Some(mut stdin) = child.stdin.take() {
let _ = stdin.write_all(&self.buffer); if stdin.write_all(&self.buffer).is_err() {
// Ignore write error; child will see EOF and likely fail
// the file detection, returning no output.
}
} }
child.wait_with_output() child.wait_with_output()
}); });
@@ -333,7 +336,7 @@ impl FallbackMagicFileMetaPlugin {
} }
} }
#[cfg(not(feature = "magic"))] #[cfg(not(feature = "meta_magic"))]
impl MetaPlugin for FallbackMagicFileMetaPlugin { impl MetaPlugin for FallbackMagicFileMetaPlugin {
fn is_finalized(&self) -> bool { fn is_finalized(&self) -> bool {
self.is_finalized self.is_finalized
@@ -438,7 +441,7 @@ impl MetaPlugin for FallbackMagicFileMetaPlugin {
} }
} }
#[cfg(not(feature = "magic"))] #[cfg(not(feature = "meta_magic"))]
pub use FallbackMagicFileMetaPlugin as MagicFileMetaPlugin; pub use FallbackMagicFileMetaPlugin as MagicFileMetaPlugin;
use crate::meta_plugin::register_meta_plugin; use crate::meta_plugin::register_meta_plugin;

View File

@@ -1,5 +1,4 @@
use log::{debug, warn}; use log::{debug, warn};
use once_cell::sync::Lazy;
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
use std::collections::HashMap; use std::collections::HashMap;
use std::sync::{Arc, Mutex}; use std::sync::{Arc, Mutex};
@@ -9,7 +8,7 @@ pub mod digest;
pub mod env; pub mod env;
pub mod exec; pub mod exec;
pub mod hostname; pub mod hostname;
#[cfg(feature = "infer")] #[cfg(feature = "meta_infer")]
pub mod infer_plugin; pub mod infer_plugin;
pub mod keep_pid; pub mod keep_pid;
pub mod magic_file; pub mod magic_file;
@@ -18,32 +17,32 @@ pub mod read_time;
pub mod shell; pub mod shell;
pub mod shell_pid; pub mod shell_pid;
pub mod text; pub mod text;
#[cfg(feature = "tokens")] #[cfg(feature = "meta_tokens")]
pub mod tokens; pub mod tokens;
#[cfg(feature = "tree_magic_mini")] #[cfg(feature = "meta_tree_magic_mini")]
pub mod tree_magic_mini; pub mod tree_magic_mini;
pub mod user; pub mod user;
pub use digest::DigestMetaPlugin; pub use digest::DigestMetaPlugin;
pub use exec::MetaPluginExec; pub use exec::MetaPluginExec;
#[cfg(feature = "magic")] #[cfg(feature = "meta_magic")]
pub use magic_file::MagicFileMetaPlugin; pub use magic_file::MagicFileMetaPlugin;
// pub use text::TextMetaPlugin; // Removed duplicate // pub use text::TextMetaPlugin; // Removed duplicate
pub use cwd::CwdMetaPlugin; pub use cwd::CwdMetaPlugin;
pub use env::EnvMetaPlugin; pub use env::EnvMetaPlugin;
pub use hostname::HostnameMetaPlugin; pub use hostname::HostnameMetaPlugin;
#[cfg(feature = "infer")] #[cfg(feature = "meta_infer")]
pub use infer_plugin::InferMetaPlugin; pub use infer_plugin::InferMetaPlugin;
pub use keep_pid::KeepPidMetaPlugin; pub use keep_pid::KeepPidMetaPlugin;
pub use read_rate::ReadRateMetaPlugin; pub use read_rate::ReadRateMetaPlugin;
pub use read_time::ReadTimeMetaPlugin; pub use read_time::ReadTimeMetaPlugin;
pub use shell::ShellMetaPlugin; pub use shell::ShellMetaPlugin;
pub use shell_pid::ShellPidMetaPlugin; pub use shell_pid::ShellPidMetaPlugin;
#[cfg(feature = "tree_magic_mini")] #[cfg(feature = "meta_tree_magic_mini")]
pub use tree_magic_mini::TreeMagicMiniMetaPlugin; pub use tree_magic_mini::TreeMagicMiniMetaPlugin;
pub use user::UserMetaPlugin; pub use user::UserMetaPlugin;
#[cfg(not(feature = "magic"))] #[cfg(not(feature = "meta_magic"))]
pub use magic_file::FallbackMagicFileMetaPlugin as MagicFileMetaPlugin; pub use magic_file::FallbackMagicFileMetaPlugin as MagicFileMetaPlugin;
type PluginConstructor = fn( type PluginConstructor = fn(
@@ -306,22 +305,7 @@ pub fn process_metadata_outputs(
return None; return None;
} }
if let Some(custom_name) = mapping.as_str() { if let Some(custom_name) = mapping.as_str() {
// Convert the value to a string representation let value_str = yaml_value_to_string(&value);
let value_str = match &value {
serde_yaml::Value::Null => "null".to_string(),
serde_yaml::Value::Bool(b) => b.to_string(),
serde_yaml::Value::Number(n) => n.to_string(),
serde_yaml::Value::String(s) => s.clone(),
serde_yaml::Value::Sequence(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
serde_yaml::Value::Mapping(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
serde_yaml::Value::Tagged(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
};
debug!( debug!(
"META: Processing metadata: internal_name={internal_name}, custom_name={custom_name}, value={value_str}" "META: Processing metadata: internal_name={internal_name}, custom_name={custom_name}, value={value_str}"
); );
@@ -332,22 +316,7 @@ pub fn process_metadata_outputs(
} }
} }
// Convert the value to a string representation let value_str = yaml_value_to_string(&value);
let value_str = match &value {
serde_yaml::Value::Null => "null".to_string(),
serde_yaml::Value::Bool(b) => b.to_string(),
serde_yaml::Value::Number(n) => n.to_string(),
serde_yaml::Value::String(s) => s.clone(),
serde_yaml::Value::Sequence(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
serde_yaml::Value::Mapping(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
serde_yaml::Value::Tagged(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
};
// Default: use internal name as output name // Default: use internal name as output name
debug!("META: Processing metadata: name={internal_name}, value={value_str}"); debug!("META: Processing metadata: name={internal_name}, value={value_str}");
@@ -357,6 +326,20 @@ pub fn process_metadata_outputs(
}) })
} }
fn yaml_value_to_string(value: &serde_yaml::Value) -> String {
match value {
serde_yaml::Value::Null => "null".to_string(),
serde_yaml::Value::Bool(b) => b.to_string(),
serde_yaml::Value::Number(n) => n.to_string(),
serde_yaml::Value::String(s) => s.clone(),
serde_yaml::Value::Sequence(_)
| serde_yaml::Value::Mapping(_)
| serde_yaml::Value::Tagged(_) => {
serde_yaml::to_string(value).unwrap_or_else(|_| "".to_string())
}
}
}
pub trait MetaPlugin: Send pub trait MetaPlugin: Send
where where
Self: 'static, Self: 'static,
@@ -460,9 +443,9 @@ where
/// ///
/// An empty `HashMap` (default implementation). /// An empty `HashMap` (default implementation).
fn outputs(&self) -> &std::collections::HashMap<String, serde_yaml::Value> { fn outputs(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
use once_cell::sync::Lazy; use std::sync::LazyLock;
static EMPTY: Lazy<std::collections::HashMap<String, serde_yaml::Value>> = static EMPTY: LazyLock<std::collections::HashMap<String, serde_yaml::Value>> =
Lazy::new(std::collections::HashMap::new); LazyLock::new(std::collections::HashMap::new);
&EMPTY &EMPTY
} }
@@ -487,9 +470,9 @@ where
/// ///
/// An empty `HashMap` (default implementation). /// An empty `HashMap` (default implementation).
fn options(&self) -> &std::collections::HashMap<String, serde_yaml::Value> { fn options(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
use once_cell::sync::Lazy; use std::sync::LazyLock;
static EMPTY: Lazy<std::collections::HashMap<String, serde_yaml::Value>> = static EMPTY: LazyLock<std::collections::HashMap<String, serde_yaml::Value>> =
Lazy::new(std::collections::HashMap::new); LazyLock::new(std::collections::HashMap::new);
&EMPTY &EMPTY
} }
@@ -618,8 +601,9 @@ where
} }
/// Global registry for meta plugins. /// Global registry for meta plugins.
static META_PLUGIN_REGISTRY: Lazy<Mutex<HashMap<MetaPluginType, PluginConstructor>>> = static META_PLUGIN_REGISTRY: std::sync::LazyLock<
Lazy::new(|| Mutex::new(HashMap::new())); Mutex<HashMap<MetaPluginType, PluginConstructor>>,
> = std::sync::LazyLock::new(|| Mutex::new(HashMap::new()));
/// Register a meta plugin with the global registry. /// Register a meta plugin with the global registry.
/// ///

View File

@@ -1,7 +1,7 @@
use crate::common::PIPESIZE; use crate::common::PIPESIZE;
use crate::meta_plugin::{ use crate::meta_plugin::{
process_metadata_outputs, register_meta_plugin, BaseMetaPlugin, MetaPlugin, MetaPluginResponse, BaseMetaPlugin, MetaPlugin, MetaPluginResponse, MetaPluginType, process_metadata_outputs,
MetaPluginType, register_meta_plugin,
}; };
#[derive(Debug, Default)] #[derive(Debug, Default)]

View File

@@ -4,16 +4,15 @@ use clap::Command;
use log::debug; use log::debug;
use std::collections::HashMap; use std::collections::HashMap;
use std::fs; use std::fs;
use std::io::{Read, Write};
use crate::client::KeepClient; use crate::client::KeepClient;
use crate::common::sanitize_ts_string;
use crate::config; use crate::config;
use crate::modes::common::{ExportMeta, resolve_item_id, sanitize_tags};
/// Export an item to data and metadata files via client. /// Export items to a `.keep.tar` archive via client.
/// ///
/// If no IDs or tags are specified, exports the latest item. /// Sends a request to the server's `/api/export` endpoint and
/// Streams data in fixed-size buffers without loading entire file into memory. /// streams the response to a local tar file.
pub fn mode( pub fn mode(
client: &KeepClient, client: &KeepClient,
cmd: &mut Command, cmd: &mut Command,
@@ -21,40 +20,38 @@ pub fn mode(
ids: &[i64], ids: &[i64],
tags: &[String], tags: &[String],
) -> Result<()> { ) -> Result<()> {
// Validate: IDs XOR tags
if !ids.is_empty() && !tags.is_empty() { if !ids.is_empty() && !tags.is_empty() {
cmd.error( cmd.error(
clap::error::ErrorKind::InvalidValue, clap::error::ErrorKind::InvalidValue,
"Both ID and tags given, you must supply either IDs or tags when using --export", "Cannot use both IDs and tags with --export",
) )
.exit(); .exit();
} else if ids.len() > 1 { }
if ids.is_empty() && tags.is_empty() {
cmd.error( cmd.error(
clap::error::ErrorKind::InvalidValue, clap::error::ErrorKind::InvalidValue,
"More than one ID given, you must supply exactly one ID when using --export", "Must provide either IDs or tags with --export",
) )
.exit(); .exit();
} }
let item_id = resolve_item_id(client, ids, tags)?; // We need to resolve items on the server to compute the filename.
// First, get the item info to build the filename template variables.
// For the tar filename, we use {name}_{ts}.keep.tar where name comes from
// --export-name or default export_<common-tags>.
let dir_name = if let Some(ref name) = settings.export_name {
name.clone()
} else {
"export".to_string()
};
// Get item info let now = Utc::now();
let item_info = client.get_item_info(item_id)?; let ts_str = sanitize_ts_string(&now.format("%Y-%m-%dT%H:%M:%SZ").to_string());
// Get streaming reader for raw compressed content
let (mut reader, compression) = client.get_item_content_stream(item_id)?;
// Build template variables
let mut vars = HashMap::new(); let mut vars = HashMap::new();
vars.insert("id".to_string(), item_id.to_string()); vars.insert("name".to_string(), dir_name);
vars.insert("tags".to_string(), sanitize_tags(&item_info.tags)); vars.insert("ts".to_string(), ts_str);
let ts = chrono::DateTime::parse_from_rfc3339(&item_info.ts)
.map(|dt| dt.with_timezone(&Utc))
.unwrap_or_else(|_| Utc::now());
vars.insert(
"ts".to_string(),
ts.format("%Y-%m-%dT%H:%M:%SZ").to_string(),
);
vars.insert("compression".to_string(), compression.clone());
let basename = strfmt::strfmt(&settings.export_filename_format, &vars).map_err(|e| { let basename = strfmt::strfmt(&settings.export_filename_format, &vars).map_err(|e| {
anyhow!( anyhow!(
@@ -64,36 +61,17 @@ pub fn mode(
) )
})?; })?;
// Stream data file write with fixed-size buffers let tar_filename = format!("{basename}.keep.tar");
let data_filename = format!("{}.data.{}", basename, compression);
let mut data_file = fs::File::create(&data_filename)
.with_context(|| format!("Cannot create data file: {}", data_filename))?;
let mut total_bytes: usize = 0;
crate::common::stream_copy(&mut reader, |chunk| {
data_file.write_all(chunk)?;
total_bytes += chunk.len();
Ok(())
})?;
debug!(
"CLIENT_EXPORT: Wrote {} bytes to {}",
total_bytes, data_filename
);
// Write meta file client
let meta_filename = format!("{}.meta.yml", basename); .export_items_to_file(ids, tags, std::path::Path::new(&tar_filename))
let export_meta = ExportMeta { .map_err(|e| anyhow!("Export failed: {e}"))?;
ts,
compression,
size: item_info.size,
tags: item_info.tags.clone(),
metadata: item_info.metadata.clone(),
};
let meta_yaml = serde_yaml::to_string(&export_meta)?;
fs::write(&meta_filename, &meta_yaml)
.with_context(|| format!("Cannot write meta file: {}", meta_filename))?;
debug!("CLIENT_EXPORT: Wrote metadata to {}", meta_filename);
eprintln!("{} {}", data_filename, meta_filename); if !settings.quiet {
eprintln!("{tar_filename}");
}
debug!("CLIENT_EXPORT: Wrote items to {tar_filename}");
Ok(()) Ok(())
} }

View File

@@ -35,11 +35,11 @@ pub fn mode(
// Get streaming reader for raw content // Get streaming reader for raw content
let (reader, compression) = client.get_item_content_stream(item_id)?; let (reader, compression) = client.get_item_content_stream(item_id)?;
let compression_type = CompressionType::from_str(&compression).unwrap_or(CompressionType::None); let compression_type = CompressionType::from_str(&compression).unwrap_or(CompressionType::Raw);
// Decompress through streaming readers // Decompress through streaming readers
let mut decompressed_reader: Box<dyn Read> = let mut decompressed_reader: Box<dyn Read> =
CompressionService::decompressing_reader(reader, &compression_type); CompressionService::decompressing_reader(reader, &compression_type)?;
// Binary detection: sample first chunk // Binary detection: sample first chunk
let mut sample_buf = [0u8; crate::common::PIPESIZE]; let mut sample_buf = [0u8; crate::common::PIPESIZE];

View File

@@ -4,6 +4,7 @@ use log::debug;
use std::collections::HashMap; use std::collections::HashMap;
use std::fs; use std::fs;
use std::io::Read; use std::io::Read;
use std::path::Path;
use crate::client::KeepClient; use crate::client::KeepClient;
use crate::compression_engine::CompressionType; use crate::compression_engine::CompressionType;
@@ -11,11 +12,61 @@ use crate::config;
use crate::modes::common::ImportMeta; use crate::modes::common::ImportMeta;
use std::str::FromStr; use std::str::FromStr;
/// Import an item from a metadata file via client. /// Import items from a `.keep.tar` archive or legacy `.meta.yml` file via client.
/// ///
/// Streams data to server without buffering entire file in memory. /// For `.keep.tar` files, streams the archive to the server's `/api/import` endpoint.
/// Sends original timestamp to server so it's preserved. /// For `.meta.yml` files, uses the legacy single-item import path.
pub fn mode( pub fn mode(
client: &KeepClient,
cmd: &mut Command,
settings: &config::Settings,
import_path: &str,
) -> Result<()> {
if import_path.ends_with(".keep.tar") {
import_tar(client, cmd, settings, import_path)
} else if import_path.ends_with(".meta.yml") {
import_legacy(client, cmd, settings, import_path)
} else {
cmd.error(
clap::error::ErrorKind::InvalidValue,
format!("Unsupported import format: {}", import_path),
)
.exit();
}
}
/// Import from a `.keep.tar` archive via the server API.
fn import_tar(
client: &KeepClient,
_cmd: &mut Command,
settings: &config::Settings,
tar_path: &str,
) -> Result<()> {
let path = Path::new(tar_path);
let imported_ids = client
.import_tar_file(path)
.map_err(|e| anyhow!("Import failed: {e}"))?;
if !settings.quiet {
println!(
"KEEP: Imported {} item(s): {:?}",
imported_ids.len(),
imported_ids
);
}
debug!(
"CLIENT_IMPORT: Imported {} items from {}",
imported_ids.len(),
tar_path
);
Ok(())
}
/// Legacy single-item import from a `.meta.yml` file.
fn import_legacy(
client: &KeepClient, client: &KeepClient,
cmd: &mut Command, cmd: &mut Command,
settings: &config::Settings, settings: &config::Settings,
@@ -23,9 +74,9 @@ pub fn mode(
) -> Result<()> { ) -> Result<()> {
// Read and parse metadata // Read and parse metadata
let meta_yaml = fs::read_to_string(meta_file) let meta_yaml = fs::read_to_string(meta_file)
.with_context(|| format!("Cannot read metadata file: {}", meta_file))?; .with_context(|| format!("Cannot read metadata file: {meta_file}"))?;
let import_meta: ImportMeta = serde_yaml::from_str(&meta_yaml) let import_meta: ImportMeta = serde_yaml::from_str(&meta_yaml)
.with_context(|| format!("Cannot parse metadata file: {}", meta_file))?; .with_context(|| format!("Cannot parse metadata file: {meta_file}"))?;
// Validate compression type // Validate compression type
CompressionType::from_str(&import_meta.compression).map_err(|_| { CompressionType::from_str(&import_meta.compression).map_err(|_| {
@@ -64,7 +115,7 @@ pub fn mode(
client.post_stream("/api/item/", &mut reader, &param_refs)? client.post_stream("/api/item/", &mut reader, &param_refs)?
} else { } else {
// For stdin, we need to buffer since stdin can't be seeked // For stdin, we need to buffer since stdin can't be seeked
// and post_stream may need to retry. Use a BufReader for efficiency. // and post_stream may need to retry.
let mut buf = Vec::new(); let mut buf = Vec::new();
std::io::stdin() std::io::stdin()
.read_to_end(&mut buf) .read_to_end(&mut buf)
@@ -84,7 +135,7 @@ pub fn mode(
debug!("CLIENT_IMPORT: Created item {} via server", item_id); debug!("CLIENT_IMPORT: Created item {} via server", item_id);
// Set uncompressed size if known from metadata // Set uncompressed size if known from metadata
if let Some(size) = import_meta.size { if let Some(size) = import_meta.uncompressed_size {
client.set_item_size(item_id, size as u64)?; client.set_item_size(item_id, size as u64)?;
debug!("CLIENT_IMPORT: Set size to {}", size); debug!("CLIENT_IMPORT: Set size to {}", size);
} }

View File

@@ -31,7 +31,7 @@ pub fn mode(
timestamp: item.ts.clone(), timestamp: item.ts.clone(),
path: String::new(), path: String::new(),
stream_size: item stream_size: item
.size .uncompressed_size
.map(|s| format_size(s as u64, settings.human_readable)) .map(|s| format_size(s as u64, settings.human_readable))
.unwrap_or_else(|| "N/A".to_string()), .unwrap_or_else(|| "N/A".to_string()),
compression: item.compression.clone(), compression: item.compression.clone(),

View File

@@ -1,6 +1,6 @@
use crate::client::KeepClient; use crate::client::KeepClient;
use crate::modes::common::{ use crate::modes::common::{
format_size, render_list_table_with_format, settings_output_format, ColumnType, OutputFormat, ColumnType, OutputFormat, format_size, render_list_table_with_format, settings_output_format,
}; };
use clap::Command; use clap::Command;
use log::debug; use log::debug;
@@ -10,16 +10,12 @@ pub fn mode(
client: &KeepClient, client: &KeepClient,
_cmd: &mut Command, _cmd: &mut Command,
settings: &crate::config::Settings, settings: &crate::config::Settings,
ids: &[i64],
tags: &[String], tags: &[String],
) -> Result<(), anyhow::Error> { ) -> Result<(), anyhow::Error> {
debug!("CLIENT_LIST: Listing items via remote server"); debug!("CLIENT_LIST: Listing items via remote server");
let meta_filter: std::collections::HashMap<String, Option<String>> = settings let items = client.list_items(ids, tags, "newest", 0, 100, &settings.meta_filter())?;
.meta
.iter()
.map(|(k, v)| (k.clone(), v.clone()))
.collect();
let items = client.list_items(tags, "newest", 0, 100, &meta_filter)?;
if settings.ids_only { if settings.ids_only {
for item in &items { for item in &items {
@@ -45,7 +41,7 @@ pub fn mode(
Some(ColumnType::Id) => item.id.to_string(), Some(ColumnType::Id) => item.id.to_string(),
Some(ColumnType::Time) => item.ts.clone(), Some(ColumnType::Time) => item.ts.clone(),
Some(ColumnType::Size) => item Some(ColumnType::Size) => item
.size .uncompressed_size
.map(|s| format_size(s as u64, settings.human_readable)) .map(|s| format_size(s as u64, settings.human_readable))
.unwrap_or_default(), .unwrap_or_default(),
Some(ColumnType::Compression) => item.compression.clone(), Some(ColumnType::Compression) => item.compression.clone(),

View File

@@ -1,8 +1,9 @@
use crate::client::{ItemInfo, KeepClient}; use crate::client::KeepClient;
use crate::compression_engine::CompressionType; use crate::compression_engine::CompressionType;
use crate::config::Settings; use crate::config::Settings;
use crate::meta_plugin::SaveMetaFn; use crate::meta_plugin::SaveMetaFn;
use crate::modes::common::settings_compression_type; use crate::modes::common::settings_compression_type;
use crate::services::ItemInfo;
use crate::services::compression_service::CompressionService; use crate::services::compression_service::CompressionService;
use crate::services::meta_service::MetaService; use crate::services::meta_service::MetaService;
use anyhow::Result; use anyhow::Result;
@@ -39,7 +40,7 @@ pub fn mode(
// Determine compression type from settings // Determine compression type from settings
let compression_type = settings_compression_type(cmd, settings); let compression_type = settings_compression_type(cmd, settings);
let compression_type_str = compression_type.to_string(); let compression_type_str = compression_type.to_string();
// In client mode, the client always handles compression (even "none"). // In client mode, the client always handles compression (even "raw").
// The server should never re-compress client data. // The server should never re-compress client data.
let server_compress = false; let server_compress = false;
@@ -75,7 +76,7 @@ pub fn mode(
// Wrap pipe writer with appropriate compression // Wrap pipe writer with appropriate compression
let mut compressor: Box<dyn Write> = let mut compressor: Box<dyn Write> =
CompressionService::compressing_writer(Box::new(pipe_writer), &compression_type_clone); CompressionService::compressing_writer(Box::new(pipe_writer), &compression_type_clone)?;
loop { loop {
let n = stdin_lock.read(&mut buffer)?; let n = stdin_lock.read(&mut buffer)?;

View File

@@ -22,7 +22,6 @@ use clap::Command;
use clap::error::ErrorKind; use clap::error::ErrorKind;
use comfy_table::{Attribute, Cell, ContentArrangement, Table}; use comfy_table::{Attribute, Cell, ContentArrangement, Table};
use log::debug; use log::debug;
use regex::Regex;
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
use std::collections::HashMap; use std::collections::HashMap;
use std::env; use std::env;
@@ -56,38 +55,18 @@ pub enum OutputFormat {
Yaml, Yaml,
} }
/// Extracts metadata from KEEP_META_* environment variables. pub const IMPORT_FORMAT_ERROR: &str =
/// "Unsupported import format: {} (expected .keep.tar or .meta.yml)";
/// Scans environment for variables prefixed with KEEP_META_ and extracts
/// key-value pairs for initial item metadata. Ignores KEEP_META_PLUGINS.
///
/// # Returns
///
/// `HashMap<String, String>` - Metadata from environment variables, with keys in uppercase without prefix.
///
/// # Errors
///
/// None; silently ignores non-matching vars and PLUGINS.
///
/// # Examples
///
/// ```ignore
/// use std::env;
/// env::set_var("KEEP_META_COMMAND", "ls -la");
/// let meta = keep::modes::common::get_meta_from_env();
/// assert_eq!(meta.get("COMMAND"), Some(&"ls -la".to_string()));
/// ```
pub fn get_meta_from_env() -> HashMap<String, String> { pub fn get_meta_from_env() -> HashMap<String, String> {
debug!("COMMON: Getting meta from KEEP_META_*"); debug!("COMMON: Getting meta from KEEP_META_*");
let re = Regex::new(r"^KEEP_META_(.+)$").unwrap();
let mut meta_env: HashMap<String, String> = HashMap::new(); let mut meta_env: HashMap<String, String> = HashMap::new();
const PREFIX: &str = "KEEP_META_";
for (key, value) in env::vars() { for (key, value) in env::vars() {
if let Some(meta_name_caps) = re.captures(key.as_str()) { if let Some(name) = key.strip_prefix(PREFIX) {
let name = String::from(meta_name_caps.get(1).unwrap().as_str()); if !name.is_empty() && name != "PLUGINS" {
// Ignore KEEP_META_PLUGINS debug!("COMMON: Found meta: {}={}", name, value);
if name != "PLUGINS" { meta_env.insert(name.to_string(), value);
debug!("COMMON: Found meta: {}={}", name.clone(), value.clone());
meta_env.insert(name, value.clone());
} }
} }
} }
@@ -446,15 +425,6 @@ pub struct DisplayItemInfo {
pub metadata: Vec<(String, String)>, pub metadata: Vec<(String, String)>,
} }
/// Display data for a single list row (used by --list).
pub struct DisplayListItem {
pub id: i64,
pub time: String,
pub size: String,
pub compression: String,
pub tags: Vec<String>,
}
/// Renders item detail table. Shared by local and client info modes. /// Renders item detail table. Shared by local and client info modes.
pub fn render_item_info_table(info: &DisplayItemInfo, table_config: &config::TableConfig) { pub fn render_item_info_table(info: &DisplayItemInfo, table_config: &config::TableConfig) {
use comfy_table::{Attribute, Cell}; use comfy_table::{Attribute, Cell};
@@ -643,7 +613,7 @@ pub fn sanitize_tags(tags: &[String]) -> String {
pub struct ExportMeta { pub struct ExportMeta {
pub ts: DateTime<Utc>, pub ts: DateTime<Utc>,
pub compression: String, pub compression: String,
pub size: Option<i64>, pub uncompressed_size: Option<i64>,
pub tags: Vec<String>, pub tags: Vec<String>,
pub metadata: HashMap<String, String>, pub metadata: HashMap<String, String>,
} }
@@ -653,8 +623,8 @@ pub struct ExportMeta {
pub struct ImportMeta { pub struct ImportMeta {
pub ts: DateTime<Utc>, pub ts: DateTime<Utc>,
pub compression: String, pub compression: String,
#[serde(default)] #[serde(default, alias = "size")]
pub size: Option<i64>, pub uncompressed_size: Option<i64>,
#[serde(default)] #[serde(default)]
pub tags: Vec<String>, pub tags: Vec<String>,
#[serde(default)] #[serde(default)]
@@ -665,6 +635,7 @@ pub struct ImportMeta {
/// ///
/// Returns the first ID if provided, the newest item matching tags, /// Returns the first ID if provided, the newest item matching tags,
/// or the newest item overall if neither is specified. /// or the newest item overall if neither is specified.
#[cfg(feature = "client")]
pub fn resolve_item_id( pub fn resolve_item_id(
client: &crate::client::KeepClient, client: &crate::client::KeepClient,
ids: &[i64], ids: &[i64],
@@ -673,13 +644,13 @@ pub fn resolve_item_id(
if !ids.is_empty() { if !ids.is_empty() {
Ok(ids[0]) Ok(ids[0])
} else if !tags.is_empty() { } else if !tags.is_empty() {
let items = client.list_items(tags, "newest", 0, 1, &HashMap::new())?; let items = client.list_items(&[], tags, "newest", 0, 1, &HashMap::new())?;
if items.is_empty() { if items.is_empty() {
return Err(anyhow!("No items found matching tags: {:?}", tags)); return Err(anyhow!("No items found matching tags: {:?}", tags));
} }
Ok(items[0].id) Ok(items[0].id)
} else { } else {
let items = client.list_items(&[], "newest", 0, 1, &HashMap::new())?; let items = client.list_items(&[], &[], "newest", 0, 1, &HashMap::new())?;
if items.is_empty() { if items.is_empty() {
return Err(anyhow!("No items found")); return Err(anyhow!("No items found"));
} }
@@ -688,6 +659,7 @@ pub fn resolve_item_id(
} }
/// Resolve item IDs from explicit IDs or tags (multi-item variant). /// Resolve item IDs from explicit IDs or tags (multi-item variant).
#[cfg(feature = "client")]
pub fn resolve_item_ids( pub fn resolve_item_ids(
client: &crate::client::KeepClient, client: &crate::client::KeepClient,
ids: &[i64], ids: &[i64],
@@ -696,13 +668,13 @@ pub fn resolve_item_ids(
if !ids.is_empty() { if !ids.is_empty() {
Ok(ids.to_vec()) Ok(ids.to_vec())
} else if !tags.is_empty() { } else if !tags.is_empty() {
let items = client.list_items(tags, "newest", 0, 0, &HashMap::new())?; let items = client.list_items(&[], tags, "newest", 0, 0, &HashMap::new())?;
if items.is_empty() { if items.is_empty() {
return Err(anyhow!("No items found matching tags: {:?}", tags)); return Err(anyhow!("No items found matching tags: {:?}", tags));
} }
Ok(items.into_iter().map(|i| i.id).collect()) Ok(items.into_iter().map(|i| i.id).collect())
} else { } else {
let items = client.list_items(&[], "newest", 0, 1, &HashMap::new())?; let items = client.list_items(&[], &[], "newest", 0, 1, &HashMap::new())?;
if items.is_empty() { if items.is_empty() {
return Err(anyhow!("No items found")); return Err(anyhow!("No items found"));
} }

View File

@@ -6,7 +6,7 @@
use crate::config; use crate::config;
use crate::services::compression_service::CompressionService; use crate::services::compression_service::CompressionService;
use crate::services::item_service::ItemService; use crate::services::item_service::ItemService;
use anyhow::{Context, Result}; use anyhow::{Context, Result, anyhow};
use clap::Command; use clap::Command;
use command_fds::{CommandFdExt, FdMapping}; use command_fds::{CommandFdExt, FdMapping};
use log::debug; use log::debug;
@@ -104,16 +104,19 @@ fn spawn_writer_thread(
write_fd: OwnedFd, write_fd: OwnedFd,
) -> std::thread::JoinHandle<Result<()>> { ) -> std::thread::JoinHandle<Result<()>> {
let data_path = item_service.get_data_path().clone(); let data_path = item_service.get_data_path().clone();
let item_id = item.item.id.expect("item must have ID"); let id = match item.item.id {
Some(id) => id,
None => return std::thread::spawn(|| Err(anyhow!("item missing ID"))),
};
let compression = item.item.compression.clone(); let compression = item.item.compression.clone();
let mut item_path = data_path; let mut item_path = data_path;
item_path.push(item_id.to_string()); item_path.push(id.to_string());
std::thread::spawn(move || -> Result<()> { std::thread::spawn(move || -> Result<()> {
let compression_service = CompressionService::new(); let compression_service = CompressionService::new();
let mut reader = compression_service let mut reader = compression_service
.stream_item_content(item_path, &compression) .stream_item_content(item_path, &compression)
.map_err(|e| anyhow::anyhow!("Failed to stream item {item_id}: {e}"))?; .map_err(|e| anyhow::anyhow!("Failed to stream item {id}: {e}"))?;
// Convert OwnedFd to File — safe, takes ownership, closes on drop // Convert OwnedFd to File — safe, takes ownership, closes on drop
let mut writer = std::fs::File::from(write_fd); let mut writer = std::fs::File::from(write_fd);
@@ -121,7 +124,7 @@ fn spawn_writer_thread(
use std::io::Write; use std::io::Write;
writer.write_all(chunk) writer.write_all(chunk)
}) })
.map_err(|e| anyhow::anyhow!("Error reading item {item_id}: {e}"))?; .map_err(|e| anyhow::anyhow!("Error reading item {id}: {e}"))?;
// writer dropped here, closing write_fd → diff sees EOF // writer dropped here, closing write_fd → diff sees EOF
Ok(()) Ok(())
}) })

View File

@@ -1,74 +1,115 @@
use anyhow::{Context, Result, anyhow}; use anyhow::{Context, Result, anyhow};
use chrono::{DateTime, Utc}; use chrono::Utc;
use clap::Command; use clap::Command;
use log::debug; use log::debug;
use std::collections::HashMap; use std::collections::HashMap;
use std::fs; use std::fs;
use std::io::{Read, Write};
use std::path::PathBuf; use std::path::PathBuf;
use crate::common::sanitize_ts_string;
use crate::config; use crate::config;
use crate::export_tar;
use crate::filter_plugin::FilterChain; use crate::filter_plugin::FilterChain;
use crate::modes::common::{ExportMeta, sanitize_tags}; use crate::modes::common::sanitize_tags;
use crate::services::item_service::ItemService; use crate::services::item_service::ItemService;
use crate::services::types::ItemWithMeta;
/// Export an item to data and metadata files. /// Export items to a `.keep.tar` archive.
/// ///
/// If no IDs or tags are specified, exports the latest item. /// Requires either IDs or tags (mutually exclusive). If IDs are given,
/// Writes `{basename}.data.{compression}` for raw data and `{basename}.meta.yml` for metadata. /// ALL must exist. Archives contain per-item data and metadata files.
pub fn mode_export( pub fn mode_export(
cmd: &mut Command, cmd: &mut Command,
settings: &config::Settings, settings: &config::Settings,
ids: &mut [i64], ids: &[i64],
tags: &mut [String], tags: &[String],
conn: &mut rusqlite::Connection, conn: &mut rusqlite::Connection,
data_path: PathBuf, data_path: PathBuf,
filter_chain: Option<FilterChain>, filter_chain: Option<FilterChain>,
) -> Result<()> { ) -> Result<()> {
// Validate: IDs XOR tags
if !ids.is_empty() && !tags.is_empty() { if !ids.is_empty() && !tags.is_empty() {
cmd.error( cmd.error(
clap::error::ErrorKind::InvalidValue, clap::error::ErrorKind::InvalidValue,
"Both ID and tags given, you must supply either IDs or tags when using --export", "Cannot use both IDs and tags with --export",
) )
.exit(); .exit();
} else if ids.len() > 1 { }
if ids.is_empty() && tags.is_empty() {
cmd.error( cmd.error(
clap::error::ErrorKind::InvalidValue, clap::error::ErrorKind::InvalidValue,
"More than one ID given, you must supply exactly one ID when using --export", "Must provide either IDs or tags with --export",
) )
.exit(); .exit();
} }
let item_service = ItemService::new(data_path.clone()); let item_service = ItemService::new(data_path.clone());
let meta_filter: HashMap<String, Option<String>> = settings let meta_filter = settings.meta_filter();
.meta
.iter()
.map(|(k, v)| (k.clone(), v.clone()))
.collect();
let item_with_meta = item_service
.find_item(conn, ids, tags, &meta_filter)
.map_err(|e| anyhow!("Unable to find matching item: {}", e))?;
let item_id = item_with_meta.item.id.context("Item missing ID")?; // Resolve items
let item_tags: Vec<String> = item_with_meta.tags.iter().map(|t| t.name.clone()).collect(); let items: Vec<ItemWithMeta> = if !ids.is_empty() {
let meta_map = item_with_meta.meta_as_map(); // Fetch each ID individually; ALL must exist
let mut result = Vec::new();
for &id in ids {
match item_service.get_item(conn, id) {
Ok(item) => result.push(item),
Err(_) => {
cmd.error(
clap::error::ErrorKind::InvalidValue,
format!("Item {id} not found"),
)
.exit();
}
}
}
result
} else {
// Search by tags
item_service
.list_items(conn, tags, &meta_filter)
.map_err(|e| anyhow!("Unable to find matching items: {}", e))?
};
if items.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"No items found matching the given criteria",
)
.exit();
}
// Validate: --export-filename-format doesn't use per-item vars with multiple items
if items.len() > 1 {
let fmt = &settings.export_filename_format;
if fmt.contains("{id}") || fmt.contains("{tags}") || fmt.contains("{compression}") {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"Cannot use {id}, {tags}, or {compression} in --export-filename-format when exporting multiple items",
)
.exit();
}
}
// Compute export name
let dir_name = export_tar::export_name(&settings.export_name, &items);
// Compute tar filename from format template
let now = Utc::now();
let ts_str = sanitize_ts_string(&now.format("%Y-%m-%dT%H:%M:%SZ").to_string());
// Build template variables
let mut vars = HashMap::new(); let mut vars = HashMap::new();
vars.insert("id".to_string(), item_id.to_string()); vars.insert("name".to_string(), dir_name.clone());
vars.insert("tags".to_string(), sanitize_tags(&item_tags)); vars.insert("ts".to_string(), ts_str.clone());
vars.insert(
"ts".to_string(), // For single-item exports, also provide per-item vars
item_with_meta if items.len() == 1 {
.item let item = &items[0];
.ts let item_id = item.item.id.context("Item missing ID")?;
.format("%Y-%m-%dT%H:%M:%SZ") let item_tags = item.tag_names();
.to_string(), vars.insert("id".to_string(), item_id.to_string());
); vars.insert("tags".to_string(), sanitize_tags(&item_tags));
vars.insert( vars.insert("compression".to_string(), item.item.compression.clone());
"compression".to_string(), }
item_with_meta.item.compression.clone(),
);
let basename = strfmt::strfmt(&settings.export_filename_format, &vars).map_err(|e| { let basename = strfmt::strfmt(&settings.export_filename_format, &vars).map_err(|e| {
anyhow!( anyhow!(
@@ -78,52 +119,27 @@ pub fn mode_export(
) )
})?; })?;
// Write data file let tar_filename = format!("{basename}.keep.tar");
let data_filename = format!("{}.data.{}", basename, item_with_meta.item.compression);
let mut item_path = data_path.clone(); // Write the tar archive
item_path.push(item_id.to_string()); let tar_file = fs::File::create(&tar_filename)
.with_context(|| format!("Cannot create tar file: {tar_filename}"))?;
if filter_chain.is_some() { export_tar::write_export_tar(
// Apply filters: decompress, filter, write tar_file,
let (mut reader, _, _) = item_service.get_item_content_info_streaming_with_chain( &dir_name,
conn, &items,
item_id, &data_path,
filter_chain.as_ref(), filter_chain.as_ref(),
)?; &item_service,
let mut out_file = fs::File::create(&data_filename) conn,
.with_context(|| format!("Cannot create data file: {}", data_filename))?; )?;
let mut buf = [0u8; 8192];
loop { if !settings.quiet {
let n = reader.read(&mut buf)?; eprintln!("{tar_filename}");
if n == 0 {
break;
}
out_file.write_all(&buf[..n])?;
}
debug!("EXPORT: Wrote filtered data to {}", data_filename);
} else {
// Raw copy of compressed file
fs::copy(&item_path, &data_filename)
.with_context(|| format!("Cannot copy {} to {}", item_path.display(), data_filename))?;
debug!("EXPORT: Copied raw data to {}", data_filename);
} }
// Write meta file debug!("EXPORT: Wrote {} items to {tar_filename}", items.len());
let meta_filename = format!("{}.meta.yml", basename);
let export_meta = ExportMeta {
ts: item_with_meta.item.ts,
compression: item_with_meta.item.compression.clone(),
size: item_with_meta.item.size,
tags: item_tags,
metadata: meta_map,
};
let meta_yaml = serde_yaml::to_string(&export_meta)?;
fs::write(&meta_filename, &meta_yaml)
.with_context(|| format!("Cannot write meta file: {}", meta_filename))?;
debug!("EXPORT: Wrote metadata to {}", meta_filename);
eprintln!("{} {}", data_filename, meta_filename);
Ok(()) Ok(())
} }

View File

@@ -258,7 +258,7 @@ fn compression_description(name: &str) -> &str {
"bzip2" => "High compression (requires bzip2 binary)", "bzip2" => "High compression (requires bzip2 binary)",
"xz" => "Very high compression (requires xz binary)", "xz" => "Very high compression (requires xz binary)",
"zstd" => "Modern fast compression (requires zstd binary)", "zstd" => "Modern fast compression (requires zstd binary)",
"none" => "No compression", "raw" => "No compression (alias: none)",
_ => "", _ => "",
} }
} }

View File

@@ -51,13 +51,8 @@ pub fn mode_get(
// If both are empty, find_item will find the last item // If both are empty, find_item will find the last item
let item_service = ItemService::new(data_path.clone()); let item_service = ItemService::new(data_path.clone());
let meta_filter: std::collections::HashMap<String, Option<String>> = settings
.meta
.iter()
.map(|(k, v)| (k.clone(), v.clone()))
.collect();
let item_with_meta = item_service let item_with_meta = item_service
.find_item(conn, ids, tags, &meta_filter) .find_item(conn, ids, tags, &settings.meta_filter())
.map_err(|e| anyhow!("Unable to find matching item in database: {}", e))?; .map_err(|e| anyhow!("Unable to find matching item in database: {}", e))?;
let item_id = item_with_meta.item.id.context("Item missing ID")?; let item_id = item_with_meta.item.id.context("Item missing ID")?;

View File

@@ -12,12 +12,56 @@ use crate::common::PIPESIZE;
use crate::compression_engine::CompressionType; use crate::compression_engine::CompressionType;
use crate::config; use crate::config;
use crate::db; use crate::db;
use crate::import_tar;
use crate::modes::common::ImportMeta; use crate::modes::common::ImportMeta;
/// Import an item from a metadata file and optional data file. /// Import items from a `.keep.tar` archive or legacy `.meta.yml` file.
/// ///
/// If `import_data_file` is not provided, reads data from stdin. /// For `.keep.tar` files, all items are imported in their original ID order,
/// each receiving a new auto-incremented ID from the database.
/// For `.meta.yml` files, the legacy single-item import is used.
pub fn mode_import( pub fn mode_import(
cmd: &mut Command,
settings: &config::Settings,
import_path: &str,
conn: &mut rusqlite::Connection,
data_path: PathBuf,
) -> Result<()> {
let path = PathBuf::from(import_path);
if import_path.ends_with(".keep.tar") {
// New tar-based import
let imported_ids = import_tar::import_from_tar(&path, conn, &data_path)?;
if !settings.quiet {
println!(
"KEEP: Imported {} item(s): {:?}",
imported_ids.len(),
imported_ids
);
}
debug!(
"IMPORT: Imported {} items from {}",
imported_ids.len(),
import_path
);
} else if import_path.ends_with(".meta.yml") {
// Legacy single-item import
import_legacy(cmd, settings, import_path, conn, data_path)?;
} else {
cmd.error(
clap::error::ErrorKind::InvalidValue,
format!("Unsupported import format: {}", import_path),
)
.exit();
}
Ok(())
}
/// Legacy single-item import from a `.meta.yml` file.
fn import_legacy(
cmd: &mut Command, cmd: &mut Command,
settings: &config::Settings, settings: &config::Settings,
meta_file: &str, meta_file: &str,
@@ -26,9 +70,9 @@ pub fn mode_import(
) -> Result<()> { ) -> Result<()> {
// Read metadata // Read metadata
let meta_yaml = fs::read_to_string(meta_file) let meta_yaml = fs::read_to_string(meta_file)
.with_context(|| format!("Cannot read metadata file: {}", meta_file))?; .with_context(|| format!("Cannot read metadata file: {meta_file}"))?;
let import_meta: ImportMeta = serde_yaml::from_str(&meta_yaml) let import_meta: ImportMeta = serde_yaml::from_str(&meta_yaml)
.with_context(|| format!("Cannot parse metadata file: {}", meta_file))?; .with_context(|| format!("Cannot parse metadata file: {meta_file}"))?;
// Validate compression type // Validate compression type
CompressionType::from_str(&import_meta.compression).map_err(|_| { CompressionType::from_str(&import_meta.compression).map_err(|_| {
@@ -129,10 +173,12 @@ pub fn mode_import(
); );
} }
// Update item size (use imported size if available, otherwise data length) // Update item sizes (use imported size if available, otherwise data length)
let size_to_record = import_meta.size.unwrap_or(data_size); let size_to_record = import_meta.uncompressed_size.unwrap_or(data_size);
let mut updated_item = item; let mut updated_item = item;
updated_item.size = Some(size_to_record); updated_item.uncompressed_size = Some(size_to_record);
updated_item.compressed_size = Some(std::fs::metadata(&item_path)?.len() as i64);
updated_item.closed = true;
db::update_item(conn, updated_item)?; db::update_item(conn, updated_item)?;
if !settings.quiet { if !settings.quiet {

View File

@@ -64,13 +64,8 @@ pub fn mode_info(
// If both are empty, find_item will find the last item // If both are empty, find_item will find the last item
let item_service = ItemService::new(data_path.clone()); let item_service = ItemService::new(data_path.clone());
let meta_filter: std::collections::HashMap<String, Option<String>> = settings
.meta
.iter()
.map(|(k, v)| (k.clone(), v.clone()))
.collect();
let item_with_meta = item_service let item_with_meta = item_service
.find_item(conn, ids, tags, &meta_filter) .find_item(conn, ids, tags, &settings.meta_filter())
.map_err(|e| anyhow!("Unable to find matching item in database: {}", e))?; .map_err(|e| anyhow!("Unable to find matching item in database: {}", e))?;
show_item(item_with_meta, settings, data_path) show_item(item_with_meta, settings, data_path)
@@ -143,14 +138,14 @@ fn show_item(
return show_item_structured(item_with_meta, settings, data_path, output_format); return show_item_structured(item_with_meta, settings, data_path, output_format);
} }
let item_tags = item_with_meta.tag_names();
let item = item_with_meta.item; let item = item_with_meta.item;
let item_id = item.id.context("Item missing ID")?; let item_id = item.id.context("Item missing ID")?;
let item_tags: Vec<String> = item_with_meta.tags.iter().map(|t| t.name.clone()).collect();
let mut item_path_buf = data_path.clone(); let mut item_path_buf = data_path.clone();
item_path_buf.push(item_id.to_string()); item_path_buf.push(item_id.to_string());
let size_str = match item.size { let size_str = match item.uncompressed_size {
Some(size) => format_size(size as u64, settings.human_readable), Some(size) => format_size(size as u64, settings.human_readable),
None => "Missing".to_string(), None => "Missing".to_string(),
}; };
@@ -216,7 +211,7 @@ fn show_item_structured(
data_path: PathBuf, data_path: PathBuf,
output_format: OutputFormat, output_format: OutputFormat,
) -> Result<()> { ) -> Result<()> {
let item_tags: Vec<String> = item_with_meta.tags.iter().map(|t| t.name.clone()).collect(); let item_tags = item_with_meta.tag_names();
let meta_map = item_with_meta.meta_as_map(); let meta_map = item_with_meta.meta_as_map();
let item = item_with_meta.item; let item = item_with_meta.item;
let item_id = item.id.context("Item missing ID")?; let item_id = item.id.context("Item missing ID")?;
@@ -230,7 +225,7 @@ fn show_item_structured(
None => "Missing".to_string(), None => "Missing".to_string(),
}; };
let stream_size_formatted = match item.size { let stream_size_formatted = match item.uncompressed_size {
Some(size) => format_size(size as u64, settings.human_readable), Some(size) => format_size(size as u64, settings.human_readable),
None => "Missing".to_string(), None => "Missing".to_string(),
}; };
@@ -243,7 +238,7 @@ fn show_item_structured(
.format("%F %T %Z") .format("%F %T %Z")
.to_string(), .to_string(),
path: item_path_buf.to_str().unwrap_or("").to_string(), path: item_path_buf.to_str().unwrap_or("").to_string(),
stream_size: item.size.map(|s| s as u64), stream_size: item.uncompressed_size.map(|s| s as u64),
stream_size_formatted, stream_size_formatted,
compression: item.compression, compression: item.compression,
file_size, file_size,

View File

@@ -5,7 +5,7 @@
/// including table, JSON, and YAML. /// including table, JSON, and YAML.
use crate::config; use crate::config;
use crate::modes::common::ColumnType; use crate::modes::common::ColumnType;
use crate::modes::common::{apply_color, apply_table_attribute, format_size, OutputFormat}; use crate::modes::common::{OutputFormat, apply_color, apply_table_attribute, format_size};
use crate::services::item_service::ItemService; use crate::services::item_service::ItemService;
use crate::services::types::ItemWithMeta; use crate::services::types::ItemWithMeta;
use anyhow::{Context, Result}; use anyhow::{Context, Result};
@@ -81,28 +81,15 @@ struct ListItem {
/// ///
/// * `Result<()>` - Success or error if listing fails. /// * `Result<()>` - Success or error if listing fails.
pub fn mode_list( pub fn mode_list(
cmd: &mut clap::Command, _cmd: &mut clap::Command,
settings: &config::Settings, settings: &config::Settings,
ids: &mut [i64], ids: &mut [i64],
tags: &[String], tags: &[String],
conn: &mut rusqlite::Connection, conn: &mut rusqlite::Connection,
data_path: std::path::PathBuf, data_path: std::path::PathBuf,
) -> Result<()> { ) -> Result<()> {
if !ids.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"ID given, you can only supply tags when using --list",
)
.exit();
}
let item_service = ItemService::new(data_path.clone()); let item_service = ItemService::new(data_path.clone());
let meta_filter: std::collections::HashMap<String, Option<String>> = settings let items_with_meta = item_service.get_items(conn, ids, tags, &settings.meta_filter())?;
.meta
.iter()
.map(|(k, v)| (k.clone(), v.clone()))
.collect();
let items_with_meta = item_service.list_items(conn, tags, &meta_filter)?;
if settings.ids_only { if settings.ids_only {
for item_with_meta in &items_with_meta { for item_with_meta in &items_with_meta {
@@ -129,7 +116,7 @@ pub fn mode_list(
table.set_header(header_cells); table.set_header(header_cells);
for item_with_meta in items_with_meta { for item_with_meta in items_with_meta {
let tags: Vec<String> = item_with_meta.tags.iter().map(|t| t.name.clone()).collect(); let tags = item_with_meta.tag_names();
let meta = item_with_meta.meta_as_map(); let meta = item_with_meta.meta_as_map();
let item = item_with_meta.item; let item = item_with_meta.item;
@@ -160,17 +147,23 @@ pub fn mode_list(
.with_timezone(&chrono::Local) .with_timezone(&chrono::Local)
.format("%F %T") .format("%F %T")
.to_string(), .to_string(),
ColumnType::Size => match item.size { ColumnType::Size => match item.uncompressed_size {
Some(size) => format_size(size as u64, settings.human_readable), Some(size) => format_size(size as u64, settings.human_readable),
None => match item_path.metadata() { None => match item_path.metadata() {
Ok(_) => "Unknown".to_string(), Ok(_) => "Unknown".to_string(),
Err(_) => "Missing".to_string(), Err(e) => {
log::warn!("File missing or inaccessible: {}", e);
"Missing".to_string()
}
}, },
}, },
ColumnType::Compression => item.compression.to_string(), ColumnType::Compression => item.compression.to_string(),
ColumnType::FileSize => match item_path.metadata() { ColumnType::FileSize => match item_path.metadata() {
Ok(metadata) => format_size(metadata.len(), settings.human_readable), Ok(metadata) => format_size(metadata.len(), settings.human_readable),
Err(_) => "Missing".to_string(), Err(e) => {
log::warn!("File missing or inaccessible: {}", e);
"Missing".to_string()
}
}, },
ColumnType::FilePath => item_path ColumnType::FilePath => item_path
.clone() .clone()
@@ -226,7 +219,7 @@ pub fn mode_list(
// Apply styling for specific cases // Apply styling for specific cases
match column_type { match column_type {
ColumnType::Size => { ColumnType::Size => {
if item.size.is_none() { if item.uncompressed_size.is_none() {
if item_path.metadata().is_ok() { if item_path.metadata().is_ok() {
cell = cell cell = cell
.fg(comfy_table::Color::Yellow) .fg(comfy_table::Color::Yellow)
@@ -276,7 +269,7 @@ fn show_list_structured(
let mut list_items = Vec::new(); let mut list_items = Vec::new();
for item_with_meta in items_with_meta { for item_with_meta in items_with_meta {
let tags: Vec<String> = item_with_meta.tags.iter().map(|t| t.name.clone()).collect(); let tags = item_with_meta.tag_names();
let meta = item_with_meta.meta_as_map(); let meta = item_with_meta.meta_as_map();
let item = item_with_meta.item; let item = item_with_meta.item;
let item_id = item.id.context("Item missing ID")?; let item_id = item.id.context("Item missing ID")?;
@@ -290,7 +283,7 @@ fn show_list_structured(
None => "Missing".to_string(), None => "Missing".to_string(),
}; };
let size_formatted = match item.size { let size_formatted = match item.uncompressed_size {
Some(size) => crate::modes::common::format_size(size as u64, settings.human_readable), Some(size) => crate::modes::common::format_size(size as u64, settings.human_readable),
None => "Unknown".to_string(), None => "Unknown".to_string(),
}; };
@@ -302,7 +295,7 @@ fn show_list_structured(
.with_timezone(&chrono::Local) .with_timezone(&chrono::Local)
.format("%F %T") .format("%F %T")
.to_string(), .to_string(),
size: item.size.map(|s| s as u64), size: item.uncompressed_size.map(|s| s as u64),
size_formatted, size_formatted,
compression: item.compression, compression: item.compression,
file_size, file_size,

File diff suppressed because it is too large Load Diff

View File

@@ -87,6 +87,8 @@ pub fn add_routes(router: Router<AppState>) -> Router<AppState> {
.route("/api/item/{item_id}/info", get(item::handle_get_item_info)) .route("/api/item/{item_id}/info", get(item::handle_get_item_info))
.route("/api/item/{item_id}/update", post(item::handle_update_item)) .route("/api/item/{item_id}/update", post(item::handle_update_item))
.route("/api/diff", get(item::handle_diff_items)) .route("/api/diff", get(item::handle_diff_items))
.route("/api/export", get(item::handle_export_items))
.route("/api/import", post(item::handle_import_items))
} }
#[cfg(feature = "swagger")] #[cfg(feature = "swagger")]

View File

@@ -2,6 +2,32 @@ use axum::{extract::State, http::StatusCode, response::Json};
use crate::modes::server::common::{ApiResponse, AppState, StatusInfoResponse}; use crate::modes::server::common::{ApiResponse, AppState, StatusInfoResponse};
async fn generate_status(
state: &AppState,
) -> Result<crate::common::status::StatusInfo, StatusCode> {
let db_path = state
.db
.lock()
.await
.path()
.unwrap_or("unknown")
.to_string();
let status_service = crate::services::status_service::StatusService::new();
let mut cmd = state.cmd.lock().await;
status_service
.generate_status(
&mut cmd,
&state.settings,
state.data_dir.clone(),
db_path.into(),
)
.map_err(|e| {
log::warn!("Failed to generate status: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})
}
#[utoipa::path( #[utoipa::path(
get, get,
path = "/api/status", path = "/api/status",
@@ -48,29 +74,7 @@ use crate::modes::server::common::{ApiResponse, AppState, StatusInfoResponse};
pub async fn handle_status( pub async fn handle_status(
State(state): State<AppState>, State(state): State<AppState>,
) -> Result<Json<StatusInfoResponse>, StatusCode> { ) -> Result<Json<StatusInfoResponse>, StatusCode> {
// Get database path let status_info = generate_status(&state).await?;
let db_path = state
.db
.lock()
.await
.path()
.unwrap_or("unknown")
.to_string();
// Use the status service to generate status info showing configured plugins
let status_service = crate::services::status_service::StatusService::new();
let mut cmd = state.cmd.lock().await;
let status_info = status_service
.generate_status(
&mut cmd,
&state.settings,
state.data_dir.clone(),
db_path.into(),
)
.map_err(|e| {
log::warn!("Failed to generate status: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
let response = StatusInfoResponse { let response = StatusInfoResponse {
success: true, success: true,
@@ -107,27 +111,7 @@ pub struct PluginsStatusResponse {
pub async fn handle_plugins_status( pub async fn handle_plugins_status(
State(state): State<AppState>, State(state): State<AppState>,
) -> Result<Json<crate::modes::server::common::ApiResponse<PluginsStatusResponse>>, StatusCode> { ) -> Result<Json<crate::modes::server::common::ApiResponse<PluginsStatusResponse>>, StatusCode> {
let db_path = state let status_info = generate_status(&state).await?;
.db
.lock()
.await
.path()
.unwrap_or("unknown")
.to_string();
let status_service = crate::services::status_service::StatusService::new();
let mut cmd = state.cmd.lock().await;
let status_info = status_service
.generate_status(
&mut cmd,
&state.settings,
state.data_dir.clone(),
db_path.into(),
)
.map_err(|e| {
log::warn!("Failed to generate status: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
let response_data = PluginsStatusResponse { let response_data = PluginsStatusResponse {
meta_plugins: status_info.meta_plugins, meta_plugins: status_info.meta_plugins,

View File

@@ -1,4 +1,5 @@
use crate::services::item_service::ItemService; use crate::services::item_service::ItemService;
use crate::services::types::ItemWithMeta;
/// Common utilities and types for the server module. /// Common utilities and types for the server module.
/// ///
/// This module provides shared structures, functions, and middleware used across /// This module provides shared structures, functions, and middleware used across
@@ -182,6 +183,26 @@ pub struct ApiResponse<T> {
pub error: Option<String>, pub error: Option<String>,
} }
impl<T> ApiResponse<T> {
/// Creates a successful API response with the given data.
pub fn ok(data: T) -> Self {
Self {
success: true,
data: Some(data),
error: None,
}
}
/// Creates a successful API response with no data.
pub fn empty() -> Self {
Self {
success: true,
data: None,
error: None,
}
}
}
/// Response type for list of item information. /// Response type for list of item information.
/// ///
/// Specialized response for endpoints that return multiple items. /// Specialized response for endpoints that return multiple items.
@@ -345,10 +366,13 @@ pub struct StatusInfoResponse {
/// let item_info = ItemInfo { /// let item_info = ItemInfo {
/// id: 42, /// id: 42,
/// ts: "2023-12-01T15:30:45Z".to_string(), /// ts: "2023-12-01T15:30:45Z".to_string(),
/// size: Some(1024), /// uncompressed_size: Some(1024),
/// compressed_size: Some(512),
/// closed: true,
/// compression: "gzip".to_string(), /// compression: "gzip".to_string(),
/// tags: vec!["important".to_string()], /// tags: vec!["important".to_string()],
/// metadata: HashMap::from([("mime_type".to_string(), "text/plain".to_string())]), /// metadata: HashMap::from([("mime_type".to_string(), "text/plain".to_string())]),
/// file_size: Some(512),
/// }; /// };
/// ``` /// ```
#[derive(Serialize, Deserialize, ToSchema)] #[derive(Serialize, Deserialize, ToSchema)]
@@ -364,11 +388,19 @@ pub struct ItemInfo {
/// The creation timestamp of the item in ISO 8601 format. /// The creation timestamp of the item in ISO 8601 format.
#[schema(example = "2023-12-01T15:30:45Z")] #[schema(example = "2023-12-01T15:30:45Z")]
pub ts: String, pub ts: String,
/// Size in bytes. /// Uncompressed size in bytes.
/// ///
/// The size of the item's content in bytes, may be None if not set. /// The uncompressed size of the item's content in bytes, may be None if not set.
#[schema(example = 1024)] #[schema(example = 1024)]
pub size: Option<i64>, pub uncompressed_size: Option<i64>,
/// Compressed size in bytes.
///
/// The compressed file size on disk in bytes, may be None if not set.
#[schema(example = 512)]
pub compressed_size: Option<i64>,
/// Whether the item has been fully written and closed.
#[schema(example = true)]
pub closed: bool,
/// Compression type. /// Compression type.
/// ///
/// The compression algorithm used for the item's content. /// The compression algorithm used for the item's content.
@@ -384,6 +416,56 @@ pub struct ItemInfo {
/// Key-value pairs containing additional metadata about the item. /// Key-value pairs containing additional metadata about the item.
#[schema(example = json!({"mime_type": "text/plain", "mime_encoding": "utf-8", "line_count": "42"}))] #[schema(example = json!({"mime_type": "text/plain", "mime_encoding": "utf-8", "line_count": "42"}))]
pub metadata: HashMap<String, String>, pub metadata: HashMap<String, String>,
/// Actual file size in bytes.
///
/// The filesystem-reported size of the item's data file. This may differ from
/// `compressed_size` if the file was written and the database hasn't been updated.
/// None if the file cannot be read (e.g., file not found, permission denied).
#[schema(example = 512)]
pub file_size: Option<i64>,
}
impl ItemInfo {
/// Enriches this `ItemInfo` with the actual filesystem-reported size.
///
/// Reads the size of the item's data file from disk and sets `file_size`.
/// If the file cannot be read, `file_size` is left as None.
///
/// # Arguments
///
/// * `data_dir` - The data directory path containing item files.
///
/// # Returns
///
/// A new `ItemInfo` with `file_size` populated from the filesystem.
pub fn with_file_size(mut self, data_dir: &std::path::Path) -> Self {
let item_path = data_dir.join(self.id.to_string());
self.file_size = std::fs::metadata(&item_path).map(|m| m.len() as i64).ok();
self
}
}
impl TryFrom<ItemWithMeta> for ItemInfo {
type Error = anyhow::Error;
fn try_from(item_with_meta: ItemWithMeta) -> Result<Self, Self::Error> {
let tags = item_with_meta.tag_names();
let metadata = item_with_meta.meta_as_map();
Ok(ItemInfo {
id: item_with_meta
.item
.id
.ok_or_else(|| anyhow::anyhow!("Item missing ID"))?,
ts: item_with_meta.item.ts.to_rfc3339(),
uncompressed_size: item_with_meta.item.uncompressed_size,
compressed_size: item_with_meta.item.compressed_size,
closed: item_with_meta.item.closed,
compression: item_with_meta.item.compression,
tags,
metadata,
file_size: None,
})
}
} }
/// Item information including content and metadata, with binary detection. /// Item information including content and metadata, with binary detection.
@@ -450,6 +532,7 @@ pub struct TagsQuery {
/// ```rust /// ```rust
/// use keep::modes::server::common::ListItemsQuery; /// use keep::modes::server::common::ListItemsQuery;
/// let query = ListItemsQuery { /// let query = ListItemsQuery {
/// ids: None,
/// tags: Some("important".to_string()), /// tags: Some("important".to_string()),
/// order: Some("newest".to_string()), /// order: Some("newest".to_string()),
/// start: Some(0), /// start: Some(0),
@@ -459,6 +542,10 @@ pub struct TagsQuery {
/// ``` /// ```
#[derive(Debug, Deserialize)] #[derive(Debug, Deserialize)]
pub struct ListItemsQuery { pub struct ListItemsQuery {
/// Optional comma-separated item IDs for filtering.
///
/// String containing numeric IDs to filter the item list.
pub ids: Option<String>,
/// Optional comma-separated tags for filtering. /// Optional comma-separated tags for filtering.
/// ///
/// String containing tags to filter the item list. /// String containing tags to filter the item list.
@@ -664,7 +751,7 @@ pub struct UpdateItemQuery {
/// Optional comma-separated tags to add. /// Optional comma-separated tags to add.
pub tags: Option<String>, pub tags: Option<String>,
/// Optional uncompressed size to set on the item. /// Optional uncompressed size to set on the item.
pub size: Option<i64>, pub uncompressed_size: Option<i64>,
} }
/// Request body for creating a new item. /// Request body for creating a new item.

View File

@@ -179,24 +179,18 @@ async fn run_server(
let addr: SocketAddr = bind_address.parse()?; let addr: SocketAddr = bind_address.parse()?;
// Warn if authentication is enabled without TLS // Warn if authentication is enabled without TLS
if config.password.is_some() || config.password_hash.is_some() || config.jwt_secret.is_some() { if (config.password.is_some() || config.password_hash.is_some() || config.jwt_secret.is_some())
#[cfg(not(feature = "tls"))] && (config.cert_file.is_none() || config.key_file.is_none())
{
log::warn!( log::warn!(
"SECURITY: Authentication enabled but TLS support is not compiled in. Credentials will be transmitted in plain text!" "SECURITY: Authentication enabled but TLS is not configured. Credentials will be transmitted in plain text!"
); );
#[cfg(feature = "tls")]
if config.cert_file.is_none() || config.key_file.is_none() {
log::warn!(
"SECURITY: Authentication enabled but TLS is not configured. Credentials will be transmitted in plain text!"
);
}
} }
// Build the app into a service // Build the app into a service
let service = app.into_make_service_with_connect_info::<SocketAddr>(); let service = app.into_make_service_with_connect_info::<SocketAddr>();
// Use TLS if both cert and key files are provided // Use TLS if both cert and key files are provided
#[cfg(feature = "tls")]
if let (Some(cert_file), Some(key_file)) = (&config.cert_file, &config.key_file) { if let (Some(cert_file), Some(key_file)) = (&config.cert_file, &config.key_file) {
info!("SERVER: HTTPS server listening on {addr}"); info!("SERVER: HTTPS server listening on {addr}");

View File

@@ -13,11 +13,13 @@ use serde::Deserialize;
use std::collections::HashMap; use std::collections::HashMap;
/// Escape text content for safe HTML insertion. /// Escape text content for safe HTML insertion.
#[inline]
fn esc(s: &str) -> String { fn esc(s: &str) -> String {
encode_text(s).to_string() encode_text(s).to_string()
} }
/// Escape attribute values for safe HTML attribute insertion. /// Escape attribute values for safe HTML attribute insertion.
#[inline]
fn esc_attr(s: &str) -> String { fn esc_attr(s: &str) -> String {
encode_double_quoted_attribute(s).to_string() encode_double_quoted_attribute(s).to_string()
} }
@@ -240,7 +242,10 @@ fn build_item_list(
format!("<a href=\"/item/{item_id}\">{id_value}</a>") format!("<a href=\"/item/{item_id}\">{id_value}</a>")
} }
"time" => item.ts.format("%Y-%m-%d %H:%M:%S").to_string(), "time" => item.ts.format("%Y-%m-%d %H:%M:%S").to_string(),
"size" => item.size.map(|s| s.to_string()).unwrap_or_default(), "size" => item
.uncompressed_size
.map(|s| s.to_string())
.unwrap_or_default(),
"tags" => { "tags" => {
// Make sure we're using all tags for the item // Make sure we're using all tags for the item
let tag_links: Vec<String> = tags let tag_links: Vec<String> = tags
@@ -424,7 +429,7 @@ fn build_item_details(conn: &Connection, id: i64) -> Result<String> {
)); ));
html.push_str(&format!( html.push_str(&format!(
"<tr><th>Size</th><td>{}</td></tr>", "<tr><th>Size</th><td>{}</td></tr>",
item.size.unwrap_or(0) item.uncompressed_size.unwrap_or(0)
)); ));
html.push_str(&format!( html.push_str(&format!(
"<tr><th>Compression</th><td>{}</td></tr>", "<tr><th>Compression</th><td>{}</td></tr>",

View File

@@ -103,11 +103,20 @@ pub fn mode_update(
// Backfill size if not set // Backfill size if not set
let mut updated_item = item.clone(); let mut updated_item = item.clone();
if item.size.is_none() { if item.uncompressed_size.is_none() {
debug!("UPDATE: Size not set, backfilling from content file"); debug!("UPDATE: Size not set, backfilling from content file");
if let Some(size) = compute_item_size(&data_path, &item) { if let Some(size) = compute_item_size(&data_path, &item) {
debug!("UPDATE: Computed size: {size}"); debug!("UPDATE: Computed size: {size}");
updated_item.size = Some(size); updated_item.uncompressed_size = Some(size);
db::update_item(conn, updated_item.clone())?;
}
}
// Backfill compressed_size if not set
if item.compressed_size.is_none() {
let item_path = data_path.join(item_id.to_string());
if let Ok(meta) = std::fs::metadata(&item_path) {
updated_item.compressed_size = Some(meta.len() as i64);
db::update_item(conn, updated_item.clone())?; db::update_item(conn, updated_item.clone())?;
} }
} }

View File

@@ -7,27 +7,6 @@ use std::str::FromStr;
pub struct CompressionService; pub struct CompressionService;
/// Service for handling compression and decompression of item content.
///
/// Provides methods to read compressed item files either fully into memory
/// or as streaming readers. Supports various compression types via engines.
/// This service abstracts the underlying compression engines for consistent access.
///
/// # Examples
///
/// ```ignore
/// let service = CompressionService::new();
/// let content = service.get_item_content(path, "gzip")?;
/// ```
/// Provides methods to read compressed item files either fully into memory
/// or as streaming readers. Supports various compression types via engines.
///
/// # Examples
///
/// ```ignore
/// let service = CompressionService::new();
/// let content = service.get_item_content(path, "gzip")?;
/// ```
impl CompressionService { impl CompressionService {
/// Creates a new CompressionService instance. /// Creates a new CompressionService instance.
/// ///
@@ -133,38 +112,27 @@ impl CompressionService {
Ok(reader) Ok(reader)
} }
/// Creates a decompressing reader wrapping the given reader.
///
/// Returns a boxed reader that decompresses on the fly based on the compression type.
/// Useful for decompressing network streams or other non-file sources.
///
/// # Arguments
///
/// * `reader` - The underlying compressed reader.
/// * `compression` - Compression type string (e.g., "gzip", "lz4").
///
/// # Returns
///
/// A boxed decompressing reader. Unknown/none types pass through unchanged.
pub fn decompressing_reader( pub fn decompressing_reader(
reader: Box<dyn Read>, reader: Box<dyn Read>,
compression: &CompressionType, compression: &CompressionType,
) -> Box<dyn Read> { ) -> Result<Box<dyn Read>, CoreError> {
match compression { match compression {
CompressionType::GZip => { CompressionType::GZip => {
use flate2::read::GzDecoder; use flate2::read::GzDecoder;
Box::new(GzDecoder::new(reader)) Ok(Box::new(GzDecoder::new(reader)))
} }
CompressionType::LZ4 => { CompressionType::LZ4 => {
use lz4_flex::frame::FrameDecoder; use lz4_flex::frame::FrameDecoder;
Box::new(FrameDecoder::new(reader)) Ok(Box::new(FrameDecoder::new(reader)))
} }
#[cfg(feature = "zstd")] #[cfg(feature = "zstd")]
CompressionType::ZStd => { CompressionType::ZStd => {
use zstd::stream::read::Decoder; use zstd::stream::read::Decoder;
Box::new(Decoder::new(reader).expect("Failed to create zstd decoder")) Ok(Box::new(Decoder::new(reader).map_err(|e| {
CoreError::Compression(format!("zstd decoder error: {}", e))
})?))
} }
_ => reader, _ => Ok(reader),
} }
} }
@@ -184,24 +152,24 @@ impl CompressionService {
pub fn compressing_writer( pub fn compressing_writer(
writer: Box<dyn Write>, writer: Box<dyn Write>,
compression: &CompressionType, compression: &CompressionType,
) -> Box<dyn Write> { ) -> Result<Box<dyn Write>, CoreError> {
match compression { match compression {
CompressionType::GZip => { CompressionType::GZip => {
use flate2::Compression; use flate2::Compression;
use flate2::write::GzEncoder; use flate2::write::GzEncoder;
Box::new(GzEncoder::new(writer, Compression::default())) Ok(Box::new(GzEncoder::new(writer, Compression::default())))
} }
CompressionType::LZ4 => Box::new(lz4_flex::frame::FrameEncoder::new(writer)), CompressionType::LZ4 => Ok(Box::new(lz4_flex::frame::FrameEncoder::new(writer))),
#[cfg(feature = "zstd")] #[cfg(feature = "zstd")]
CompressionType::ZStd => { CompressionType::ZStd => {
use zstd::stream::write::Encoder; use zstd::stream::write::Encoder;
Box::new( Ok(Box::new(
Encoder::new(writer, 3) Encoder::new(writer, 3)
.expect("Failed to create zstd encoder") .map_err(|e| CoreError::Compression(format!("zstd encoder error: {}", e)))?
.auto_finish(), .auto_finish(),
) ))
} }
_ => writer, _ => Ok(writer),
} }
} }
} }

View File

@@ -13,32 +13,27 @@ use thiserror::Error;
/// * `ItemNotFoundGeneric` - Generic item not found (no ID specified). /// * `ItemNotFoundGeneric` - Generic item not found (no ID specified).
/// * `InvalidInput(String)` - User or config input validation failure with message. /// * `InvalidInput(String)` - User or config input validation failure with message.
/// * `Compression(String)` - Compression/decompression errors with details. /// * `Compression(String)` - Compression/decompression errors with details.
/// * `PayloadTooLarge` - Request body exceeded maximum allowed size.
/// * `Other(anyhow::Error)` - Catch-all for other anyhow-wrapped errors. /// * `Other(anyhow::Error)` - Catch-all for other anyhow-wrapped errors.
/// * `Migration(rusqlite_migration::Error)` - Database migration failures. /// * `Migration(rusqlite_migration::Error)` - Database migration failures.
#[derive(Error, Debug)] #[derive(Error, Debug)]
pub enum CoreError { pub enum CoreError {
#[error("Database error: {0}")] #[error("Database error: {0}")]
/// Database operation failed.
Database(#[from] rusqlite::Error), Database(#[from] rusqlite::Error),
#[error("I/O error: {0}")] #[error("I/O error: {0}")]
/// File or stream I/O operation failed.
Io(#[from] std::io::Error), Io(#[from] std::io::Error),
#[error("Item not found with id {0}")] #[error("Item not found with id {0}")]
/// Item with the specified ID does not exist in the database.
ItemNotFound(i64), ItemNotFound(i64),
#[error("Item not found")] #[error("Item not found")]
/// Item does not exist (no specific ID).
ItemNotFoundGeneric, ItemNotFoundGeneric,
#[error("Invalid input: {0}")] #[error("Invalid input: {0}")]
/// Input validation failed.
InvalidInput(String), InvalidInput(String),
#[error("Compression error: {0}")] #[error("Compression error: {0}")]
/// Compression or decompression operation failed.
Compression(String), Compression(String),
#[error("Payload too large")]
PayloadTooLarge,
#[error(transparent)] #[error(transparent)]
/// Other unexpected error.
Other(#[from] anyhow::Error), Other(#[from] anyhow::Error),
#[error("Migration error: {0}")] #[error("Migration error: {0}")]
/// Database schema migration failed.
Migration(#[from] rusqlite_migration::Error), Migration(#[from] rusqlite_migration::Error),
} }

View File

@@ -1,5 +1,4 @@
use crate::filter_plugin::{FilterChain, parse_filter_string}; use crate::filter_plugin::{FilterChain, parse_filter_string};
use once_cell::sync::Lazy;
use std::collections::HashMap; use std::collections::HashMap;
use std::io::{Read, Result, Write}; use std::io::{Read, Result, Write};
use std::sync::Mutex; use std::sync::Mutex;
@@ -166,8 +165,8 @@ impl FilterService {
/// # Panics /// # Panics
/// ///
/// Lock acquisition failures (rare) cause panics in accessors. /// Lock acquisition failures (rare) cause panics in accessors.
static FILTER_PLUGIN_REGISTRY: Lazy<Mutex<HashMap<String, FilterConstructor>>> = static FILTER_PLUGIN_REGISTRY: std::sync::LazyLock<Mutex<HashMap<String, FilterConstructor>>> =
Lazy::new(|| Mutex::new(HashMap::new())); std::sync::LazyLock::new(|| Mutex::new(HashMap::new()));
/// Registers a filter plugin in the global registry. /// Registers a filter plugin in the global registry.
/// ///

View File

@@ -62,6 +62,12 @@ impl ItemService {
} }
} }
fn item_path(&self, item_id: i64) -> PathBuf {
let mut path = self.data_path.clone();
path.push(item_id.to_string());
path
}
/// Retrieves an item with its associated metadata and tags. /// Retrieves an item with its associated metadata and tags.
/// ///
/// Fetches the item from the database by ID and loads its tags and metadata. /// Fetches the item from the database by ID and loads its tags and metadata.
@@ -150,7 +156,7 @@ impl ItemService {
} }
// Check size guard before loading content // Check size guard before loading content
if let Some(size) = item_with_meta.item.size if let Some(size) = item_with_meta.item.uncompressed_size
&& size > MAX_CONTENT_SIZE && size > MAX_CONTENT_SIZE
{ {
return Err(CoreError::InvalidInput(format!( return Err(CoreError::InvalidInput(format!(
@@ -159,8 +165,7 @@ impl ItemService {
))); )));
} }
let mut item_path = self.data_path.clone(); let item_path = self.item_path(item_id);
item_path.push(item_id.to_string());
debug!("ITEM_SERVICE: Reading content from path: {item_path:?}"); debug!("ITEM_SERVICE: Reading content from path: {item_path:?}");
let content = self let content = self
@@ -304,8 +309,7 @@ impl ItemService {
))); )));
} }
let mut item_path = self.data_path.clone(); let item_path = self.item_path(item_id);
item_path.push(item_id.to_string());
let reader = self let reader = self
.compression_service .compression_service
@@ -345,8 +349,7 @@ impl ItemService {
))); )));
} }
let mut item_path = self.data_path.clone(); let item_path = self.item_path(item_id);
item_path.push(item_id.to_string());
let reader = self let reader = self
.compression_service .compression_service
@@ -540,8 +543,7 @@ impl ItemService {
let item = db::get_item(conn, id)?.ok_or(CoreError::ItemNotFound(id))?; let item = db::get_item(conn, id)?.ok_or(CoreError::ItemNotFound(id))?;
debug!("ITEM_SERVICE: Found item to delete: {item:?}"); debug!("ITEM_SERVICE: Found item to delete: {item:?}");
let mut item_path = self.data_path.clone(); let item_path = self.item_path(id);
item_path.push(id.to_string());
debug!("ITEM_SERVICE: Deleting file at path: {item_path:?}"); debug!("ITEM_SERVICE: Deleting file at path: {item_path:?}");
let deleted_item = item.clone(); let deleted_item = item.clone();
@@ -632,21 +634,22 @@ impl ItemService {
// Print the "KEEP: New item" message before starting to read input // Print the "KEEP: New item" message before starting to read input
if !settings.quiet { if !settings.quiet {
if std::io::stderr().is_terminal() { if std::io::stderr().is_terminal() {
let mut t = term::stderr().unwrap(); if let Some(mut t) = term::stderr() {
let _ = t.reset(); let _ = t.reset();
let _ = t.attr(term::Attr::Bold); let _ = t.attr(term::Attr::Bold);
let _ = write!(t, "KEEP:"); let _ = write!(t, "KEEP:");
let _ = t.reset(); let _ = t.reset();
let _ = write!(t, " New item "); let _ = write!(t, " New item ");
let _ = t.attr(term::Attr::Bold); let _ = t.attr(term::Attr::Bold);
let _ = write!(t, "{item_id}"); let _ = write!(t, "{item_id}");
let _ = t.reset(); let _ = t.reset();
let _ = write!(t, " tags: "); let _ = write!(t, " tags: ");
let _ = t.attr(term::Attr::Bold); let _ = t.attr(term::Attr::Bold);
let _ = write!(t, "{}", tags.join(" ")); let _ = write!(t, "{}", tags.join(" "));
let _ = t.reset(); let _ = t.reset();
let _ = writeln!(t); let _ = writeln!(t);
let _ = std::io::stderr().flush(); let _ = std::io::stderr().flush();
}
} else { } else {
let mut t = std::io::stderr(); let mut t = std::io::stderr();
let _ = writeln!(t, "KEEP: New item: {item_id} tags: {tags:?}"); let _ = writeln!(t, "KEEP: New item: {item_id} tags: {tags:?}");
@@ -661,8 +664,7 @@ impl ItemService {
debug!("ITEM_SERVICE: Got {} meta plugins", plugins.len()); debug!("ITEM_SERVICE: Got {} meta plugins", plugins.len());
meta_service.initialize_plugins(&mut plugins); meta_service.initialize_plugins(&mut plugins);
let mut item_path = self.data_path.clone(); let item_path = self.item_path(item_id);
item_path.push(item_id.to_string());
debug!("ITEM_SERVICE: Writing item to path: {item_path:?}"); debug!("ITEM_SERVICE: Writing item to path: {item_path:?}");
let mut item_out = compression_engine.create(item_path.clone())?; let mut item_out = compression_engine.create(item_path.clone())?;
@@ -681,17 +683,20 @@ impl ItemService {
item_out.flush()?; item_out.flush()?;
drop(item_out); drop(item_out);
let compressed_size = std::fs::metadata(&item_path)?.len() as i64;
debug!("ITEM_SERVICE: Finalizing meta plugins"); debug!("ITEM_SERVICE: Finalizing meta plugins");
meta_service.finalize_plugins(&mut plugins); meta_service.finalize_plugins(&mut plugins);
// Write collected plugin metadata to DB // Write collected plugin metadata to DB
if let Ok(entries) = collected_meta.lock() { let entries = collected_meta.lock().expect("meta lock poisoned");
for (name, value) in entries.iter() { for (name, value) in entries.iter() {
db::add_meta(conn, item_id, name, value)?; db::add_meta(conn, item_id, name, value)?;
}
} }
item.size = Some(total_bytes); item.uncompressed_size = Some(total_bytes);
item.compressed_size = Some(compressed_size);
item.closed = true;
db::update_item(conn, item.clone())?; db::update_item(conn, item.clone())?;
debug!("ITEM_SERVICE: Save completed successfully"); debug!("ITEM_SERVICE: Save completed successfully");
@@ -792,6 +797,7 @@ impl ItemService {
None, None,
None, None,
settings, settings,
true,
) )
} }
@@ -812,6 +818,7 @@ impl ItemService {
client_compression_type: Option<CompressionType>, client_compression_type: Option<CompressionType>,
import_ts: Option<DateTime<Utc>>, import_ts: Option<DateTime<Utc>>,
settings: &Settings, settings: &Settings,
set_size: bool,
) -> Result<ItemWithMeta, CoreError> { ) -> Result<ItemWithMeta, CoreError> {
let mut cmd = Command::new("keep"); let mut cmd = Command::new("keep");
let mut tags = tags; let mut tags = tags;
@@ -823,8 +830,8 @@ impl ItemService {
let engine = get_compression_engine(ct.clone())?; let engine = get_compression_engine(ct.clone())?;
(ct, engine) (ct, engine)
} else { } else {
let ct = client_compression_type.unwrap_or(CompressionType::None); let ct = client_compression_type.unwrap_or(CompressionType::Raw);
let engine = get_compression_engine(CompressionType::None)?; let engine = get_compression_engine(CompressionType::Raw)?;
(ct, engine) (ct, engine)
}; };
@@ -853,10 +860,9 @@ impl ItemService {
meta_service.initialize_plugins(&mut plugins); meta_service.initialize_plugins(&mut plugins);
} }
let mut item_path = self.data_path.clone(); let item_path = self.item_path(item_id);
item_path.push(item_id.to_string());
let mut item_out = compression_engine.create(item_path)?; let mut item_out = compression_engine.create(item_path.clone())?;
let mut total_bytes = 0i64; let mut total_bytes = 0i64;
@@ -872,11 +878,14 @@ impl ItemService {
item_out.flush()?; item_out.flush()?;
drop(item_out); drop(item_out);
let compressed_size = std::fs::metadata(&item_path)?.len() as i64;
if run_meta { if run_meta {
meta_service.finalize_plugins(&mut plugins); meta_service.finalize_plugins(&mut plugins);
} }
if run_meta && let Ok(entries) = collected_meta.lock() { if run_meta {
let entries = collected_meta.lock().expect("meta lock poisoned");
for (name, value) in entries.iter() { for (name, value) in entries.iter() {
db::add_meta(conn, item_id, name, value)?; db::add_meta(conn, item_id, name, value)?;
} }
@@ -888,7 +897,9 @@ impl ItemService {
} }
} }
item.size = Some(total_bytes); item.uncompressed_size = if set_size { Some(total_bytes) } else { None };
item.compressed_size = Some(compressed_size);
item.closed = true;
db::update_item(conn, item)?; db::update_item(conn, item)?;
self.get_item(conn, item_id) self.get_item(conn, item_id)
@@ -922,8 +933,7 @@ impl ItemService {
return self.get_item(conn, item_id); return self.get_item(conn, item_id);
} }
let mut item_path = self.data_path.clone(); let item_path = self.item_path(item_id);
item_path.push(item_id.to_string());
if !item_path.exists() { if !item_path.exists() {
return Err(CoreError::ItemNotFound(item_id)); return Err(CoreError::ItemNotFound(item_id));

View File

@@ -1,3 +1,8 @@
/// Business logic services for the Keep application.
///
/// This module provides the core service layer that orchestrates item storage,
/// compression, metadata collection, and filtering. Services are used by both
/// local CLI modes and the HTTP server.
pub mod compression_service; pub mod compression_service;
pub mod error; pub mod error;
pub mod filter_service; pub mod filter_service;
@@ -13,5 +18,5 @@ pub use filter_service::{FilterService, register_filter_plugin};
pub use item_service::ItemService; pub use item_service::ItemService;
pub use meta_service::MetaService; pub use meta_service::MetaService;
pub use status_service::StatusService; pub use status_service::StatusService;
pub use types::{ItemWithContent, ItemWithMeta}; pub use types::{ItemInfo, ItemWithContent, ItemWithMeta};
pub use utils::{calc_byte_range, extract_tags, parse_comma_tags}; pub use utils::{calc_byte_range, extract_tags, parse_comma_tags};

View File

@@ -40,6 +40,15 @@ impl ItemWithMeta {
.map(|m| (m.name, m.value)) .map(|m| (m.name, m.value))
.collect() .collect()
} }
/// Returns a list of tag names for this item.
///
/// # Returns
///
/// `Vec<String>` - Tag names extracted from the tags list.
pub fn tag_names(&self) -> Vec<String> {
self.tags.iter().map(|t| t.name.clone()).collect()
}
} }
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
@@ -53,3 +62,15 @@ pub struct ItemWithContent {
/// The content bytes. /// The content bytes.
pub content: Vec<u8>, pub content: Vec<u8>,
} }
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ItemInfo {
pub id: i64,
pub ts: String,
pub uncompressed_size: Option<i64>,
pub compressed_size: Option<i64>,
pub closed: bool,
pub compression: String,
pub tags: Vec<String>,
pub metadata: HashMap<String, String>,
}

View File

@@ -57,8 +57,10 @@ pub fn create_test_item(conn: &Connection) -> i64 {
let item = crate::db::Item { let item = crate::db::Item {
id: None, id: None,
ts: chrono::Utc::now(), ts: chrono::Utc::now(),
size: Some(100), uncompressed_size: Some(100),
compression: crate::compression_engine::CompressionType::None.to_string(), compressed_size: Some(80),
closed: true,
compression: crate::compression_engine::CompressionType::Raw.to_string(),
}; };
db::insert_item(conn, item).expect("Failed to insert item") db::insert_item(conn, item).expect("Failed to insert item")
} }

View File

@@ -1,19 +1,19 @@
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
use crate::compression_engine::none::CompressionEngineNone; use crate::compression_engine::raw::CompressionEngineRaw;
use crate::tests::common::test_helpers::test_compression_engine; use crate::tests::common::test_helpers::test_compression_engine;
#[test] #[test]
fn test_compression_engine_none() { fn test_compression_engine_raw() {
let test_data = b"test compression data"; let test_data = b"test compression data";
let engine = CompressionEngineNone {}; let engine = CompressionEngineRaw {};
test_compression_engine(&engine, test_data); test_compression_engine(&engine, test_data);
} }
#[test] #[test]
fn test_compression_engine_none_empty_data() { fn test_compression_engine_raw_empty_data() {
let test_data = b""; let test_data = b"";
let engine = CompressionEngineNone {}; let engine = CompressionEngineRaw {};
test_compression_engine(&engine, test_data); test_compression_engine(&engine, test_data);
} }
} }

View File

@@ -7,7 +7,7 @@ mod tests {
fn test_compression_type_display() { fn test_compression_type_display() {
assert_eq!(format!("{}", CompressionType::LZ4), "lz4"); assert_eq!(format!("{}", CompressionType::LZ4), "lz4");
assert_eq!(format!("{}", CompressionType::GZip), "gzip"); assert_eq!(format!("{}", CompressionType::GZip), "gzip");
assert_eq!(format!("{}", CompressionType::None), "none"); assert_eq!(format!("{}", CompressionType::Raw), "raw");
} }
#[test] #[test]
@@ -21,8 +21,8 @@ mod tests {
CompressionType::GZip CompressionType::GZip
); );
assert_eq!( assert_eq!(
CompressionType::from_str("none").unwrap(), CompressionType::from_str("raw").unwrap(),
CompressionType::None CompressionType::Raw
); );
// Test case insensitivity // Test case insensitivity
assert_eq!( assert_eq!(
@@ -34,8 +34,8 @@ mod tests {
CompressionType::GZip CompressionType::GZip
); );
assert_eq!( assert_eq!(
CompressionType::from_str("NONE").unwrap(), CompressionType::from_str("RAW").unwrap(),
CompressionType::None CompressionType::Raw
); );
} }
@@ -46,4 +46,19 @@ mod tests {
// "xz" is actually a valid compression type, so it should not error // "xz" is actually a valid compression type, so it should not error
assert!(CompressionType::from_str("xz").is_ok()); assert!(CompressionType::from_str("xz").is_ok());
} }
#[test]
fn test_compression_type_none_alias() {
// "none" is an alias for "raw"
assert_eq!(
CompressionType::from_str("none").unwrap(),
CompressionType::Raw
);
assert_eq!(
CompressionType::from_str("NONE").unwrap(),
CompressionType::Raw
);
// Display outputs "raw" (canonical name)
assert_eq!(format!("{}", CompressionType::Raw), "raw");
}
} }

View File

@@ -13,9 +13,9 @@ mod tests {
.expect("Failed to get GZip engine"); .expect("Failed to get GZip engine");
assert!(gzip_engine.is_supported()); assert!(gzip_engine.is_supported());
let none_engine = compression_engine::get_compression_engine(CompressionType::None) let raw_engine = compression_engine::get_compression_engine(CompressionType::Raw)
.expect("Failed to get None engine"); .expect("Failed to get Raw engine");
assert!(none_engine.is_supported()); assert!(raw_engine.is_supported());
} }
#[test] #[test]

View File

@@ -27,8 +27,10 @@ mod tests {
let item = crate::db::Item { let item = crate::db::Item {
id: Some(999), // Non-existent item id: Some(999), // Non-existent item
ts: chrono::Utc::now(), ts: chrono::Utc::now(),
size: Some(0), uncompressed_size: Some(0),
compression: crate::compression_engine::CompressionType::None.to_string(), compressed_size: Some(0),
closed: true,
compression: crate::compression_engine::CompressionType::Raw.to_string(),
}; };
let metas = db::get_item_meta(&conn, &item); let metas = db::get_item_meta(&conn, &item);

View File

@@ -32,8 +32,10 @@ mod tests {
let item = crate::db::Item { let item = crate::db::Item {
id: Some(999), // Non-existent item id: Some(999), // Non-existent item
ts: chrono::Utc::now(), ts: chrono::Utc::now(),
size: Some(0), uncompressed_size: Some(0),
compression: crate::compression_engine::CompressionType::None.to_string(), compressed_size: Some(0),
closed: true,
compression: crate::compression_engine::CompressionType::Raw.to_string(),
}; };
let delete_result = db::delete_item_tags(&conn, item); let delete_result = db::delete_item_tags(&conn, item);

View File

@@ -0,0 +1,96 @@
#[cfg(test)]
mod export_tar_tests {
use crate::db::{Item, Meta, Tag};
use crate::export_tar::{common_tags, export_name};
use crate::services::types::ItemWithMeta;
use chrono::Utc;
fn make_item_with_tags(id: i64, tags: Vec<&str>) -> ItemWithMeta {
ItemWithMeta {
item: Item {
id: Some(id),
ts: Utc::now(),
uncompressed_size: Some(100),
compressed_size: Some(80),
closed: true,
compression: "raw".to_string(),
},
tags: tags
.into_iter()
.map(|t| Tag {
id: 0,
name: t.to_string(),
})
.collect(),
meta: Vec::new(),
}
}
#[test]
fn test_common_tags_empty() {
let items: Vec<ItemWithMeta> = Vec::new();
assert!(common_tags(&items).is_empty());
}
#[test]
fn test_common_tags_single_item() {
let items = vec![make_item_with_tags(1, vec!["foo", "bar"])];
let tags = common_tags(&items);
assert_eq!(tags, vec!["bar", "foo"]);
}
#[test]
fn test_common_tags_intersection() {
let items = vec![
make_item_with_tags(1, vec!["foo", "bar", "baz"]),
make_item_with_tags(2, vec!["foo", "bar", "qux"]),
make_item_with_tags(3, vec!["foo", "baz"]),
];
let tags = common_tags(&items);
assert_eq!(tags, vec!["foo"]);
}
#[test]
fn test_common_tags_no_intersection() {
let items = vec![
make_item_with_tags(1, vec!["foo"]),
make_item_with_tags(2, vec!["bar"]),
];
let tags = common_tags(&items);
assert!(tags.is_empty());
}
#[test]
fn test_export_name_with_arg() {
let items = vec![make_item_with_tags(1, vec!["foo"])];
let name = export_name(&Some("mybackup".to_string()), &items);
assert_eq!(name, "mybackup");
}
#[test]
fn test_export_name_default_with_tags() {
let items = vec![
make_item_with_tags(1, vec!["foo", "bar"]),
make_item_with_tags(2, vec!["foo", "baz"]),
];
let name = export_name(&None, &items);
assert_eq!(name, "export_foo");
}
#[test]
fn test_export_name_default_no_common_tags() {
let items = vec![
make_item_with_tags(1, vec!["foo"]),
make_item_with_tags(2, vec!["bar"]),
];
let name = export_name(&None, &items);
assert_eq!(name, "export");
}
#[test]
fn test_export_name_default_empty() {
let items: Vec<ItemWithMeta> = Vec::new();
let name = export_name(&None, &items);
assert_eq!(name, "export");
}
}

View File

@@ -0,0 +1,218 @@
#[cfg(test)]
mod import_tar_tests {
use crate::db;
use crate::export_tar::write_export_tar;
use crate::import_tar::import_from_tar;
use crate::services::item_service::ItemService;
use crate::services::types::ItemWithMeta;
use anyhow::Result;
use chrono::Utc;
use std::io::Write;
use std::path::Path;
use tempfile::TempDir;
fn setup_test_env() -> (TempDir, rusqlite::Connection, std::path::PathBuf) {
let temp_dir = TempDir::new().unwrap();
let db_path = temp_dir.path().join("test.db");
let conn = db::open(db_path).unwrap();
let data_path = temp_dir.path().join("data");
std::fs::create_dir_all(&data_path).unwrap();
(temp_dir, conn, data_path)
}
fn save_test_item(
conn: &mut rusqlite::Connection,
data_path: &Path,
content: &[u8],
tags: Vec<&str>,
compression: &str,
) -> i64 {
let item = db::insert_item_with_ts(conn, Utc::now(), compression).unwrap();
let item_id = item.id.unwrap();
// Write data file
let mut file_path = data_path.to_path_buf();
file_path.push(item_id.to_string());
let mut file = std::fs::File::create(&file_path).unwrap();
file.write_all(content).unwrap();
// Set size
let mut updated = item;
updated.uncompressed_size = Some(content.len() as i64);
updated.compressed_size = Some(content.len() as i64);
updated.closed = true;
db::update_item(conn, updated).unwrap();
// Set tags
let tag_names: Vec<String> = tags.into_iter().map(|t| t.to_string()).collect();
db::set_item_tags(
conn,
crate::db::Item {
id: Some(item_id),
ts: Utc::now(),
uncompressed_size: Some(content.len() as i64),
compressed_size: Some(content.len() as i64),
closed: true,
compression: compression.to_string(),
},
&tag_names,
)
.unwrap();
item_id
}
#[test]
fn test_roundtrip_export_import() -> Result<()> {
let (_dir, mut conn, data_path) = setup_test_env();
// Save test items
let id1 = save_test_item(&mut conn, &data_path, b"hello world", vec!["test"], "raw");
let id2 = save_test_item(
&mut conn,
&data_path,
b"foo bar baz",
vec!["test", "extra"],
"raw",
);
// Get items with metadata
let item_service = ItemService::new(data_path.clone());
let items = vec![
item_service.get_item(&conn, id1)?,
item_service.get_item(&conn, id2)?,
];
// Export to tar
let tar_path = _dir.path().join("test_export.keep.tar");
let tar_file = std::fs::File::create(&tar_path)?;
write_export_tar(
tar_file,
"test_export",
&items,
&data_path,
None,
&item_service,
&conn,
)?;
assert!(tar_path.exists());
let tar_size = std::fs::metadata(&tar_path)?.len();
assert!(tar_size > 0, "Tar file should not be empty");
// Clear database and data
let new_data_path = _dir.path().join("new_data");
std::fs::create_dir_all(&new_data_path)?;
// Import from tar
let new_ids = import_from_tar(&tar_path, &mut conn, &new_data_path)?;
assert_eq!(new_ids.len(), 2, "Should import 2 items");
// Verify imported data
let mut imported_data1 = Vec::new();
let mut f1 = std::fs::File::open(new_data_path.join(new_ids[0].to_string()))?;
std::io::Read::read_to_end(&mut f1, &mut imported_data1)?;
assert_eq!(imported_data1, b"hello world");
let mut imported_data2 = Vec::new();
let mut f2 = std::fs::File::open(new_data_path.join(new_ids[1].to_string()))?;
std::io::Read::read_to_end(&mut f2, &mut imported_data2)?;
assert_eq!(imported_data2, b"foo bar baz");
Ok(())
}
#[test]
fn test_import_preserves_id_order() -> Result<()> {
let (_dir, mut conn, data_path) = setup_test_env();
// Save items with specific IDs (they'll be auto-assigned 1, 2, 3)
save_test_item(&mut conn, &data_path, b"item1", vec!["a"], "raw");
save_test_item(&mut conn, &data_path, b"item2", vec!["b"], "raw");
save_test_item(&mut conn, &data_path, b"item3", vec!["c"], "raw");
let item_service = ItemService::new(data_path.clone());
let items = vec![
item_service.get_item(&conn, 1)?,
item_service.get_item(&conn, 2)?,
item_service.get_item(&conn, 3)?,
];
// Export
let tar_path = _dir.path().join("order_test.keep.tar");
let tar_file = std::fs::File::create(&tar_path)?;
write_export_tar(
tar_file,
"order_test",
&items,
&data_path,
None,
&item_service,
&conn,
)?;
// Import into new data dir
let new_data_path = _dir.path().join("new_data");
std::fs::create_dir_all(&new_data_path)?;
let new_ids = import_from_tar(&tar_path, &mut conn, &new_data_path)?;
// IDs should be 4, 5, 6 (next available after 1, 2, 3)
assert_eq!(new_ids, vec![4, 5, 6]);
Ok(())
}
#[test]
fn test_import_empty_tar_error() {
let (_dir, mut conn, data_path) = setup_test_env();
// Create an empty tar file
let tar_path = _dir.path().join("empty.keep.tar");
{
let tar_file = std::fs::File::create(&tar_path).unwrap();
let mut builder = tar::Builder::new(tar_file);
builder.finish().unwrap();
}
let result = import_from_tar(&tar_path, &mut conn, &data_path);
assert!(result.is_err(), "Empty tar should return an error");
}
#[test]
fn test_common_tags_intersection() {
use crate::db::{Item, Tag};
use crate::export_tar::common_tags;
let make_item = |tags: Vec<&str>| ItemWithMeta {
item: Item {
id: Some(1),
ts: Utc::now(),
uncompressed_size: None,
compressed_size: None,
closed: false,
compression: "raw".to_string(),
},
tags: tags
.into_iter()
.map(|t| Tag {
id: 0,
name: t.to_string(),
})
.collect(),
meta: Vec::new(),
};
let items = vec![
make_item(vec!["a", "b", "c"]),
make_item(vec!["a", "b", "d"]),
make_item(vec!["a", "c", "d"]),
];
assert_eq!(common_tags(&items), vec!["a"]);
let items_single = vec![make_item(vec!["x", "y"])];
let tags = common_tags(&items_single);
assert_eq!(tags, vec!["x", "y"]);
}
}

View File

@@ -3,10 +3,10 @@
#[cfg(test)] #[cfg(test)]
pub mod digest_tests; pub mod digest_tests;
#[cfg(feature = "infer")] #[cfg(feature = "meta_infer")]
#[cfg(test)] #[cfg(test)]
pub mod infer_tests; pub mod infer_tests;
#[cfg(feature = "tree_magic_mini")] #[cfg(feature = "meta_tree_magic_mini")]
#[cfg(test)] #[cfg(test)]
pub mod tree_magic_mini_tests; pub mod tree_magic_mini_tests;

View File

@@ -3,6 +3,8 @@ pub mod compression;
pub mod compression_engine; pub mod compression_engine;
pub mod compression_types; pub mod compression_types;
pub mod db; pub mod db;
pub mod export_tar_tests;
pub mod import_tar_tests;
pub mod meta_plugin; pub mod meta_plugin;
pub mod modes; pub mod modes;
pub mod server; pub mod server;

View File

@@ -1,5 +1,4 @@
use anyhow::{Result, bail}; use anyhow::{Result, bail};
use once_cell::sync::Lazy;
/// Supported LLM token encodings. /// Supported LLM token encodings.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)] #[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
@@ -48,10 +47,10 @@ impl std::fmt::Debug for Tokenizer {
} }
/// Static tokenizer instances — loaded once per process, shared across all plugins. /// Static tokenizer instances — loaded once per process, shared across all plugins.
static CL100K: Lazy<Tokenizer> = Lazy::new(|| { static CL100K: std::sync::LazyLock<Tokenizer> = std::sync::LazyLock::new(|| {
Tokenizer::new(TokenEncoding::Cl100kBase).expect("Failed to create cl100k_base tokenizer") Tokenizer::new(TokenEncoding::Cl100kBase).expect("Failed to create cl100k_base tokenizer")
}); });
static O200K: Lazy<Tokenizer> = Lazy::new(|| { static O200K: std::sync::LazyLock<Tokenizer> = std::sync::LazyLock::new(|| {
Tokenizer::new(TokenEncoding::O200kBase).expect("Failed to create o200k_base tokenizer") Tokenizer::new(TokenEncoding::O200kBase).expect("Failed to create o200k_base tokenizer")
}); });