Commit Graph

168 Commits (3afb88920cbca55120b637c20b9338b210d5d91b)
 

Author SHA1 Message Date
David Li 3afb88920c merge: resolve and finalize branch jw 2026-05-15 00:16:42 +08:00
jackwener 52cc39a55c chore(release): bump version to 0.2.0
主要新增:
- `wx attachments` / `wx extract`:从本地 chat 数据解密提取 V2 图片附件(macOS / Windows)
- `DbCache` WAL 增量复用:daemon 请求路径从每次 ~120s 全量解密压到 < 1s(典型 WAL)

完整 changelog 见 #57 / #58。
2026-05-14 21:38:05 +08:00
jackwener 6424a2162b fix(cache): reuse decrypted db across wal-only updates (#58) 2026-05-14 19:37:22 +08:00
jackwener e9f65ba71b review: preserve wal incremental reuse across restart 2026-05-14 19:35:36 +08:00
jackwener b032b8be04 fix(cache): apply WAL incrementally instead of full re-decrypting on WAL mtime change
DbCache 之前只要 .db 或 .db-wal 任一 mtime 变就 full_decrypt。WeChat 在写消息
时会持续 append WAL(无 checkpoint 时),导致每次 attachments/extract 请求都
重新解密 1.8GB 的 message_0.db(实测 ~120s/次)。

改成三种 hit 路径:
  1. db_mt + wal_mt 都不变 → 直接返回 cached path
  2. db_mt 不变、wal_mt 变了 → 在 cached 产物上**再 apply 一次 WAL**
     (apply_wal 是幂等的:旧帧 redo 同样的 page 写入,新帧追加生效)
  3. db_mt 变了 → 全量解密 + apply WAL(旧路径)

效果:典型 WAL(< 10MB)从 ~120s 压到 < 1s;100MB 大 WAL 也只在 ~7s。
SQLite 不会自发"主库不变 + WAL 清空",所以 path 2 的边角不需要特殊处理。

测试覆盖三条路径:
  - exact_mtime_hit_skips_decrypt
  - wal_only_change_uses_incremental_path
  - db_mtime_change_triggers_full_decrypt
区分手段:cached file 大小是否被 full_decrypt 重写到 PAGE_SZ 倍数。
2026-05-14 19:24:02 +08:00
jackwener ff96f957b7 feat(attachment): support image extraction from local chat data (#57) 2026-05-14 19:11:13 +08:00
jackwener b63589b368 review: tighten attachment extraction scope 2026-05-14 19:10:03 +08:00
jackwener 7feacc6371 fix(daemon): drop redundant `ok` from extract payload (collides with Response.ok)
Response 用 #[serde(flatten)] 把 q_* 返回的 Value 拼到 `{ok, error, ...data}`
里,q_extract 里再塞一个 `"ok": true` 就会在 wire 上写出两个同名 key,CLI
端 `serde_json::from_str::<Response>` 直接报「duplicate field `ok`」,对外
表现是「extract 失败 / 解析 daemon 响应失败」,但 daemon 实际已经把图解出来
了。其他 q_* 都没塞 ok(biz_articles / sessions / history 等),保持一致。
2026-05-14 18:48:46 +08:00
jackwener 2d88c9542d feat(attachment): wire wx attachments / wx extract end-to-end
把 V1 (legacy XOR + V1 fixed-AES) + 平台相关 V2 (macOS / Windows) image 解
密能力一路接到 CLI:

- ipc: 新增 Attachments / Extract 两个 Request variant
- daemon/server: dispatch 路由到 query::q_attachments / q_extract
- daemon/cache: DbCache::db_dir() 公开,让 resolver 推 wxchat_base
- daemon/query: q_attachments 走 Msg_<chat> 表按 (local_type & 0xFFFFFFFF)
  IN (...) 过滤、按 ts DESC 全局排序后分页,返回不透明 attachment_id;
  q_extract 解码 attachment_id → 查 message_resource.db → 找本地 .dat →
  按 magic 分发 v1/v2 解码 → 写盘。bridge 用 ImageKeyMaterial.{aes_key,
  xor_key}(codex 实测真实账号 xor_key=0xa2,不能硬编码 0x88)
- cli: 新增 wx attachments / wx extract 两个子命令,flag 风格与现有
  history / biz-articles 对齐
- README + SKILL: 加附件提取章节,含三档解码档位与 V2 image key 派生说明
2026-05-14 18:40:57 +08:00
jackwener bf8d0d934a feat(attachment): implement V2 image key providers 2026-05-14 18:34:38 +08:00
jackwener 14fdfde1d3 feat(attachment): scaffold module + V1 decoders + resource resolver
Lays down the skeleton for聊天附件 (chat attachment) extraction. This commit
introduces the `attachment` module with:

- `attachment_id`: opaque base64url(json) round-trip handle for CLI/IPC. Carries
  `(chat, local_id, create_time, kind)` — `local_id` alone is not unique
  (实测同 chat 内最多 7 条同 local_id 的记录), so create_time is required for
  disambiguation.
- `decoder/`: dispatch by 6B header magic. Three branches:
  - `V2_MAGIC` → AES-128-ECB + raw + XOR (need image AES key)
  - `V1_MAGIC` → AES-128-ECB with fixed key `cfcd208495d565ef` (= md5("0")[:16])
  - else → legacy single-byte XOR with magic auto-detect
  Manual ECB + PKCS7 unpad to avoid pulling in another crate.
- `resolver`: `message_resource.db` lookup chain
  `username → ChatName2Id.rowid → MessageResourceInfo.packed_info → md5`
  + on-disk `.dat` selection (full > _h > _t) under
  `<wxchat_base>/msg/attach/<md5(chat)>/<YYYY-MM>/Img/<md5>[_t|_h].dat`.
  Honors `message_local_type % 2^32` to strip the high flag bits, and orders by
  `message_create_time DESC` to handle local_id reuse.
- `image_key/`: stub trait + macOS / Windows placeholders. To be filled by
  codex with the V2 image key extraction (kvcomm + brute-force on macOS, memory
  scan on Windows).

V1 decoder ships with 6 unit tests covering every supported magic + the BMP
extra validation; resolver ships with packed_info parser + dat-file selection
tests; v2 decoder ships with header validation tests. 21 tests pass.

`cargo check` and `cargo check --target x86_64-pc-windows-gnu` both clean.
2026-05-14 18:25:32 +08:00
David Li 36302fb493 remove the notes 2026-05-14 17:35:27 +08:00
jackwener 5c001b18be chore(release): bump version to 0.1.11 2026-05-14 17:26:20 +08:00
jakevin c4c3b72796
docs(readme): mention Windows VirtualQueryEx + ReadProcessMemory in 原理 section (#55)
The 原理 section previously listed only macOS Mach VM API and Linux /proc/<pid>/mem,
omitting the Windows scanner path that has existed in src/scanner/windows.rs since
the Rust rewrite. Add the Windows API pair and the required process access rights
so the section accurately reflects all three platforms supported in CI/builds.
2026-05-14 17:20:07 +08:00
jakevin 70aa3a44e3
fix(daemon,scanner,crypto): harden lifecycle, widen Windows page scan, fix SQLCipher short read (#54)
- daemon: write pid file only after IPC bound; clean sock+pid on normal return
- transport: PidFile JSON metadata + identity verification (ps/QueryFullProcessImageNameW); SIGTERM with poll-timeout; backward-compat read for plain-text pid
- daemon_cmd: status/stop work with both new JSON and legacy plain-text pid file
- config: cwd → exe_dir → ~/.wx-cli config precedence matches `wx init` write order; Windows DB auto-detect picks newest by latest mtime
- crypto: full_decrypt uses read_exact for intermediate pages, zero-pads only the final partial page; tests cover short-chunk reads and early EOF
- scanner/windows: page protect check covers PAGE_READWRITE / PAGE_WRITECOPY / PAGE_EXECUTE_*WRITE* with modifier-bit stripping

Cross-reviewed by @wx-cli-coder. Windows verified via `cargo check --target x86_64-pc-windows-gnu` (no Windows runtime test).
2026-05-14 17:11:42 +08:00
jakevin d4587b1c68
fix(query): three correctness/latency fixes from deep review (#51)
- q_contacts: replaced ad-hoc `gh_*`/`biz_*` prefix filter with
  `chat_type_of == "private"`. The old filter leaked groups
  (`@chatroom`), folded entries (`brandsessionholder` /
  `@placeholder_foldgroup`), verified service accounts
  (`verify_flag != 0`), and internal `@xxx` system accounts into
  `wx contacts` output.

- q_search: parallelized the per-message-DB blocking phase via
  `JoinSet::spawn_blocking`. Previously the `for (db_path, ...) in
  by_path { ... .await }` loop ran one DB at a time; users with N
  message_*.db shards paid N× latency. Each DB now runs concurrently
  on the blocking pool; total latency collapses to a single slow DB.

- q_new_messages: fixed `new_state` reset path so first-run + truncated
  sessions don't lock `since_ts` at `fallback_ts` forever. Old code
  always wrote `state[uname] = old_since_ts || fallback_ts` for changed
  sessions, then advanced only those that appeared in `all_msgs`. On
  first run (state=None) truncated sessions ended up with
  `state[uname] = now-86400` and stayed there across calls — every
  subsequent call re-scanned a window that grew with elapsed time.
  New logic separates three cases:
    * in_results        → advance to returned_max (incremental fetch)
    * truncated + state → keep prev since_ts (retry next call)
    * truncated + none  → advance to session_ts (avoid lock-in; old
                          messages remain reachable via `wx history`).
2026-05-14 17:11:27 +08:00
jakevin f0f3d3cf22
feat(favorites): expose article url field (#50)
Co-authored-by: Kyrie <kyrie@mallab.world>
2026-05-14 16:08:48 +08:00
陈源泉 dab3217d3f
feat(biz): add wx biz-articles command to query public account messages (#33)
* feat(biz): add biz-articles command to query public account messages

加载 biz_message_0.db 提取公众号推送(标题/url/作者/时间)。

- daemon 端通过 DbCache 按需解密 biz_message_0.db(密钥已在 all_keys.json 中)
- 新增 IPC 变体 BizArticles(limit/account/since/until 参数)
- 新增 query 处理器 q_biz_articles:
  - 通过 Name2Id 反查 gh_* username → md5 → Msg_<hash> 表映射
  - 过滤 local_type & 0xFFFFFFFF = 49(appmsg 公众号文章)
  - zstd 解压 + extract_cdata 解析 <mmreader>/<item> XML
  - 支持多文章推送(一条消息含多篇文章)
  - 输出字段:time/timestamp/recv_time/account/account_username/title/url/digest/cover_url
- 新增 CLI 子命令 wx biz-articles,参数:-n / --account / --since / --until / --json
- 新增工具函数 extract_cdata(CDATA 块解析)和 parse_biz_xml_items
- 新增 8 个单测(biz_tests 模块)覆盖 CDATA 解析和多文章场景

支持工作流:
  wx biz-articles --since today --json | jq ".[].url" | xargs opencli weixin download

Verified: 返朴 ADHD 文章、Datawhale Claude Code 文章、土猛员外知识引擎文章均已正确提取。

* feat(biz-articles): add --unread filter (one latest article per account)

只列「有未读的公众号」的最近 1 篇文章 — 与 'wx unread --filter official'
行为一致,便于扫描"哪些公众号还有未读,标题是啥"。

- ipc.rs: BizArticles 加 unread: bool 字段(serde default = false 向后兼容)
- cli/mod.rs: --unread flag
- cli/biz_articles.rs: 透传 unread
- daemon/server.rs: dispatch 加 unread 参数
- daemon/query.rs: q_biz_articles
  - 开启 --unread 时先查 session.db 拿 unread_count>0 且
    chat_type==official_account 的 username 集合
  - 与 --account 取交集(两者都给时进一步缩小范围)
  - 空交集提前 return,避免无意义全表扫
  - 解析后按 pub_time DESC 排,每个 account_username 只保留首条
  - 最后再 truncate(limit)

* docs: PR draft - update --unread + --until usage

* chore(biz-articles): drop PR draft, document command, fix typo

- 删除 PR_DRAFT.md(误入 repo 的 PR 描述草稿,不该进 main)
- README.md / SKILL.md 补 biz-articles 用法
- query.rs: 密鑰 → 密钥

Co-authored-by: wx-cli-coder <coder@example.com>

---------

Co-authored-by: jackwener <jakevingoo@gmail.com>
Co-authored-by: wx-cli-coder <coder@example.com>
2026-05-14 16:07:39 +08:00
Haoqing Wang c284b4ade6
fix: parse appmsg subtypes from type 49 messages (#24) 2026-05-14 15:29:01 +08:00
jakevin 9d5a78ac04
docs(macOS): document TCC csreq invalidation after re-signing WeChat (#48)
macOS TCC binds permissions to (bundle id, csreq) where csreq encodes
the app's code signature. `codesign --force --deep --sign -` on
WeChat changes the csreq, silently invalidating every existing TCC
grant for com.tencent.xinWeChat — yet System Settings still paints
each toggle as ON because the UI only checks bundle id, hiding the
drift. WeChat then reprompts for screen recording / camera /
microphone / file access despite "looking allowed".

Three doc-only updates, no code changes:

- README.md quick start: add the `tccutil reset` loop right after the
  codesign step, plus a one-line callout pointing at the deep-dive
  section.
- SKILL.md macOS init flow: same loop in the agent-readable order, so
  agents executing the steps don't skip it.
- docs/macos-permission-guide.md: new section 五 with first-principles
  root cause, the reset loop, the macOS 26 "录屏与系统录音 / 仅系统
  录音" UI split footgun, and ad-hoc signature verification.

Builds on the BobbyCat PR #29 — keeps the symptom description and the
macOS 26 UI split note, expands scope from ScreenCapture-only to all
TCC services that re-signing actually breaks (Camera / Microphone /
AppleEvents / AddressBook / Documents / Downloads / Desktop), drops
the misleading TCC.db sqlite query (path varies by macOS version, can
need FDA, and is no more useful than just trying WeChat's screenshot
again), and explicitly leaves the reset as a manual step rather than
auto-running it from `wx init` because it would wipe currently-working
grants.

Co-authored-by: BobbyCat <114374951+BobbyCats@users.noreply.github.com>
2026-05-14 15:13:50 +08:00
Tsing 1b00d04598
feat: expose url field for link/appmsg messages (#18)
* feat: expose url field for link/appmsg messages

Extract <url> from appmsg XML in type-49 messages and append it as
a 'url' field in history/search output. The field is omitted when
the message has no valid URL (non-link types, empty, non-http).

* fix: normalize appmsg urls across query outputs

---------

Co-authored-by: tsinghu <tsinghu@tencent.com>
Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 14:46:34 +08:00
Haoqing Wang b0431352ce
feat(appmsg): 支持引用消息原文解析 (#28)
* feat(appmsg): parse quoted message content

* docs(appmsg): document quote message output
2026-05-14 14:42:03 +08:00
Haoqing Wang 35a8f0e94b
feat(group): 支持群昵称/群名片展示 (#23)
* feat: support group nicknames

* fix(group): keep duplicate nickname senders separate in stats

---------

Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 14:22:55 +08:00
刘传佳 d750ef6e9f
fix(cli,config): 修复 sudo 下初始化失败 + daemon 不重载问题 (#37)
* fix(cli,config): 修复 sudo 下初始化失败 + daemon 不重载问题

  - cli/transport: 新增 stop_daemon(),init 后自动停止旧 daemon
  - config: cli_dir() 优先读 SUDO_USER 环境变量,避免写到 /root/.wx-cli
  - config: auto_detect_db_dir() 按 .db 文件最新 mtime 排序,正确选最新目录
  - daemon/server: dispatch 新增 ReloadConfig 命令(预留)
  - ipc: Request 新增 ReloadConfig 变体
  - scanner/linux: 移除调试日志,清理 unused bail import

* fix(config): resolve sudo home via passwd lookup

---------

Co-authored-by: cjliu <cjliu@upointech.com>
Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 13:50:04 +08:00
David Li 1be706ddb0 fix(daemon): pass global --tcp to daemon start
DaemonCommands::Start had its own --tcp subcommand flag, so
'wx --tcp=ADDR daemon start' ignored the global --tcp and
started the daemon with no TCP listener. Now the global --tcp
is used as fallback when the subcommand flag is absent.
2026-05-13 18:23:43 +08:00
David Li cba33e4630 fix(cli): require '=' for --tcp flag to prevent subcommand collision
--tcp consumed the following subcommand as its value (e.g.
'wx --tcp daemon start' parsed --tcp=daemon). Adding
require_equals=true forces --tcp=ADDR syntax so subcommands
are parsed correctly.
2026-05-13 18:12:08 +08:00
David Li 1106b4f544 feat(daemon): log every received request at info level
Add info!(cmd = ?req, "收到请求") in handle_connection so each
incoming request is logged with its full Request variant for diagnostics.
2026-05-13 17:40:40 +08:00
David Li 521c6296b6 fix: unterminated char literal and missing cli_dir in start_daemon
- Fix '\''  -> '\\' in src/daemon/mod.rs (lines 85, 151)
- Replace undefined preflight_cli_dir_writable() with inline
  config::cli_dir() creation check in src/cli/transport.rs
2026-05-13 17:27:08 +08:00
David Li 107af74a72 chore: auto-commit after quick-task
GSD-Unit: Q3
2026-05-13 17:19:09 +08:00
David Li 7d2a54c416 chore: untrack .gsd/ runtime files from git index 2026-05-13 17:19:08 +08:00
David Li 11e7372258 fix: change default tracing log level from warn to info 2026-05-13 16:54:52 +08:00
David Li 7ab918adef Merge milestone/M001: migrate to tracing for structured logging
# Conflicts:
#	.gitignore
2026-05-13 16:10:47 +08:00
David Li 3d0dd9b8b9 feat: migrate from eprintln! to tracing for structured logging
- Add tracing + tracing-subscriber dependencies
- Initialize tracing in main() with env-filter (RUST_LOG support)
- Replace all eprintln! diagnostic messages with tracing macros:
  - info! for lifecycle events (daemon startup, cache hits, scan progress)
  - warn! for non-fatal errors (skipped DBs, scan limits, connection errors)
  - error! for fatal errors (daemon startup failure)
  - debug! for cache hits (hidden behind RUST_LOG=debug)
- Add #[tracing::instrument] to key paths:
  - daemon::start_daemon — automatic startup timing
  - query::{sessions, history, search, new_messages} — per-query timing
  - crypto::full_decrypt — per-decrypt timing with page count
- Keep println! for user-facing CLI output (YAML/JSON, status messages)
- Keep eprintln! for test output and CLI progress indicators
2026-05-13 16:08:48 +08:00
David Li 5a4de7f83b chore: auto-commit after worktree-switch
GSD-Unit: m001
2026-05-13 15:50:22 +08:00
David Li 59b2ebbff4 chore: auto-commit after complete-milestone
GSD-Unit: M001
2026-05-13 15:45:46 +08:00
David Li e145090e74 chore: auto-commit after complete-milestone
GSD-Unit: M001
2026-05-13 14:54:00 +08:00
David Li a8ac86452e test: Added TCP vs local transport data comparison test that queries se…
- src/cli/transport.rs

GSD context:
- Milestone: M001 - TCP Transport
- Slice: S04
- Task: T02 - Added TCP vs local transport data comparison test that queries sessions via both transports and asserts deep equality

GSD-Task: S04/T02
2026-05-13 14:40:21 +08:00
David Li 7b50d6abd4 test: Added real TCP daemon integration tests that spawn the actual wx…
- src/cli/transport.rs

GSD context:
- Milestone: M001 - TCP Transport
- Slice: S04
- Task: T01 - Added real TCP daemon integration tests that spawn the actual wx binary, connect via TCP, verify ping round-trip, and test connection refused

GSD-Task: S04/T01
2026-05-13 14:37:59 +08:00
David Li ee54abdc37 test: Added 3 integration tests (round-trip, connection refused, livene…
- src/cli/transport.rs

GSD context:
- Milestone: M001 - TCP Transport
- Slice: S03
- Task: T01 - Added 3 integration tests (round-trip, connection refused, liveness check) exercising send_tcp() and is_alive_tcp() against a mock TCP server

GSD-Task: S03/T01
2026-05-13 14:25:09 +08:00
David Li 57ad8f127f test: All changes compile on native and Windows targets; 32 unit tests…
- (none)

GSD context:
- Milestone: M001 - TCP Transport
- Slice: S02
- Task: T03 - All changes compile on native and Windows targets; 32 unit tests pass including new TCP transport tests

GSD-Task: S02/T03
2026-05-13 14:11:42 +08:00
David Li 7681e69e68 feat: Wired --tcp into daemon stop command with manual-stop warning; st…
- src/cli/daemon_cmd.rs

GSD context:
- Milestone: M001 - TCP Transport
- Slice: S02
- Task: T02 - Wired --tcp into daemon stop command with manual-stop warning; status already reports TCP vs local

GSD-Task: S02/T02
2026-05-13 14:11:00 +08:00
David Li 2d11f69d5b feat: Added global --tcp CLI flag and wired TCP transport with 15s conn…
- src/cli/mod.rs
- src/cli/transport.rs
- src/cli/daemon_cmd.rs
- src/cli/sessions.rs
- src/cli/history.rs
- src/cli/search.rs
- src/cli/contacts.rs
- src/cli/export.rs

GSD context:
- Milestone: M001 - TCP Transport
- Slice: S02
- Task: T01 - Added global --tcp CLI flag and wired TCP transport with 15s connect/120s read-write timeouts, no silent fallback

GSD-Task: S02/T01
2026-05-13 14:09:47 +08:00
David Li 1f7b843a1a feat: Wired transport module into daemon server, added TCP listening al…
- src/daemon/server.rs
- src/daemon/mod.rs
- src/cli/daemon_cmd.rs
- src/cli/mod.rs

GSD context:
- Milestone: M001 - TCP Transport
- Slice: S01
- Task: T02 - Wired transport module into daemon server, added TCP listening alongside local transport, and implemented `wx daemon start [--tcp ADDR]` subcommand

GSD-Task: S01/T02
2026-05-13 13:57:12 +08:00
David Li 189110f36d feat: Created transport module with object-safe Listener/Connector trai…
- src/transport/mod.rs
- src/main.rs

GSD context:
- Milestone: M001 - TCP Transport
- Slice: S01
- Task: T01 - Created transport module with object-safe Listener/Connector traits, generic handle_connection, and TcpListener/TcpConnector implementations

GSD-Task: S01/T01
2026-05-13 13:46:57 +08:00
jackwener 6659f48984 chore: bump version to 0.1.10 2026-04-19 21:27:59 +08:00
jakevin c7e2775aa6
perf(sns): parse_post_xml 单走 roxmltree DOM,去掉 regex+DOM 双解析 (#17)
* perf(sns): parse_post_xml 单走 roxmltree DOM,去掉 regex+DOM 双解析

之前一份 SnsTimeLine.content 在 q_sns_feed / q_sns_search 全表扫描时
要被解两次:extract_xml_text 走字符串扫描取 createTime / contentDesc
/ username,parse_post_media 再 build 一次完整 roxmltree DOM 取媒体
列表。10k+ 行扫描时是显式的工作浪费。

本次重构:

- parse_post_xml 一次性 Document::parse,定位到 TimelineObject 之后所有
  字段(createTime / contentDesc / username / media / location)共用同
  一个 doc,roxmltree 只 build 一次。
- 把 parse_post_media 拆成 parse_media_from_timeline(node),避免外部
  parse 之后又重新 parse;旧的 parse_post_media(&str) 单测专用,标
  #[cfg(test)]。
- 删除 sns_location_re(不再需要 regex 抽 poiName)。
- 副作用:roxmltree 自动解码 XML entity,所以 content / location /
  username 字段输出的是解码后文本(旧版字符串扫描原样保留 `&lt;` 等)。
  对下游是更正确的语义;新增 parse_decodes_xml_entities_in_content 单
  测把行为锁住。
- 新增 parse_returns_defaults_for_malformed_xml 单测覆盖 DOM parse 失败
  时的 fallback 路径(不 panic、author 走 column fallback)。

q_sns_search 的 LIKE 预筛仍走 extract_xml_text(contentDesc) 字符串扫描
做 false-positive 过滤——这一步比 build 一棵 DOM 更快,是真优化,保
留。q_sns_notifications 也仍用 extract_xml_text,本 PR 不动(每次只跑
~limit 条,DOM 化收益小,避免扩大 scope)。

验证:
- cargo check ×3 target (darwin / windows-gnu / linux-gnu)
- cargo test 39 passed (37 → 39,新增 2 个)

* refactor(sns): parse_post_xml dedup 两份 ParsedPost 早 return 块

merge 前自查发现 Document::parse 失败 / 找不到 TimelineObject 两条
fallback 路径写了完全相同的 9 行 ParsedPost 字面量。抽成 empty()
闭包,从 2×9 行降到 1×7 行 + 两个 return empty()。

行为完全等价(含 author = column fallback)。

* fix(sns): salvage scalar fields from malformed post xml
2026-04-19 13:56:55 +08:00
郭立lee 2b5d872f0b
feat(sns): sns-feed / sns-search 输出完整 media[] 字段 (#15)
#14 之上增量:把 sns-feed / sns-search 的 media_count 升级成完整 media[] 数组(含 url/thumb/key/token/md5/enc_idx/size + video_md5/duration),下游可直接做图片代理或离线渲染。

- 用 roxmltree(pure Rust,无 C 依赖)替代 regex 抽属性
- 字段命名对齐 artifacts 仓库 Python _parse_media,跨实现 diff 友好
- 14 个 sns 单测:作者新增 6 个 fixture(单图/三图/视频/纯文字/malformed/缺 totalSize)+ 已有 8 个保持
- 与之前 PR #14 的 --user XML fallback 修复 / SNS_MAX_LIMIT / SNS_MAX_SCAN / escape_like_pattern 完全兼容

Author: leeguooooo <guoli@zhihu.com>
Co-fixed-by: wx-cli-coder (rebase + 冲突解决 + 测试模块合并 + media_count 语义文档补充)
2026-04-19 02:22:55 +08:00
JL e8939f315d
feat(sns): sns-notifications / sns-feed / sns-search (#14)
新增 3 个朋友圈相关命令:sns-notifications / sns-feed / sns-search。
PR review 修复(已 push 进同一分支):
- 修 --user 过滤与 XML <username> fallback 打架的 bug(@wx-cli-codex 发现)
- 加 SNS_MAX_LIMIT / SNS_MAX_SCAN 防御性上限
- 抽 escape_like_pattern() helper
- 补 8 个单测(parse_post_xml / escape_like_pattern)

Cargo check 三 target 全过:aarch64-darwin / x86_64-pc-windows-gnu / x86_64-unknown-linux-gnu。
Co-authored-by: fengliu222 <fengliu222@users.noreply.github.com>
2026-04-19 01:58:21 +08:00
郭立lee f0dcd4ea05
docs(readme): explain how to fetch more than 500 messages (#13)
Clarify that the 500-message behavior is only a default limit, not a hard cap.
Document `-n/--limit` examples for history, search, and export in both README and SKILL.
2026-04-18 15:01:15 +08:00
jackwener 697d3fc720 chore: bump version to 0.1.9 2026-04-18 02:11:28 +08:00