Compare commits

..

74 Commits
v0.1.2 ... main

Author SHA1 Message Date
jakevin 08af894594
fix(biz-articles): read all biz_message shards (#81) 2026-05-19 14:19:02 +08:00
jakevin 94fcc36ffe
feat(attachments): expose stable group sender identity (#77)
`q_attachments` 群聊场景下两个昵称同名的成员,原本只输出
`sender` 字段(取群名片),无法在 JSON 消费侧区分谁发的图。

跟 #68 把 `sender_username / sender_contact_display /
sender_group_nickname` 一起追加到 attachment row 上,复用
PR68 引入的 `add_sender_identity` / `sender_username` helper,
保持 4 处出口 (history / search / new-messages / stats.top_senders)
+ attachments 的字段语义完全一致。

调整:
- `q_attachments` 元组从 7 字段扩到 8 字段(多带一个稳定 wxid)
- spawn_blocking 内部多算一次 `sender_username`,per-row 复杂度 O(1)
- JSON build 处调用 `add_sender_identity`,行为对齐:非群 / 解析不到
  wxid 时三字段不输出

测试 / 文档:
- 新增 `attachment_row_gets_stable_group_sender_identity_via_helper`,
  锁住"两同名成员可被 sender_username 区分" + "非群 / 未知 sender
  不追加伪字段"
- README + SKILL.md 在 `attachments` 段和顶部 "sender 选择策略" 段
  同时记录新字段,标明 wxid 解析不到时的不输出语义

closes #23
2026-05-19 01:44:03 +08:00
jackwener 0612789d19 Merge pull request #68 from t0m1sacat/kael/sender-identity
fix: expose stable group sender identity
2026-05-19 01:14:58 +08:00
jackwener f8550ae74d Merge pull request #63 from Icy-Cat/feat/windows-mydocument-keyword
feat(windows): resolve MyDocument: token in Weixin data-root ini
2026-05-19 01:14:58 +08:00
jackwener 5f87ce6348 Merge pull request #62 from Icy-Cat/fix/init-error-shows-config-path
fix(init): show config.json path in auto-detect error
2026-05-19 01:14:58 +08:00
jackwener ed95812332 Merge pull request #76 from Suda202/fix/group-nickname-field
fix(members): ignore non-card fields for group nicknames
2026-05-19 01:14:58 +08:00
suda be1a174226 fix(members): ignore non-card fields for group nicknames 2026-05-18 23:18:20 +08:00
kael c34f5f8fe2 fix: expose stable group sender identity 2026-05-16 08:46:37 +08:00
jackwener 739b66a4b1 chore(release): bump version to 0.3.0 2026-05-16 02:23:46 +08:00
jackwener b5edaf7177 feat(meta): expose freshness coverage in query output 2026-05-16 02:22:03 +08:00
jackwener 9f6a2cfba3 review: restore cache mode coverage and rationale comments 2026-05-15 22:33:32 +08:00
jackwener 76024901e9 feat(meta): expose freshness coverage in query output 2026-05-15 22:08:46 +08:00
jakevin 12740afb53
docs(macos): document codesign side-effect popup (#64)
* docs(macos): document codesign side-effect popup ("微信" 想访问其他 App 的数据)

After `codesign --force --deep --sign - /Applications/WeChat.app`, macOS
treats the re-signed WeChat as a different code identity from the
original. When WeChat then accesses its own container / cache / app-group
data (notably triggered when opening 公众号 articles), macOS fires the
"'微信' 想访问其他 App 的数据" popup.

This is a known side-effect of the current macOS invasive init path,
not a "wx-cli is reading other apps' data" issue and not a 公众号-only
problem — 公众号 is just a high-frequency trigger surface because of
WebView / cache access.

Document this in 3 places per agreed scope:
- README.md macOS init: add "副作用提示" callout linking to the guide
- docs/macos-permission-guide.md: new §六 with first-principles
  explanation, mitigation options, and long-term direction
- src/cli/init.rs: print a short macOS-only warning at the end of
  `wx init` so users see it right when they finish the invasive setup

* review: stop overstating the trade-off and condition the init warning

Per codex review on PR #64:

1. src/cli/init.rs warning was unconditional but the wording presumed
   the user had taken the ad-hoc re-sign path. If init goes through the
   tier 2 path (Apple-signed WeChat + GUI Terminal + Developer Tools TCC
   authorization), the warning would mis-fire. Reword conditionally and
   point to the GitHub URL of the doc instead of a relative path that
   release-binary / npm-installed users won't have on disk.

2. docs/macos-permission-guide.md §六 and the matching README callout
   said "restoring official WeChat = giving up macOS memory-scan". This
   contradicts the same guide's §一 实测表 which shows
   "Apple 签名 + 本机 Terminal sudo = ". Restoring the official
   signature only gives up the default re-sign path; the local-Terminal
   + Developer-Tools route still works on Apple-signed WeChat. Only
   SSH + Apple-signed WeChat actually requires re-signing.

* review (round 2): caveat empirical gap + drop emoji

Self-review found two issues both LGTMs missed:

1. The "tier 2 仍走通" claim (README + §六) leans on §一 实测表 row
   "Apple 签名 + 本机 Terminal sudo = ". But that data only covers
   macOS 10.15 (Catalina) and 11.1 (Big Sur). macOS 14/15 — the exact
   versions where the popup behavior originates — were never tested
   for that path in this project. Add an explicit caveat instead of
   silently extrapolating across major macOS versions.

2. `init.rs` warning used a ⚠️ emoji prefix, which violates the
   project + global "no emojis in files unless requested" rule. README
   and the rest of init.rs have no emoji. Replace with `[macOS]`.
2026-05-15 15:47:15 +08:00
Icy-Cat b58ae5468d feat(windows): resolve MyDocument: token in Weixin data-root ini
The data-root ini under %APPDATA%\Tencent\xwechat\config\*.ini is
observed to contain either a plain absolute path (e.g. D:\WeChatFiles)
or the literal token 'MyDocument:'. The token form is not a real
filesystem path, so detect_db_dir_impl() — which previously did
PathBuf::from(content).is_dir() — silently failed on it, even though
the user's Weixin data was sitting in their (possibly relocated)
Documents folder.

Empirically the token denotes 'the calling user's Documents folder'.
We resolve it via SHGetKnownFolderPath(FOLDERID_Documents), which
honours the standard Windows shell-folder redirect (HKCU User Shell
Folders\Personal), so users who moved Documents to e.g. D:\Documents
now auto-detect correctly.

Plain absolute paths still pass through unchanged.

Adds Win32_UI_Shell + Win32_System_Com features to the windows crate
(needed for SHGetKnownFolderPath and CoTaskMemFree).
2026-05-15 11:53:35 +08:00
Icy-Cat 7451ce5684 fix(init): show config.json path in auto-detect error
When auto_detect_db_dir() fails, the error told the user to edit
config.json without saying where that file lives. On Windows that is
%USERPROFILE%\.wx-cli\config.json, which is non-obvious.

Use the config_path already computed at the top of cmd_init() so the
error message includes the absolute path, plus a concrete example of
the db_dir shape.
2026-05-15 11:49:40 +08:00
jackwener 52cc39a55c chore(release): bump version to 0.2.0
主要新增:
- `wx attachments` / `wx extract`:从本地 chat 数据解密提取 V2 图片附件(macOS / Windows)
- `DbCache` WAL 增量复用:daemon 请求路径从每次 ~120s 全量解密压到 < 1s(典型 WAL)

完整 changelog 见 #57 / #58。
2026-05-14 21:38:05 +08:00
jackwener 6424a2162b fix(cache): reuse decrypted db across wal-only updates (#58) 2026-05-14 19:37:22 +08:00
jackwener e9f65ba71b review: preserve wal incremental reuse across restart 2026-05-14 19:35:36 +08:00
jackwener b032b8be04 fix(cache): apply WAL incrementally instead of full re-decrypting on WAL mtime change
DbCache 之前只要 .db 或 .db-wal 任一 mtime 变就 full_decrypt。WeChat 在写消息
时会持续 append WAL(无 checkpoint 时),导致每次 attachments/extract 请求都
重新解密 1.8GB 的 message_0.db(实测 ~120s/次)。

改成三种 hit 路径:
  1. db_mt + wal_mt 都不变 → 直接返回 cached path
  2. db_mt 不变、wal_mt 变了 → 在 cached 产物上**再 apply 一次 WAL**
     (apply_wal 是幂等的:旧帧 redo 同样的 page 写入,新帧追加生效)
  3. db_mt 变了 → 全量解密 + apply WAL(旧路径)

效果:典型 WAL(< 10MB)从 ~120s 压到 < 1s;100MB 大 WAL 也只在 ~7s。
SQLite 不会自发"主库不变 + WAL 清空",所以 path 2 的边角不需要特殊处理。

测试覆盖三条路径:
  - exact_mtime_hit_skips_decrypt
  - wal_only_change_uses_incremental_path
  - db_mtime_change_triggers_full_decrypt
区分手段:cached file 大小是否被 full_decrypt 重写到 PAGE_SZ 倍数。
2026-05-14 19:24:02 +08:00
jackwener ff96f957b7 feat(attachment): support image extraction from local chat data (#57) 2026-05-14 19:11:13 +08:00
jackwener b63589b368 review: tighten attachment extraction scope 2026-05-14 19:10:03 +08:00
jackwener 7feacc6371 fix(daemon): drop redundant `ok` from extract payload (collides with Response.ok)
Response 用 #[serde(flatten)] 把 q_* 返回的 Value 拼到 `{ok, error, ...data}`
里,q_extract 里再塞一个 `"ok": true` 就会在 wire 上写出两个同名 key,CLI
端 `serde_json::from_str::<Response>` 直接报「duplicate field `ok`」,对外
表现是「extract 失败 / 解析 daemon 响应失败」,但 daemon 实际已经把图解出来
了。其他 q_* 都没塞 ok(biz_articles / sessions / history 等),保持一致。
2026-05-14 18:48:46 +08:00
jackwener 2d88c9542d feat(attachment): wire wx attachments / wx extract end-to-end
把 V1 (legacy XOR + V1 fixed-AES) + 平台相关 V2 (macOS / Windows) image 解
密能力一路接到 CLI:

- ipc: 新增 Attachments / Extract 两个 Request variant
- daemon/server: dispatch 路由到 query::q_attachments / q_extract
- daemon/cache: DbCache::db_dir() 公开,让 resolver 推 wxchat_base
- daemon/query: q_attachments 走 Msg_<chat> 表按 (local_type & 0xFFFFFFFF)
  IN (...) 过滤、按 ts DESC 全局排序后分页,返回不透明 attachment_id;
  q_extract 解码 attachment_id → 查 message_resource.db → 找本地 .dat →
  按 magic 分发 v1/v2 解码 → 写盘。bridge 用 ImageKeyMaterial.{aes_key,
  xor_key}(codex 实测真实账号 xor_key=0xa2,不能硬编码 0x88)
- cli: 新增 wx attachments / wx extract 两个子命令,flag 风格与现有
  history / biz-articles 对齐
- README + SKILL: 加附件提取章节,含三档解码档位与 V2 image key 派生说明
2026-05-14 18:40:57 +08:00
jackwener bf8d0d934a feat(attachment): implement V2 image key providers 2026-05-14 18:34:38 +08:00
jackwener 14fdfde1d3 feat(attachment): scaffold module + V1 decoders + resource resolver
Lays down the skeleton for聊天附件 (chat attachment) extraction. This commit
introduces the `attachment` module with:

- `attachment_id`: opaque base64url(json) round-trip handle for CLI/IPC. Carries
  `(chat, local_id, create_time, kind)` — `local_id` alone is not unique
  (实测同 chat 内最多 7 条同 local_id 的记录), so create_time is required for
  disambiguation.
- `decoder/`: dispatch by 6B header magic. Three branches:
  - `V2_MAGIC` → AES-128-ECB + raw + XOR (need image AES key)
  - `V1_MAGIC` → AES-128-ECB with fixed key `cfcd208495d565ef` (= md5("0")[:16])
  - else → legacy single-byte XOR with magic auto-detect
  Manual ECB + PKCS7 unpad to avoid pulling in another crate.
- `resolver`: `message_resource.db` lookup chain
  `username → ChatName2Id.rowid → MessageResourceInfo.packed_info → md5`
  + on-disk `.dat` selection (full > _h > _t) under
  `<wxchat_base>/msg/attach/<md5(chat)>/<YYYY-MM>/Img/<md5>[_t|_h].dat`.
  Honors `message_local_type % 2^32` to strip the high flag bits, and orders by
  `message_create_time DESC` to handle local_id reuse.
- `image_key/`: stub trait + macOS / Windows placeholders. To be filled by
  codex with the V2 image key extraction (kvcomm + brute-force on macOS, memory
  scan on Windows).

V1 decoder ships with 6 unit tests covering every supported magic + the BMP
extra validation; resolver ships with packed_info parser + dat-file selection
tests; v2 decoder ships with header validation tests. 21 tests pass.

`cargo check` and `cargo check --target x86_64-pc-windows-gnu` both clean.
2026-05-14 18:25:32 +08:00
jackwener 5c001b18be chore(release): bump version to 0.1.11 2026-05-14 17:26:20 +08:00
jakevin c4c3b72796
docs(readme): mention Windows VirtualQueryEx + ReadProcessMemory in 原理 section (#55)
The 原理 section previously listed only macOS Mach VM API and Linux /proc/<pid>/mem,
omitting the Windows scanner path that has existed in src/scanner/windows.rs since
the Rust rewrite. Add the Windows API pair and the required process access rights
so the section accurately reflects all three platforms supported in CI/builds.
2026-05-14 17:20:07 +08:00
jakevin 70aa3a44e3
fix(daemon,scanner,crypto): harden lifecycle, widen Windows page scan, fix SQLCipher short read (#54)
- daemon: write pid file only after IPC bound; clean sock+pid on normal return
- transport: PidFile JSON metadata + identity verification (ps/QueryFullProcessImageNameW); SIGTERM with poll-timeout; backward-compat read for plain-text pid
- daemon_cmd: status/stop work with both new JSON and legacy plain-text pid file
- config: cwd → exe_dir → ~/.wx-cli config precedence matches `wx init` write order; Windows DB auto-detect picks newest by latest mtime
- crypto: full_decrypt uses read_exact for intermediate pages, zero-pads only the final partial page; tests cover short-chunk reads and early EOF
- scanner/windows: page protect check covers PAGE_READWRITE / PAGE_WRITECOPY / PAGE_EXECUTE_*WRITE* with modifier-bit stripping

Cross-reviewed by @wx-cli-coder. Windows verified via `cargo check --target x86_64-pc-windows-gnu` (no Windows runtime test).
2026-05-14 17:11:42 +08:00
jakevin d4587b1c68
fix(query): three correctness/latency fixes from deep review (#51)
- q_contacts: replaced ad-hoc `gh_*`/`biz_*` prefix filter with
  `chat_type_of == "private"`. The old filter leaked groups
  (`@chatroom`), folded entries (`brandsessionholder` /
  `@placeholder_foldgroup`), verified service accounts
  (`verify_flag != 0`), and internal `@xxx` system accounts into
  `wx contacts` output.

- q_search: parallelized the per-message-DB blocking phase via
  `JoinSet::spawn_blocking`. Previously the `for (db_path, ...) in
  by_path { ... .await }` loop ran one DB at a time; users with N
  message_*.db shards paid N× latency. Each DB now runs concurrently
  on the blocking pool; total latency collapses to a single slow DB.

- q_new_messages: fixed `new_state` reset path so first-run + truncated
  sessions don't lock `since_ts` at `fallback_ts` forever. Old code
  always wrote `state[uname] = old_since_ts || fallback_ts` for changed
  sessions, then advanced only those that appeared in `all_msgs`. On
  first run (state=None) truncated sessions ended up with
  `state[uname] = now-86400` and stayed there across calls — every
  subsequent call re-scanned a window that grew with elapsed time.
  New logic separates three cases:
    * in_results        → advance to returned_max (incremental fetch)
    * truncated + state → keep prev since_ts (retry next call)
    * truncated + none  → advance to session_ts (avoid lock-in; old
                          messages remain reachable via `wx history`).
2026-05-14 17:11:27 +08:00
jakevin f0f3d3cf22
feat(favorites): expose article url field (#50)
Co-authored-by: Kyrie <kyrie@mallab.world>
2026-05-14 16:08:48 +08:00
陈源泉 dab3217d3f
feat(biz): add wx biz-articles command to query public account messages (#33)
* feat(biz): add biz-articles command to query public account messages

加载 biz_message_0.db 提取公众号推送(标题/url/作者/时间)。

- daemon 端通过 DbCache 按需解密 biz_message_0.db(密钥已在 all_keys.json 中)
- 新增 IPC 变体 BizArticles(limit/account/since/until 参数)
- 新增 query 处理器 q_biz_articles:
  - 通过 Name2Id 反查 gh_* username → md5 → Msg_<hash> 表映射
  - 过滤 local_type & 0xFFFFFFFF = 49(appmsg 公众号文章)
  - zstd 解压 + extract_cdata 解析 <mmreader>/<item> XML
  - 支持多文章推送(一条消息含多篇文章)
  - 输出字段:time/timestamp/recv_time/account/account_username/title/url/digest/cover_url
- 新增 CLI 子命令 wx biz-articles,参数:-n / --account / --since / --until / --json
- 新增工具函数 extract_cdata(CDATA 块解析)和 parse_biz_xml_items
- 新增 8 个单测(biz_tests 模块)覆盖 CDATA 解析和多文章场景

支持工作流:
  wx biz-articles --since today --json | jq ".[].url" | xargs opencli weixin download

Verified: 返朴 ADHD 文章、Datawhale Claude Code 文章、土猛员外知识引擎文章均已正确提取。

* feat(biz-articles): add --unread filter (one latest article per account)

只列「有未读的公众号」的最近 1 篇文章 — 与 'wx unread --filter official'
行为一致,便于扫描"哪些公众号还有未读,标题是啥"。

- ipc.rs: BizArticles 加 unread: bool 字段(serde default = false 向后兼容)
- cli/mod.rs: --unread flag
- cli/biz_articles.rs: 透传 unread
- daemon/server.rs: dispatch 加 unread 参数
- daemon/query.rs: q_biz_articles
  - 开启 --unread 时先查 session.db 拿 unread_count>0 且
    chat_type==official_account 的 username 集合
  - 与 --account 取交集(两者都给时进一步缩小范围)
  - 空交集提前 return,避免无意义全表扫
  - 解析后按 pub_time DESC 排,每个 account_username 只保留首条
  - 最后再 truncate(limit)

* docs: PR draft - update --unread + --until usage

* chore(biz-articles): drop PR draft, document command, fix typo

- 删除 PR_DRAFT.md(误入 repo 的 PR 描述草稿,不该进 main)
- README.md / SKILL.md 补 biz-articles 用法
- query.rs: 密鑰 → 密钥

Co-authored-by: wx-cli-coder <coder@example.com>

---------

Co-authored-by: jackwener <jakevingoo@gmail.com>
Co-authored-by: wx-cli-coder <coder@example.com>
2026-05-14 16:07:39 +08:00
Haoqing Wang c284b4ade6
fix: parse appmsg subtypes from type 49 messages (#24) 2026-05-14 15:29:01 +08:00
jakevin 9d5a78ac04
docs(macOS): document TCC csreq invalidation after re-signing WeChat (#48)
macOS TCC binds permissions to (bundle id, csreq) where csreq encodes
the app's code signature. `codesign --force --deep --sign -` on
WeChat changes the csreq, silently invalidating every existing TCC
grant for com.tencent.xinWeChat — yet System Settings still paints
each toggle as ON because the UI only checks bundle id, hiding the
drift. WeChat then reprompts for screen recording / camera /
microphone / file access despite "looking allowed".

Three doc-only updates, no code changes:

- README.md quick start: add the `tccutil reset` loop right after the
  codesign step, plus a one-line callout pointing at the deep-dive
  section.
- SKILL.md macOS init flow: same loop in the agent-readable order, so
  agents executing the steps don't skip it.
- docs/macos-permission-guide.md: new section 五 with first-principles
  root cause, the reset loop, the macOS 26 "录屏与系统录音 / 仅系统
  录音" UI split footgun, and ad-hoc signature verification.

Builds on the BobbyCat PR #29 — keeps the symptom description and the
macOS 26 UI split note, expands scope from ScreenCapture-only to all
TCC services that re-signing actually breaks (Camera / Microphone /
AppleEvents / AddressBook / Documents / Downloads / Desktop), drops
the misleading TCC.db sqlite query (path varies by macOS version, can
need FDA, and is no more useful than just trying WeChat's screenshot
again), and explicitly leaves the reset as a manual step rather than
auto-running it from `wx init` because it would wipe currently-working
grants.

Co-authored-by: BobbyCat <114374951+BobbyCats@users.noreply.github.com>
2026-05-14 15:13:50 +08:00
Tsing 1b00d04598
feat: expose url field for link/appmsg messages (#18)
* feat: expose url field for link/appmsg messages

Extract <url> from appmsg XML in type-49 messages and append it as
a 'url' field in history/search output. The field is omitted when
the message has no valid URL (non-link types, empty, non-http).

* fix: normalize appmsg urls across query outputs

---------

Co-authored-by: tsinghu <tsinghu@tencent.com>
Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 14:46:34 +08:00
Haoqing Wang b0431352ce
feat(appmsg): 支持引用消息原文解析 (#28)
* feat(appmsg): parse quoted message content

* docs(appmsg): document quote message output
2026-05-14 14:42:03 +08:00
Haoqing Wang 35a8f0e94b
feat(group): 支持群昵称/群名片展示 (#23)
* feat: support group nicknames

* fix(group): keep duplicate nickname senders separate in stats

---------

Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 14:22:55 +08:00
刘传佳 d750ef6e9f
fix(cli,config): 修复 sudo 下初始化失败 + daemon 不重载问题 (#37)
* fix(cli,config): 修复 sudo 下初始化失败 + daemon 不重载问题

  - cli/transport: 新增 stop_daemon(),init 后自动停止旧 daemon
  - config: cli_dir() 优先读 SUDO_USER 环境变量,避免写到 /root/.wx-cli
  - config: auto_detect_db_dir() 按 .db 文件最新 mtime 排序,正确选最新目录
  - daemon/server: dispatch 新增 ReloadConfig 命令(预留)
  - ipc: Request 新增 ReloadConfig 变体
  - scanner/linux: 移除调试日志,清理 unused bail import

* fix(config): resolve sudo home via passwd lookup

---------

Co-authored-by: cjliu <cjliu@upointech.com>
Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 13:50:04 +08:00
jackwener 6659f48984 chore: bump version to 0.1.10 2026-04-19 21:27:59 +08:00
jakevin c7e2775aa6
perf(sns): parse_post_xml 单走 roxmltree DOM,去掉 regex+DOM 双解析 (#17)
* perf(sns): parse_post_xml 单走 roxmltree DOM,去掉 regex+DOM 双解析

之前一份 SnsTimeLine.content 在 q_sns_feed / q_sns_search 全表扫描时
要被解两次:extract_xml_text 走字符串扫描取 createTime / contentDesc
/ username,parse_post_media 再 build 一次完整 roxmltree DOM 取媒体
列表。10k+ 行扫描时是显式的工作浪费。

本次重构:

- parse_post_xml 一次性 Document::parse,定位到 TimelineObject 之后所有
  字段(createTime / contentDesc / username / media / location)共用同
  一个 doc,roxmltree 只 build 一次。
- 把 parse_post_media 拆成 parse_media_from_timeline(node),避免外部
  parse 之后又重新 parse;旧的 parse_post_media(&str) 单测专用,标
  #[cfg(test)]。
- 删除 sns_location_re(不再需要 regex 抽 poiName)。
- 副作用:roxmltree 自动解码 XML entity,所以 content / location /
  username 字段输出的是解码后文本(旧版字符串扫描原样保留 `&lt;` 等)。
  对下游是更正确的语义;新增 parse_decodes_xml_entities_in_content 单
  测把行为锁住。
- 新增 parse_returns_defaults_for_malformed_xml 单测覆盖 DOM parse 失败
  时的 fallback 路径(不 panic、author 走 column fallback)。

q_sns_search 的 LIKE 预筛仍走 extract_xml_text(contentDesc) 字符串扫描
做 false-positive 过滤——这一步比 build 一棵 DOM 更快,是真优化,保
留。q_sns_notifications 也仍用 extract_xml_text,本 PR 不动(每次只跑
~limit 条,DOM 化收益小,避免扩大 scope)。

验证:
- cargo check ×3 target (darwin / windows-gnu / linux-gnu)
- cargo test 39 passed (37 → 39,新增 2 个)

* refactor(sns): parse_post_xml dedup 两份 ParsedPost 早 return 块

merge 前自查发现 Document::parse 失败 / 找不到 TimelineObject 两条
fallback 路径写了完全相同的 9 行 ParsedPost 字面量。抽成 empty()
闭包,从 2×9 行降到 1×7 行 + 两个 return empty()。

行为完全等价(含 author = column fallback)。

* fix(sns): salvage scalar fields from malformed post xml
2026-04-19 13:56:55 +08:00
郭立lee 2b5d872f0b
feat(sns): sns-feed / sns-search 输出完整 media[] 字段 (#15)
#14 之上增量:把 sns-feed / sns-search 的 media_count 升级成完整 media[] 数组(含 url/thumb/key/token/md5/enc_idx/size + video_md5/duration),下游可直接做图片代理或离线渲染。

- 用 roxmltree(pure Rust,无 C 依赖)替代 regex 抽属性
- 字段命名对齐 artifacts 仓库 Python _parse_media,跨实现 diff 友好
- 14 个 sns 单测:作者新增 6 个 fixture(单图/三图/视频/纯文字/malformed/缺 totalSize)+ 已有 8 个保持
- 与之前 PR #14 的 --user XML fallback 修复 / SNS_MAX_LIMIT / SNS_MAX_SCAN / escape_like_pattern 完全兼容

Author: leeguooooo <guoli@zhihu.com>
Co-fixed-by: wx-cli-coder (rebase + 冲突解决 + 测试模块合并 + media_count 语义文档补充)
2026-04-19 02:22:55 +08:00
JL e8939f315d
feat(sns): sns-notifications / sns-feed / sns-search (#14)
新增 3 个朋友圈相关命令:sns-notifications / sns-feed / sns-search。
PR review 修复(已 push 进同一分支):
- 修 --user 过滤与 XML <username> fallback 打架的 bug(@wx-cli-codex 发现)
- 加 SNS_MAX_LIMIT / SNS_MAX_SCAN 防御性上限
- 抽 escape_like_pattern() helper
- 补 8 个单测(parse_post_xml / escape_like_pattern)

Cargo check 三 target 全过:aarch64-darwin / x86_64-pc-windows-gnu / x86_64-unknown-linux-gnu。
Co-authored-by: fengliu222 <fengliu222@users.noreply.github.com>
2026-04-19 01:58:21 +08:00
郭立lee f0dcd4ea05
docs(readme): explain how to fetch more than 500 messages (#13)
Clarify that the 500-message behavior is only a default limit, not a hard cap.
Document `-n/--limit` examples for history, search, and export in both README and SKILL.
2026-04-18 15:01:15 +08:00
jackwener 697d3fc720 chore: bump version to 0.1.9 2026-04-18 02:11:28 +08:00
jackwener 1e52014a6b perf(daemon): Arc<Names> + tokio RwLock, O(1) clone per IPC request
Was: Arc<std::sync::RwLock<Names>>; each dispatch clone_names() copied
4 HashMaps (~100KB for a user with 2700 contacts) and used std RwLock
which blocks the tokio worker thread during the clone.

Now: Arc<tokio::sync::RwLock<Arc<Names>>>; dispatch takes the read
guard, does Arc::clone (pointer bump), drops the guard, then spawns
the query work. Names is immutable after daemon startup; Arc is ideal.

Smoke tested: `wx sessions --json` returns correct data including
chat_type; 8 concurrent clients finish in 12ms.
2026-04-18 02:10:45 +08:00
JL e977007306
feat(unread): 按 chat_type 分类会话,新增 --filter (#9)
Before: wx unread / sessions / history 把公众号、订阅号折叠入口
(brandsessionholder)、折叠群聊(@placeholder_foldgroup)、认证服务号
全归为 is_group=false,与真私聊混在一起。甚至 username 形如 wxid_* 但
实为公众号的条目也完全分不出来。

改动:
- 新增 chat_type_of(username, names) helper,输出固定为
  group / official_account / folded / private。
- 判据依次:@chatroom → group;brandsessionholder / @placeholder_foldgroup
  → folded;contact.verify_flag != 0 → official_account(覆盖 wxid_*
  伪装为公众号的情况,以及银行/品牌服务号、qqsafe / mphelper 等认证账号);
  gh_* / biz_* / @* 前缀兜底;其余为 private。
- load_names 顺带读 contact.verify_flag,Names::is_verified 封装查询。
- q_sessions / q_unread / q_history / q_new_messages / q_stats 输出
  新增 chat_type 字段,is_group 保留向后兼容并统一由 chat_type 派生。
- wx unread 新增 --filter,clap value_parser 限制可选值为
  all / private / group / official / folded,逗号分隔多选,默认 all。
  例:wx unread --filter private,group 可过滤公众号与折叠入口。
- SKILL.md / README.md 补充新字段与用法说明。
- .gitignore 补 target/(Rust 项目标配)。

性能:默认 wx unread 的 SQL 与改动前相同(保留 LIMIT)。仅当传入
--filter 时改为全表扫再 Rust 侧过滤,否则 SQL LIMIT 会先把匹配
filter 的条目截断导致漏召。
2026-04-18 01:59:35 +08:00
jackwener bfb7048cf0 fix: bind CLI --version to crate version (credit: @leeguooooo #4) 2026-04-18 01:55:37 +08:00
jackwener c564438994 chore: bump version to 0.1.8 2026-04-18 01:50:25 +08:00
jackwener e44990ba01 fix: drop privileges after key scan to avoid root-owned ~/.wx-cli/ (#7 #8)
Root cause: `wx init` does two conceptually-separate things in one
privileged process: (1) scan WeChat memory for keys (needs root) and
(2) write ~/.wx-cli/{all_keys,config}.json (needs only user). When
run under sudo, the files inherit root ownership, so later the daemon
(forked as the user) can't create daemon.sock/log/pid → silent 15s
timeout.

Also: all_keys.json is the raw AES key; 0644 leaked it to every user
on the system.

Fix in init.rs: after the scan completes, immediately setgid+setuid
back to \$SUDO_UID/\$SUDO_GID and set umask 0o077 before any file I/O.
Files are then created as the real user with 0600 by default. Migrate
old broken installs by chown+chmod-recursive before the setuid call.

Fix in transport.rs: pre-check that ~/.wx-cli/ is writable before
spawning daemon; on EACCES print a clear "sudo chown -R ..." hint
instead of the useless "daemon 启动超时" message.
2026-04-18 01:48:42 +08:00
jackwener ae74072b3f docs: add Windows cross-check setup and IPC same-library rule 2026-04-17 16:43:05 +08:00
jackwener 4e6907c5cc chore: bump version to 0.1.7 2026-04-17 16:42:02 +08:00
jackwener 6a2b23486a fix: client connects via interprocess on Windows, not OpenOptions
Server uses interprocess::local_socket, but client was using
std::fs::OpenOptions("\\.\pipe\wx-cli-daemon") which fails to
connect to pipes created by interprocess's tokio listener.

Use the same interprocess client API on both sides for consistency.

Verified with: cargo check --target x86_64-pc-windows-gnu (mingw-w64).
2026-04-17 16:41:32 +08:00
jakevin 33758671d6
Merge pull request #2 from leeguooooo/fix/skill-md-frontmatter
fix(skill): add YAML frontmatter so `skills` CLI can detect SKILL.md
2026-04-17 16:36:33 +08:00
jackwener fe71f1e9f8 chore: bump version to 0.1.6 2026-04-17 15:05:44 +08:00
jackwener 18daf5b22e fix: Windows init and daemon startup (issue #5)
Three related bugs caused "wx init" and daemon startup to fail on Windows:

1. init.rs: create ~/.wx-cli/ before writing all_keys.json (was created
   only before config.json, so first write failed with ENOENT)

2. transport.rs (Windows): daemon.log was always empty because stderr
   was never redirected, and log file open silently fell back to null
   when parent dir didn't exist. Now mirror the Unix version: create
   parent dir, try_clone to redirect both stdout and stderr.

3. server.rs (Windows): interprocess GenericNamespaced auto-prepends
   \\.\pipe\ on Windows. Passing the full path caused a double-prefixed
   pipe name that clients (using raw \\.\pipe\wx-cli-daemon) could
   never connect to, leading to the 15s startup timeout.
2026-04-17 14:01:04 +08:00
leeguooooo 34698faa65 fix(skill): add YAML frontmatter to SKILL.md so `skills` CLI can detect it
The `skills` CLI (https://github.com/openclaw/skills) requires a YAML
frontmatter block with `name` and `description` to recognize a SKILL.md
as a valid skill. The current file declares description as a Markdown
blockquote, which causes:

  $ npx skills add jackwener/wx-cli -g
  No valid skills found. Skills require a SKILL.md with name and description.

Switching to standard frontmatter makes installation work end-to-end.

Verified with `npx skills add . -l`:
  Found 1 skill
    wx-cli
2026-04-17 13:27:07 +09:00
jackwener 2c9df70d44 docs: emphasize YAML is more token-efficient, JSON for jq 2026-04-17 11:19:35 +08:00
jackwener 3473f47d1d docs: use --query instead of -q for clarity 2026-04-17 11:18:32 +08:00
jackwener e4bfc39c8f fix: improve task_for_pid error message and document codesign steps 2026-04-17 10:46:55 +08:00
jackwener 0e2711dcf8 chore: bump to 0.1.5, fix publish to skip already-published versions 2026-04-17 09:25:04 +08:00
jackwener 7c27a83340 fix: add missing wx.js launcher to git (was gitignored by global config) 2026-04-17 09:13:03 +08:00
jackwener a5de749f0a chore: bump version to 0.1.4 2026-04-17 00:41:01 +08:00
jackwener 69c7a5666c docs: add acknowledgment for ylytdeng/wechat-decrypt 2026-04-16 23:54:50 +08:00
jackwener 3eddfa0ffa fix: add permissions:write, fix Windows copy to use PowerShell syntax 2026-04-16 23:49:00 +08:00
jackwener a2239c0dca ci: check linux only (windows needs MSVC tools, covered by build job) 2026-04-16 23:42:31 +08:00
jackwener 2170db93eb ci: remove arm64 from check job (no cross-compiler available) 2026-04-16 23:40:28 +08:00
jackwener ee1da2ffa6 docs: add CLAUDE.md and AGENTS.md with cross-platform check rules 2026-04-16 23:38:47 +08:00
jackwener d8f4c6e87d fix: replace macOS-only libc::__error() with std::io::Error::last_os_error() 2026-04-16 23:35:30 +08:00
jackwener 3413f6c8f4 fix: move anyhow/chrono/dirs/md5/regex back to [dependencies] section 2026-04-16 23:31:41 +08:00
jackwener 2afea74eb9 ci: add cross-platform cargo check job before build 2026-04-16 23:26:08 +08:00
jackwener 6931dfc4cc chore: update Cargo.lock for v0.1.3 2026-04-16 23:25:02 +08:00
jackwener ad256288e1 chore: bump version to 0.1.3 2026-04-16 23:15:48 +08:00
jackwener 59dd6bfa24 fix: Windows build errors (handle_connection, creation_flags, mkdir)
- server.rs: add handle_connection_windows for named pipe connections
- transport.rs: import CommandExt trait for creation_flags on Windows
- release.yml: mkdir -p before binary copy to npm bin dirs
2026-04-16 23:14:58 +08:00
jackwener f9bca1f872 docs: add npx skills add instruction 2026-04-16 23:08:57 +08:00
jackwener 42e5ac38c3 docs: add SKILL.md for AI agent integration 2026-04-16 22:46:13 +08:00
56 changed files with 10128 additions and 929 deletions

View File

@ -5,8 +5,24 @@ on:
tags: ['v*'] tags: ['v*']
workflow_dispatch: workflow_dispatch:
permissions:
contents: write
jobs: jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: x86_64-unknown-linux-gnu
- name: cargo check linux target
run: cargo check --target x86_64-unknown-linux-gnu
build: build:
needs: check
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
@ -70,13 +86,16 @@ jobs:
if: matrix.os != 'windows-latest' if: matrix.os != 'windows-latest'
run: | run: |
cp target/${{ matrix.target }}/release/wx ${{ matrix.asset }} cp target/${{ matrix.target }}/release/wx ${{ matrix.asset }}
mkdir -p npm/platforms/${{ matrix.npm_dir }}/bin
cp target/${{ matrix.target }}/release/wx npm/platforms/${{ matrix.npm_dir }}/bin/wx cp target/${{ matrix.target }}/release/wx npm/platforms/${{ matrix.npm_dir }}/bin/wx
- name: Copy binary (Windows) - name: Copy binary (Windows)
if: matrix.os == 'windows-latest' if: matrix.os == 'windows-latest'
shell: pwsh
run: | run: |
copy target\${{ matrix.target }}\release\wx.exe ${{ matrix.asset }} Copy-Item "target\${{ matrix.target }}\release\wx.exe" "${{ matrix.asset }}"
copy target\${{ matrix.target }}\release\wx.exe npm\platforms\${{ matrix.npm_dir }}\bin\wx.exe New-Item -ItemType Directory -Force -Path "npm\platforms\${{ matrix.npm_dir }}\bin" | Out-Null
Copy-Item "target\${{ matrix.target }}\release\wx.exe" "npm\platforms\${{ matrix.npm_dir }}\bin\wx.exe"
- uses: actions/upload-artifact@v4 - uses: actions/upload-artifact@v4
with: with:
@ -93,8 +112,6 @@ jobs:
if: startsWith(github.ref, 'refs/tags/') if: startsWith(github.ref, 'refs/tags/')
with: with:
files: ${{ matrix.asset }} files: ${{ matrix.asset }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
publish-npm: publish-npm:
needs: build needs: build
@ -129,7 +146,7 @@ jobs:
run: | run: |
for dir in darwin-arm64 darwin-x64 linux-x64 linux-arm64 win32-x64; do for dir in darwin-arm64 darwin-x64 linux-x64 linux-arm64 win32-x64; do
cd npm/platforms/$dir cd npm/platforms/$dir
npm publish npm publish 2>&1 | tee /tmp/npm-out.txt || grep -q "previously published" /tmp/npm-out.txt || exit 1
cd ../../.. cd ../../..
done done
env: env:
@ -138,6 +155,6 @@ jobs:
- name: Publish main package - name: Publish main package
run: | run: |
cd npm/wx-cli cd npm/wx-cli
npm publish npm publish 2>&1 | tee /tmp/npm-out.txt || grep -q "previously published" /tmp/npm-out.txt || exit 1
env: env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }} NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

7
.gitignore vendored
View File

@ -15,6 +15,9 @@ hook_start_output.txt
hook_stderr.txt hook_stderr.txt
run_hook.bat run_hook.bat
# Rust
target/
# Python # Python
__pycache__/ __pycache__/
*.py[cod] *.py[cod]
@ -23,5 +26,5 @@ __pycache__/
# OS # OS
.DS_Store .DS_Store
Thumbs.db Thumbs.db
find_all_keys_macos find_all_keys_macos
.claude/worktrees/ .claude/worktrees/

33
AGENTS.md 100644
View File

@ -0,0 +1,33 @@
# wx-cli Agent Rules
## 每次改完代码后必须做的事
1. **`cargo check`** — 改任何 `.rs` 文件后立刻运行,不通过不提交
2. **改了跨平台代码时加运行跨平台 check**
```bash
cargo check --target x86_64-unknown-linux-gnu
cargo check --target x86_64-pc-windows-msvc
```
3. **改了 `Cargo.toml` 版本号时:** `cargo update --workspace`
## 禁止行为
- 不能在 `cargo check` 失败的情况下 commit
- 不能只在 macOS 本地 check 就认为跨平台没问题
- 不能改完 `Cargo.toml` 不更新 `Cargo.lock` 就打 tag
## 常见陷阱
| 陷阱 | 正确做法 |
|------|----------|
| `libc::__error()``#[cfg(unix)]` 里 | 用 `std::io::Error::last_os_error()` |
| 把通用 dep 放到 `[target.cfg(windows).dependencies]` 后面 | TOML section 是贪婪的,通用 dep 必须在 target section 之前 |
| 改版本号忘更新 Cargo.lock | `cargo update --workspace` |
| Windows 代码用 trait method 忘 import trait | `use std::os::windows::process::CommandExt` 等 |
| `#[cfg(windows)]` 里引用了未定义的函数 | 跨平台 check 会发现 |
## Push 规则
- remote 名称:`wx-cli`,使用 SSH
- 每次 commit 后立刻 push
- 打 tag 用 `git tag vX.Y.Z && git push wx-cli vX.Y.Z`

63
CLAUDE.md 100644
View File

@ -0,0 +1,63 @@
# wx-cli Project Rules
## After Every Code Change
**Rust 代码改动后,必须立刻运行:**
```bash
cargo check
```
不允许在 `cargo check` 通过之前提交或推送。
**改动涉及跨平台代码(`#[cfg(...)]` / `Cargo.toml` dependencies额外运行**
```bash
cargo check --target x86_64-unknown-linux-gnu
cargo check --target x86_64-pc-windows-gnu # 在 macOS 上用这个msvc 需要 MSVC 工具链
```
macOS 上需要一次性安装 target 和交叉编译器:
```bash
rustup target add x86_64-pc-windows-gnu
brew install mingw-w64 # 提供 x86_64-w64-mingw32-gcczstd-sys 等 C 依赖需要
```
这两条 check 命令用于提前暴露 Linux/Windows 特有的编译错误,**只做类型检查**(不 link
## IPC / 跨平台同库约定
动任何 IPC / 网络代码时:**两端必须用同一个库、同一套 API**。例如 server 用 `interprocess::local_socket::tokio::Listener`client 就必须用 `interprocess::local_socket::Stream::connect`,不能用 `std::fs::OpenOptions` 打开同名路径——即使 kernel 名字对上了,底层的 framing / overlapped 模式也不兼容。
## Cargo.toml 修改规则
- 修改版本号后,必须运行 `cargo update --workspace` 更新 Cargo.lock
- 添加/移动 `[target.'cfg(...)'.dependencies]` section 时,确认后续依赖没有被意外归入该 sectionTOML section 持续到下一个 header
- 改完后运行 `cargo check` 验证
## Git 规则
- 每次 commit 后必须 push`git push wx-cli main`
- 打 tag 前确认 `cargo check``cargo update --workspace` 都已完成
- remote 使用 `wx-cli`SSH不用 `origin`
## 平台兼容性检查清单
改动以下内容时必须做跨平台 check
- [ ] `libc::` 调用 → 确认函数在 Linux 和 macOS 都存在(`__error` 是 macOS 专属,用 `std::io::Error::last_os_error()` 代替)
- [ ] `#[cfg(unix)]` 块 → unix 包括 macOS 和 Linux不能用 macOS 专属 API
- [ ] `Cargo.toml` dependency section 顺序 → 检查是否有 dep 意外落入 target section
- [ ] Windows named pipe 代码 → 确认函数都已定义trait import 齐全
## CI 结构
```
check jobubuntu
└── cargo check --target linux-x86, linux-arm64, windows-x86
↓ 通过后
build jobs5平台并行
↓ 全部通过后
publish-npm job
```

16
Cargo.lock generated
View File

@ -105,6 +105,12 @@ version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c08606f8c3cbf4ce6ec8e28fb0014a2c086708fe954eaa885384a6165172e7e8" checksum = "c08606f8c3cbf4ce6ec8e28fb0014a2c086708fe954eaa885384a6165172e7e8"
[[package]]
name = "base64"
version = "0.22.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "72b3254f16251a8381aa12e40e3c4d2f0199f8c6508fbecb9d91f575e0fbb8c6"
[[package]] [[package]]
name = "bitflags" name = "bitflags"
version = "2.11.1" version = "2.11.1"
@ -719,6 +725,12 @@ version = "0.8.10"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dc897dd8d9e8bd1ed8cdad82b5966c3e0ecae09fb1907d58efaa013543185d0a" checksum = "dc897dd8d9e8bd1ed8cdad82b5966c3e0ecae09fb1907d58efaa013543185d0a"
[[package]]
name = "roxmltree"
version = "0.20.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6c20b6793b5c2fa6553b250154b78d6d0db37e72700ae35fad9387a46f487c97"
[[package]] [[package]]
name = "rusqlite" name = "rusqlite"
version = "0.31.0" version = "0.31.0"
@ -1301,10 +1313,11 @@ checksum = "d7249219f66ced02969388cf2bb044a09756a083d0fab1e566056b04d9fbcaa5"
[[package]] [[package]]
name = "wx-cli" name = "wx-cli"
version = "0.1.2" version = "0.3.0"
dependencies = [ dependencies = [
"aes", "aes",
"anyhow", "anyhow",
"base64",
"cbc", "cbc",
"chrono", "chrono",
"clap", "clap",
@ -1315,6 +1328,7 @@ dependencies = [
"md5", "md5",
"pbkdf2", "pbkdf2",
"regex", "regex",
"roxmltree",
"rusqlite", "rusqlite",
"serde", "serde",
"serde_json", "serde_json",

View File

@ -1,6 +1,6 @@
[package] [package]
name = "wx-cli" name = "wx-cli"
version = "0.1.2" version = "0.3.0"
edition = "2021" edition = "2021"
description = "WeChat 4.x (macOS/Linux) local data CLI — decrypt SQLCipher DBs, query chat history, watch new messages" description = "WeChat 4.x (macOS/Linux) local data CLI — decrypt SQLCipher DBs, query chat history, watch new messages"
license = "Apache-2.0" license = "Apache-2.0"
@ -38,9 +38,6 @@ pbkdf2 = "0.12"
# 解压 # 解压
zstd = "0.13" zstd = "0.13"
# IPC (Unix socket + Windows named pipe 统一)
interprocess = { version = "2", features = ["tokio"] }
# 错误处理 # 错误处理
anyhow = "1" anyhow = "1"
@ -53,8 +50,16 @@ dirs = "5"
# MD5 (联系人表名 Msg_<md5>) # MD5 (联系人表名 Msg_<md5>)
md5 = "0.7" md5 = "0.7"
# 附件 ID 编码base64url
base64 = "0.22"
# 正则表达式 # 正则表达式
regex = "1" regex = "1"
roxmltree = "0.20"
# IPC Windows named pipeUnix 直接用 tokio::net::UnixListener
[target.'cfg(windows)'.dependencies]
interprocess = { version = "2", features = ["tokio"] }
[target.'cfg(unix)'.dependencies] [target.'cfg(unix)'.dependencies]
libc = "0.2" libc = "0.2"
@ -66,6 +71,8 @@ windows = { version = "0.58", features = [
"Win32_System_Threading", "Win32_System_Threading",
"Win32_Foundation", "Win32_Foundation",
"Win32_System_Memory", "Win32_System_Memory",
"Win32_System_Com",
"Win32_UI_Shell",
] } ] }
[profile.release] [profile.release]

181
README.md
View File

@ -8,17 +8,35 @@
[![Platform](https://img.shields.io/badge/platform-macOS%20%7C%20Linux%20%7C%20Windows-lightgrey.svg)](#安装) [![Platform](https://img.shields.io/badge/platform-macOS%20%7C%20Linux%20%7C%20Windows-lightgrey.svg)](#安装)
[![Rust](https://img.shields.io/badge/built%20with-Rust-orange.svg)](https://www.rust-lang.org) [![Rust](https://img.shields.io/badge/built%20with-Rust-orange.svg)](https://www.rust-lang.org)
会话 · 聊天记录 · 搜索 · 联系人 · 群成员 · 收藏 · 统计 · 导出 会话 · 聊天记录 · 搜索 · 联系人 · 群成员 · 群昵称 · 收藏 · 统计 · 导出
</div> </div>
--- ---
## AI Agent Skill
通过 [skills CLI](https://github.com/vercel-labs/skills) 一键安装到 Claude Code、Cursor、Codex 等 agent
```bash
npx skills add jackwener/wx-cli
```
或全局安装:
```bash
npx skills add jackwener/wx-cli -g
```
安装后 agent 会自动读取 `SKILL.md`,了解如何安装和调用 wx-cli。
---
## 特性 ## 特性
- **零依赖安装** — 单一 Rust 二进制,一行命令装完 - **零依赖安装** — 单一 Rust 二进制,一行命令装完
- **毫秒级响应** — 后台 daemon 持久缓存解密数据库mtime 不变则复用 - **毫秒级响应** — 后台 daemon 持久缓存解密数据库mtime 不变则复用
- **AI 友好** — 默认 YAML 输出,`--json` 切换为 JSON方便 LLM agent 直接调用 - **AI 友好**`history` / `search` / `sessions` / `new-messages` / `stats` / `attachments` 默认返回 `{..., meta}` wrapperagent 能直接消费 freshness / source 信息
- **完全本地** — 数据不出本机,实时解密,无需全量预解密 - **完全本地** — 数据不出本机,实时解密,无需全量预解密
--- ---
@ -79,10 +97,32 @@ cargo build --release
**macOS**(需要先对微信做 ad-hoc 签名,才能扫描其内存) **macOS**(需要先对微信做 ad-hoc 签名,才能扫描其内存)
```bash ```bash
sudo codesign --force --deep --sign - /Applications/WeChat.app # 1. 签名只需做一次WeChat 更新后重做)
codesign --force --deep --sign - /Applications/WeChat.app
# 2. 清理旧 TCC 授权记录(重签名后必做,否则微信截图/通话权限可能 silent 失效)
for s in ScreenCapture Camera Microphone AppleEvents AddressBook \
SystemPolicyDocumentsFolder SystemPolicyDownloadsFolder SystemPolicyDesktopFolder; do
tccutil reset "$s" com.tencent.xinWeChat
done
# 3. 重启微信,等待完全登录
killall WeChat && open /Applications/WeChat.app
# 4. 初始化
sudo wx init sudo wx init
``` ```
> 如果 `codesign``signature in use`,先执行:
> ```bash
> codesign --remove-signature "/Applications/WeChat.app/Contents/Frameworks/vlc_plugins/librtp_mpeg4_plugin.dylib"
> codesign --force --deep --sign - /Applications/WeChat.app
> ```
>
> 重签名后 macOS 的 TCC 隐私授权按新 code signature 重新校验,旧记录会失效。如果跳过 `tccutil reset`,微信截图/视频通话/麦克风等权限可能"看起来已开启但实际拒绝"。详见 [macOS 权限与签名指南](docs/macos-permission-guide.md#五重签名后微信权限-silent-失效)。
> **副作用提示**:完成上面的 ad-hoc 重签后macOS 会比较频繁地弹 `"微信" 想访问其他 App 的数据`(在微信里打开公众号文章时尤其容易触发)。这是当前 macOS invasive init 路径的已知副作用:重签后 WeChat 的 code identity 变了,它再访问自己原来的 container / 缓存数据会被系统识别为"跨 App 访问"。点"允许"通常只是放行当前 WeChat 进程;想彻底不弹得恢复官方 WeChat——这只放弃**当前依赖重签的默认路径****不等于放弃 memory-scan**:在本机 GUI Terminal 下、Terminal.app 拿到「开发者工具」TCC 授权后,对 Apple 官方签名的 WeChat 应当仍可以走通(实证覆盖只有 Catalina / Big SurmacOS 14+ 未在本项目内实测);只有 SSH 远程 + Apple 签名 WeChat 这种组合才必须重签。详见 [macOS 权限与签名指南 §六](docs/macos-permission-guide.md#六微信-想访问其他-app-的数据-弹窗)。
**Linux** **Linux**
```bash ```bash
@ -95,14 +135,14 @@ sudo wx init
wx init wx init
``` ```
之后直接用daemon 会在首次调用时自动启动 验证安装
```bash ```bash
wx sessions # 查看最近会话 wx sessions
wx history "张三" # 查看聊天记录
wx search "Claude" # 搜索消息
``` ```
能看到最近会话即表示一切正常。daemon 在首次调用时自动启动。
--- ---
## 命令 ## 命令
@ -112,27 +152,135 @@ wx search "Claude" # 搜索消息
```bash ```bash
wx sessions # 最近 20 个会话 wx sessions # 最近 20 个会话
wx unread # 有未读消息的会话 wx unread # 有未读消息的会话
wx unread --filter private,group # 只看真人未读(过滤公众号/折叠入口)
wx new-messages # 上次检查后的新消息(增量) wx new-messages # 上次检查后的新消息(增量)
wx history "张三" # 最近 50 条记录 wx history "张三" # 最近 50 条记录
wx history "张三" -n 2000 # 拉更多历史消息
wx history "AI群" --since 2026-04-01 --until 2026-04-15 wx history "AI群" --since 2026-04-01 --until 2026-04-15
wx search "关键词" # 全库搜索 wx search "关键词" # 全库搜索
wx search "关键词" -n 500 # 放宽搜索结果条数
wx search "会议" --in "工作群" --since 2026-01-01 wx search "会议" --in "工作群" --since 2026-01-01
``` ```
`history` / `search` / `export` 都支持 `-n` / `--limit` 指定条数。默认值只是为了避免一次性输出过多消息,不是硬上限。
会话/消息输出里都带 `chat_type` 字段,取值为 `private` / `group` / `official_account` / `folded`。`official_account` 涵盖公众号、订阅号、服务号及 `mphelper` / `qqsafe` 等系统通知;`folded` 对应微信里的"订阅号折叠"和"折叠群聊"两个聚合入口。
群聊里的 `last_sender`、`sender` 和 `stats``top_senders` 会优先使用群昵称(群名片)。如果本地数据库里没有对应群昵称,则回退到联系人备注、微信昵称或 username。
`history` / `search` / `new-messages` / `attachments` 以及 `stats.top_senders`,在群聊上下文里还会附带稳定身份三件套:
- `sender_username`:稳定 wxid用来区分两个昵称同名的成员
- `sender_contact_display`:通讯录里的显示名(备注 > 昵称 > wxid 兜底)
- `sender_group_nickname`:群名片本身(同 `sender` 的来源,方便机器读取时不必再解析)
解析不到 wxid 时id2u 没命中且老格式 `wxid_xxx:\n...` 前缀也不存在)这三字段不会输出,避免伪造空字段污染下游过滤。
`history` / `search` / `sessions` / `unread` / `new-messages` / `stats` / `attachments` 现在都会附带 `meta`
- `status`: `ok` / `possibly_stale` / `possibly_stale_unknown_shards` / `windowed`
- `unknown_shards`: 磁盘上存在、但 daemon 当前没有 key 的 `message_N.db` 分片;非空时应先跑 `wx init --force`
- `chat_latest_timestamp` / `chat_latest_db`: 当前命中数据里最新一条消息的时间和分片来源
- `session_last_timestamp`: `session.db` 里 WeChat 自己记录的最新时间;如果明显领先于 `chat_latest_timestamp`,说明结果可能漏了消息
默认情况下,人类用户会在 stderr 看到可执行的 warningagent / 脚本可直接读 stdout 里的 `meta`。传 `--with-meta` 会额外返回 `per_shard_latest` / `cache_mode_per_shard`,传隐藏 flag `--debug-source` 还会带真实 `shard_paths`
引用消息会在 `history` / `search` / `new-messages` 输出中显示当前回复和被引用原文:
```text
[引用] 当前回复
↳ 发送者: 被引用内容
```
`--type link` / `--type file` 会包含微信 appmsg 里的链接、文件、合并聊天记录和引用消息等变体;搜索时也会匹配解压后可见的引用原文。
### 朋友圈SNS
三个独立命令,区分"通知"和"帖子"
```bash
wx sns-notifications # 点赞/评论通知(默认仅未读)
wx sns-notifications --include-read -n 100 # 含已读
wx sns-feed # 近 20 条朋友圈(时间线)
wx sns-feed --user "张三" # 限定作者
wx sns-feed --since 2026-04-01 -n 100 # 按时间
wx sns-search "关键词" # 全文搜索朋友圈正文
wx sns-search "婚礼" --user "李四" --since 2023-01-01
```
- **sns-notifications** 返回互动通知:`type``like`/`comment`)、`from_nickname`、`content`(评论正文)、`feed_preview` + `feed_author`(对应原帖)
- **sns-feed** / **sns-search** 返回朋友圈帖子:`author`、`content`(正文)、`media`、`media_count`、`location`、`timestamp``media` 字段含每张图的 url/thumb/key/token/md5/enc_idx/size供下游做图片代理或离线渲染。`media_count = media.len()`,按 DOM 解析的合法 `<media>` 子节点计数malformed XML 返回 0
朋友圈数据只覆盖你本地刷到过的帖子(微信 app 按需下载)。
### 公众号文章
公众号文章推送存在独立的 `biz_message_*.db` 分片,用 `biz-articles` 单独查:
```bash
wx biz-articles # 最近 50 篇
wx biz-articles -n 200 # 更多
wx biz-articles --account "返朴" # 限定公众号(名称模糊匹配)
wx biz-articles --since 2026-05-01 --until 2026-05-10
wx biz-articles --unread # 仅有未读的公众号,每号取最新 1 篇
wx biz-articles --json | jq '.[].url' # 下游消费 URL
```
每条返回:`account` / `account_username` / `title` / `url` / `digest` / `cover_url` / `time` / `timestamp` / `recv_time_str`。多图文推送会展开成多行。
### 附件提取(图片)
聊天里的附件本体存在 `xwechat_files/<wxid>/msg/attach/...` 下的 `.dat` 文件,需要按消息所在 `message_resource.db` 的 md5 + 平台相关 image key 解码才能拿到原图。
```bash
# 1) 列出会话里的图片附件,先拿到不透明的 attachment_id
wx attachments "张三"
wx attachments "AI群" --kind image -n 100
wx attachments "AI群" --since 2026-04-01 --until 2026-04-15
# 2) 把单个 attachment_id 解密写出去(扩展名建议保留 .jpg / .mp4 等)
wx extract <attachment_id> -o ~/Desktop/photo.jpg
wx extract <attachment_id> -o /tmp/x.jpg --overwrite
```
`attachments` 输出每条带:`attachment_id` / `kind` / `type` / `local_id` / `timestamp` / `time`,群聊里还有 `sender` 以及稳定身份三件套 `sender_username` / `sender_contact_display` / `sender_group_nickname`(语义同 `history` / `search` / `new-messages``sender_username` 是 wxid用于两个同名成员之间的稳定区分解析不到 wxid 时这三字段不输出)。当前 `kind` 固定为 `image`;命令名保留成 `attachments` 是为了后续扩到其他附件类型时不 break CLI。
`extract` 输出报告里带:`md5` / `dat_path` / `dat_size` / `output` / `output_size` / `format`实际识别出的图片格式jpg / png / gif / webp / hevc 等)/ `decoder`(实际选用的解码器:`legacy_xor` / `v1_aes` / `v2`)。
支持的解码档位:
- **legacy XOR**:早期单字节 XOR无 magic按文件首字节探测格式自动反推
- **V1 fixed-AES**`07 08 V1 08 07`AES-128-ECB + 固定 key `cfcd208495d565ef`
- **V2 AES + XOR**`07 08 V2 08 07`AES-128-ECB + raw + XORAES key 平台派生
V2 image key 提取:
- **macOS**`kvcomm` cache`key_<uin>_*.statistic` 文件名取 uin → `md5(str(uin) + wxid)[:16]`+ brute-force fallback`md5(str(uin))[:4] == wxid_suffix` 枚举 2^24xor_key = `uin & 0xff`**不是硬编码 0x88**
- **Windows**:扫 `Weixin.exe` 内存匹配 `[A-Za-z0-9]{32|16}` 候选,按 V2 template ciphertext-block 反验
- **Linux**:上游空白,遇到 V2 .dat 会报 unsupported
### 联系人 & 群组 ### 联系人 & 群组
```bash ```bash
wx contacts # 联系人列表 wx contacts # 联系人列表
wx contacts -q "李" # 按名字搜索 wx contacts --query "李" # 按名字搜索
wx members "AI交流群" # 群成员列表 wx members "AI交流群" # 群成员列表
``` ```
`wx members --json` 返回的成员字段包括:
- `username`:微信内部 username
- `display`:用于展示的名称,优先使用群昵称
- `contact_display`:联系人备注或微信昵称
- `group_nickname`:群昵称;本地没有记录时为空字符串
- `is_owner`:是否群主
### 收藏 & 统计 ### 收藏 & 统计
```bash ```bash
wx favorites # 全部收藏 wx favorites # 全部收藏
wx favorites --type image # 按类型筛选text/image/article/card/video wx favorites --type image # 按类型筛选text/image/article/card/video
wx favorites -q "关键词" # 搜索收藏内容 wx favorites --query "关键词" # 搜索收藏内容
wx stats "AI群" # 聊天统计 wx stats "AI群" # 聊天统计
wx stats "AI群" --since 2026-01-01 # 指定时间范围 wx stats "AI群" --since 2026-01-01 # 指定时间范围
``` ```
@ -141,17 +289,20 @@ wx stats "AI群" --since 2026-01-01 # 指定时间范围
```bash ```bash
wx export "张三" --format markdown -o chat.md wx export "张三" --format markdown -o chat.md
wx export "张三" -n 2000 --format markdown -o chat.md
wx export "AI群" --since 2026-01-01 --format json wx export "AI群" --since 2026-01-01 --format json
``` ```
### 输出格式 ### 输出格式
默认输出 YAML,加 `--json` 切换为 JSON适合 AI agent / `jq` 处理) 默认输出 YAML`--json` 可切换为 JSON。对 agent 而言,`history` / `search` / `sessions` / `new-messages` / `stats` / `attachments` 的 stdout 现在是 wrapper而不是裸数组
```bash ```bash
wx sessions --json wx sessions --json
wx search "关键词" --json | jq '.[0].content' wx search "关键词" --json | jq '.results[0].content'
wx new-messages --json wx new-messages --json
wx history "张三" --json | jq '.meta'
wx history "张三" --json --with-meta | jq '.meta.cache_mode_per_shard'
``` ```
### Daemon 管理 ### Daemon 管理
@ -193,7 +344,13 @@ daemon 首次解密后将数据库和 mtime 持久化到 `~/.wx-cli/cache/`。
微信 4.x 使用 SQLCipher 4 加密本地数据库AES-256-CBC + HMAC-SHA512PBKDF2 256,000 次迭代。WCDB 在进程内存中缓存派生后的 raw key格式为 `x'<64hex_key><32hex_salt>'` 微信 4.x 使用 SQLCipher 4 加密本地数据库AES-256-CBC + HMAC-SHA512PBKDF2 256,000 次迭代。WCDB 在进程内存中缓存派生后的 raw key格式为 `x'<64hex_key><32hex_salt>'`
wx-cli 通过 macOS Mach VM API`mach_vm_region` + `mach_vm_read`)或 Linux `/proc/<pid>/mem` 扫描微信进程内存匹配该模式提取密钥daemon 按需解密并缓存。 wx-cli 通过 macOS Mach VM API`mach_vm_region` + `mach_vm_read`、Linux `/proc/<pid>/mem` 或 Windows `VirtualQueryEx` + `ReadProcessMemory`(需要 `PROCESS_VM_READ | PROCESS_QUERY_INFORMATION` 权限扫描微信进程内存匹配该模式提取密钥daemon 按需解密并缓存。
---
## 致谢
本项目受 [ylytdeng/wechat-decrypt](https://github.com/ylytdeng/wechat-decrypt) 启发,在其基础上进行了重新设计与实现。感谢原作者的研究与探索。
--- ---

374
SKILL.md 100644
View File

@ -0,0 +1,374 @@
---
name: wx-cli
description: "wx-cli — 从本地微信数据库查询聊天记录、联系人、会话、收藏等。用户提到微信聊天记录、联系人、消息历史、群成员、收藏内容时,使用此 skill 安装并调用 wx-cli。"
---
# wx-cli
## Triggers
- 查微信聊天记录
- 微信消息历史
- 微信联系人
- 微信群成员
- 微信群昵称 / 群名片
- 微信收藏
- wechat history / messages / contacts
- wx-cli
- 帮我看看微信里
- 搜索微信消息
## Prerequisites
- macOSApple Silicon / Intel或 Linux
- 微信桌面版 4.x 已安装并登录
- Node.js >= 14npm 安装方式)或 curlshell 安装方式)
- 首次 `wx init` 需要 `sudo`(内存扫描提取密钥)
---
## 安装
### 方式一npm推荐
```bash
npm install -g @jackwener/wx-cli
```
### 方式二curl
```bash
curl -fsSL https://raw.githubusercontent.com/jackwener/wx-cli/main/install.sh | bash
```
安装后验证:
```bash
wx --version
```
---
## 初始化(首次使用,只需一次)
### macOS必须按顺序执行
**第一步:对 WeChat 重新签名**只需做一次WeChat 更新后需重做)
```bash
codesign --force --deep --sign - /Applications/WeChat.app
```
如果报错 `signature in use` 或某个 dylib 签名损坏,先修复再签名:
```bash
codesign --remove-signature "/Applications/WeChat.app/Contents/Frameworks/vlc_plugins/librtp_mpeg4_plugin.dylib"
codesign --force --deep --sign - /Applications/WeChat.app
```
**第二步:清理 WeChat 在 macOS TCC 隐私数据库里的旧授权记录**(重签名后必做)
macOS TCC 按 `bundle id + csreq` 联合校验权限csreq 编码自代码签名。重签名后旧 csreq 和新签名不再匹配,旧授权记录会 silent 失效System Settings 仍把开关画成"已允许",运行时实际拒绝)。把 WeChat 在 TCC 里的旧记录抹掉,让 macOS 在下次微信请求权限时按新签名重新生成 csreq
```bash
tccutil reset ScreenCapture com.tencent.xinWeChat # 截图 / 屏幕共享
tccutil reset Camera com.tencent.xinWeChat # 视频通话 / 扫码
tccutil reset Microphone com.tencent.xinWeChat # 语音消息 / 通话
tccutil reset AppleEvents com.tencent.xinWeChat # 自动化 / 输入法
tccutil reset AddressBook com.tencent.xinWeChat # 通讯录
tccutil reset SystemPolicyDocumentsFolder com.tencent.xinWeChat
tccutil reset SystemPolicyDownloadsFolder com.tencent.xinWeChat
tccutil reset SystemPolicyDesktopFolder com.tencent.xinWeChat
```
`tccutil` 对没有授权过的 service 会报 "No such bundle identifier",是 no-op不影响其他 service 的 reset。
**第三步:重启 WeChat**
```bash
killall WeChat && open /Applications/WeChat.app
# 等待微信完全登录后再继续
```
之后微信触发权限请求时按 GUI 提示重新允许即可。在 macOS 26 上,把 WeChat 加进 **隐私与安全 → 录屏与系统录音** 的上半区,**不要**只勾下半区的"仅系统录音"——后者不能授予截图权限。
**第四步:初始化**
```bash
sudo wx init
```
### Linux
```bash
sudo wx init
```
`wx init` 会自动:
1. 检测微信数据目录
2. 扫描进程内存,提取所有数据库密钥
3. 写入 `~/.wx-cli/config.json`
初始化完成后,后续所有命令无需 `sudo`daemon 在首次调用时自动启动。
---
## 命令速查
所有命令默认输出 YAML更省 token & 易读;`--json` 可切换为 JSON方便 `jq` 处理等)。
### 会话与消息
```bash
# 最近 20 个会话
wx sessions
# 有未读消息的会话
wx unread
# 只看真人(私聊 + 群聊)的未读,过滤公众号与折叠入口
wx unread --filter private,group
# 上次检查后的新消息(增量)
wx new-messages
wx new-messages --json # JSON 输出,适合 agent 解析
# 聊天记录(支持昵称/备注名)
wx history "张三"
wx history "张三" -n 2000
wx history "AI群" --since 2026-04-01 --until 2026-04-15 -n 100
# 全库搜索
wx search "关键词"
wx search "关键词" -n 500
wx search "会议" --in "工作群" --since 2026-01-01
```
`history` / `search` / `export` 都支持 `-n` / `--limit` 指定返回条数。默认值只是为了避免一次输出过多,不是硬上限。
`sessions` / `unread` / `history` / `new-messages` / `stats` 的输出都带 `chat_type` 字段agent 可据此分流:
| 取值 | 含义 | username 特征 |
|------|------|--------------|
| `private` | 真人私聊 | `wxid_*` 或自定义短号 |
| `group` | 群聊 | `*@chatroom` |
| `official_account` | 公众号 / 订阅号 / 服务号 / 系统通知 | `gh_*`、`biz_*`、`mphelper`、`qqsafe`、`@opencustomerservicemsg` |
| `folded` | 折叠入口(订阅号折叠、折叠群聊的聚合条目) | `brandsessionholder`、`@placeholder_foldgroup` |
`wx unread --filter` 支持 `private` / `group` / `official` / `folded` / `all`,逗号分隔多选。默认 `all`
群聊消息里的 `last_sender`、`sender` 和 `stats.top_senders` 会优先显示群昵称(群名片)。如果本地数据库没有群昵称,再回退到联系人备注、微信昵称或 username。
`history` / `search` / `new-messages` / `attachments``stats.top_senders` 在群上下文里同时输出稳定身份三件套:`sender_username`(稳定 wxid用来区分同名成员/ `sender_contact_display`(备注 > 昵称 > wxid 兜底)/ `sender_group_nickname`(群名片,等价于 `sender` 的来源,免去再做字符串解析)。当 wxid 解析不到时,这三字段不会输出,避免空字符串污染下游过滤。
`sessions` / `unread` / `history` / `search` / `new-messages` / `stats` / `attachments` 的 stdout 现在统一是 wrapper
```json
{
"messages": [...],
"meta": {
"status": "ok",
"unknown_shards": [],
"chat_latest_timestamp": 1715750400,
"chat_latest_db": "message/message_2.db",
"session_last_timestamp": 1715760000
}
}
```
其中:
- `status = possibly_stale_unknown_shards`:磁盘上出现 daemon 不认识的新 `message_N.db`,先跑 `wx init --force`
- `status = possibly_stale``session.db` 记录的最新时间明显领先于本次查到的最新消息,结果可能漏消息
- `status = windowed`:这次查询本来就是窗口化/过滤后的局部视图,不应把它当作"全量最新状态"
- `--with-meta`:额外返回 `per_shard_latest` / `cache_mode_per_shard`
- `--debug-source`:在 `--with-meta` 基础上再暴露真实 `shard_paths`
引用消息appmsg `type=57`)在 `history` / `search` / `new-messages` 输出里会展开为两行:第一行是当前回复,第二行以 `↳` 开头显示被引用原文,例如:
```text
[引用] 当前回复
↳ 发送者: 被引用内容
```
`--type link` / `--type file` 会覆盖微信 appmsg 的链接、文件、合并聊天记录和引用消息等变体;`search --type link` 也会匹配解压并格式化后的引用原文。
### 联系人与群组
```bash
# 联系人列表 / 搜索
wx contacts
wx contacts --query "李"
# 群成员列表
wx members "AI交流群"
```
`wx members --json` 每个成员包含:
- `username`:微信内部 username
- `display`:推荐展示名,优先使用群昵称
- `contact_display`:联系人备注或微信昵称
- `group_nickname`:群昵称;没有记录时为空字符串
- `is_owner`:是否群主
Agent 展示群成员时优先用 `display`。需要区分群昵称和联系人名时,再读取 `group_nickname``contact_display`
### 朋友圈SNS
三个命令,作用各不同:
```bash
# 1) 互动通知(点赞 / 评论,默认仅未读)
wx sns-notifications
wx sns-notifications --include-read --since 2026-04-01 -n 100
# 2) 时间线:浏览本地缓存的朋友圈帖子
wx sns-feed # 近 20 条
wx sns-feed --user "张三" # 只看某人
wx sns-feed --since 2026-04-01 --until 2026-04-18 -n 100
# 3) 全文搜索:在正文里找关键词
wx sns-search "关键词"
wx sns-search "婚礼" --user "李四" --since 2023-01-01 -n 50
```
**字段区分**
- `sns-notifications` 返回"通知"条目:`type``like`/`comment`)、`from_nickname`、`content`(评论正文,点赞为空)、`feed_preview` + `feed_author`(对应的原帖)
- `sns-feed` / `sns-search` 返回"帖子"条目:`author`、`content`(朋友圈正文)、`media`、`media_count`(图片/视频数)、`location`、`timestamp``media` 字段含每张图的 url/thumb/key/token/md5/enc_idx/size供下游做图片代理或离线渲染。`media_count = media.len()`,按 DOM 解析的合法 `<media>` 子节点计数malformed XML 返回 0
> 只保存你本地刷到过的朋友圈(微信 app 按需下载)。没刷到过的帖子不在本地,任何命令都拿不到。
### 公众号文章
公众号的文章推送存在独立的 `biz_message_*.db` 分片,与普通 `message_0.db` 分开:
```bash
# 最近 50 篇(默认)
wx biz-articles
# 更多
wx biz-articles -n 200
# 限定公众号(名称模糊匹配 display name / username
wx biz-articles --account "返朴"
# 时间范围YYYY-MM-DD发布时间非接收时间
wx biz-articles --since 2026-05-01 --until 2026-05-10
# 仅有未读消息的公众号,每号取最新 1 篇(适合"今天有什么新推送"扫描)
wx biz-articles --unread
wx biz-articles --unread --account "Datawhale" # 与 --account 取交集
# 下游消费:拿 URL 做内容抓取
wx biz-articles --since 2026-05-10 --json | jq '.[].url'
```
每条返回的字段:`account` / `account_username``gh_*`/ `title` / `url``mp.weixin.qq.com` 链接)/ `digest` / `cover_url` / `time` + `timestamp`(文章发布时间)/ `recv_time_str` + `recv_time`(微信接收推送的时间)。多图文推送会展开为多行。
### 附件提取(图片)
聊天里的图片本体在 `xwechat_files/<wxid>/msg/attach/...` 下加密存储(`.dat`),需要按消息所在 `message_resource.db` 的 md5 + 平台相关 image key 才能解码。两步走:
```bash
# 1) 先列出图片附件,拿到不透明的 attachment_id
wx attachments "张三"
wx attachments "AI群" --kind image -n 100
wx attachments "AI群" --since 2026-04-01 --until 2026-04-15
# 2) 用 attachment_id 把单个资源解密写到指定路径
wx extract <attachment_id> -o ~/Desktop/photo.jpg
wx extract <attachment_id> -o /tmp/x.jpg --overwrite
```
`attachments` 输出每条带:`attachment_id` / `kind`(当前固定 `image`/ `type` / `local_id` / `timestamp` / `time`,群聊里另带 `sender` 和稳定身份三件套(同上文)。命令名保留成 `attachments` 是为了后续扩到其他附件类型时不 break CLI。
`extract` 报告里带:`md5` / `dat_path` / `dat_size` / `output` / `output_size` / `format`实际识别出的图片格式jpg / png / gif / webp / hevc 等)/ `decoder`(实际选用的解码器:`legacy_xor` / `v1_aes` / `v2`)。
支持的解码档位:
- **legacy XOR**:早期单字节 XOR无 magic按文件首字节探测格式自动反推
- **V1 fixed-AES**`07 08 V1 08 07`AES-128-ECB + 固定 key `cfcd208495d565ef`
- **V2 AES + XOR**`07 08 V2 08 07`AES-128-ECB + raw + XORAES key 平台派生
V2 image key 提取macOS / Windows 自动Linux 暂不支持):
- macOS`kvcomm` cache`key_<uin>_*.statistic` 文件名取 uin → `md5(str(uin) + wxid)[:16]`+ brute-force fallback`xor_key = uin & 0xff`
- Windows`Weixin.exe` 内存匹配 `[A-Za-z0-9]{32|16}` 候选,按 V2 template ciphertext-block 反验
### 收藏与统计
```bash
# 全部收藏
wx favorites
# 按类型筛选text / image / article / card / video
wx favorites --type image
# 搜索收藏内容
wx favorites --query "关键词"
# 聊天统计(发言人、消息类型、活跃时段)
wx stats "AI群"
wx stats "AI群" --since 2026-01-01
```
### 导出
```bash
# 导出为 Markdown默认
wx export "张三" --format markdown -o chat.md
wx export "张三" -n 2000 --format markdown -o chat.md
# 导出为 JSON
wx export "AI群" --since 2026-01-01 --format json -o chat.json
```
### Daemon 管理
```bash
wx daemon status
wx daemon stop
wx daemon logs --follow
```
---
## Agent 使用建议
查询结果需要程序处理时,统一加 `--json`
```bash
wx sessions --json
wx new-messages --json
wx search "关键词" --json | jq '.results[0]'
wx history "张三" --json -n 50 | jq '.messages[0]'
wx history "张三" --json | jq '.meta'
wx history "张三" --json --with-meta | jq '.meta.cache_mode_per_shard'
```
CHAT 参数支持昵称、备注名、微信 ID模糊匹配。不确定准确名称时先用 `wx contacts --query` 搜索。
---
## 数据文件位置
```
~/.wx-cli/
├── config.json # 配置
├── all_keys.json # 数据库密钥(敏感,勿分享)
├── daemon.sock # Unix socket
├── daemon.pid / .log
└── cache/ # 解密后的数据库缓存
```
---
## 常见问题
**微信重启后密钥失效**:重新运行 `sudo wx init --force`(微信必须正在运行)。
**daemon 无响应**`wx daemon stop` 后重新调用任意命令自动重启。
**找不到聊天**:用 `wx contacts --query` 确认昵称/备注名,或用微信 ID 直接查询。
**为什么只能获取 500 条消息?**:这是默认输出条数,不是硬限制。显式传 `-n` 即可,例如 `wx history "张三" -n 2000``wx export "张三" -n 2000 -o chat.md`

View File

@ -196,3 +196,126 @@ open /Applications/WeChat.app
| "SIP 阻止了调试微信" | ❌ SIP 只保护系统进程,微信不受 SIP 保护 | | "SIP 阻止了调试微信" | ❌ SIP 只保护系统进程,微信不受 SIP 保护 |
| "加了 sshd 到 FDA 就行" | ❌ 还需要加 `sshd-keygen-wrapper`,且要重连 SSH | | "加了 sshd 到 FDA 就行" | ❌ 还需要加 `sshd-keygen-wrapper`,且要重连 SSH |
| "微信开着也能重签名" | ❌ 运行中的 binary/dylib 被占用codesign 会失败 | | "微信开着也能重签名" | ❌ 运行中的 binary/dylib 被占用codesign 会失败 |
---
## 五、重签名后微信权限 silent 失效
### 现象
完成 ad-hoc 重签名后,微信任意以下功能都可能"看起来已授权但实际被拒绝"
- 截图 / 屏幕共享(`ScreenCapture`
- 视频通话 / 扫码(`Camera`
- 语音消息 / 通话(`Microphone`
- 自动化、第三方输入法(`AppleEvents`
- 同步通讯录(`AddressBook`
- 文件发送 / 接收(`SystemPolicyDocumentsFolder` / `Downloads` / `Desktop`
System Settings 里通常仍看到"微信.app"开关是 ON但运行时权限校验失败。微信会反复弹"需要开启 X 权限"。
### 根因(第一性原理)
macOS TCCTransparency, Consent, and Control**bundle id + csreq** 联合校验权限。`csreq`code requirement是从 app 的 code signature 推导出的二进制 blob存在 `/Library/Application Support/com.apple.TCC/TCC.db``access` 表里,每条 ~160 字节。
`codesign --force --deep --sign -` 把 WeChat 从官方签名换成 ad-hoc 签名(甚至 ad-hoc → ad-hoc 重签也会变),新进程的 csreq 跟旧记录里那条对不上 —— tccd 拒绝。
System Settings UI 只按 client 显示开关、不重算 csreq所以视觉上是"已授权",运行时实际拒绝。这是 silent drift。
### 修复步骤
把 WeChat 在 TCC 里的旧记录全部抹掉,让 macOS 在下次微信请求权限时按新签名重新生成 csreq
```bash
for s in ScreenCapture Camera Microphone AppleEvents AddressBook \
SystemPolicyDocumentsFolder SystemPolicyDownloadsFolder SystemPolicyDesktopFolder; do
tccutil reset "$s" com.tencent.xinWeChat
done
```
`tccutil` 对没有授权过的 service 会报 "No such bundle identifier",这是 no-op不影响其他 service 的 reset。
之后退出并重新打开微信,按 GUI 提示重新允许:
```bash
killall WeChat
open /Applications/WeChat.app
```
> 这一步**应当由用户/agent 手动执行**,不在 `wx init` 里自动跑——TCC 重置会让用户的现有授权失效,需要由人决定时机。
#### macOS 26 的 UI 拆分
在 macOS 26 上,**隐私与安全 → 录屏与系统录音** 显示为两块,容易踩坑:
| 区域 | 作用 |
|------|------|
| **录屏与系统录音**(上半区) | 录制屏幕内容 + 系统音频;微信截图、屏幕共享需要这一项 |
| **仅系统录音**(下半区) | 只录系统音频;只打开这一项**不能**修复微信截图 |
把 WeChat 加进上半区;只勾下半区的"仅系统录音"无效。
### 验证
确认 WeChat 当前是 ad-hoc 签名(这是修复前提):
```bash
codesign -dv --verbose=4 /Applications/WeChat.app 2>&1 | grep -E "Signature|flags|TeamIdentifier"
```
期望看到:
```text
flags=0x2(adhoc)
Signature=adhoc
TeamIdentifier=not set
```
最直接的功能验证:在微信里使用截图、视频通话、麦克风等功能,按 GUI 弹窗的"允许"重新授权一次,之后正常工作。
---
## 六、`"微信" 想访问其他 App 的数据` 弹窗
### 现象
执行过 `wx init`、对 `/Applications/WeChat.app` 做过 ad-hoc 重签名之后,再使用微信时会比较频繁地看到 macOS 弹出:
```
"微信" 想访问其他 App 的数据。
单独存放 App 数据可让你更容易管理隐私和安全。
[ 不允许 ] [ 允许 ]
```
最常见的触发面是**在微信里打开公众号文章**,但这只是高频触发面,不是根因。
### 根因(第一性原理)
这弹窗是 macOS Ventura+ / 14 / 15 对 **app data container 跨身份访问** 的保护:当前进程("微信")正在读取另一个 code identity 的 app 留下的数据。
我们当前 macOS 方案为了让 `task_for_pid` 能拿到 WeChat 的 task port、读取进程内存里的 raw key要求用户执行
```bash
codesign --force --deep --sign - /Applications/WeChat.app
```
这一步把 WeChat 从 Apple 官方签名换成 ad-hoc 身份。对用户来说它仍然是"微信";对 macOS 安全模型来说,**重签前的 WeChat** 和 **重签后的 WeChat** 已经不是同一个 app identity。
之后当(重签后的)微信访问它原本的 `~/Library/Containers/com.tencent.xinWeChat/...`、缓存、app group 等数据时,系统看到的是"一个新身份在读旧身份留下的 container 数据",于是按隐私保护策略弹这个对话框。公众号文章里的 webview / cookie / 缓存路径刚好踩到了这条访问路径,所以"打开公众号就弹"会非常容易复现,但**本质不是公众号页面的问题**,而是 code identity + container access。
> 注意:这**不是** "wx-cli 在偷偷读别的 App 的数据"wx-cli 进程本身对 WeChat container 是只读访问;但**要求用户重签 WeChat** 这一步本身就是这类弹窗的直接诱因。所以这是当前 macOS invasive init 路径的已知副作用,不是与 wx-cli 无关的系统行为。
### 应对
短期缓解:
- 点"允许"通常只是放行**当前这次** WeChat 进程;下一次 WeChat 启动权限会 reset可能还会再弹
- 该授权一般不会在 System Settings 里留下显式开关,因为它绑定的是动态的 code identity
彻底不弹:
- 把 `/Applications/WeChat.app` 恢复成官方签名(重装官方 WeChat 包),不再执行 `codesign --force --deep --sign -`
- 这一步只是放弃**当前依赖 ad-hoc 重签的默认路径**,并不等于放弃 macOS memory-scan在本机 GUI Terminal 下、对 Terminal.app 授予「开发者工具」TCC 权限后,`task_for_pid` 对 Apple 官方签名hardened runtime的 WeChat 应当仍能走通——参考 §一 实测表里的"Apple 签名 + 本机 Terminal sudo = ✅"
- ⚠️ 实测覆盖范围说明:§一 实测表里 "Apple 签名 + 本机 Terminal sudo ✅" 的两条实证只覆盖 macOS 10.15 (Catalina) 与 11.1 (Big Sur)macOS 14 (Sonoma) / 15 (Sequoia) 上是否仍走通**未在本项目内实测**。如果你按这条路恢复官方签名后发现 init 走不通,请回到重签路径并接受本节描述的弹窗副作用
- 真正受限的场景是 SSH 远程 + Apple 签名 WeChat`sshd` 拿不到 TCC 开发者工具授权,这时才必须走重签路径
长期方向:
- 这条副作用的真正修复是把 `wx init` 重新设计成 `safe → assisted → invasive fallback` 三层:默认不动 WeChat只有在前两条都不可行时才走 ad-hoc 重签,并先打出完整副作用清单让用户显式确认。在那之前,这是已知 trade-off。

View File

@ -1,6 +1,6 @@
{ {
"name": "@jackwener/wx-cli-darwin-arm64", "name": "@jackwener/wx-cli-darwin-arm64",
"version": "0.1.2", "version": "0.3.0",
"description": "wx-cli binary for macOS arm64", "description": "wx-cli binary for macOS arm64",
"os": ["darwin"], "os": ["darwin"],
"cpu": ["arm64"], "cpu": ["arm64"],

View File

@ -1,6 +1,6 @@
{ {
"name": "@jackwener/wx-cli-darwin-x64", "name": "@jackwener/wx-cli-darwin-x64",
"version": "0.1.2", "version": "0.3.0",
"description": "wx-cli binary for macOS x64", "description": "wx-cli binary for macOS x64",
"os": ["darwin"], "os": ["darwin"],
"cpu": ["x64"], "cpu": ["x64"],

View File

@ -1,6 +1,6 @@
{ {
"name": "@jackwener/wx-cli-linux-arm64", "name": "@jackwener/wx-cli-linux-arm64",
"version": "0.1.2", "version": "0.3.0",
"description": "wx-cli binary for Linux arm64", "description": "wx-cli binary for Linux arm64",
"os": ["linux"], "os": ["linux"],
"cpu": ["arm64"], "cpu": ["arm64"],

View File

@ -1,6 +1,6 @@
{ {
"name": "@jackwener/wx-cli-linux-x64", "name": "@jackwener/wx-cli-linux-x64",
"version": "0.1.2", "version": "0.3.0",
"description": "wx-cli binary for Linux x64", "description": "wx-cli binary for Linux x64",
"os": ["linux"], "os": ["linux"],
"cpu": ["x64"], "cpu": ["x64"],

View File

@ -1,6 +1,6 @@
{ {
"name": "@jackwener/wx-cli-win32-x64", "name": "@jackwener/wx-cli-win32-x64",
"version": "0.1.2", "version": "0.3.0",
"description": "wx-cli binary for Windows x64", "description": "wx-cli binary for Windows x64",
"os": ["win32"], "os": ["win32"],
"cpu": ["x64"], "cpu": ["x64"],

View File

@ -0,0 +1,53 @@
#!/usr/bin/env node
'use strict';
const { execFileSync } = require('child_process');
const path = require('path');
const fs = require('fs');
const PLATFORM_PACKAGES = {
'darwin-arm64': '@jackwener/wx-cli-darwin-arm64',
'darwin-x64': '@jackwener/wx-cli-darwin-x64',
'linux-x64': '@jackwener/wx-cli-linux-x64',
'linux-arm64': '@jackwener/wx-cli-linux-arm64',
'win32-x64': '@jackwener/wx-cli-win32-x64',
};
const platformKey = `${process.platform}-${process.arch}`;
const ext = process.platform === 'win32' ? '.exe' : '';
function getBinaryPath() {
if (process.env.WX_CLI_BINARY) {
return process.env.WX_CLI_BINARY;
}
const pkg = PLATFORM_PACKAGES[platformKey];
if (!pkg) {
console.error(`wx-cli: unsupported platform ${platformKey}`);
process.exit(1);
}
try {
return require.resolve(`${pkg}/bin/wx${ext}`);
} catch {
const modPath = path.join(
path.dirname(require.resolve(`${pkg}/package.json`)),
`bin/wx${ext}`
);
if (fs.existsSync(modPath)) return modPath;
}
console.error(`wx-cli: binary not found for ${platformKey}`);
console.error('Try: npm install -g @jackwener/wx-cli');
process.exit(1);
}
try {
execFileSync(getBinaryPath(), process.argv.slice(2), {
stdio: 'inherit',
env: { ...process.env },
});
} catch (e) {
if (e && e.status != null) process.exit(e.status);
throw e;
}

View File

@ -1,6 +1,6 @@
{ {
"name": "@jackwener/wx-cli", "name": "@jackwener/wx-cli",
"version": "0.1.2", "version": "0.3.0",
"description": "Query your local WeChat data from the command line. Designed for LLM agent tool calls.", "description": "Query your local WeChat data from the command line. Designed for LLM agent tool calls.",
"bin": { "bin": {
"wx": "bin/wx.js" "wx": "bin/wx.js"
@ -13,11 +13,11 @@
"install.js" "install.js"
], ],
"optionalDependencies": { "optionalDependencies": {
"@jackwener/wx-cli-darwin-arm64": "0.1.1", "@jackwener/wx-cli-darwin-arm64": "0.3.0",
"@jackwener/wx-cli-darwin-x64": "0.1.1", "@jackwener/wx-cli-darwin-x64": "0.3.0",
"@jackwener/wx-cli-linux-x64": "0.1.1", "@jackwener/wx-cli-linux-x64": "0.3.0",
"@jackwener/wx-cli-linux-arm64": "0.1.1", "@jackwener/wx-cli-linux-arm64": "0.3.0",
"@jackwener/wx-cli-win32-x64": "0.1.1" "@jackwener/wx-cli-win32-x64": "0.3.0"
}, },
"engines": { "node": ">=14" }, "engines": { "node": ">=14" },
"keywords": ["wechat", "cli", "wx", "llm", "ai", "sqlite", "sqlcipher"], "keywords": ["wechat", "cli", "wx", "llm", "ai", "sqlite", "sqlcipher"],

View File

@ -0,0 +1,153 @@
//! 不透明附件 ID — 跨 CLI / IPC 的圆 trip 句柄。
//!
//! 编码:`base64url_no_pad(serde_json(payload))`。
//! 选择 base64url(json) 而不是紧凑 bit-pack
//! - phase 1 求稳,不发明二进制协议
//! - 后面加字段(`resource_md5` / `decoder_hint` 之类)老 CLI 不 break
//! - debug 直接 base64 -d | jq 看字段
//!
//! ⚠️ `local_id` 在同一 chat 内会被 WeChat 复用(实测同 chat 最多 7 条同 local_id
//! 所以 `(chat, local_id, create_time)` 三元组才是定位资源行的最小集。
use anyhow::{anyhow, Context, Result};
use base64::{engine::general_purpose::URL_SAFE_NO_PAD, Engine};
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum AttachmentKind {
Image,
Video,
File,
Voice,
}
impl AttachmentKind {
/// 从 message.local_type 推 attachment kind只覆盖 phase 1 关心的几种)。
/// 高 32 bit 是版本/会话 flag要先 mask 到低 32 bit。
pub fn from_local_type(local_type: i64) -> Option<Self> {
let lo = (local_type as u64) & 0xFFFF_FFFF;
match lo {
3 => Some(AttachmentKind::Image),
34 => Some(AttachmentKind::Voice),
43 => Some(AttachmentKind::Video),
// type=49 是 appmsg里面 subtype=6 才是文件;这里偏宽松返回 File
// 由 resolver 进一步根据 appmsg subtype 决定是否真的能 extract
49 => Some(AttachmentKind::File),
_ => None,
}
}
pub fn as_str(&self) -> &'static str {
match self {
AttachmentKind::Image => "image",
AttachmentKind::Video => "video",
AttachmentKind::File => "file",
AttachmentKind::Voice => "voice",
}
}
}
/// 附件 ID payload序列化后 base64url 编码)。
///
/// `v` 是版本字段,将来 schema 变了可以走分支兼容。当前 v=1。
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AttachmentId {
/// payload schema version
pub v: u32,
/// 会话 username同时用于 ChatName2Id 查 chat_id 和拼 attach 路径)
pub chat: String,
/// 消息行的 local_id
pub local_id: i64,
/// 消息行的 create_timeunix 秒)— 用于 disambiguate 同 chat 内 local_id 复用
pub create_time: i64,
/// 附件类别
pub kind: AttachmentKind,
/// 可选 hint消息所在 message_N.db 的 N。给定时 resolver 可跳过 shard 扫描;
/// 缺省时 resolver 会按 `find_msg_tables` 逻辑全量扫
#[serde(default, skip_serializing_if = "Option::is_none")]
pub db: Option<u8>,
}
impl AttachmentId {
pub fn encode(&self) -> Result<String> {
let json = serde_json::to_vec(self).context("序列化 AttachmentId")?;
Ok(URL_SAFE_NO_PAD.encode(json))
}
pub fn decode(s: &str) -> Result<Self> {
let bytes = URL_SAFE_NO_PAD
.decode(s.trim())
.map_err(|e| anyhow!("attachment_id 不是合法 base64url: {}", e))?;
let id: AttachmentId =
serde_json::from_slice(&bytes).context("attachment_id payload 非合法 JSON")?;
if id.v != 1 {
return Err(anyhow!("不支持的 attachment_id 版本 v={}", id.v));
}
Ok(id)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn round_trip_minimal() {
let id = AttachmentId {
v: 1,
chat: "wxid_abc".to_string(),
local_id: 12345,
create_time: 1_715_678_901,
kind: AttachmentKind::Image,
db: None,
};
let s = id.encode().unwrap();
let back = AttachmentId::decode(&s).unwrap();
assert_eq!(back.chat, id.chat);
assert_eq!(back.local_id, id.local_id);
assert_eq!(back.create_time, id.create_time);
assert_eq!(back.kind, id.kind);
assert_eq!(back.db, id.db);
}
#[test]
fn round_trip_with_db_hint() {
let id = AttachmentId {
v: 1,
chat: "1234@chatroom".to_string(),
local_id: 42,
create_time: 1,
kind: AttachmentKind::Image,
db: Some(2),
};
let s = id.encode().unwrap();
assert!(!s.contains('=')); // base64url no-pad
let back = AttachmentId::decode(&s).unwrap();
assert_eq!(back.db, Some(2));
}
#[test]
fn local_type_mask_high_bits() {
// monitor_web.py 里 image push 路径:高位带 flag低 32 bit 是 3
let high_flag = (0xDEAD_BEEFu64 << 32) as i64 | 3;
assert_eq!(
AttachmentKind::from_local_type(high_flag),
Some(AttachmentKind::Image)
);
}
#[test]
fn rejects_unknown_version() {
let id = AttachmentId {
v: 99,
chat: "x".to_string(),
local_id: 0,
create_time: 0,
kind: AttachmentKind::Image,
db: None,
};
let s = id.encode().unwrap();
assert!(AttachmentId::decode(&s).is_err());
}
}

View File

@ -0,0 +1,122 @@
//! `.dat` 文件解码:根据 6B header magic 分发到具体 decoder。
//!
//! 三档:
//! | header[0..6] | decoder | 备注 |
//! |-------------------------|-------------------|-----------------------------------------|
//! | `07 08 V2 08 07` | `v2` | AES-128-ECB + XOR 混合,需要 image AES key |
//! | `07 08 V1 08 07` | `v1_aes` | 固定 AES key `cfcd208495d565ef` |
//! | (其他, 通常无 magic) | `v1_xor` | legacy single-byte XORmagic 自动探测 |
//!
//! 决策点放在 `dispatch`,让上层(`resolver` / CLI extract 命令)只跟一个入口打交道。
use anyhow::{anyhow, Result};
pub mod v1_xor;
pub mod v2;
/// 完整 V2 magic`\x07\x08V2\x08\x07`
pub const V2_MAGIC: [u8; 6] = [0x07, 0x08, b'V', b'2', 0x08, 0x07];
/// 完整 V1 magic`\x07\x08V1\x08\x07`
pub const V1_MAGIC: [u8; 6] = [0x07, 0x08, b'V', b'1', 0x08, 0x07];
/// 解码后的产物 + 探测出的图片格式
#[derive(Debug)]
pub struct DecodedImage {
pub data: Vec<u8>,
/// 推断出的图片扩展名(不带点),由 magic 决定。例如 "jpg" / "png" / "gif" / "webp" /
/// "tif" / "bmp" / "hevc"wxgf 容器)/ "bin"(未识别)
pub format: &'static str,
/// 解码器名称("legacy_xor" / "v1_aes" / "v2"),用于 CLI 调试输出
pub decoder: &'static str,
}
/// 由 caller 提供的 V2 image AES keycodex 的 `image_key` 模块负责拿到)。
/// 缺省时遇到 V2 文件会返回 `Err`caller 可以拿到具体错误信息再处理。
#[derive(Debug, Clone, Copy, Default)]
pub struct V2KeyMaterial<'a> {
pub aes_key: Option<&'a [u8; 16]>,
/// XOR key — WeChat 4.x 默认 0x88可 override
pub xor_key: u8,
}
impl<'a> V2KeyMaterial<'a> {
pub fn with_aes(key: &'a [u8; 16]) -> Self {
Self { aes_key: Some(key), xor_key: 0x88 }
}
}
/// 根据 `dat_bytes` 头部 magic 自动分发到对应 decoder。
///
/// `v2_key` 仅在文件是 V2 magic 时被消费。
pub fn dispatch(dat_bytes: &[u8], v2_key: V2KeyMaterial<'_>) -> Result<DecodedImage> {
if dat_bytes.len() >= 6 {
let head: &[u8; 6] = dat_bytes[..6].try_into().unwrap();
if head == &V2_MAGIC {
return v2::decode(dat_bytes, v2_key);
}
if head == &V1_MAGIC {
// V1 fixed-AES: 固定 key = md5("0")[:16] = "cfcd208495d565ef"
let fixed_key: [u8; 16] = *b"cfcd208495d565ef";
return v2::decode(
dat_bytes,
V2KeyMaterial { aes_key: Some(&fixed_key), xor_key: v2_key.xor_key },
)
.map(|mut d| {
d.decoder = "v1_aes";
d
});
}
}
if dat_bytes.is_empty() {
return Err(anyhow!("空 .dat 文件"));
}
v1_xor::decode(dat_bytes)
}
/// 从解密后的字节流头部探测图片格式扩展名。
///
/// 与上游 `decode_image.py::detect_image_format` 一致;新增 wxgf (HEVC 裸流) 的探测,
/// 因为 V2 解码后产物可能直接是 wxgf 容器。
pub fn detect_image_format(bytes: &[u8]) -> &'static str {
if bytes.len() >= 4 && &bytes[..4] == b"wxgf" {
return "hevc";
}
if bytes.len() >= 3 && bytes[..3] == [0xFF, 0xD8, 0xFF] {
return "jpg";
}
if bytes.len() >= 4 && bytes[..4] == [0x89, 0x50, 0x4E, 0x47] {
return "png";
}
if bytes.len() >= 3 && &bytes[..3] == b"GIF" {
return "gif";
}
if bytes.len() >= 12 && &bytes[..4] == b"RIFF" && &bytes[8..12] == b"WEBP" {
return "webp";
}
if bytes.len() >= 4 && bytes[..4] == [0x49, 0x49, 0x2A, 0x00] {
return "tif";
}
if bytes.len() >= 2 && &bytes[..2] == b"BM" {
return "bmp";
}
"bin"
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn detect_basic_formats() {
assert_eq!(detect_image_format(&[0xFF, 0xD8, 0xFF, 0xE0]), "jpg");
assert_eq!(detect_image_format(&[0x89, 0x50, 0x4E, 0x47]), "png");
assert_eq!(detect_image_format(b"GIF89a"), "gif");
assert_eq!(detect_image_format(b"BM\0\0\0\0\0\0\0\0\0\0\0\0"), "bmp");
let mut webp = b"RIFF\0\0\0\0WEBP".to_vec();
webp.extend_from_slice(&[0; 4]);
assert_eq!(detect_image_format(&webp), "webp");
assert_eq!(detect_image_format(&[0x49, 0x49, 0x2A, 0x00]), "tif");
assert_eq!(detect_image_format(b"wxgfXXXX"), "hevc");
assert_eq!(detect_image_format(&[0, 0, 0, 0]), "bin");
}
}

View File

@ -0,0 +1,166 @@
//! Legacy single-byte XOR decoder无 magic 头的旧 .dat
//!
//! 算法:用已知图片 magic 反推 XOR key —— `key = file[0] ^ magic[0]`。
//! 然后用同一个 key 校验 `file[i] ^ key == magic[i]`,全部命中才接受这个 key。
//!
//! 优先级(按 magic 长度降序,避免短 magic 假阳性):
//! PNG (4) > GIF (4) > TIF (4) > WEBP (4, RIFF) > JPG (3) > BMP (2, 需额外校验)
//!
//! BMP 只有 2 字节 magic假阳性高额外用 BMP file header 里的
//! `bf_size`offset 2, u32 LE和 `bf_offset`offset 10, u32 LE做合理性校验
//! - `|bf_size - file_size| < 1024`(允许微小 padding 差)
//! - `14 <= bf_offset <= 1078`(最大调色板 256*4 + header 14 = 1038留点余量
use anyhow::{anyhow, Result};
use super::{detect_image_format, DecodedImage};
const PNG: &[u8] = &[0x89, 0x50, 0x4E, 0x47];
const GIF: &[u8] = &[0x47, 0x49, 0x46, 0x38];
const TIF: &[u8] = &[0x49, 0x49, 0x2A, 0x00];
const WEBP_RIFF: &[u8] = &[0x52, 0x49, 0x46, 0x46];
const JPG: &[u8] = &[0xFF, 0xD8, 0xFF];
const BMP: &[u8] = &[0x42, 0x4D];
/// 在 `header` 上尝试一个固定 magic返回 `Some(key)` 当且仅当所有字节都对得上。
fn try_magic(header: &[u8], magic: &[u8]) -> Option<u8> {
if header.len() < magic.len() {
return None;
}
let key = header[0] ^ magic[0];
for i in 1..magic.len() {
if header[i] ^ key != magic[i] {
return None;
}
}
Some(key)
}
/// 探测 XOR key。失败返回 `None`caller 决定是不是错)。
pub fn detect_key(file_bytes: &[u8]) -> Option<u8> {
if file_bytes.len() < 4 {
return None;
}
let header = &file_bytes[..file_bytes.len().min(16)];
// 先试 3+ 字节 magic
for magic in [PNG, GIF, TIF, WEBP_RIFF, JPG] {
if let Some(k) = try_magic(header, magic) {
return Some(k);
}
}
// 最后试 BMP只有 2B magic需额外校验
if let Some(k) = try_magic(header, BMP) {
if header.len() >= 14 {
// 解 BMP file header 14 字节
let mut dec = [0u8; 14];
for i in 0..14 {
dec[i] = header[i] ^ k;
}
let bmp_size = u32::from_le_bytes([dec[2], dec[3], dec[4], dec[5]]);
let bmp_offset = u32::from_le_bytes([dec[10], dec[11], dec[12], dec[13]]);
let file_size = file_bytes.len() as u32;
// 允许 1024 字节 padding 差offset 在合理范围
if file_size.abs_diff(bmp_size) < 1024 && (14..=1078).contains(&bmp_offset) {
return Some(k);
}
}
}
None
}
/// XOR 解码整个 `.dat` 内容。
pub fn decode(file_bytes: &[u8]) -> Result<DecodedImage> {
let key =
detect_key(file_bytes).ok_or_else(|| anyhow!("legacy XOR: 无法识别图片 magickey 探测失败)"))?;
let data: Vec<u8> = file_bytes.iter().map(|b| b ^ key).collect();
let format = detect_image_format(&data);
if format == "bin" {
return Err(anyhow!("legacy XOR: 解出 key=0x{:02x} 但产物 magic 不识别", key));
}
Ok(DecodedImage { data, format, decoder: "legacy_xor" })
}
#[cfg(test)]
mod tests {
use super::*;
/// 把一段 plaintext 用单字节 key XOR 加密,模拟 .dat 文件
fn xor_encrypt(plain: &[u8], key: u8) -> Vec<u8> {
plain.iter().map(|b| b ^ key).collect()
}
#[test]
fn detect_jpg_key() {
let plain = vec![0xFF, 0xD8, 0xFF, 0xE0, 0x00, 0x10, 0x4A, 0x46, 0x49, 0x46];
let enc = xor_encrypt(&plain, 0x3C);
assert_eq!(detect_key(&enc), Some(0x3C));
}
#[test]
fn detect_png_key() {
let mut plain = vec![0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A];
plain.extend_from_slice(&[0; 16]);
let enc = xor_encrypt(&plain, 0xA5);
assert_eq!(detect_key(&enc), Some(0xA5));
}
#[test]
fn detect_gif_key() {
let mut plain = b"GIF89a".to_vec();
plain.extend_from_slice(&[0; 16]);
let enc = xor_encrypt(&plain, 0x77);
assert_eq!(detect_key(&enc), Some(0x77));
}
#[test]
fn detect_webp_riff_key() {
let mut plain = b"RIFF\x00\x00\x00\x00WEBP".to_vec();
plain.extend_from_slice(&[0; 8]);
let enc = xor_encrypt(&plain, 0x12);
assert_eq!(detect_key(&enc), Some(0x12));
}
#[test]
fn detect_tif_key() {
let mut plain = vec![0x49, 0x49, 0x2A, 0x00, 0x08, 0x00, 0x00, 0x00];
plain.extend_from_slice(&[0; 16]);
let enc = xor_encrypt(&plain, 0xC3);
assert_eq!(detect_key(&enc), Some(0xC3));
}
#[test]
fn detect_bmp_with_valid_header() {
// BMP 14B header: 'BM' + size(u32 LE) + reserved(2*u16) + offset(u32 LE)
let mut plain = Vec::new();
plain.extend_from_slice(b"BM");
plain.extend_from_slice(&100u32.to_le_bytes()); // file_size = 100
plain.extend_from_slice(&[0; 4]); // reserved
plain.extend_from_slice(&54u32.to_le_bytes()); // pixel data offset = 54
plain.resize(100, 0); // 整个文件 100 字节,匹配 file_size
let enc = xor_encrypt(&plain, 0x55);
assert_eq!(detect_key(&enc), Some(0x55));
}
#[test]
fn reject_random_bytes() {
// 全 0 文件BMP 检测会算出 key = 0x42 ^ 0 = 0x42
// 但解密出的 BMP file_size = 0 vs file_size = 100差距 > 1024 →
// 应该 reject
let bytes = vec![0u8; 100];
assert_eq!(detect_key(&bytes), None);
}
#[test]
fn decode_round_trip_jpg() {
let mut plain = vec![0xFF, 0xD8, 0xFF, 0xE0];
plain.extend_from_slice(b"JFIF padding here");
let enc = xor_encrypt(&plain, 0xAB);
let out = decode(&enc).unwrap();
assert_eq!(out.format, "jpg");
assert_eq!(out.decoder, "legacy_xor");
assert_eq!(out.data, plain);
}
}

View File

@ -0,0 +1,130 @@
//! V2 .dat 解码:`AES-128-ECB(PKCS7) + raw + XOR` 三段拼接。
//!
//! 文件结构(来自上游 `decode_image.py::v2_decrypt_file`
//! `[6B magic V2/V1] [4B aes_size LE] [4B xor_size LE] [1B padding]`
//! `[aligned_aes_size bytes AES-ECB ciphertext]`
//! `[len - aligned_aes_size - xor_size bytes raw_data (不加密)]`
//! `[xor_size bytes XOR (单字节 key)]`
//!
//! `aligned_aes_size`:把 `aes_size` 向上对齐到 16 的倍数;当 `aes_size` 本身是
//! 16 的倍数时PKCS7 还会再加一整块 padding所以再 +16。等价于
//! `aes_size + (16 - aes_size % 16)`。
//!
//! ⚠️ 此模块由 codex 落地完整 V2 实现 + image key 模块。当前只提供一个
//! `decode` 入口骨架,方便 v1_aes 路径(固定 key和 dispatch 一起编译过。
//! `aes_key=None` 时返回带具体诊断信息的错误。
use anyhow::{anyhow, bail, Result};
use super::{detect_image_format, DecodedImage, V2KeyMaterial, V1_MAGIC, V2_MAGIC};
const HEADER_SIZE: usize = 15;
pub fn decode(file_bytes: &[u8], key: V2KeyMaterial<'_>) -> Result<DecodedImage> {
if file_bytes.len() < HEADER_SIZE {
bail!("V2 .dat: 文件过短({} < {} 字节)", file_bytes.len(), HEADER_SIZE);
}
let magic: &[u8; 6] = file_bytes[..6].try_into().unwrap();
if magic != &V2_MAGIC && magic != &V1_MAGIC {
bail!("V2 .dat: header magic 不匹配 V1/V2");
}
let aes_key = key.aes_key.ok_or_else(|| {
anyhow!("V2 .dat: 需要 image AES keycodex 的 image_key 模块尚未填充)")
})?;
let aes_size = u32::from_le_bytes(file_bytes[6..10].try_into().unwrap()) as usize;
let xor_size = u32::from_le_bytes(file_bytes[10..14].try_into().unwrap()) as usize;
// PKCS7 对齐aes_size 不是 16 的倍数 → 向上对齐;是 16 的倍数 → 再加一整块
let aligned_aes_size = aes_size + (16 - (aes_size % 16));
let aes_end = HEADER_SIZE.checked_add(aligned_aes_size).ok_or_else(|| anyhow!("aes 段长度溢出"))?;
if aes_end > file_bytes.len() {
bail!(
"V2 .dat: 头部宣称 aes_size={} (aligned={}) 超过文件长度 {}",
aes_size,
aligned_aes_size,
file_bytes.len()
);
}
let raw_end = file_bytes.len().checked_sub(xor_size).ok_or_else(|| {
anyhow!("V2 .dat: 头部宣称 xor_size={} 超过文件长度 {}", xor_size, file_bytes.len())
})?;
if aes_end > raw_end {
bail!(
"V2 .dat: aes_end={} > raw_end={}aes/xor 段重叠)",
aes_end,
raw_end
);
}
// === AES-128-ECB 解密 + PKCS7 unpad ===
let aes_data = &file_bytes[HEADER_SIZE..aes_end];
let dec_aes = aes_ecb_decrypt_pkcs7(aes_key, aes_data)?;
// === Raw 段(未加密) ===
let raw_data = &file_bytes[aes_end..raw_end];
// === XOR 段 ===
let xor_data: Vec<u8> = file_bytes[raw_end..].iter().map(|b| b ^ key.xor_key).collect();
let mut out = Vec::with_capacity(dec_aes.len() + raw_data.len() + xor_data.len());
out.extend_from_slice(&dec_aes);
out.extend_from_slice(raw_data);
out.extend_from_slice(&xor_data);
let format = detect_image_format(&out);
if format == "bin" {
bail!("V2 .dat: AES 解密成功但产物 magic 不识别key 可能错)");
}
Ok(DecodedImage { data: out, format, decoder: "v2" })
}
/// AES-128-ECB 解密 + PKCS7 unpad。失败时返回 `Err`,不返回半结果。
///
/// 不引第三方 ECB 包ECB 本身就是 block-by-block手工跑就行。
/// PKCS7 padding 由本函数最后一段做 strict 校验:长度 1..=16且尾部全是同值字节。
fn aes_ecb_decrypt_pkcs7(key: &[u8; 16], cipher: &[u8]) -> Result<Vec<u8>> {
use aes::cipher::{generic_array::GenericArray, BlockDecrypt, KeyInit};
if cipher.is_empty() || cipher.len() % 16 != 0 {
bail!("AES 输入长度 {} 不是 16 的倍数", cipher.len());
}
let aes = aes::Aes128::new(key.into());
let mut out = Vec::with_capacity(cipher.len());
for chunk in cipher.chunks_exact(16) {
let mut block = GenericArray::clone_from_slice(chunk);
aes.decrypt_block(&mut block);
out.extend_from_slice(&block);
}
let pad = *out.last().ok_or_else(|| anyhow!("AES PKCS7: 空输出"))? as usize;
if pad == 0 || pad > 16 || pad > out.len() {
bail!("AES PKCS7: 非法 padding 长度 {}", pad);
}
let tail = &out[out.len() - pad..];
if !tail.iter().all(|&b| b as usize == pad) {
bail!("AES PKCS7: padding 字节不一致");
}
out.truncate(out.len() - pad);
Ok(out)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn rejects_short_file() {
let r = decode(&[0u8; 4], V2KeyMaterial::default());
assert!(r.is_err());
}
#[test]
fn rejects_v2_without_key() {
let mut buf = V2_MAGIC.to_vec();
buf.extend_from_slice(&[0u8; HEADER_SIZE - 6]);
let r = decode(&buf, V2KeyMaterial::default());
let err = r.unwrap_err().to_string();
assert!(err.contains("AES key"), "{}", err);
}
}

View File

@ -0,0 +1,11 @@
use anyhow::{bail, Result};
use super::{ImageKeyMaterial, ImageKeyProvider};
pub struct LinuxImageKeyProvider;
impl ImageKeyProvider for LinuxImageKeyProvider {
fn get_key(&self, _wxid: &str) -> Result<ImageKeyMaterial> {
bail!("Linux V2 图片 key 当前未实现;请先用 legacy/V1 图片或在 README 中标注 unsupported")
}
}

View File

@ -0,0 +1,423 @@
//! macOS V2 image AES key 提取。
//!
//! 主路径:从 `key_<uin>_*.statistic` 文件名拿 uin然后
//! `md5(str(uin) + normalize(wxid)).hex()[:16]` 派生 AES key。
//!
//! fallback通过 `md5(str(uin))[:4] == wxid_suffix` + `uin & 0xff == xor_key`
//! 把搜索空间压到 2^24再用 V2 模板反验 AES key。
use anyhow::{bail, Context, Result};
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::{mpsc, Arc, Mutex};
use crate::config;
use super::{
attach_root_for_db_dir, configured_db_dir_for_wxid, derive_xor_key_from_v2_dat,
find_v2_template_ciphertexts, join_components, normalize_wxid, verify_aes_key, wxid_from_db_dir,
ImageKeyMaterial, ImageKeyProvider,
};
pub struct MacosImageKeyProvider {
configured_db_dir: Result<PathBuf, String>,
cache: Mutex<HashMap<String, ImageKeyMaterial>>,
}
impl MacosImageKeyProvider {
pub fn from_current_config() -> Self {
let configured_db_dir = config::load_config()
.map(|cfg| cfg.db_dir)
.map_err(|err| err.to_string());
Self {
configured_db_dir,
cache: Mutex::new(HashMap::new()),
}
}
}
impl ImageKeyProvider for MacosImageKeyProvider {
fn get_key(&self, wxid: &str) -> Result<ImageKeyMaterial> {
let cache_key = normalize_wxid(wxid);
if let Some(found) = self.cache.lock().unwrap().get(&cache_key).copied() {
return Ok(found);
}
let configured_db_dir = self
.configured_db_dir
.as_ref()
.map_err(|err| anyhow::anyhow!("读取 config.db_dir 失败: {}", err))?;
let db_dir = configured_db_dir_for_wxid(configured_db_dir, wxid);
let attach_dir = attach_root_for_db_dir(&db_dir);
let key = derive_key_for_paths(&db_dir, &attach_dir)?;
self.cache.lock().unwrap().insert(cache_key, key);
Ok(key)
}
}
fn derive_key_for_paths(db_dir: &Path, attach_dir: &Path) -> Result<ImageKeyMaterial> {
let templates = find_v2_template_ciphertexts(attach_dir, 3, 64)?;
if templates.is_empty() {
bail!("在 {} 下找不到 V2 模板文件", attach_dir.display());
}
if let Some(found) = find_via_kvcomm(db_dir, &templates)? {
return Ok(found);
}
let (wxid_full, wxid_norm, suffix) =
extract_wxid_parts(db_dir).context("db_dir 不含可用于 fallback 的 wxid 4 位后缀")?;
let (xor_key, _votes, _total) = derive_xor_key_from_v2_dat(attach_dir, 10, 3)?
.context("V2 .dat 样本不足,无法投票反推 xor_key")?;
for wxid in preferred_wxid_candidates(&wxid_full, &wxid_norm) {
if let Some(aes_key) = bruteforce_aes_key(xor_key, &suffix, wxid, &templates)? {
return Ok(ImageKeyMaterial { aes_key, xor_key });
}
}
bail!("macOS V2 图片 key 派生失败")
}
fn find_via_kvcomm(db_dir: &Path, templates: &[[u8; 16]]) -> Result<Option<ImageKeyMaterial>> {
let Some(kvcomm_dir) = find_existing_kvcomm_dir(db_dir) else {
return Ok(None);
};
let codes = collect_kvcomm_codes(&kvcomm_dir)?;
if codes.is_empty() {
return Ok(None);
}
let wxids = collect_wxid_candidates(db_dir);
if wxids.is_empty() {
return Ok(None);
}
for wxid in wxids {
for code in &codes {
let candidate = derive_image_key_material(*code, &wxid);
if verify_aes_key(&candidate.aes_key, templates) {
return Ok(Some(candidate));
}
}
}
Ok(None)
}
fn derive_image_key_material(code: u32, wxid: &str) -> ImageKeyMaterial {
let xor_key = (code & 0xFF) as u8;
let digest = format!("{:x}", md5::compute(format!("{}{}", code, wxid)));
let mut aes_key = [0u8; 16];
aes_key.copy_from_slice(&digest.as_bytes()[..16]);
ImageKeyMaterial { aes_key, xor_key }
}
fn collect_wxid_candidates(db_dir: &Path) -> Vec<String> {
let Some(raw) = wxid_from_db_dir(db_dir) else {
return Vec::new();
};
let mut out = vec![raw.clone()];
let normalized = normalize_wxid(&raw);
if normalized != raw {
out.push(normalized);
}
out
}
fn extract_wxid_parts(db_dir: &Path) -> Option<(String, String, String)> {
let raw = wxid_from_db_dir(db_dir)?;
let idx = raw.rfind('_')?;
let suffix = &raw[idx + 1..];
if suffix.len() != 4 || !suffix.bytes().all(|byte| byte.is_ascii_hexdigit()) {
return None;
}
Some((raw.clone(), normalize_wxid(&raw), suffix.to_ascii_lowercase()))
}
fn preferred_wxid_candidates<'a>(raw: &'a str, normalized: &'a str) -> Vec<&'a str> {
if raw == normalized {
vec![raw]
} else {
vec![normalized, raw]
}
}
fn derive_kvcomm_dir_candidates(db_dir: &Path) -> Vec<PathBuf> {
let parts: Vec<String> = db_dir
.components()
.map(|component| component.as_os_str().to_string_lossy().into_owned())
.collect();
let mut candidates = Vec::new();
if let Some(idx) = parts.iter().position(|part| part == "xwechat_files") {
let documents_root = join_components(&parts[..idx]);
candidates.push(documents_root.join("app_data/net/kvcomm"));
candidates.push(documents_root.join("xwechat/net/kvcomm"));
if idx >= 1 {
let container_root = join_components(&parts[..idx - 1]);
candidates.push(
container_root
.join("Library/Application Support/com.tencent.xinWeChat/xwechat/net/kvcomm"),
);
candidates.push(
container_root.join("Library/Application Support/com.tencent.xinWeChat/net/kvcomm"),
);
}
}
if let Some(home) = dirs::home_dir() {
candidates.push(
home.join("Library/Containers/com.tencent.xinWeChat/Data/Documents/app_data/net/kvcomm"),
);
}
let mut dedup = Vec::new();
for candidate in candidates {
if !dedup.contains(&candidate) {
dedup.push(candidate);
}
}
dedup
}
fn find_existing_kvcomm_dir(db_dir: &Path) -> Option<PathBuf> {
derive_kvcomm_dir_candidates(db_dir)
.into_iter()
.find(|path| path.is_dir())
}
fn collect_kvcomm_codes(kvcomm_dir: &Path) -> Result<Vec<u32>> {
let mut codes = std::collections::BTreeSet::new();
for entry in std::fs::read_dir(kvcomm_dir)? {
let entry = entry?;
let Some(name) = entry.file_name().to_str().map(|value| value.to_string()) else {
continue;
};
let Some(rest) = name.strip_prefix("key_") else {
continue;
};
let Some((code, _)) = rest.split_once('_') else {
continue;
};
if let Ok(code) = code.parse::<u32>() {
codes.insert(code);
}
}
Ok(codes.into_iter().collect())
}
fn bruteforce_aes_key(
xor_key: u8,
suffix_hex: &str,
wxid: &str,
templates: &[[u8; 16]],
) -> Result<Option<[u8; 16]>> {
let suffix = hex_prefix_to_bytes(suffix_hex)?;
let workers = std::thread::available_parallelism()
.map(|count| count.get())
.unwrap_or(1)
.max(1);
let total = 1u32 << 24;
let chunk = total / workers as u32;
let stop = Arc::new(AtomicBool::new(false));
let (tx, rx) = mpsc::channel();
let wxid = Arc::new(wxid.as_bytes().to_vec());
let templates = Arc::new(templates.to_vec());
std::thread::scope(|scope| {
for idx in 0..workers {
let start = idx as u32 * chunk;
let end = if idx + 1 == workers {
total
} else {
(idx as u32 + 1) * chunk
};
let stop = Arc::clone(&stop);
let tx = tx.clone();
let wxid = Arc::clone(&wxid);
let templates = Arc::clone(&templates);
scope.spawn(move || {
for upper in start..end {
if stop.load(Ordering::Relaxed) {
break;
}
let uin = (upper << 8) | xor_key as u32;
let uin_ascii = uin.to_string();
let digest = md5::compute(uin_ascii.as_bytes());
if digest.0[0] != suffix[0] || digest.0[1] != suffix[1] {
continue;
}
let mut input = Vec::with_capacity(uin_ascii.len() + wxid.len());
input.extend_from_slice(uin_ascii.as_bytes());
input.extend_from_slice(&wxid);
let aes_hex = format!("{:x}", md5::compute(input));
let mut aes_key = [0u8; 16];
aes_key.copy_from_slice(&aes_hex.as_bytes()[..16]);
if verify_aes_key(&aes_key, &templates) {
stop.store(true, Ordering::Relaxed);
let _ = tx.send(aes_key);
break;
}
}
});
}
});
drop(tx);
Ok(rx.try_iter().next())
}
fn hex_prefix_to_bytes(hex: &str) -> Result<[u8; 2]> {
if hex.len() != 4 {
bail!("wxid suffix 不是 4 位 hex: {}", hex);
}
let hi = u8::from_str_radix(&hex[..2], 16)?;
let lo = u8::from_str_radix(&hex[2..], 16)?;
Ok([hi, lo])
}
#[cfg(test)]
mod tests {
use super::{derive_key_for_paths, find_existing_kvcomm_dir};
use super::collect_wxid_candidates;
use crate::attachment::image_key::normalize_wxid;
use aes::cipher::{generic_array::GenericArray, BlockEncrypt, KeyInit};
use aes::Aes128;
use std::fs;
use std::path::Path;
fn temp_dir(label: &str) -> std::path::PathBuf {
let mut dir = std::env::temp_dir();
dir.push(format!(
"wx-cli-image-key-macos-{}-{:?}",
label,
std::thread::current().id()
));
let _ = fs::remove_dir_all(&dir);
fs::create_dir_all(&dir).unwrap();
dir
}
fn write_v2_template(path: &Path, aes_key: &[u8; 16], xor_key: u8, plaintext: &[u8; 16]) {
let cipher = Aes128::new(aes_key.into());
let mut block = GenericArray::clone_from_slice(plaintext);
cipher.encrypt_block(&mut block);
let mut data = Vec::new();
data.extend_from_slice(&crate::attachment::decoder::V2_MAGIC);
data.extend_from_slice(&0u32.to_le_bytes());
data.extend_from_slice(&0u32.to_le_bytes());
data.push(0);
data.extend_from_slice(&block);
data.push(0);
data.push(0xD9 ^ xor_key);
fs::create_dir_all(path.parent().unwrap()).unwrap();
fs::write(path, data).unwrap();
}
#[test]
fn normalize_wxid_matches_expected_shapes() {
assert_eq!(normalize_wxid("wxid_abc_def"), "wxid_abc");
assert_eq!(normalize_wxid("your_wxid_a1b2"), "your_wxid");
assert_eq!(normalize_wxid("plain"), "plain");
}
#[test]
fn kvcomm_path_detection_works() {
let dir = temp_dir("kvcomm");
let db_dir = dir.join(
"Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files/your_wxid_a1b2/db_storage",
);
let kvcomm = dir.join(
"Library/Containers/com.tencent.xinWeChat/Data/Documents/app_data/net/kvcomm",
);
fs::create_dir_all(&db_dir).unwrap();
fs::create_dir_all(&kvcomm).unwrap();
assert_eq!(find_existing_kvcomm_dir(&db_dir), Some(kvcomm));
let _ = fs::remove_dir_all(dir);
}
#[test]
fn derives_key_via_kvcomm() {
let dir = temp_dir("via-kvcomm");
let db_dir = dir.join(
"Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files/your_wxid_a1b2/db_storage",
);
let attach = dir.join(
"Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files/your_wxid_a1b2/msg/attach/chat/2026-05/Img",
);
let kvcomm = dir.join(
"Library/Containers/com.tencent.xinWeChat/Data/Documents/app_data/net/kvcomm",
);
fs::create_dir_all(&db_dir).unwrap();
fs::create_dir_all(&kvcomm).unwrap();
fs::write(kvcomm.join("key_42_x.statistic"), b"").unwrap();
let digest = format!("{:x}", md5::compute("42your_wxid"));
let mut aes_key = [0u8; 16];
aes_key.copy_from_slice(&digest.as_bytes()[..16]);
write_v2_template(
&attach.join("sample_t.dat"),
&aes_key,
42,
b"\xFF\xD8\xFFtemplate-001!",
);
let derived = derive_key_for_paths(&db_dir, db_dir.parent().unwrap().join("msg/attach").as_path())
.unwrap();
assert_eq!(derived.aes_key, aes_key);
assert_eq!(derived.xor_key, 42);
let _ = fs::remove_dir_all(dir);
}
#[test]
fn derives_key_via_bruteforce_fallback() {
let dir = temp_dir("via-fallback");
let suffix = format!("{:x}", md5::compute("42"))
.chars()
.take(4)
.collect::<String>();
let raw_wxid = format!("mywxid_{}", suffix);
let db_dir = dir.join(format!(
"Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files/{}/db_storage",
raw_wxid
));
let attach = dir.join(format!(
"Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files/{}/msg/attach/chat/2026-05/Img",
raw_wxid
));
fs::create_dir_all(&db_dir).unwrap();
let digest = format!("{:x}", md5::compute("42mywxid"));
let mut aes_key = [0u8; 16];
aes_key.copy_from_slice(&digest.as_bytes()[..16]);
for idx in 0..3 {
write_v2_template(
&attach.join(format!("sample{}_t.dat", idx)),
&aes_key,
42,
b"\xFF\xD8\xFFtemplate-001!",
);
}
let derived = derive_key_for_paths(&db_dir, db_dir.parent().unwrap().join("msg/attach").as_path())
.unwrap();
assert_eq!(derived.aes_key, aes_key);
assert_eq!(derived.xor_key, 42);
let _ = fs::remove_dir_all(dir);
}
#[test]
fn collects_raw_and_normalized_wxid() {
let dir = temp_dir("wxid");
let db_dir = dir.join(
"Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files/your_wxid_a1b2/db_storage",
);
fs::create_dir_all(&db_dir).unwrap();
let wxids = collect_wxid_candidates(&db_dir);
assert_eq!(wxids, vec!["your_wxid_a1b2".to_string(), "your_wxid".to_string()]);
let _ = fs::remove_dir_all(dir);
}
}

View File

@ -0,0 +1,342 @@
//! V2 image AES key 提取 — 平台相关。
//!
//! 路径:
//! - macOS磁盘派生`key_<uin>_*.statistic` 文件名拿 uin → `md5(str(uin) + wxid)[:16]`
//! + brute-force fallback`md5(str(uin))[:4] == wxid_suffix` 枚举 2^24
//! - Windows扫 `Weixin.exe` 内存,匹配 `[a-zA-Z0-9]{32}` 候选,按已知 AES ciphertext-block
//! 反验(`find_image_key.py` / `find_image_key.c` 已写实)
//! - Linux上游空白当前不实现遇到 V2 .dat 返回 unsupported 错误
#[cfg(target_os = "linux")]
pub mod linux;
#[cfg(target_os = "macos")]
pub mod macos;
#[cfg(target_os = "windows")]
pub mod windows;
use anyhow::Result;
use regex::bytes::Regex;
use std::collections::HashSet;
use std::fs;
use std::path::{Path, PathBuf};
use std::sync::OnceLock;
use crate::attachment::decoder::{detect_image_format, V2_MAGIC};
/// V2 图片真正需要的是两份材料:
/// - 16 字节 ASCII AES key
/// - XOR keymacOS 上来自 uin & 0xff不是总能硬编码成 0x88
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct ImageKeyMaterial {
pub aes_key: [u8; 16],
pub xor_key: u8,
}
/// 单个 wxid 的 V2 image key 提取接口。
///
/// 实现者负责跨调用缓存(一台机器上同一 wxid 的 image key 在微信不重启时通常稳定)。
pub trait ImageKeyProvider {
fn get_key(&self, wxid: &str) -> Result<ImageKeyMaterial>;
fn get_aes_key(&self, wxid: &str) -> Result<[u8; 16]> {
Ok(self.get_key(wxid)?.aes_key)
}
fn get_xor_key(&self, wxid: &str) -> Result<u8> {
Ok(self.get_key(wxid)?.xor_key)
}
}
/// 平台默认实现。
pub fn default_provider() -> Option<Box<dyn ImageKeyProvider + Send + Sync>> {
#[cfg(target_os = "macos")]
{
return Some(Box::new(macos::MacosImageKeyProvider::from_current_config()));
}
#[cfg(target_os = "windows")]
{
return Some(Box::new(windows::WindowsImageKeyProvider::from_current_config()));
}
#[cfg(target_os = "linux")]
{
return Some(Box::new(linux::LinuxImageKeyProvider));
}
#[cfg(not(any(target_os = "macos", target_os = "windows", target_os = "linux")))]
{
None
}
}
pub(crate) fn configured_db_dir_for_wxid(configured_db_dir: &Path, requested_wxid: &str) -> PathBuf {
if requested_wxid.trim().is_empty() {
return configured_db_dir.to_path_buf();
}
let configured_leaf = wxid_from_db_dir(configured_db_dir);
if let Some(leaf) = configured_leaf.as_deref() {
if same_wxid(leaf, requested_wxid) {
return configured_db_dir.to_path_buf();
}
}
xwechat_files_root(configured_db_dir)
.map(|root| root.join(requested_wxid).join("db_storage"))
.unwrap_or_else(|| configured_db_dir.to_path_buf())
}
pub(crate) fn wxid_from_db_dir(db_dir: &Path) -> Option<String> {
let mut components = db_dir
.components()
.map(|component| component.as_os_str().to_string_lossy().into_owned());
while let Some(component) = components.next() {
if component == "xwechat_files" {
return components.next();
}
}
None
}
pub(crate) fn xwechat_files_root(db_dir: &Path) -> Option<PathBuf> {
let parts: Vec<_> = db_dir
.components()
.map(|component| component.as_os_str().to_string_lossy().into_owned())
.collect();
let idx = parts.iter().position(|part| part == "xwechat_files")?;
Some(join_components(&parts[..=idx]))
}
pub(crate) fn normalize_wxid(raw: &str) -> String {
let raw = raw.trim();
if raw.is_empty() {
return String::new();
}
if let Some(stripped) = raw.strip_prefix("wxid_") {
let head = stripped.split('_').next().unwrap_or(stripped);
return format!("wxid_{}", head);
}
if let Some((base, suffix)) = raw.rsplit_once('_') {
if suffix.len() == 4 && suffix.bytes().all(|byte| byte.is_ascii_hexdigit()) {
return base.to_string();
}
}
raw.to_string()
}
pub(crate) fn same_wxid(a: &str, b: &str) -> bool {
a == b || normalize_wxid(a) == normalize_wxid(b)
}
pub(crate) fn join_components(parts: &[String]) -> PathBuf {
let mut out = if parts.first().map(|part| part.is_empty()).unwrap_or(false) {
PathBuf::from("/")
} else {
PathBuf::new()
};
for part in parts {
if part.is_empty() {
continue;
}
out.push(part);
}
out
}
pub(crate) fn attach_root_for_db_dir(db_dir: &Path) -> PathBuf {
db_dir
.parent()
.map(|base| base.join("msg").join("attach"))
.unwrap_or_else(|| PathBuf::from("msg/attach"))
}
pub(crate) fn find_v2_template_ciphertexts(
attach_dir: &Path,
max_templates: usize,
max_files: usize,
) -> Result<Vec<[u8; 16]>> {
if !attach_dir.is_dir() {
return Ok(Vec::new());
}
let mut out = collect_templates_with_suffix(attach_dir, "_t.dat", max_templates, max_files)?;
if out.is_empty() {
out = collect_templates_with_suffix(attach_dir, ".dat", max_templates, max_files)?;
}
Ok(out)
}
pub(crate) fn derive_xor_key_from_v2_dat(
attach_dir: &Path,
sample: usize,
min_samples: usize,
) -> Result<Option<(u8, usize, usize)>> {
if !attach_dir.is_dir() {
return Ok(None);
}
let mut votes = Vec::new();
visit_files(attach_dir, &mut |path| -> Result<bool> {
let Some(name) = path.file_name().and_then(|value| value.to_str()) else {
return Ok(false);
};
if !name.ends_with(".dat") {
return Ok(false);
}
let meta = fs::metadata(path)?;
if meta.len() < 0x20 {
return Ok(false);
}
let bytes = fs::read(path)?;
if bytes.starts_with(&V2_MAGIC) {
let last = *bytes.last().unwrap();
votes.push(last ^ 0xD9);
if votes.len() >= sample {
return Ok(true);
}
}
Ok(false)
})?;
if votes.len() < min_samples {
return Ok(None);
}
let mut counts = [0usize; 256];
for vote in &votes {
counts[*vote as usize] += 1;
}
let (xor_key, top_votes) = counts
.iter()
.enumerate()
.max_by_key(|(_, count)| *count)
.map(|(idx, count)| (idx as u8, *count))
.expect("votes 非空");
Ok(Some((xor_key, top_votes, votes.len())))
}
pub(crate) fn verify_aes_key(aes_key: &[u8; 16], templates: &[[u8; 16]]) -> bool {
!templates.is_empty()
&& templates
.iter()
.all(|template| decrypt_template_block(aes_key, template).is_some())
}
pub(crate) fn ascii_alnum_candidates<'a>(buf: &'a [u8], len: usize) -> Vec<&'a [u8]> {
let re = match len {
16 => regex16(),
32 => regex32(),
_ => return Vec::new(),
};
re.find_iter(buf)
.filter_map(|matched| {
let start = matched.start();
let end = matched.end();
let left_ok = start == 0 || !buf[start - 1].is_ascii_alphanumeric();
let right_ok = end == buf.len() || !buf[end].is_ascii_alphanumeric();
(left_ok && right_ok).then_some(&buf[start..end])
})
.collect()
}
fn collect_templates_with_suffix(
dir: &Path,
suffix: &str,
max_templates: usize,
max_files: usize,
) -> Result<Vec<[u8; 16]>> {
let mut out = Vec::new();
let mut seen = HashSet::new();
let mut examined = 0usize;
visit_files(dir, &mut |path| -> Result<bool> {
let Some(name) = path.file_name().and_then(|value| value.to_str()) else {
return Ok(false);
};
if !name.ends_with(suffix) {
return Ok(false);
}
examined += 1;
let bytes = fs::read(path)?;
if bytes.len() >= 0x1F && bytes.starts_with(&V2_MAGIC) {
let template: [u8; 16] = bytes[0x0F..0x1F].try_into().unwrap();
if seen.insert(template) {
out.push(template);
if out.len() >= max_templates {
return Ok(true);
}
}
}
Ok(examined >= max_files && !out.is_empty())
})?;
Ok(out)
}
fn visit_files<F>(dir: &Path, f: &mut F) -> Result<bool>
where
F: FnMut(&Path) -> Result<bool>,
{
let mut entries: Vec<PathBuf> = fs::read_dir(dir)?
.flatten()
.map(|entry| entry.path())
.collect();
entries.sort();
for path in entries {
if path.is_dir() {
if visit_files(&path, f)? {
return Ok(true);
}
continue;
}
if f(&path)? {
return Ok(true);
}
}
Ok(false)
}
fn decrypt_template_block(aes_key: &[u8; 16], ciphertext: &[u8; 16]) -> Option<&'static str> {
use aes::cipher::{generic_array::GenericArray, BlockDecrypt, KeyInit};
let cipher = aes::Aes128::new(aes_key.into());
let mut block = GenericArray::clone_from_slice(ciphertext);
cipher.decrypt_block(&mut block);
let block: [u8; 16] = block.as_slice().try_into().ok()?;
let format = detect_image_format(&block);
(format != "bin").then_some(format)
}
fn regex16() -> &'static Regex {
static RE: OnceLock<Regex> = OnceLock::new();
RE.get_or_init(|| Regex::new(r"[A-Za-z0-9]{16}").unwrap())
}
fn regex32() -> &'static Regex {
static RE: OnceLock<Regex> = OnceLock::new();
RE.get_or_init(|| Regex::new(r"[A-Za-z0-9]{32}").unwrap())
}
#[cfg(test)]
mod tests {
use super::{ascii_alnum_candidates, normalize_wxid, same_wxid};
#[test]
fn regex_candidates_respect_boundaries() {
let buf = b"xx 0123456789ABCDef yy";
let hits = ascii_alnum_candidates(buf, 16);
assert_eq!(hits, vec![&buf[3..19]]);
}
#[test]
fn regex_candidates_ignore_embedded_runs() {
let buf = b"x0123456789ABCDefz";
assert!(ascii_alnum_candidates(buf, 16).is_empty());
}
#[test]
fn wxid_normalization_matches_expected_forms() {
assert_eq!(normalize_wxid("wxid_abc_def"), "wxid_abc");
assert_eq!(normalize_wxid("your_wxid_a1b2"), "your_wxid");
assert!(same_wxid("your_wxid_a1b2", "your_wxid"));
}
}

View File

@ -0,0 +1,238 @@
//! Windows V2 image AES key 提取。
//!
//! 扫 `Weixin.exe` 进程内存,匹配模式 `[A-Za-z0-9]{32}` / `[A-Za-z0-9]{16}`
//! 然后用 V2 模板 AES block 反验,控制 false positive。
use anyhow::{bail, Context, Result};
use std::collections::{HashMap, HashSet};
use std::path::PathBuf;
use std::sync::Mutex;
use windows::Win32::Foundation::{CloseHandle, HANDLE};
use windows::Win32::System::Diagnostics::Debug::ReadProcessMemory;
use windows::Win32::System::Diagnostics::ToolHelp::{
CreateToolhelp32Snapshot, Process32First, Process32Next, PROCESSENTRY32, TH32CS_SNAPPROCESS,
};
use windows::Win32::System::Memory::{
VirtualQueryEx, MEMORY_BASIC_INFORMATION, MEM_COMMIT, PAGE_EXECUTE_READWRITE,
PAGE_EXECUTE_WRITECOPY, PAGE_GUARD, PAGE_NOCACHE, PAGE_NOACCESS, PAGE_READWRITE,
PAGE_WRITECOMBINE, PAGE_WRITECOPY,
};
use windows::Win32::System::Threading::{OpenProcess, PROCESS_QUERY_INFORMATION, PROCESS_VM_READ};
use crate::config;
use super::{
ascii_alnum_candidates, attach_root_for_db_dir, configured_db_dir_for_wxid,
derive_xor_key_from_v2_dat, find_v2_template_ciphertexts, verify_aes_key, ImageKeyMaterial,
ImageKeyProvider,
};
const CHUNK_SIZE: usize = 2 * 1024 * 1024;
const MAX_REGION_SIZE: usize = 50 * 1024 * 1024;
pub struct WindowsImageKeyProvider {
configured_db_dir: Result<PathBuf, String>,
cache: Mutex<HashMap<String, ImageKeyMaterial>>,
}
impl WindowsImageKeyProvider {
pub fn from_current_config() -> Self {
let configured_db_dir = config::load_config()
.map(|cfg| cfg.db_dir)
.map_err(|err| err.to_string());
Self {
configured_db_dir,
cache: Mutex::new(HashMap::new()),
}
}
}
impl ImageKeyProvider for WindowsImageKeyProvider {
fn get_key(&self, wxid: &str) -> Result<ImageKeyMaterial> {
let cache_key = wxid.trim().to_string();
if let Some(found) = self.cache.lock().unwrap().get(&cache_key).copied() {
return Ok(found);
}
let configured_db_dir = self
.configured_db_dir
.as_ref()
.map_err(|err| anyhow::anyhow!("读取 config.db_dir 失败: {}", err))?;
let db_dir = configured_db_dir_for_wxid(configured_db_dir, wxid);
let attach_dir = attach_root_for_db_dir(&db_dir);
let key = derive_key_for_paths(&attach_dir)?;
self.cache.lock().unwrap().insert(cache_key, key);
Ok(key)
}
}
fn derive_key_for_paths(attach_dir: &std::path::Path) -> Result<ImageKeyMaterial> {
let templates = find_v2_template_ciphertexts(attach_dir, 3, 64)?;
if templates.is_empty() {
bail!("在 {} 下找不到 V2 模板文件", attach_dir.display());
}
let xor_key = derive_xor_key_from_v2_dat(attach_dir, 10, 3)?
.map(|(key, _, _)| key)
.unwrap_or(0x88);
let pid = find_wechat_pid().context("找不到 Weixin.exe 进程,请确认微信正在运行")?;
let process = unsafe {
OpenProcess(PROCESS_VM_READ | PROCESS_QUERY_INFORMATION, false, pid)
.context("OpenProcess 失败,请以管理员权限运行")?
};
let aes_key = scan_memory_for_key(process, &templates);
unsafe {
let _ = CloseHandle(process);
}
Ok(ImageKeyMaterial {
aes_key: aes_key?,
xor_key,
})
}
fn find_wechat_pid() -> Option<u32> {
let snapshot = unsafe { CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0).ok()? };
let mut entry = PROCESSENTRY32 {
dwSize: std::mem::size_of::<PROCESSENTRY32>() as u32,
..Default::default()
};
unsafe {
if Process32First(snapshot, &mut entry).is_err() {
let _ = CloseHandle(snapshot);
return None;
}
loop {
let name =
std::ffi::CStr::from_ptr(entry.szExeFile.as_ptr() as *const i8).to_string_lossy();
if name.eq_ignore_ascii_case("Weixin.exe") {
let pid = entry.th32ProcessID;
let _ = CloseHandle(snapshot);
return Some(pid);
}
if Process32Next(snapshot, &mut entry).is_err() {
break;
}
}
let _ = CloseHandle(snapshot);
}
None
}
fn scan_memory_for_key(process: HANDLE, templates: &[[u8; 16]]) -> Result<[u8; 16]> {
let mut seen = HashSet::<[u8; 16]>::new();
let mut address = 0usize;
loop {
let mut mbi = MEMORY_BASIC_INFORMATION::default();
let ret = unsafe {
VirtualQueryEx(
process,
Some(address as *const _),
&mut mbi,
std::mem::size_of::<MEMORY_BASIC_INFORMATION>(),
)
};
if ret == 0 {
break;
}
let base = mbi.BaseAddress as usize;
let size = mbi.RegionSize;
if mbi.State == MEM_COMMIT && is_candidate_page(mbi.Protect.0) && size <= MAX_REGION_SIZE {
if let Some(aes_key) = scan_region(process, base, size, templates, &mut seen)? {
return Ok(aes_key);
}
}
address = base.saturating_add(size);
if address == 0 {
break;
}
}
bail!("Windows 进程内存里没有找到可验证的 V2 AES key")
}
fn scan_region(
process: HANDLE,
base: usize,
size: usize,
templates: &[[u8; 16]],
seen: &mut HashSet<[u8; 16]>,
) -> Result<Option<[u8; 16]>> {
let overlap = 31usize;
let mut offset = 0usize;
while offset < size {
let chunk_size = std::cmp::min(CHUNK_SIZE, size - offset);
let addr = base + offset;
let mut buf = vec![0u8; chunk_size];
let mut bytes_read = 0usize;
let ok = unsafe {
ReadProcessMemory(
process,
addr as *const _,
buf.as_mut_ptr() as *mut _,
chunk_size,
Some(&mut bytes_read),
)
.is_ok()
};
if ok && bytes_read > 0 {
buf.truncate(bytes_read);
if let Some(key) = scan_candidate_buffer(&buf, templates, seen) {
return Ok(Some(key));
}
}
offset += if chunk_size > overlap {
chunk_size - overlap
} else {
chunk_size
};
}
Ok(None)
}
fn scan_candidate_buffer(
buf: &[u8],
templates: &[[u8; 16]],
seen: &mut HashSet<[u8; 16]>,
) -> Option<[u8; 16]> {
for candidate in ascii_alnum_candidates(buf, 32) {
let mut key = [0u8; 16];
key.copy_from_slice(&candidate[..16]);
if seen.insert(key) && verify_aes_key(&key, templates) {
return Some(key);
}
}
for candidate in ascii_alnum_candidates(buf, 16) {
let mut key = [0u8; 16];
key.copy_from_slice(candidate);
if seen.insert(key) && verify_aes_key(&key, templates) {
return Some(key);
}
}
None
}
fn is_candidate_page(protect: u32) -> bool {
if protect == PAGE_NOACCESS.0 || (protect & PAGE_GUARD.0) != 0 {
return false;
}
let base = protect & !(PAGE_GUARD.0 | PAGE_NOCACHE.0 | PAGE_WRITECOMBINE.0);
matches!(
base,
value if value == PAGE_READWRITE.0
|| value == PAGE_WRITECOPY.0
|| value == PAGE_EXECUTE_READWRITE.0
|| value == PAGE_EXECUTE_WRITECOPY.0
)
}

View File

@ -0,0 +1,28 @@
//! 聊天附件提取链路(图片 / 视频 / 语音 / 文件本体的本地解码)
//!
//! 整条链:
//! message_N.db (Msg_<md5>) → message_resource.db (ChatName2Id + MessageResourceInfo)
//! → packed_info protobuf md5 提取 → xwechat_files/<wxid>/msg/attach/.../Img/<md5>[_t|_h].dat
//! → magic 分发 (legacy XOR / V1 fixed-AES / V2 AES+XOR) → 写出实际图片
//!
//! 模块切分:
//! - `attachment_id`:跨 IPC / CLI 的不透明 IDbase64url(json)
//! - `resolver`:从 `attachment_id` 反查 message_resource.db定位本地 .dat
//! - `decoder`:根据文件 magic 分发到具体解码器V1 / V2 等)
//! - `image_key`V2 image AES key 提取macOS / Windows
//!
//! V2 / image_key 模块由 codex 落地,先放空 stub 以便 V1 / resolver / CLI 不被 block。
// 此模块由分多个 PR/commit 增量启用:
// 1) 先落 attachment_id / decoder / resolver / image_key 骨架(本 commit
// 2) IPC + CLI + daemon route 把它们串起来(后续 commit
// 3) image_key 平台实现codex 后续 commit
// 在 step 1 完成、step 2 未到时,大量公开 API 仍未被引用,#[allow(dead_code)] 抑制噪音
#![allow(dead_code)]
pub mod attachment_id;
pub mod decoder;
pub mod resolver;
pub mod image_key;
pub use attachment_id::{AttachmentId, AttachmentKind};

View File

@ -0,0 +1,439 @@
//! 把 `AttachmentId` 翻译成本地 `.dat` 路径。
//!
//! 流程:
//! 1. `chat` username → `ChatName2Id.rowid`(资源库)
//! 2. `(chat_id, local_id)` + `ORDER BY message_create_time DESC LIMIT 1` →
//! `MessageResourceInfo.packed_info`
//! 3. 从 `packed_info` (protobuf) 提取 32 字节 ASCII hex MD5
//! 4. 在 `<wxchat_base>/msg/attach/<md5(chat)>/<YYYY-MM>/Img/<md5>[_t|_h].dat`
//! 下找对应文件,按 full > _h > _t 优先级选一个
//!
//! `<wxchat_base>` 由 daemon 已知(同 `db_dir` 的父目录),路径 layout 平台差异:
//! - Linux: `~/Documents/xwechat_files/<wxid>`
//! - macOS: `~/Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files/<wxid>`
//! ⚠️ msg/attach/... 子树 layout 待我用真实账号验证;上游 docstring 只写了 Windows
//! - Windows: `<root>\xwechat_files\<wxid>`root 从 `%APPDATA%\Tencent\xwechat\config\*.ini` 读)
use anyhow::{anyhow, Context, Result};
use chrono::TimeZone;
use rusqlite::Connection;
use std::path::{Path, PathBuf};
use super::AttachmentId;
/// 单条 attachment 在资源库 + 本地 attach 树下的解析结果。
#[derive(Debug, Clone)]
pub struct ResolvedAttachment {
pub id: AttachmentId,
/// 从 `packed_info` 提取出的资源 MD5小写 hex
pub md5: String,
/// 命中的本地 .dat 路径(按 full > _h > _t 优先级选一个)
pub dat_path: PathBuf,
/// 文件 size字节
pub size: u64,
}
/// 仅 schema lookup不去找本地 .dat
/// 用于 `wx attachments` 列表时填 `md5` 字段——文件可能根本不在本地。
#[derive(Debug, Clone)]
pub struct AttachmentMetadata {
pub md5: String,
}
/// 用 `(chat, local_id)` 查 message_resource.db 拿 file md5。
///
/// 调用方传已经解密好的 `message_resource.db` 路径(由 daemon 的 `DBCache` 准备)。
/// 同步函数 — caller 在 `spawn_blocking` 里跑。
pub fn lookup_md5_blocking(
resource_db_path: &Path,
chat: &str,
local_id: i64,
create_time: i64,
msg_local_type_lo32: i64,
) -> Result<Option<AttachmentMetadata>> {
let conn = Connection::open_with_flags(
resource_db_path,
rusqlite::OpenFlags::SQLITE_OPEN_READ_ONLY | rusqlite::OpenFlags::SQLITE_OPEN_URI,
)
.with_context(|| format!("打开 message_resource.db {:?}", resource_db_path))?;
// 1) ChatName2Id: user_name -> rowid
let chat_id: Option<i64> = conn
.query_row(
"SELECT rowid FROM ChatName2Id WHERE user_name = ?1",
[chat],
|row| row.get(0),
)
.ok();
let Some(chat_id) = chat_id else {
return Ok(None);
};
// 2) MessageResourceInfo:
// 同 chat 内 local_id 会复用,所以先用 create_time 精确命中;
// 若资源库里的时间戳跟 message_N.db 不完全对齐,再 fallback 到“同 local_id/type 取最新”
// message_local_type 高 32 bit 是版本/会话 flag低 32 bit 才是真实类型
let packed_exact: Option<Vec<u8>> = conn
.query_row(
"SELECT packed_info FROM MessageResourceInfo
WHERE chat_id = ?1
AND message_local_id = ?2
AND (message_local_type = ?3 OR message_local_type % 4294967296 = ?3)
AND message_create_time = ?4
ORDER BY rowid DESC
LIMIT 1",
rusqlite::params![chat_id, local_id, msg_local_type_lo32, create_time],
|row| row.get(0),
)
.ok();
let packed: Option<Vec<u8>> = packed_exact.or_else(|| conn
.query_row(
"SELECT packed_info FROM MessageResourceInfo
WHERE chat_id = ?1
AND message_local_id = ?2
AND (message_local_type = ?3 OR message_local_type % 4294967296 = ?3)
ORDER BY message_create_time DESC
LIMIT 1",
rusqlite::params![chat_id, local_id, msg_local_type_lo32],
|row| row.get(0),
)
.ok());
let Some(blob) = packed else {
return Ok(None);
};
Ok(extract_md5_from_packed_info(&blob).map(|md5| AttachmentMetadata { md5 }))
}
/// 从 `MessageResourceInfo.packed_info` (protobuf) 提取 32 字节 ASCII hex md5。
///
/// 主路径:搜 4 字节 marker `12 22 0a 20`field=2 LEN, length=34, sub field=1 LEN, length=32
/// 紧跟 32 字节 ASCII hex。
/// Fallback扫整个 blob 找连续 32 字节合法 hex 字符。
pub fn extract_md5_from_packed_info(blob: &[u8]) -> Option<String> {
const MARKER: &[u8; 4] = &[0x12, 0x22, 0x0A, 0x20];
// 主路径
if let Some(pos) = find_subslice(blob, MARKER) {
let start = pos + MARKER.len();
if start + 32 <= blob.len() {
if let Ok(s) = std::str::from_utf8(&blob[start..start + 32]) {
if s.chars().all(|c| c.is_ascii_hexdigit()) {
return Some(s.to_ascii_lowercase());
}
}
}
}
// Fallback连续 32 字节合法 hex
if blob.len() >= 32 {
for start in 0..=blob.len() - 32 {
let chunk = &blob[start..start + 32];
if let Ok(s) = std::str::from_utf8(chunk) {
if s.chars().all(|c| c.is_ascii_hexdigit()) {
return Some(s.to_ascii_lowercase());
}
}
}
}
None
}
/// 简单的子串扫描(避免拉 memchr/memmem 依赖blob 通常 < 1KB
fn find_subslice(haystack: &[u8], needle: &[u8]) -> Option<usize> {
if needle.is_empty() || needle.len() > haystack.len() {
return None;
}
haystack
.windows(needle.len())
.position(|w| w == needle)
}
/// 在 `<attach_root>/<md5(chat)>/<YYYY-MM>/Img/<md5>[_t|_h].dat` 下找文件。
///
/// 优先级full > `_h`HD thumbnail> `_t`thumbnail。返回最优的一个
/// 找不到返回 None。
///
/// `attach_root` = `<wxchat_base>/msg/attach`。
/// `create_time` 用于先定位 `<YYYY-MM>` 子目录;找不到时再 fallback 全月份扫描,
/// 因为 WeChat 的 `YYYY-MM` 目录有时跟消息时间差 1 个月(按收到时间归档)。
pub fn find_dat_file(
attach_root: &Path,
chat: &str,
file_md5: &str,
create_time: i64,
) -> Option<PathBuf> {
let chat_hash = format!("{:x}", md5::compute(chat.as_bytes()));
let chat_dir = attach_root.join(&chat_hash);
if !chat_dir.is_dir() {
return None;
}
// 第一步:试 create_time 当月 + 前后各一个月(共 3 个候选目录)
let candidates_ym: Vec<String> = three_month_candidates(create_time);
for ym in &candidates_ym {
let img_dir = chat_dir.join(ym).join("Img");
if let Some(p) = pick_best_in_img_dir(&img_dir, file_md5) {
return Some(p);
}
}
// 第二步 fallback扫整个 chat_dir 的所有月份子目录
let entries = std::fs::read_dir(&chat_dir).ok()?;
let mut all_months: Vec<PathBuf> = entries
.filter_map(|e| e.ok())
.map(|e| e.path())
.filter(|p| p.is_dir())
.collect();
// 已经试过的 3 个候选可以跳过,但成本极小;保留全量扫
all_months.sort();
for month_dir in all_months {
let img_dir = month_dir.join("Img");
if let Some(p) = pick_best_in_img_dir(&img_dir, file_md5) {
return Some(p);
}
}
None
}
fn pick_best_in_img_dir(img_dir: &Path, file_md5: &str) -> Option<PathBuf> {
if !img_dir.is_dir() {
return None;
}
let full = img_dir.join(format!("{}.dat", file_md5));
if full.is_file() {
return Some(full);
}
let hd = img_dir.join(format!("{}_h.dat", file_md5));
if hd.is_file() {
return Some(hd);
}
let thumb = img_dir.join(format!("{}_t.dat", file_md5));
if thumb.is_file() {
return Some(thumb);
}
None
}
fn three_month_candidates(unix_ts: i64) -> Vec<String> {
use chrono::{Datelike, Duration};
let dt = match chrono::Local.timestamp_opt(unix_ts, 0).single() {
Some(d) => d,
None => return Vec::new(),
};
let prev = dt - Duration::days(31);
let next = dt + Duration::days(31);
[prev, dt, next]
.iter()
.map(|d| format!("{:04}-{:02}", d.year(), d.month()))
.collect()
}
/// 把 `<wxchat_base>` (即 `db_storage` 父目录)拼成 `<base>/msg/attach`。
pub fn attach_root_for(wxchat_base: &Path) -> PathBuf {
wxchat_base.join("msg").join("attach")
}
/// 完整流程:用 `attachment_id` 拿 md5 + 找 .dat。失败返回带具体诊断信息的 `Err`。
///
/// `resource_db_path` 由 daemon 提供DBCache 已经解密好);
/// `attach_root` 由 caller 拼好(`attach_root_for(wxchat_base)`)。
/// 同步函数 — caller 在 `spawn_blocking` 里跑。
pub fn resolve_blocking(
id: &AttachmentId,
resource_db_path: &Path,
attach_root: &Path,
) -> Result<ResolvedAttachment> {
let lo32_type: i64 = match id.kind {
super::AttachmentKind::Image => 3,
super::AttachmentKind::Voice => 34,
super::AttachmentKind::Video => 43,
super::AttachmentKind::File => 49,
};
let meta = lookup_md5_blocking(
resource_db_path,
&id.chat,
id.local_id,
id.create_time,
lo32_type,
)?
.ok_or_else(|| {
anyhow!(
"message_resource.db 中找不到 chat={} local_id={} type={} 的资源行(可能是非附件消息或资源库未同步)",
id.chat,
id.local_id,
lo32_type
)
})?;
let dat_path = find_dat_file(attach_root, &id.chat, &meta.md5, id.create_time).ok_or_else(
|| {
anyhow!(
"找不到本地 .datmd5={} chat={} create_time={})— 微信可能尚未下载该附件,或附件已被清理",
meta.md5,
id.chat,
id.create_time
)
},
)?;
let size = std::fs::metadata(&dat_path).map(|m| m.len()).unwrap_or(0);
Ok(ResolvedAttachment { id: id.clone(), md5: meta.md5, dat_path, size })
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn extract_md5_main_path() {
// 构造一段含 12 22 0a 20 marker 的 blob
let mut blob = vec![0xAA, 0xBB, 0xCC];
blob.extend_from_slice(&[0x12, 0x22, 0x0A, 0x20]);
blob.extend_from_slice(b"deadbeefcafebabe1234567890abcdef");
blob.extend_from_slice(&[0xFF, 0xFF]);
assert_eq!(
extract_md5_from_packed_info(&blob),
Some("deadbeefcafebabe1234567890abcdef".to_string())
);
}
#[test]
fn extract_md5_fallback_no_marker() {
// 没有 marker但 blob 里有合法 32 字节 hex
let mut blob = vec![0xFF, 0x00];
blob.extend_from_slice(b"00112233445566778899aabbccddeeff");
blob.extend_from_slice(&[0x01]);
assert_eq!(
extract_md5_from_packed_info(&blob),
Some("00112233445566778899aabbccddeeff".to_string())
);
}
#[test]
fn extract_md5_uppercase_normalized_to_lower() {
let mut blob = vec![0x12, 0x22, 0x0A, 0x20];
blob.extend_from_slice(b"DEADBEEFCAFEBABE1234567890ABCDEF");
// 上游/CI/本地 file md5 都是 lowercase强制小写化避免大小写不一致导致命中失败
assert_eq!(
extract_md5_from_packed_info(&blob),
Some("deadbeefcafebabe1234567890abcdef".to_string())
);
}
#[test]
fn extract_md5_returns_none_on_garbage() {
let blob = vec![0; 16];
assert!(extract_md5_from_packed_info(&blob).is_none());
}
#[test]
fn lookup_md5_prefers_exact_create_time_over_latest_reuse() {
let dir = tempdir_for_test();
let db_path = dir.join("message_resource.db");
let conn = Connection::open(&db_path).unwrap();
conn.execute(
"CREATE TABLE ChatName2Id (user_name TEXT)",
[],
)
.unwrap();
conn.execute(
"INSERT INTO ChatName2Id (rowid, user_name) VALUES (1, 'room@chatroom')",
[],
)
.unwrap();
conn.execute(
"CREATE TABLE MessageResourceInfo (
chat_id INTEGER,
message_local_id INTEGER,
message_local_type INTEGER,
message_create_time INTEGER,
packed_info BLOB
)",
[],
)
.unwrap();
let old_blob = {
let mut blob = vec![0x12, 0x22, 0x0A, 0x20];
blob.extend_from_slice(b"11111111111111111111111111111111");
blob
};
let new_blob = {
let mut blob = vec![0x12, 0x22, 0x0A, 0x20];
blob.extend_from_slice(b"22222222222222222222222222222222");
blob
};
conn.execute(
"INSERT INTO MessageResourceInfo
(chat_id, message_local_id, message_local_type, message_create_time, packed_info)
VALUES (?1, ?2, ?3, ?4, ?5)",
rusqlite::params![1i64, 7i64, 3i64, 1000i64, old_blob],
)
.unwrap();
conn.execute(
"INSERT INTO MessageResourceInfo
(chat_id, message_local_id, message_local_type, message_create_time, packed_info)
VALUES (?1, ?2, ?3, ?4, ?5)",
rusqlite::params![1i64, 7i64, 3i64, 2000i64, new_blob],
)
.unwrap();
let old = lookup_md5_blocking(&db_path, "room@chatroom", 7, 1000, 3)
.unwrap()
.unwrap();
let new = lookup_md5_blocking(&db_path, "room@chatroom", 7, 2000, 3)
.unwrap()
.unwrap();
assert_eq!(old.md5, "11111111111111111111111111111111");
assert_eq!(new.md5, "22222222222222222222222222222222");
}
#[test]
fn three_month_candidates_includes_prev_curr_next() {
// 2025-08-15 (mid-month) → 2025-07, 2025-08, 2025-09
let ts = chrono::Local
.with_ymd_and_hms(2025, 8, 15, 12, 0, 0)
.unwrap()
.timestamp();
let v = three_month_candidates(ts);
assert!(v.contains(&"2025-07".to_string()));
assert!(v.contains(&"2025-08".to_string()));
assert!(v.contains(&"2025-09".to_string()));
}
#[test]
fn pick_best_prefers_full_then_h_then_t() {
let tmp = tempdir_for_test();
let img = tmp.join("Img");
std::fs::create_dir_all(&img).unwrap();
let md5 = "abcd1234";
std::fs::write(img.join(format!("{}_t.dat", md5)), b"thumb").unwrap();
std::fs::write(img.join(format!("{}_h.dat", md5)), b"hd").unwrap();
// 只有 _t / _h 时取 _h
assert_eq!(
pick_best_in_img_dir(&img, md5).unwrap().file_name().unwrap(),
format!("{}_h.dat", md5).as_str()
);
// 加 full 后取 full
std::fs::write(img.join(format!("{}.dat", md5)), b"full").unwrap();
assert_eq!(
pick_best_in_img_dir(&img, md5).unwrap().file_name().unwrap(),
format!("{}.dat", md5).as_str()
);
}
fn tempdir_for_test() -> PathBuf {
let pid = std::process::id();
let nanos = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap()
.as_nanos();
let p = std::env::temp_dir().join(format!("wx-cli-attach-test-{}-{}", pid, nanos));
std::fs::create_dir_all(&p).unwrap();
p
}
}

View File

@ -0,0 +1,41 @@
use anyhow::Result;
use super::history::{parse_time, parse_time_end};
use super::output::{emit_warnings, print_response, OutputOpts};
use super::transport;
use crate::ipc::Request;
/// `wx attachments` — 列出指定会话的附件消息(默认 image可多选
///
/// 输出每条 `attachment_id`,再传给 `wx extract` 才真正读 message_resource.db
/// 与本地 .dat 解码。这一步只查 `Msg_<chat>` 表,几千条群聊也能秒返。
pub fn cmd_attachments(
chat: String,
kinds: Vec<String>,
limit: usize,
offset: usize,
since: Option<String>,
until: Option<String>,
opts: OutputOpts,
) -> Result<()> {
let since_ts = since.as_deref().map(parse_time).transpose()?;
let until_ts = until.as_deref().map(parse_time_end).transpose()?;
let (with_meta, debug_source) = opts.request_flags();
// CLI 收上来的 Vec<String> 为空时按默认image让 daemon 决定 fallback。
let kinds_param = if kinds.is_empty() { None } else { Some(kinds) };
let req = Request::Attachments {
chat,
kinds: kinds_param,
limit,
offset,
since: since_ts,
until: until_ts,
with_meta,
debug_source,
};
let resp = transport::send(req)?;
emit_warnings(&resp.data);
print_response(&resp.data, &opts)
}

View File

@ -0,0 +1,30 @@
use anyhow::Result;
use crate::ipc::Request;
use super::history::{parse_time, parse_time_end};
use super::transport;
use super::output::{resolve, print_value};
pub fn cmd_biz_articles(
limit: usize,
account: Option<String>,
since: Option<String>,
until: Option<String>,
unread: bool,
json: bool,
) -> Result<()> {
let since_ts = since.as_deref().map(parse_time).transpose()?;
let until_ts = until.as_deref().map(parse_time_end).transpose()?;
let req = Request::BizArticles {
limit,
account,
since: since_ts,
until: until_ts,
unread,
};
let resp = transport::send(req)?;
let data = resp.data.get("articles")
.cloned()
.unwrap_or(serde_json::Value::Array(vec![]));
print_value(&data, &resolve(json))
}

View File

@ -1,7 +1,7 @@
use anyhow::Result;
use crate::config;
use crate::cli::DaemonCommands;
use crate::cli::transport; use crate::cli::transport;
use crate::cli::DaemonCommands;
use crate::config;
use anyhow::Result;
pub fn cmd_daemon(cmd: DaemonCommands) -> Result<()> { pub fn cmd_daemon(cmd: DaemonCommands) -> Result<()> {
match cmd { match cmd {
@ -15,7 +15,13 @@ fn cmd_status() -> Result<()> {
if transport::is_alive() { if transport::is_alive() {
let pid_path = config::pid_path(); let pid_path = config::pid_path();
let pid = std::fs::read_to_string(&pid_path) let pid = std::fs::read_to_string(&pid_path)
.map(|s| s.trim().to_string()) .map(|s| {
serde_json::from_str::<serde_json::Value>(&s)
.ok()
.and_then(|v| v.get("pid").and_then(|p| p.as_u64()))
.map(|pid| pid.to_string())
.unwrap_or_else(|| s.trim().to_string())
})
.unwrap_or_else(|_| "?".into()); .unwrap_or_else(|_| "?".into());
println!("wx-daemon 运行中 (PID {})", pid); println!("wx-daemon 运行中 (PID {})", pid);
} else { } else {
@ -25,42 +31,13 @@ fn cmd_status() -> Result<()> {
} }
fn cmd_stop() -> Result<()> { fn cmd_stop() -> Result<()> {
let pid_path = config::pid_path(); if !transport::is_alive() {
if !pid_path.exists() {
println!("daemon 未运行"); println!("daemon 未运行");
return Ok(()); return Ok(());
} }
let pid_str = std::fs::read_to_string(&pid_path)?; transport::stop_daemon()?;
let pid: u32 = pid_str.trim().parse() println!("已停止 wx-daemon");
.map_err(|_| anyhow::anyhow!("PID 文件格式错误"))?;
#[cfg(unix)]
{
let ret = unsafe { libc::kill(pid as libc::pid_t, libc::SIGTERM) };
if ret != 0 {
let errno = unsafe { *libc::__error() };
if errno == libc::ESRCH {
println!("wx-daemon (PID {}) 已不在运行,清理残留文件", pid);
} else {
anyhow::bail!("发送 SIGTERM 失败 (errno {})", errno);
}
} else {
println!("已停止 wx-daemon (PID {})", pid);
}
}
#[cfg(windows)]
{
std::process::Command::new("taskkill")
.args(["/PID", &pid.to_string(), "/F"])
.output()?;
println!("已停止 wx-daemon (PID {})", pid);
}
let _ = std::fs::remove_file(config::sock_path());
let _ = std::fs::remove_file(&pid_path);
Ok(()) Ok(())
} }
@ -89,19 +66,25 @@ fn cmd_logs(follow: bool, lines: usize) -> Result<()> {
file.read_to_string(&mut content)?; file.read_to_string(&mut content)?;
let all_lines: Vec<&str> = content.lines().collect(); let all_lines: Vec<&str> = content.lines().collect();
let show = &all_lines[all_lines.len().saturating_sub(lines)..]; let show = &all_lines[all_lines.len().saturating_sub(lines)..];
for line in show { println!("{}", line); } for line in show {
println!("{}", line);
}
loop { loop {
std::thread::sleep(std::time::Duration::from_millis(500)); std::thread::sleep(std::time::Duration::from_millis(500));
let mut buf = String::new(); let mut buf = String::new();
file.read_to_string(&mut buf)?; file.read_to_string(&mut buf)?;
if !buf.is_empty() { print!("{}", buf); } if !buf.is_empty() {
print!("{}", buf);
}
} }
} }
} else { } else {
let content = std::fs::read_to_string(&log_path)?; let content = std::fs::read_to_string(&log_path)?;
let all_lines: Vec<&str> = content.lines().collect(); let all_lines: Vec<&str> = content.lines().collect();
let show = &all_lines[all_lines.len().saturating_sub(lines)..]; let show = &all_lines[all_lines.len().saturating_sub(lines)..];
for line in show { println!("{}", line); } for line in show {
println!("{}", line);
}
} }
Ok(()) Ok(())

View File

@ -1,7 +1,8 @@
use anyhow::Result;
use crate::ipc::Request;
use super::transport;
use super::history::{parse_time, parse_time_end}; use super::history::{parse_time, parse_time_end};
use super::output::{emit_warnings, warning_block_markdown, warning_block_text, OutputOpts};
use super::transport;
use crate::ipc::Request;
use anyhow::Result;
pub fn cmd_export( pub fn cmd_export(
chat: String, chat: String,
@ -10,9 +11,11 @@ pub fn cmd_export(
limit: usize, limit: usize,
format: String, format: String,
output: Option<String>, output: Option<String>,
opts: OutputOpts,
) -> Result<()> { ) -> Result<()> {
let since_ts = since.as_deref().map(parse_time).transpose()?; let since_ts = since.as_deref().map(parse_time).transpose()?;
let until_ts = until.as_deref().map(parse_time_end).transpose()?; let until_ts = until.as_deref().map(parse_time_end).transpose()?;
let (with_meta, debug_source) = opts.request_flags();
let req = Request::History { let req = Request::History {
chat, chat,
@ -21,24 +24,42 @@ pub fn cmd_export(
since: since_ts, since: since_ts,
until: until_ts, until: until_ts,
msg_type: None, msg_type: None,
with_meta,
debug_source,
}; };
let resp = transport::send(req)?; let resp = transport::send(req)?;
let messages = resp.data["messages"].as_array().cloned().unwrap_or_default(); emit_warnings(&resp.data);
let messages = resp.data["messages"]
.as_array()
.cloned()
.unwrap_or_default();
let chat_name = resp.data["chat"].as_str().unwrap_or("").to_string(); let chat_name = resp.data["chat"].as_str().unwrap_or("").to_string();
let is_group = resp.data["is_group"].as_bool().unwrap_or(false); let is_group = resp.data["is_group"].as_bool().unwrap_or(false);
let count = messages.len(); let count = messages.len();
let text = match format.as_str() { let text = match format.as_str() {
"json" => serde_json::to_string_pretty(&resp.data)?, "json" => serde_json::to_string_pretty(&resp.data)?,
"yaml" => serde_yaml::to_string(&resp.data)?,
"txt" => { "txt" => {
let group_str = if is_group { "[群]" } else { "" }; let group_str = if is_group { "[群]" } else { "" };
let mut lines = vec![format!("=== {}{} ({} 条) ===\n", chat_name, group_str, count)]; let mut lines = vec![format!(
"=== {}{} ({} 条) ===\n",
chat_name, group_str, count
)];
if let Some(warn) = warning_block_text(&resp.data) {
lines.push(warn);
lines.push(String::new());
}
for m in &messages { for m in &messages {
let time = m["time"].as_str().unwrap_or(""); let time = m["time"].as_str().unwrap_or("");
let sender = m["sender"].as_str().unwrap_or(""); let sender = m["sender"].as_str().unwrap_or("");
let content = m["content"].as_str().unwrap_or(""); let content = m["content"].as_str().unwrap_or("");
let sender_str = if !sender.is_empty() { format!("{}: ", sender) } else { String::new() }; let sender_str = if !sender.is_empty() {
format!("{}: ", sender)
} else {
String::new()
};
lines.push(format!("[{}] {}{}", time, sender_str, content)); lines.push(format!("[{}] {}{}", time, sender_str, content));
} }
lines.join("\n") lines.join("\n")
@ -50,11 +71,18 @@ pub fn cmd_export(
format!("# {}{}", chat_name, group_str), format!("# {}{}", chat_name, group_str),
format!("\n> 导出 {} 条消息\n", count), format!("\n> 导出 {} 条消息\n", count),
]; ];
if let Some(warn) = warning_block_markdown(&resp.data) {
lines.push(warn);
}
for m in &messages { for m in &messages {
let time = m["time"].as_str().unwrap_or(""); let time = m["time"].as_str().unwrap_or("");
let sender = m["sender"].as_str().unwrap_or(""); let sender = m["sender"].as_str().unwrap_or("");
let content = m["content"].as_str().unwrap_or("").replace('\n', "\n> "); let content = m["content"].as_str().unwrap_or("").replace('\n', "\n> ");
let sender_md = if !sender.is_empty() { format!("**{}**: ", sender) } else { String::new() }; let sender_md = if !sender.is_empty() {
format!("**{}**: ", sender)
} else {
String::new()
};
lines.push(format!("### {}\n\n{}{}\n", time, sender_md, content)); lines.push(format!("### {}\n\n{}{}\n", time, sender_md, content));
} }
lines.join("\n") lines.join("\n")

25
src/cli/extract.rs 100644
View File

@ -0,0 +1,25 @@
use anyhow::Result;
use crate::ipc::Request;
use super::output::{print_value, resolve};
use super::transport;
/// `wx extract` — 把单个 `attachment_id` 对应的资源解密写到指定路径。
///
/// daemon 端:解析 `attachment_id` → 查 `message_resource.db` 拿 file md5 →
/// 在 `<wxchat_base>/msg/attach/...` 找 .dat → 按 magic 分发到 v1/v2 解码器 →
/// 写出真实图片/文件。
pub fn cmd_extract(
attachment_id: String,
output: String,
overwrite: bool,
json: bool,
) -> Result<()> {
let req = Request::Extract {
attachment_id,
output,
overwrite,
};
let resp = transport::send(req)?;
print_value(&resp.data, &resolve(json))
}

View File

@ -1,7 +1,7 @@
use anyhow::Result; use super::output::{emit_warnings, print_response, OutputOpts};
use crate::ipc::Request;
use super::transport; use super::transport;
use super::output::{resolve, print_value}; use crate::ipc::Request;
use anyhow::Result;
pub fn cmd_history( pub fn cmd_history(
chat: String, chat: String,
@ -10,37 +10,51 @@ pub fn cmd_history(
since: Option<String>, since: Option<String>,
until: Option<String>, until: Option<String>,
msg_type: Option<String>, msg_type: Option<String>,
json: bool, opts: OutputOpts,
) -> Result<()> { ) -> Result<()> {
let since_ts = since.as_deref().map(parse_time).transpose()?; let since_ts = since.as_deref().map(parse_time).transpose()?;
let until_ts = until.as_deref().map(parse_time_end).transpose()?; let until_ts = until.as_deref().map(parse_time_end).transpose()?;
let type_val = msg_type.as_deref().and_then(parse_msg_type); let type_val = msg_type.as_deref().and_then(parse_msg_type);
let (with_meta, debug_source) = opts.request_flags();
let req = Request::History { chat, limit, offset, since: since_ts, until: until_ts, msg_type: type_val }; let req = Request::History {
chat,
limit,
offset,
since: since_ts,
until: until_ts,
msg_type: type_val,
with_meta,
debug_source,
};
let resp = transport::send(req)?; let resp = transport::send(req)?;
emit_warnings(&resp.data);
let msgs = resp.data.get("messages") print_response(&resp.data, &opts)
.cloned()
.unwrap_or(serde_json::Value::Array(vec![]));
print_value(&msgs, &resolve(json))
} }
pub fn parse_time(s: &str) -> Result<i64> { pub fn parse_time(s: &str) -> Result<i64> {
use chrono::{Local, TimeZone}; use chrono::{Local, TimeZone};
for fmt in &["%Y-%m-%d %H:%M:%S", "%Y-%m-%d %H:%M"] { for fmt in &["%Y-%m-%d %H:%M:%S", "%Y-%m-%d %H:%M"] {
if let Ok(dt) = chrono::NaiveDateTime::parse_from_str(s, fmt) { if let Ok(dt) = chrono::NaiveDateTime::parse_from_str(s, fmt) {
return Local.from_local_datetime(&dt).single() return Local
.from_local_datetime(&dt)
.single()
.map(|d| d.timestamp()) .map(|d| d.timestamp())
.ok_or_else(|| anyhow::anyhow!("本地时间歧义: {}", s)); .ok_or_else(|| anyhow::anyhow!("本地时间歧义: {}", s));
} }
} }
if let Ok(d) = chrono::NaiveDate::parse_from_str(s, "%Y-%m-%d") { if let Ok(d) = chrono::NaiveDate::parse_from_str(s, "%Y-%m-%d") {
let dt = d.and_hms_opt(0, 0, 0).unwrap(); let dt = d.and_hms_opt(0, 0, 0).unwrap();
return Local.from_local_datetime(&dt).single() return Local
.from_local_datetime(&dt)
.single()
.map(|d| d.timestamp()) .map(|d| d.timestamp())
.ok_or_else(|| anyhow::anyhow!("本地时间歧义: {}", s)); .ok_or_else(|| anyhow::anyhow!("本地时间歧义: {}", s));
} }
anyhow::bail!("无法解析时间 '{}',支持 YYYY-MM-DD / YYYY-MM-DD HH:MM / YYYY-MM-DD HH:MM:SS", s) anyhow::bail!(
"无法解析时间 '{}',支持 YYYY-MM-DD / YYYY-MM-DD HH:MM / YYYY-MM-DD HH:MM:SS",
s
)
} }
pub fn parse_time_end(s: &str) -> Result<i64> { pub fn parse_time_end(s: &str) -> Result<i64> {
@ -48,7 +62,9 @@ pub fn parse_time_end(s: &str) -> Result<i64> {
if s.len() == 10 { if s.len() == 10 {
if let Ok(d) = chrono::NaiveDate::parse_from_str(s, "%Y-%m-%d") { if let Ok(d) = chrono::NaiveDate::parse_from_str(s, "%Y-%m-%d") {
let dt = d.and_hms_opt(23, 59, 59).unwrap(); let dt = d.and_hms_opt(23, 59, 59).unwrap();
return Local.from_local_datetime(&dt).single() return Local
.from_local_datetime(&dt)
.single()
.map(|d| d.timestamp()) .map(|d| d.timestamp())
.ok_or_else(|| anyhow::anyhow!("本地时间歧义: {}", s)); .ok_or_else(|| anyhow::anyhow!("本地时间歧义: {}", s));
} }
@ -59,15 +75,15 @@ pub fn parse_time_end(s: &str) -> Result<i64> {
/// 将消息类型字符串转为 local_type 整数,未知类型返回 None /// 将消息类型字符串转为 local_type 整数,未知类型返回 None
pub fn parse_msg_type(s: &str) -> Option<i64> { pub fn parse_msg_type(s: &str) -> Option<i64> {
match s { match s {
"text" => Some(1), "text" => Some(1),
"image" => Some(3), "image" => Some(3),
"voice" => Some(34), "voice" => Some(34),
"video" => Some(43), "video" => Some(43),
"sticker" => Some(47), "sticker" => Some(47),
"location" => Some(48), "location" => Some(48),
"link" | "file" => Some(49), "link" | "file" => Some(49),
"call" => Some(50), "call" => Some(50),
"system" => Some(10000), "system" => Some(10000),
_ => None, _ => None,
} }
} }

View File

@ -35,14 +35,32 @@ pub fn cmd_init(force: bool) -> Result<()> {
// Step 1: 检测 db_dir // Step 1: 检测 db_dir
println!("检测微信数据目录..."); println!("检测微信数据目录...");
let db_dir = config::auto_detect_db_dir() let db_dir = config::auto_detect_db_dir().with_context(|| format!(
.context("未能自动检测到微信数据目录\n请手动编辑 config.json 中的 db_dir 字段")?; "未能自动检测到微信数据目录\n\
db_dir :\n \
{}\n\
db_dir : <data_root>\\xwechat_files\\<wxid>\\db_storage",
config_path.display()
))?;
println!("找到数据目录: {}", db_dir.display()); println!("找到数据目录: {}", db_dir.display());
// Step 2: 扫描密钥(需要 root/sudo // Step 2: 扫描密钥(需要 root/sudo
println!("扫描加密密钥(需要 root 权限)..."); println!("扫描加密密钥(需要 root 权限)...");
let entries = scanner::scan_keys(&db_dir)?; let entries = scanner::scan_keys(&db_dir)?;
// === 权限边界 ===
// 扫描完成后立即 drop 到调用用户身份,后续文件写入都是用户属主。
// 未来 daemon由 `wx sessions` 以用户身份 fork才能往 ~/.wx-cli/
// 写 socket/log/pid。
#[cfg(unix)]
drop_privileges_if_sudo()?;
// 确保父目录存在(如 ~/.wx-cli/),必须在任何写入之前
if let Some(parent) = config_path.parent() {
std::fs::create_dir_all(parent)
.with_context(|| format!("创建目录失败: {}", parent.display()))?;
}
// Step 3: 保存 all_keys.json // Step 3: 保存 all_keys.json
let keys_file_path = config_path.parent() let keys_file_path = config_path.parent()
.unwrap_or(std::path::Path::new(".")) .unwrap_or(std::path::Path::new("."))
@ -75,16 +93,110 @@ pub fn cmd_init(force: bool) -> Result<()> {
cfg.entry("keys_file".into()).or_insert_with(|| json!("all_keys.json")); cfg.entry("keys_file".into()).or_insert_with(|| json!("all_keys.json"));
cfg.entry("decrypted_dir".into()).or_insert_with(|| json!("decrypted")); cfg.entry("decrypted_dir".into()).or_insert_with(|| json!("decrypted"));
// 确保父目录存在(如 ~/.wx-cli/
if let Some(parent) = config_path.parent() {
std::fs::create_dir_all(parent)
.with_context(|| format!("创建目录失败: {}", parent.display()))?;
}
std::fs::write(&config_path, serde_json::to_string_pretty(&cfg)?) std::fs::write(&config_path, serde_json::to_string_pretty(&cfg)?)
.context("写入 config.json 失败")?; .context("写入 config.json 失败")?;
println!("配置已保存: {}", config_path.display()); println!("配置已保存: {}", config_path.display());
// init 之后必须停掉旧 daemon它用的是旧 config下次调用会自动重启
let _ = crate::cli::transport::stop_daemon();
println!("初始化完成,可以使用 wx sessions / wx history 等命令了"); println!("初始化完成,可以使用 wx sessions / wx history 等命令了");
#[cfg(target_os = "macos")]
{
eprintln!();
eprintln!("[macOS] 副作用提示:");
eprintln!(" 如果你是通过对 /Applications/WeChat.app 做 ad-hoc 重签来让 init 走通的,");
eprintln!(" 之后 macOS 可能弹 \"微信\" 想访问其他 App 的数据(在微信里打开公众号文章");
eprintln!(" 时尤其常见)。这是 ad-hoc 重签后 WeChat 的 code identity 变了导致的,");
eprintln!(" 不是 wx-cli 在读其他 App 数据。");
eprintln!(" 完整说明https://github.com/jackwener/wx-cli/blob/main/docs/macos-permission-guide.md#六微信-想访问其他-app-的数据-弹窗");
eprintln!(" (如果你的 WeChat 仍是 Apple 官方签名、init 是靠 GUI Terminal + 开发者工具");
eprintln!(" 授权走通的,则不会出现这个弹窗,可以忽略本提示。)");
}
Ok(())
}
/// 如果当前以 root 身份运行且是通过 sudo 启动的drop 到调用用户身份,
/// 并迁移旧版本遗留的 root 属主 `~/.wx-cli/`。
///
/// 只影响本进程daemon后续 fork会继承调用用户身份。
#[cfg(unix)]
fn drop_privileges_if_sudo() -> Result<()> {
use std::ffi::CString;
use std::os::unix::ffi::OsStrExt;
use std::path::Path;
// 当前不是 root用户直接以非 root 跑的 `wx init`)→ 什么都不做
if unsafe { libc::geteuid() } != 0 {
return Ok(());
}
let sudo_uid: Option<u32> = std::env::var("SUDO_UID").ok().and_then(|s| s.parse().ok());
let sudo_gid: Option<u32> = std::env::var("SUDO_GID").ok().and_then(|s| s.parse().ok());
let (uid, gid) = match (sudo_uid, sudo_gid) {
(Some(u), Some(g)) if u != 0 => (u, g),
// 直接以 root 登陆(非 sudo没有"调用用户"可还原 → 保持 root
_ => return Ok(()),
};
// 迁移旧版本遗留:如果 ~/.wx-cli/ 已存在且属 root把它 chown 回调用用户,
// 顺便把 raw key 文件的权限也收紧到 0600旧版默认 0644世界可读等于泄露
// 这些必须在 setuid 之前做chown 需要 rootchmod 也只有属主或 root 能改。
let cli_dir = config::cli_dir();
if cli_dir.exists() {
let _ = chown_recursive(&cli_dir, uid, gid);
let _ = tighten_perms(&cli_dir);
}
// 设置 umask让后续 create 出来的文件/目录默认是 0600 / 0700。
unsafe { libc::umask(0o077); }
// 必须先 setgid 再 setuid一旦 uid 降下来就没法再改 gid 了。
unsafe {
if libc::setgid(gid) != 0 {
anyhow::bail!("setgid({}) 失败: {}", gid, std::io::Error::last_os_error());
}
if libc::setuid(uid) != 0 {
anyhow::bail!("setuid({}) 失败: {}", uid, std::io::Error::last_os_error());
}
}
// chown 递归实现
fn chown_recursive(path: &Path, uid: u32, gid: u32) -> std::io::Result<()> {
chown_one(path, uid, gid)?;
let md = std::fs::symlink_metadata(path)?;
if md.is_dir() {
for entry in std::fs::read_dir(path)? {
chown_recursive(&entry?.path(), uid, gid)?;
}
}
Ok(())
}
fn chown_one(path: &Path, uid: u32, gid: u32) -> std::io::Result<()> {
let c = CString::new(path.as_os_str().as_bytes())
.map_err(|_| std::io::Error::new(std::io::ErrorKind::InvalidInput, "path contains NUL"))?;
if unsafe { libc::chown(c.as_ptr(), uid, gid) } != 0 {
return Err(std::io::Error::last_os_error());
}
Ok(())
}
/// 目录收紧到 0700所有 *.json 文件(含 all_keys.json 这类 raw key收紧到 0600。
fn tighten_perms(cli_dir: &Path) -> std::io::Result<()> {
use std::os::unix::fs::PermissionsExt;
std::fs::set_permissions(cli_dir, std::fs::Permissions::from_mode(0o700))?;
for entry in std::fs::read_dir(cli_dir)? {
let entry = entry?;
let path = entry.path();
if path.extension().and_then(|s| s.to_str()) == Some("json") {
let _ = std::fs::set_permissions(&path, std::fs::Permissions::from_mode(0o600));
}
}
Ok(())
}
Ok(()) Ok(())
} }

View File

@ -1,25 +1,38 @@
mod init; pub mod attachments;
pub mod sessions; pub mod biz_articles;
pub mod history;
pub mod search;
pub mod contacts; pub mod contacts;
pub mod export;
pub mod daemon_cmd; pub mod daemon_cmd;
pub mod transport; pub mod export;
pub mod output; pub mod extract;
pub mod unread; pub mod favorites;
pub mod history;
mod init;
pub mod members; pub mod members;
pub mod new_messages; pub mod new_messages;
pub mod output;
pub mod search;
pub mod sessions;
pub mod sns_feed;
pub mod sns_notifications;
pub mod sns_search;
pub mod stats; pub mod stats;
pub mod favorites; pub mod transport;
pub mod unread;
use self::output::OutputOpts;
use anyhow::Result; use anyhow::Result;
use clap::{Parser, Subcommand}; use clap::{Parser, Subcommand};
/// wx — 微信本地数据 CLI /// wx — 微信本地数据 CLI
#[derive(Parser)] #[derive(Parser)]
#[command(name = "wx", version = "0.1.0", about = "wx — 微信本地数据 CLI")] #[command(name = "wx", version = env!("CARGO_PKG_VERSION"), about = "wx — 微信本地数据 CLI")]
pub struct Cli { pub struct Cli {
/// 返回更重的 freshness/source 元数据(如 per-shard latest、cache modes
#[arg(long, global = true)]
with_meta: bool,
/// 在 meta 里暴露真实 shard 路径(调试用)
#[arg(long, global = true, hide = true)]
debug_source: bool,
#[command(subcommand)] #[command(subcommand)]
command: Commands, command: Commands,
} }
@ -126,6 +139,10 @@ enum Commands {
/// 显示数量 /// 显示数量
#[arg(short = 'n', long, default_value = "20")] #[arg(short = 'n', long, default_value = "20")]
limit: usize, limit: usize,
/// 按会话类型过滤,逗号分隔。示例:--filter private,group 只看真人的未读
#[arg(long, value_name = "TYPES", value_delimiter = ',',
value_parser = ["all", "private", "group", "official", "folded"])]
filter: Vec<String>,
/// 输出 JSON默认 YAML /// 输出 JSON默认 YAML
#[arg(long)] #[arg(long)]
json: bool, json: bool,
@ -177,6 +194,121 @@ enum Commands {
#[arg(long)] #[arg(long)]
json: bool, json: bool,
}, },
/// 朋友圈互动通知:别人对我的朋友圈点赞/评论 + 我评过的帖子下的跟帖
SnsNotifications {
/// 显示数量
#[arg(short = 'n', long, default_value = "50")]
limit: usize,
/// 起始时间 YYYY-MM-DD
#[arg(long)]
since: Option<String>,
/// 结束时间 YYYY-MM-DD
#[arg(long)]
until: Option<String>,
/// 包含已读通知(默认仅未读)
#[arg(long)]
include_read: bool,
/// 输出 JSON默认 YAML
#[arg(long)]
json: bool,
},
/// 朋友圈时间线:按时间/作者筛选本地缓存的朋友圈
SnsFeed {
/// 显示数量
#[arg(short = 'n', long, default_value = "20")]
limit: usize,
/// 起始时间 YYYY-MM-DD
#[arg(long)]
since: Option<String>,
/// 结束时间 YYYY-MM-DD
#[arg(long)]
until: Option<String>,
/// 只看指定作者(昵称 / 备注名 / 微信 ID模糊匹配
#[arg(long)]
user: Option<String>,
/// 输出 JSON默认 YAML
#[arg(long)]
json: bool,
},
/// 查询公众号文章推送(本地缓存)
BizArticles {
/// 显示数量
#[arg(short = 'n', long, default_value = "50")]
limit: usize,
/// 限定公众号(名称模糊匹配)
#[arg(long)]
account: Option<String>,
/// 起始时间 YYYY-MM-DD
#[arg(long)]
since: Option<String>,
/// 结束时间 YYYY-MM-DD
#[arg(long)]
until: Option<String>,
/// 只看有未读的公众号,每个公众号取最新 1 篇
#[arg(long)]
unread: bool,
/// 输出 JSON默认 YAML
#[arg(long)]
json: bool,
},
/// 朋友圈全文搜索:匹配正文关键词
SnsSearch {
/// 关键词
keyword: String,
/// 结果数量
#[arg(short = 'n', long, default_value = "20")]
limit: usize,
/// 起始时间 YYYY-MM-DD
#[arg(long)]
since: Option<String>,
/// 结束时间 YYYY-MM-DD
#[arg(long)]
until: Option<String>,
/// 限定作者(昵称 / 备注名 / 微信 ID
#[arg(long)]
user: Option<String>,
/// 输出 JSON默认 YAML
#[arg(long)]
json: bool,
},
/// 列出某会话的图片附件,返回不透明 attachment_id
Attachments {
/// 会话名称(联系人显示名 / wxid / @chatroom username 都可以)
chat: String,
/// 类型(当前仅支持 image
#[arg(long = "kind", value_name = "KIND",
value_parser = ["image", "img"])]
kinds: Vec<String>,
/// 显示数量
#[arg(short = 'n', long, default_value = "50")]
limit: usize,
/// 分页偏移
#[arg(long, default_value = "0")]
offset: usize,
/// 起始时间 YYYY-MM-DD
#[arg(long)]
since: Option<String>,
/// 结束时间 YYYY-MM-DD
#[arg(long)]
until: Option<String>,
/// 输出 JSON默认 YAML
#[arg(long)]
json: bool,
},
/// 把单个 attachment_id 对应的资源解密写到指定文件路径
Extract {
/// 由 `wx attachments` 输出的不透明 IDbase64url 字符串)
attachment_id: String,
/// 输出文件路径(绝对或相对当前工作目录均可;扩展名建议保留为 .jpg 等)
#[arg(short = 'o', long)]
output: String,
/// 目标已存在时覆盖
#[arg(long)]
overwrite: bool,
/// 输出 JSON默认 YAML
#[arg(long)]
json: bool,
},
/// 管理 wx-daemon /// 管理 wx-daemon
Daemon { Daemon {
#[command(subcommand)] #[command(subcommand)]
@ -210,28 +342,184 @@ pub fn run() {
} }
fn dispatch(cli: Cli) -> Result<()> { fn dispatch(cli: Cli) -> Result<()> {
let base_with_meta = cli.with_meta;
let base_debug_source = cli.debug_source;
match cli.command { match cli.command {
Commands::Init { force } => init::cmd_init(force), Commands::Init { force } => init::cmd_init(force),
Commands::Sessions { limit, json } => sessions::cmd_sessions(limit, json), Commands::Sessions { limit, json } => sessions::cmd_sessions(
Commands::History { chat, limit, offset, since, until, msg_type, json } => { limit,
history::cmd_history(chat, limit, offset, since, until, msg_type, json) OutputOpts {
} json,
Commands::Search { keyword, chats, limit, since, until, msg_type, json } => { with_meta: base_with_meta,
search::cmd_search(keyword, chats, limit, since, until, msg_type, json) debug_source: base_debug_source,
} },
),
Commands::History {
chat,
limit,
offset,
since,
until,
msg_type,
json,
} => history::cmd_history(
chat,
limit,
offset,
since,
until,
msg_type,
OutputOpts {
json,
with_meta: base_with_meta,
debug_source: base_debug_source,
},
),
Commands::Search {
keyword,
chats,
limit,
since,
until,
msg_type,
json,
} => search::cmd_search(
keyword,
chats,
limit,
since,
until,
msg_type,
OutputOpts {
json,
with_meta: base_with_meta,
debug_source: base_debug_source,
},
),
Commands::Contacts { query, limit, json } => contacts::cmd_contacts(query, limit, json), Commands::Contacts { query, limit, json } => contacts::cmd_contacts(query, limit, json),
Commands::Export { chat, since, until, limit, format, output } => { Commands::Export {
export::cmd_export(chat, since, until, limit, format, output) chat,
since,
until,
limit,
format,
output,
} => {
let export_json = format == "json";
export::cmd_export(
chat,
since,
until,
limit,
format,
output,
OutputOpts {
json: export_json,
with_meta: base_with_meta,
debug_source: base_debug_source,
},
)
} }
Commands::Unread { limit, json } => unread::cmd_unread(limit, json), Commands::Unread {
limit,
filter,
json,
} => unread::cmd_unread(
limit,
filter,
OutputOpts {
json,
with_meta: base_with_meta,
debug_source: base_debug_source,
},
),
Commands::Members { chat, json } => members::cmd_members(chat, json), Commands::Members { chat, json } => members::cmd_members(chat, json),
Commands::NewMessages { limit, json } => new_messages::cmd_new_messages(limit, json), Commands::NewMessages { limit, json } => new_messages::cmd_new_messages(
Commands::Stats { chat, since, until, json } => { limit,
stats::cmd_stats(chat, since, until, json) OutputOpts {
} json,
Commands::Favorites { limit, fav_type, query, json } => { with_meta: base_with_meta,
favorites::cmd_favorites(limit, fav_type, query, json) debug_source: base_debug_source,
} },
),
Commands::Stats {
chat,
since,
until,
json,
} => stats::cmd_stats(
chat,
since,
until,
OutputOpts {
json,
with_meta: base_with_meta,
debug_source: base_debug_source,
},
),
Commands::Favorites {
limit,
fav_type,
query,
json,
} => favorites::cmd_favorites(limit, fav_type, query, json),
Commands::SnsNotifications {
limit,
since,
until,
include_read,
json,
} => sns_notifications::cmd_sns_notifications(limit, since, until, include_read, json),
Commands::SnsFeed {
limit,
since,
until,
user,
json,
} => sns_feed::cmd_sns_feed(limit, since, until, user, json),
Commands::SnsSearch {
keyword,
limit,
since,
until,
user,
json,
} => sns_search::cmd_sns_search(keyword, limit, since, until, user, json),
Commands::BizArticles {
limit,
account,
since,
until,
unread,
json,
} => biz_articles::cmd_biz_articles(limit, account, since, until, unread, json),
Commands::Attachments {
chat,
kinds,
limit,
offset,
since,
until,
json,
} => attachments::cmd_attachments(
chat,
kinds,
limit,
offset,
since,
until,
OutputOpts {
json,
with_meta: base_with_meta,
debug_source: base_debug_source,
},
),
Commands::Extract {
attachment_id,
output,
overwrite,
json,
} => extract::cmd_extract(attachment_id, output, overwrite, json),
Commands::Daemon { cmd } => daemon_cmd::cmd_daemon(cmd), Commands::Daemon { cmd } => daemon_cmd::cmd_daemon(cmd),
} }
} }

View File

@ -1,8 +1,8 @@
use super::output::{emit_warnings, print_response, OutputOpts};
use super::transport;
use crate::ipc::Request;
use anyhow::Result; use anyhow::Result;
use std::collections::HashMap; use std::collections::HashMap;
use crate::ipc::Request;
use super::transport;
use super::output::{resolve, print_value};
fn state_file() -> std::path::PathBuf { fn state_file() -> std::path::PathBuf {
dirs::home_dir() dirs::home_dir()
@ -18,7 +18,8 @@ fn load_state() -> Option<HashMap<String, i64>> {
let data = std::fs::read_to_string(state_file()).ok()?; let data = std::fs::read_to_string(state_file()).ok()?;
let v: serde_json::Value = serde_json::from_str(&data).ok()?; let v: serde_json::Value = serde_json::from_str(&data).ok()?;
// 旧格式(只有 timestamp 字段)没有 sessions key → 返回 None 触发首次运行逻辑 // 旧格式(只有 timestamp 字段)没有 sessions key → 返回 None 触发首次运行逻辑
let map: HashMap<String, i64> = v.get("sessions")? let map: HashMap<String, i64> = v
.get("sessions")?
.as_object()? .as_object()?
.iter() .iter()
.filter_map(|(k, v)| v.as_i64().map(|ts| (k.clone(), ts))) .filter_map(|(k, v)| v.as_i64().map(|ts| (k.clone(), ts)))
@ -33,17 +34,27 @@ fn save_state(new_state: &HashMap<String, i64>) -> Result<()> {
if let Some(parent) = path.parent() { if let Some(parent) = path.parent() {
std::fs::create_dir_all(parent)?; std::fs::create_dir_all(parent)?;
} }
std::fs::write(&path, serde_json::to_string(&serde_json::json!({ "sessions": new_state }))?)?; std::fs::write(
&path,
serde_json::to_string(&serde_json::json!({ "sessions": new_state }))?,
)?;
Ok(()) Ok(())
} }
pub fn cmd_new_messages(limit: usize, json: bool) -> Result<()> { pub fn cmd_new_messages(limit: usize, opts: OutputOpts) -> Result<()> {
let state = load_state(); let state = load_state();
let resp = transport::send(Request::NewMessages { state, limit })?; let (with_meta, debug_source) = opts.request_flags();
let resp = transport::send(Request::NewMessages {
state,
limit,
with_meta,
debug_source,
})?;
// 保存 daemon 返回的 new_state // 保存 daemon 返回的 new_state
if let Some(obj) = resp.data.get("new_state").and_then(|v| v.as_object()) { if let Some(obj) = resp.data.get("new_state").and_then(|v| v.as_object()) {
let map: HashMap<String, i64> = obj.iter() let map: HashMap<String, i64> = obj
.iter()
.filter_map(|(k, v)| v.as_i64().map(|ts| (k.clone(), ts))) .filter_map(|(k, v)| v.as_i64().map(|ts| (k.clone(), ts)))
.collect(); .collect();
if !map.is_empty() { if !map.is_empty() {
@ -51,8 +62,6 @@ pub fn cmd_new_messages(limit: usize, json: bool) -> Result<()> {
} }
} }
let messages = resp.data.get("messages") emit_warnings(&resp.data);
.cloned() print_response(&resp.data, &opts)
.unwrap_or(serde_json::Value::Array(vec![]));
print_value(&messages, &resolve(json))
} }

View File

@ -1,12 +1,31 @@
use chrono::{Local, TimeZone};
/// 输出格式 /// 输出格式
pub enum Fmt { pub enum Fmt {
Yaml, Yaml,
Json, Json,
} }
#[derive(Clone, Copy, Debug)]
pub struct OutputOpts {
pub json: bool,
pub with_meta: bool,
pub debug_source: bool,
}
impl OutputOpts {
pub fn request_flags(self) -> (bool, bool) {
(self.with_meta || self.debug_source, self.debug_source)
}
}
/// 默认 YAML--json 时输出 JSON /// 默认 YAML--json 时输出 JSON
pub fn resolve(json: bool) -> Fmt { pub fn resolve(json: bool) -> Fmt {
if json { Fmt::Json } else { Fmt::Yaml } if json {
Fmt::Json
} else {
Fmt::Yaml
}
} }
pub fn print_value(value: &serde_json::Value, fmt: &Fmt) -> anyhow::Result<()> { pub fn print_value(value: &serde_json::Value, fmt: &Fmt) -> anyhow::Result<()> {
@ -16,3 +35,95 @@ pub fn print_value(value: &serde_json::Value, fmt: &Fmt) -> anyhow::Result<()> {
} }
Ok(()) Ok(())
} }
pub fn print_response(data: &serde_json::Value, opts: &OutputOpts) -> anyhow::Result<()> {
print_value(data, &resolve(opts.json))
}
pub fn emit_warnings(data: &serde_json::Value) {
for line in warning_lines(data) {
eprintln!("[wx] 警告:{}", line);
}
}
pub fn warning_lines(data: &serde_json::Value) -> Vec<String> {
let mut lines = Vec::new();
let meta = match data.get("meta") {
Some(v) if v.is_object() => v,
_ => return lines,
};
let unknown_shards: Vec<String> = meta
.get("unknown_shards")
.and_then(|v| v.as_array())
.map(|arr| {
arr.iter()
.filter_map(|v| v.as_str().map(|s| s.to_string()))
.collect()
})
.unwrap_or_default();
if !unknown_shards.is_empty() {
lines.push(format!(
"磁盘上发现 daemon 不认识的分片 {},结果可能不完整;运行 `wx init --force` 重新提取密钥。",
unknown_shards.join(", ")
));
}
let status = meta.get("status").and_then(|v| v.as_str()).unwrap_or("");
if status == "possibly_stale" || status == "possibly_stale_unknown_shards" {
let session_ts = meta.get("session_last_timestamp").and_then(|v| v.as_i64());
let chat_ts = meta.get("chat_latest_timestamp").and_then(|v| v.as_i64());
if let (Some(session_ts), Some(chat_ts)) = (session_ts, chat_ts) {
let subject = data
.get("chat")
.and_then(|v| v.as_str())
.or_else(|| data.get("username").and_then(|v| v.as_str()))
.unwrap_or("当前查询");
lines.push(format!(
"session.db 显示 '{}' 最新到 {},但本次扫描只到 {},结果可能过期或不完整。",
subject,
fmt_meta_ts(session_ts),
fmt_meta_ts(chat_ts),
));
}
}
lines
}
pub fn warning_block_text(data: &serde_json::Value) -> Option<String> {
let lines = warning_lines(data);
if lines.is_empty() {
return None;
}
Some(
lines
.into_iter()
.map(|line| format!("[wx] 警告:{}", line))
.collect::<Vec<_>>()
.join("\n"),
)
}
pub fn warning_block_markdown(data: &serde_json::Value) -> Option<String> {
let lines = warning_lines(data);
if lines.is_empty() {
return None;
}
let mut out = String::from("> [!WARNING]\n");
for line in lines {
out.push_str("> ");
out.push_str(&line);
out.push('\n');
}
Some(out)
}
fn fmt_meta_ts(ts: i64) -> String {
Local
.timestamp_opt(ts, 0)
.single()
.map(|dt| dt.format("%Y-%m-%d %H:%M:%S").to_string())
.unwrap_or_else(|| ts.to_string())
}

View File

@ -1,8 +1,8 @@
use anyhow::Result; use super::history::{parse_msg_type, parse_time, parse_time_end};
use crate::ipc::Request; use super::output::{emit_warnings, print_response, OutputOpts};
use super::transport; use super::transport;
use super::history::{parse_time, parse_time_end, parse_msg_type}; use crate::ipc::Request;
use super::output::{resolve, print_value}; use anyhow::Result;
pub fn cmd_search( pub fn cmd_search(
keyword: String, keyword: String,
@ -11,12 +11,13 @@ pub fn cmd_search(
since: Option<String>, since: Option<String>,
until: Option<String>, until: Option<String>,
msg_type: Option<String>, msg_type: Option<String>,
json: bool, opts: OutputOpts,
) -> Result<()> { ) -> Result<()> {
let since_ts = since.as_deref().map(parse_time).transpose()?; let since_ts = since.as_deref().map(parse_time).transpose()?;
let until_ts = until.as_deref().map(parse_time_end).transpose()?; let until_ts = until.as_deref().map(parse_time_end).transpose()?;
let type_val = msg_type.as_deref().and_then(parse_msg_type); let type_val = msg_type.as_deref().and_then(parse_msg_type);
let chats_opt = if chats.is_empty() { None } else { Some(chats) }; let chats_opt = if chats.is_empty() { None } else { Some(chats) };
let (with_meta, debug_source) = opts.request_flags();
let req = Request::Search { let req = Request::Search {
keyword, keyword,
@ -25,11 +26,11 @@ pub fn cmd_search(
since: since_ts, since: since_ts,
until: until_ts, until: until_ts,
msg_type: type_val, msg_type: type_val,
with_meta,
debug_source,
}; };
let resp = transport::send(req)?; let resp = transport::send(req)?;
let results = resp.data.get("results") emit_warnings(&resp.data);
.cloned() print_response(&resp.data, &opts)
.unwrap_or(serde_json::Value::Array(vec![]));
print_value(&results, &resolve(json))
} }

View File

@ -1,12 +1,15 @@
use anyhow::Result; use super::output::{emit_warnings, print_response, OutputOpts};
use crate::ipc::Request;
use super::transport; use super::transport;
use super::output::{resolve, print_value}; use crate::ipc::Request;
use anyhow::Result;
pub fn cmd_sessions(limit: usize, json: bool) -> Result<()> { pub fn cmd_sessions(limit: usize, opts: OutputOpts) -> Result<()> {
let resp = transport::send(Request::Sessions { limit })?; let (with_meta, debug_source) = opts.request_flags();
let data = resp.data.get("sessions") let resp = transport::send(Request::Sessions {
.cloned() limit,
.unwrap_or(serde_json::Value::Array(vec![])); with_meta,
print_value(&data, &resolve(json)) debug_source,
})?;
emit_warnings(&resp.data);
print_response(&resp.data, &opts)
} }

View File

@ -0,0 +1,28 @@
use anyhow::Result;
use crate::ipc::Request;
use super::history::{parse_time, parse_time_end};
use super::transport;
use super::output::{resolve, print_value};
pub fn cmd_sns_feed(
limit: usize,
since: Option<String>,
until: Option<String>,
user: Option<String>,
json: bool,
) -> Result<()> {
let since_ts = since.as_deref().map(parse_time).transpose()?;
let until_ts = until.as_deref().map(parse_time_end).transpose()?;
let req = Request::SnsFeed {
limit,
since: since_ts,
until: until_ts,
user,
};
let resp = transport::send(req)?;
let data = resp.data.get("posts")
.cloned()
.unwrap_or(serde_json::Value::Array(vec![]));
print_value(&data, &resolve(json))
}

View File

@ -0,0 +1,28 @@
use anyhow::Result;
use crate::ipc::Request;
use super::history::{parse_time, parse_time_end};
use super::transport;
use super::output::{resolve, print_value};
pub fn cmd_sns_notifications(
limit: usize,
since: Option<String>,
until: Option<String>,
include_read: bool,
json: bool,
) -> Result<()> {
let since_ts = since.as_deref().map(parse_time).transpose()?;
let until_ts = until.as_deref().map(parse_time_end).transpose()?;
let req = Request::SnsNotifications {
limit,
since: since_ts,
until: until_ts,
include_read,
};
let resp = transport::send(req)?;
let data = resp.data.get("notifications")
.cloned()
.unwrap_or(serde_json::Value::Array(vec![]));
print_value(&data, &resolve(json))
}

View File

@ -0,0 +1,30 @@
use anyhow::Result;
use crate::ipc::Request;
use super::history::{parse_time, parse_time_end};
use super::transport;
use super::output::{resolve, print_value};
pub fn cmd_sns_search(
keyword: String,
limit: usize,
since: Option<String>,
until: Option<String>,
user: Option<String>,
json: bool,
) -> Result<()> {
let since_ts = since.as_deref().map(parse_time).transpose()?;
let until_ts = until.as_deref().map(parse_time_end).transpose()?;
let req = Request::SnsSearch {
keyword,
limit,
since: since_ts,
until: until_ts,
user,
};
let resp = transport::send(req)?;
let data = resp.data.get("posts")
.cloned()
.unwrap_or(serde_json::Value::Array(vec![]));
print_value(&data, &resolve(json))
}

View File

@ -1,18 +1,25 @@
use anyhow::Result;
use crate::ipc::Request;
use super::transport;
use super::history::{parse_time, parse_time_end}; use super::history::{parse_time, parse_time_end};
use super::output::{resolve, print_value}; use super::output::{emit_warnings, print_response, OutputOpts};
use super::transport;
use crate::ipc::Request;
use anyhow::Result;
pub fn cmd_stats( pub fn cmd_stats(
chat: String, chat: String,
since: Option<String>, since: Option<String>,
until: Option<String>, until: Option<String>,
json: bool, opts: OutputOpts,
) -> Result<()> { ) -> Result<()> {
let since_ts = since.as_deref().map(parse_time).transpose()?; let since_ts = since.as_deref().map(parse_time).transpose()?;
let until_ts = until.as_deref().map(parse_time_end).transpose()?; let until_ts = until.as_deref().map(parse_time_end).transpose()?;
let (with_meta, debug_source) = opts.request_flags();
let resp = transport::send(Request::Stats { chat, since: since_ts, until: until_ts })?; let resp = transport::send(Request::Stats {
print_value(&resp.data, &resolve(json)) chat,
since: since_ts,
until: until_ts,
with_meta,
debug_source,
})?;
emit_warnings(&resp.data);
print_response(&resp.data, &opts)
} }

View File

@ -1,48 +1,32 @@
use anyhow::{bail, Context, Result}; use anyhow::{bail, Context, Result};
use serde::{Deserialize, Serialize};
use std::io::{BufRead, BufReader, Write}; use std::io::{BufRead, BufReader, Write};
use std::path::{Path, PathBuf};
use std::time::Duration; use std::time::Duration;
use crate::config; use crate::config;
use crate::ipc::{Request, Response}; use crate::ipc::{Request, Response};
const STARTUP_TIMEOUT_SECS: u64 = 15; const STARTUP_TIMEOUT_SECS: u64 = 15;
#[cfg(unix)]
const STOP_TIMEOUT_MS: u64 = 2_000;
#[derive(Debug, Clone, Serialize, Deserialize)]
struct PidFile {
pid: u32,
#[serde(default)]
exe: Option<PathBuf>,
}
/// 检查 daemon 是否存活 /// 检查 daemon 是否存活
pub fn is_alive() -> bool { pub fn is_alive() -> bool {
#[cfg(unix)] #[cfg(unix)]
{ {
use std::os::unix::net::UnixStream; ping_unix().unwrap_or(false)
let sock_path = config::sock_path();
if !sock_path.exists() {
return false;
}
let mut stream = match UnixStream::connect(&sock_path) {
Ok(s) => s,
Err(_) => return false,
};
stream.set_read_timeout(Some(Duration::from_secs(2))).ok();
stream.set_write_timeout(Some(Duration::from_secs(2))).ok();
let req = serde_json::json!({"cmd": "ping"});
if write!(stream, "{}\n", req).is_err() {
return false;
}
let mut line = String::new();
let mut reader = BufReader::new(&stream);
if reader.read_line(&mut line).is_err() {
return false;
}
serde_json::from_str::<serde_json::Value>(&line)
.ok()
.and_then(|v| v.get("pong").and_then(|p| p.as_bool()))
.unwrap_or(false)
} }
#[cfg(windows)] #[cfg(windows)]
{ {
// 通过 named pipe 检测 ping_windows().unwrap_or(false)
let pipe_path = r"\\.\pipe\wx-cli-daemon";
use std::fs::OpenOptions;
OpenOptions::new().read(true).write(true).open(pipe_path).is_ok()
} }
#[cfg(not(any(unix, windows)))] #[cfg(not(any(unix, windows)))]
{ {
@ -60,9 +44,80 @@ pub fn ensure_daemon() -> Result<()> {
Ok(()) Ok(())
} }
/// 停止 daemon如果正在运行
pub fn stop_daemon() -> Result<()> {
let pid_path = config::pid_path();
let pid_file = read_pid_file(&pid_path)?;
let daemon_alive = is_alive();
match pid_file {
Some(pid_file) => {
let belongs = pid_belongs_to_daemon(&pid_file)?;
if daemon_alive && !belongs {
bail!(
"daemon 正在运行,但 {} 指向的 PID {} 无法确认属于当前 wx-daemon",
pid_path.display(),
pid_file.pid
);
}
if belongs {
terminate_pid(pid_file.pid)?;
}
}
None if daemon_alive => {
bail!(
"daemon 正在运行,但 {} 缺失或损坏,无法安全停止",
pid_path.display()
);
}
None => {}
}
cleanup_ipc_files();
Ok(())
}
/// 启动 daemon 前检查 `~/.wx-cli/` 可写,给出比"超时"更明确的错误。
///
/// 典型坑:旧版本 `sudo wx init` 把目录留成 root 属主,非 root 的 daemon
/// 连 socket/log 都建不了,会静默失败 15s 超时。
fn preflight_cli_dir_writable() -> Result<()> {
let cli_dir = config::cli_dir();
std::fs::create_dir_all(&cli_dir)
.with_context(|| format!("创建 {} 失败", cli_dir.display()))?;
let probe = cli_dir.join(".daemon_probe");
match std::fs::File::create(&probe) {
Ok(_) => {
let _ = std::fs::remove_file(&probe);
Ok(())
}
Err(e) if e.kind() == std::io::ErrorKind::PermissionDenied => {
let dir = cli_dir.display();
if cfg!(unix) {
bail!(
"无法写入 {dir}(权限不足)\n\n\
`sudo wx init` root\n\
\n\n \
sudo chown -R $(whoami) {dir}\n\n\
init ",
)
} else {
bail!("无法写入 {dir}: {e}")
}
}
Err(e) => bail!("无法写入 {}: {}", cli_dir.display(), e),
}
}
/// 启动 daemon 进程(自身二进制,设置 WX_DAEMON_MODE=1 /// 启动 daemon 进程(自身二进制,设置 WX_DAEMON_MODE=1
fn start_daemon() -> Result<()> { fn start_daemon() -> Result<()> {
let exe = std::env::current_exe().context("无法获取当前可执行文件路径")?; let exe = std::env::current_exe().context("无法获取当前可执行文件路径")?;
let child_pid: u32;
// 预检:当前用户是否能写 ~/.wx-cli/。如果不能,给出可操作的错误信息,
// 而不是 spawn 一个注定失败的 daemon 然后超时 15s。
preflight_cli_dir_writable()?;
#[cfg(unix)] #[cfg(unix)]
{ {
@ -74,7 +129,8 @@ fn start_daemon() -> Result<()> {
let _ = std::fs::create_dir_all(parent); let _ = std::fs::create_dir_all(parent);
} }
let (stdout_stdio, stderr_stdio) = std::fs::OpenOptions::new() let (stdout_stdio, stderr_stdio) = std::fs::OpenOptions::new()
.create(true).append(true) .create(true)
.append(true)
.open(&log_path) .open(&log_path)
.and_then(|f| f.try_clone().map(|g| (f, g))) .and_then(|f| f.try_clone().map(|g| (f, g)))
.map(|(f, g)| (std::process::Stdio::from(f), std::process::Stdio::from(g))) .map(|(f, g)| (std::process::Stdio::from(f), std::process::Stdio::from(g)))
@ -85,24 +141,39 @@ fn start_daemon() -> Result<()> {
.stdout(stdout_stdio) .stdout(stdout_stdio)
.stderr(stderr_stdio); .stderr(stderr_stdio);
// SAFETY: setsid() 在 fork 后的子进程中调用,使 daemon 脱离控制终端 // SAFETY: setsid() 在 fork 后的子进程中调用,使 daemon 脱离控制终端
unsafe { cmd.pre_exec(|| { libc::setsid(); Ok(()) }); } unsafe {
let _ = cmd.spawn().context("无法启动 daemon 进程")?; cmd.pre_exec(|| {
libc::setsid();
Ok(())
});
}
let child = cmd.spawn().context("无法启动 daemon 进程")?;
child_pid = child.id();
} }
#[cfg(windows)] #[cfg(windows)]
{ {
let log_file = std::fs::OpenOptions::new() use std::os::windows::process::CommandExt;
.create(true).append(true) let log_path = config::log_path();
.open(config::log_path()) if let Some(parent) = log_path.parent() {
.ok() let _ = std::fs::create_dir_all(parent);
.map(std::process::Stdio::from) }
.unwrap_or_else(std::process::Stdio::null); let (stdout_stdio, stderr_stdio) = std::fs::OpenOptions::new()
let _ = std::process::Command::new(&exe) .create(true)
.append(true)
.open(&log_path)
.and_then(|f| f.try_clone().map(|g| (f, g)))
.map(|(f, g)| (std::process::Stdio::from(f), std::process::Stdio::from(g)))
.unwrap_or_else(|_| (std::process::Stdio::null(), std::process::Stdio::null()));
let child = std::process::Command::new(&exe)
.env("WX_DAEMON_MODE", "1") .env("WX_DAEMON_MODE", "1")
.stdout(log_file) .stdin(std::process::Stdio::null())
.stdout(stdout_stdio)
.stderr(stderr_stdio)
.creation_flags(0x00000008) // DETACHED_PROCESS .creation_flags(0x00000008) // DETACHED_PROCESS
.spawn() .spawn()
.context("无法启动 daemon 进程")?; .context("无法启动 daemon 进程")?;
child_pid = child.id();
} }
// 等待 daemon 就绪(最多 STARTUP_TIMEOUT_SECS 秒) // 等待 daemon 就绪(最多 STARTUP_TIMEOUT_SECS 秒)
@ -110,6 +181,7 @@ fn start_daemon() -> Result<()> {
while std::time::Instant::now() < deadline { while std::time::Instant::now() < deadline {
std::thread::sleep(Duration::from_millis(300)); std::thread::sleep(Duration::from_millis(300));
if is_alive() { if is_alive() {
write_pid_file(child_pid, &exe)?;
return Ok(()); return Ok(());
} }
} }
@ -121,6 +193,233 @@ fn start_daemon() -> Result<()> {
) )
} }
fn write_pid_file(pid: u32, exe: &Path) -> Result<()> {
if let Some(parent) = config::pid_path().parent() {
std::fs::create_dir_all(parent)
.with_context(|| format!("创建 {} 失败", parent.display()))?;
}
let pid_file = PidFile {
pid,
exe: Some(exe.to_path_buf()),
};
let content = serde_json::to_string(&pid_file)?;
std::fs::write(config::pid_path(), content)
.with_context(|| format!("写入 {} 失败", config::pid_path().display()))?;
Ok(())
}
fn read_pid_file(path: &Path) -> Result<Option<PidFile>> {
let content = match std::fs::read_to_string(path) {
Ok(content) => content,
Err(err) if err.kind() == std::io::ErrorKind::NotFound => return Ok(None),
Err(err) => return Err(err).with_context(|| format!("读取 {} 失败", path.display())),
};
if let Ok(pid_file) = serde_json::from_str::<PidFile>(&content) {
return Ok(Some(pid_file));
}
if let Ok(pid) = content.trim().parse::<u32>() {
return Ok(Some(PidFile {
pid,
exe: std::env::current_exe().ok(),
}));
}
bail!("{} 不是合法的 PID 文件", path.display())
}
fn cleanup_ipc_files() {
let _ = std::fs::remove_file(config::sock_path());
let _ = std::fs::remove_file(config::pid_path());
}
#[cfg(unix)]
fn ping_unix() -> Result<bool> {
use std::os::unix::net::UnixStream;
let sock_path = config::sock_path();
if !sock_path.exists() {
return Ok(false);
}
let mut stream = UnixStream::connect(&sock_path)?;
stream.set_read_timeout(Some(Duration::from_secs(2))).ok();
stream.set_write_timeout(Some(Duration::from_secs(2))).ok();
let req = serde_json::to_string(&Request::Ping)? + "\n";
stream.write_all(req.as_bytes())?;
let mut line = String::new();
let mut reader = BufReader::new(&stream);
reader.read_line(&mut line)?;
let resp: Response = serde_json::from_str(&line)?;
Ok(resp.ok && resp.data.get("pong").and_then(|p| p.as_bool()) == Some(true))
}
#[cfg(windows)]
fn ping_windows() -> Result<bool> {
use interprocess::local_socket::{prelude::*, GenericNamespaced, Stream};
let name = "wx-cli-daemon".to_ns_name::<GenericNamespaced>()?;
let stream = Stream::connect(name)?;
let mut reader = BufReader::new(stream);
let req = serde_json::to_string(&Request::Ping)? + "\n";
reader.get_mut().write_all(req.as_bytes())?;
let mut line = String::new();
reader.read_line(&mut line)?;
let resp: Response = serde_json::from_str(&line)?;
Ok(resp.ok && resp.data.get("pong").and_then(|p| p.as_bool()) == Some(true))
}
fn pid_belongs_to_daemon(pid_file: &PidFile) -> Result<bool> {
let expected_exe = pid_file
.exe
.clone()
.or_else(|| std::env::current_exe().ok());
#[cfg(unix)]
{
unix_pid_matches_daemon(pid_file.pid, expected_exe.as_deref())
}
#[cfg(windows)]
{
windows_pid_matches_daemon(pid_file.pid, expected_exe.as_deref())
}
#[cfg(not(any(unix, windows)))]
{
let _ = expected_exe;
Ok(true)
}
}
#[cfg(unix)]
fn unix_pid_matches_daemon(pid: u32, expected_exe: Option<&Path>) -> Result<bool> {
let Some(expected_exe) = expected_exe else {
return Ok(false);
};
let output = std::process::Command::new("ps")
.args(["-o", "command=", "-p", &pid.to_string()])
.output()
.with_context(|| format!("读取 PID {} 的 command 失败", pid))?;
if !output.status.success() {
return Ok(false);
}
let command = String::from_utf8_lossy(&output.stdout);
let expected = expected_exe.to_string_lossy();
if command.contains(expected.as_ref()) {
return Ok(true);
}
let Some(exe_name) = expected_exe.file_name().and_then(|name| name.to_str()) else {
return Ok(false);
};
Ok(command
.split_whitespace()
.any(|part| part == exe_name || part.ends_with(&format!("/{}", exe_name))))
}
#[cfg(windows)]
fn windows_pid_matches_daemon(pid: u32, expected_exe: Option<&Path>) -> Result<bool> {
use windows::core::PWSTR;
use windows::Win32::Foundation::CloseHandle;
use windows::Win32::System::Threading::{
OpenProcess, QueryFullProcessImageNameW, PROCESS_NAME_FORMAT,
PROCESS_QUERY_LIMITED_INFORMATION,
};
let Some(expected_exe) = expected_exe else {
return Ok(false);
};
let handle = match unsafe { OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, false, pid) } {
Ok(handle) => handle,
Err(_) => return Ok(false),
};
let mut buf = vec![0u16; 260];
let mut len = buf.len() as u32;
let actual = unsafe {
let result = QueryFullProcessImageNameW(
handle,
PROCESS_NAME_FORMAT(0),
PWSTR(buf.as_mut_ptr()),
&mut len,
);
let _ = CloseHandle(handle);
result
};
if actual.is_err() {
return Ok(false);
}
let actual_path = PathBuf::from(String::from_utf16_lossy(&buf[..len as usize]));
Ok(normalize_exe_path(&actual_path) == normalize_exe_path(expected_exe))
}
#[cfg(windows)]
fn normalize_exe_path(path: &Path) -> String {
path.to_string_lossy()
.replace('\\', "/")
.to_ascii_lowercase()
}
fn terminate_pid(pid: u32) -> Result<()> {
#[cfg(unix)]
{
terminate_pid_unix(pid)
}
#[cfg(windows)]
{
terminate_pid_windows(pid)
}
#[cfg(not(any(unix, windows)))]
{
let _ = pid;
Ok(())
}
}
#[cfg(unix)]
fn terminate_pid_unix(pid: u32) -> Result<()> {
let rc = unsafe { libc::kill(pid as i32, libc::SIGTERM) };
if rc != 0 {
let err = std::io::Error::last_os_error();
if err.raw_os_error() == Some(libc::ESRCH) {
return Ok(());
}
bail!("停止 PID {} 失败: {}", pid, err);
}
let deadline = std::time::Instant::now() + Duration::from_millis(STOP_TIMEOUT_MS);
while std::time::Instant::now() < deadline {
if !unix_process_exists(pid) {
return Ok(());
}
std::thread::sleep(Duration::from_millis(50));
}
bail!("等待 PID {} 退出超时", pid)
}
#[cfg(unix)]
fn unix_process_exists(pid: u32) -> bool {
let rc = unsafe { libc::kill(pid as i32, 0) };
if rc == 0 {
return true;
}
let err = std::io::Error::last_os_error();
err.raw_os_error() == Some(libc::EPERM)
}
#[cfg(windows)]
fn terminate_pid_windows(pid: u32) -> Result<()> {
let status = std::process::Command::new("taskkill")
.args(["/F", "/PID", &pid.to_string()])
.status()
.with_context(|| format!("执行 taskkill /PID {} 失败", pid))?;
if !status.success() {
bail!("停止 PID {} 失败: taskkill exit {:?}", pid, status.code());
}
Ok(())
}
/// 向 daemon 发送请求并返回响应 /// 向 daemon 发送请求并返回响应
pub fn send(req: Request) -> Result<Response> { pub fn send(req: Request) -> Result<Response> {
ensure_daemon()?; ensure_daemon()?;
@ -143,10 +442,11 @@ pub fn send(req: Request) -> Result<Response> {
fn send_unix(req: Request) -> Result<Response> { fn send_unix(req: Request) -> Result<Response> {
use std::os::unix::net::UnixStream; use std::os::unix::net::UnixStream;
let sock_path = config::sock_path(); let sock_path = config::sock_path();
let mut stream = UnixStream::connect(&sock_path) let mut stream = UnixStream::connect(&sock_path).context("连接 daemon socket 失败")?;
.context("连接 daemon socket 失败")?;
stream.set_read_timeout(Some(Duration::from_secs(120))).ok(); stream.set_read_timeout(Some(Duration::from_secs(120))).ok();
stream.set_write_timeout(Some(Duration::from_secs(120))).ok(); stream
.set_write_timeout(Some(Duration::from_secs(120)))
.ok();
let req_str = serde_json::to_string(&req)? + "\n"; let req_str = serde_json::to_string(&req)? + "\n";
stream.write_all(req_str.as_bytes())?; stream.write_all(req_str.as_bytes())?;
@ -155,8 +455,7 @@ fn send_unix(req: Request) -> Result<Response> {
let mut reader = BufReader::new(&stream); let mut reader = BufReader::new(&stream);
reader.read_line(&mut line)?; reader.read_line(&mut line)?;
let resp: Response = serde_json::from_str(&line) let resp: Response = serde_json::from_str(&line).context("解析 daemon 响应失败")?;
.context("解析 daemon 响应失败")?;
if !resp.ok { if !resp.ok {
bail!("{}", resp.error.as_deref().unwrap_or("未知错误")); bail!("{}", resp.error.as_deref().unwrap_or("未知错误"));
@ -167,25 +466,23 @@ fn send_unix(req: Request) -> Result<Response> {
#[cfg(windows)] #[cfg(windows)]
fn send_windows(req: Request) -> Result<Response> { fn send_windows(req: Request) -> Result<Response> {
use std::fs::OpenOptions; use interprocess::local_socket::{prelude::*, GenericNamespaced, Stream};
use std::os::windows::fs::OpenOptionsExt;
let pipe_path = r"\\.\pipe\wx-cli-daemon"; let name = "wx-cli-daemon"
let mut pipe = OpenOptions::new() .to_ns_name::<GenericNamespaced>()
.read(true) .context("构造 pipe name 失败")?;
.write(true) let stream = Stream::connect(name).context("连接 daemon named pipe 失败")?;
.open(pipe_path)
.context("连接 daemon named pipe 失败")?; // interprocess::Stream 同时实现 Read + Write但需要拆分读写端
let mut reader = BufReader::new(stream);
let req_str = serde_json::to_string(&req)? + "\n"; let req_str = serde_json::to_string(&req)? + "\n";
pipe.write_all(req_str.as_bytes())?; reader.get_mut().write_all(req_str.as_bytes())?;
let mut line = String::new(); let mut line = String::new();
let mut reader = BufReader::new(pipe);
reader.read_line(&mut line)?; reader.read_line(&mut line)?;
let resp: Response = serde_json::from_str(&line) let resp: Response = serde_json::from_str(&line).context("解析 daemon 响应失败")?;
.context("解析 daemon 响应失败")?;
if !resp.ok { if !resp.ok {
bail!("{}", resp.error.as_deref().unwrap_or("未知错误")); bail!("{}", resp.error.as_deref().unwrap_or("未知错误"));

View File

@ -1,12 +1,22 @@
use anyhow::Result; use super::output::{emit_warnings, print_response, OutputOpts};
use crate::ipc::Request;
use super::transport; use super::transport;
use super::output::{resolve, print_value}; use crate::ipc::Request;
use anyhow::Result;
pub fn cmd_unread(limit: usize, json: bool) -> Result<()> { pub fn cmd_unread(limit: usize, filter: Vec<String>, opts: OutputOpts) -> Result<()> {
let resp = transport::send(Request::Unread { limit })?; // 空或含 "all" 视为不过滤;其他值已被 clap value_parser 验证过,直接透传给 daemon。
let data = resp.data.get("sessions") let filter_vec = if filter.is_empty() || filter.iter().any(|s| s == "all") {
.cloned() None
.unwrap_or(serde_json::Value::Array(vec![])); } else {
print_value(&data, &resolve(json)) Some(filter)
};
let (with_meta, debug_source) = opts.request_flags();
let resp = transport::send(Request::Unread {
limit,
filter: filter_vec,
with_meta,
debug_source,
})?;
emit_warnings(&resp.data);
print_response(&resp.data, &opts)
} }

View File

@ -11,38 +11,50 @@ pub struct Config {
pub wechat_process: String, pub wechat_process: String,
} }
/// 从 <exe_dir>/config.json 或 $HOME/.wx-cli/config.json 加载配置 /// 从当前工作目录 / <exe_dir> / $HOME/.wx-cli 加载配置
pub fn load_config() -> Result<Config> { pub fn load_config() -> Result<Config> {
let config_path = find_config_file()?; let config_path = find_config_file()?;
let content = std::fs::read_to_string(&config_path) let content = std::fs::read_to_string(&config_path)
.with_context(|| format!("读取 config.json 失败: {}", config_path.display()))?; .with_context(|| format!("读取 config.json 失败: {}", config_path.display()))?;
let raw: serde_json::Value = serde_json::from_str(&content) let raw: serde_json::Value =
.with_context(|| "config.json 格式错误")?; serde_json::from_str(&content).with_context(|| "config.json 格式错误")?;
let db_dir = raw.get("db_dir") let db_dir = raw
.get("db_dir")
.and_then(|v| v.as_str()) .and_then(|v| v.as_str())
.map(PathBuf::from) .map(PathBuf::from)
.unwrap_or_else(default_db_dir); .unwrap_or_else(default_db_dir);
let base_dir = config_path.parent().unwrap_or(Path::new(".")); let base_dir = config_path.parent().unwrap_or(Path::new("."));
let keys_file = raw.get("keys_file") let keys_file = raw
.get("keys_file")
.and_then(|v| v.as_str()) .and_then(|v| v.as_str())
.map(|s| { .map(|s| {
let p = PathBuf::from(s); let p = PathBuf::from(s);
if p.is_absolute() { p } else { base_dir.join(p) } if p.is_absolute() {
p
} else {
base_dir.join(p)
}
}) })
.unwrap_or_else(|| base_dir.join("all_keys.json")); .unwrap_or_else(|| base_dir.join("all_keys.json"));
let decrypted_dir = raw.get("decrypted_dir") let decrypted_dir = raw
.get("decrypted_dir")
.and_then(|v| v.as_str()) .and_then(|v| v.as_str())
.map(|s| { .map(|s| {
let p = PathBuf::from(s); let p = PathBuf::from(s);
if p.is_absolute() { p } else { base_dir.join(p) } if p.is_absolute() {
p
} else {
base_dir.join(p)
}
}) })
.unwrap_or_else(|| base_dir.join("decrypted")); .unwrap_or_else(|| base_dir.join("decrypted"));
let wechat_process = raw.get("wechat_process") let wechat_process = raw
.get("wechat_process")
.and_then(|v| v.as_str()) .and_then(|v| v.as_str())
.unwrap_or(default_wechat_process()) .unwrap_or(default_wechat_process())
.to_string(); .to_string();
@ -56,40 +68,97 @@ pub fn load_config() -> Result<Config> {
} }
fn find_config_file() -> Result<PathBuf> { fn find_config_file() -> Result<PathBuf> {
// 1. 优先查找可执行文件同目录 let cwd_dir = std::env::current_dir().ok();
if let Ok(exe) = std::env::current_exe() { let exe_dir = std::env::current_exe()
if let Some(dir) = exe.parent() { .ok()
let p = dir.join("config.json"); .and_then(|exe| exe.parent().map(PathBuf::from));
if p.exists() { let cli_home = cli_home_dir();
return Ok(p); let home_dir = (cli_home != PathBuf::from("/tmp")).then_some(cli_home.as_path());
}
} if let Some(path) = find_existing_config_path(cwd_dir.as_deref(), exe_dir.as_deref(), home_dir)
{
return Ok(path);
} }
// 2. 当前工作目录
let cwd = std::env::current_dir().unwrap_or_default().join("config.json"); Ok(default_config_path(
if cwd.exists() { cwd_dir.as_deref(),
return Ok(cwd); exe_dir.as_deref(),
} home_dir,
// 3. ~/.wx-cli/config.json ))
if let Some(home) = dirs::home_dir() { }
let p = home.join(".wx-cli").join("config.json");
if p.exists() { fn find_existing_config_path(
return Ok(p); cwd_dir: Option<&Path>,
} exe_dir: Option<&Path>,
} home_dir: Option<&Path>,
// 返回默认路径(可能不存在,调用方负责处理) ) -> Option<PathBuf> {
if let Ok(exe) = std::env::current_exe() { let candidates = [
if let Some(dir) = exe.parent() { cwd_dir.map(config_path_in_dir),
return Ok(dir.join("config.json")); exe_dir.map(config_path_in_dir),
} home_dir.map(home_config_path),
} ];
Ok(PathBuf::from("config.json")) candidates.into_iter().flatten().find(|path| path.exists())
}
fn default_config_path(
cwd_dir: Option<&Path>,
exe_dir: Option<&Path>,
home_dir: Option<&Path>,
) -> PathBuf {
cwd_dir
.map(config_path_in_dir)
.or_else(|| exe_dir.map(config_path_in_dir))
.or_else(|| home_dir.map(home_config_path))
.unwrap_or_else(|| PathBuf::from("config.json"))
}
fn config_path_in_dir(dir: &Path) -> PathBuf {
dir.join("config.json")
}
fn home_config_path(home_dir: &Path) -> PathBuf {
home_dir.join(".wx-cli").join("config.json")
} }
pub fn cli_dir() -> PathBuf { pub fn cli_dir() -> PathBuf {
dirs::home_dir() cli_home_dir().join(".wx-cli")
.unwrap_or_else(|| PathBuf::from("/tmp")) }
.join(".wx-cli")
fn cli_home_dir() -> PathBuf {
resolve_cli_home(
dirs::home_dir().unwrap_or_else(|| PathBuf::from("/tmp")),
sudo_user_home_dir(),
)
}
fn resolve_cli_home(default_home: PathBuf, sudo_home: Option<PathBuf>) -> PathBuf {
sudo_home.unwrap_or(default_home)
}
#[cfg(unix)]
fn sudo_user_home_dir() -> Option<PathBuf> {
use std::ffi::{CStr, CString};
let sudo_user = std::env::var("SUDO_USER").ok()?;
let sudo_user = sudo_user.trim();
if sudo_user.is_empty() {
return None;
}
let c_user = CString::new(sudo_user).ok()?;
unsafe {
let pwd = libc::getpwnam(c_user.as_ptr());
if pwd.is_null() || (*pwd).pw_dir.is_null() {
return None;
}
let dir = CStr::from_ptr((*pwd).pw_dir).to_str().ok()?;
Some(PathBuf::from(dir))
}
}
#[cfg(not(unix))]
fn sudo_user_home_dir() -> Option<PathBuf> {
None
} }
pub fn sock_path() -> PathBuf { pub fn sock_path() -> PathBuf {
@ -127,8 +196,7 @@ fn default_db_dir() -> PathBuf {
} }
#[cfg(target_os = "windows")] #[cfg(target_os = "windows")]
{ {
PathBuf::from(std::env::var("APPDATA").unwrap_or_default()) PathBuf::from(std::env::var("APPDATA").unwrap_or_default()).join("Tencent/xwechat")
.join("Tencent/xwechat")
} }
#[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))] #[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))]
{ {
@ -138,13 +206,21 @@ fn default_db_dir() -> PathBuf {
fn default_wechat_process() -> &'static str { fn default_wechat_process() -> &'static str {
#[cfg(target_os = "macos")] #[cfg(target_os = "macos")]
{ "WeChat" } {
"WeChat"
}
#[cfg(target_os = "linux")] #[cfg(target_os = "linux")]
{ "wechat" } {
"wechat"
}
#[cfg(target_os = "windows")] #[cfg(target_os = "windows")]
{ "Weixin.exe" } {
"Weixin.exe"
}
#[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))] #[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))]
{ "WeChat" } {
"WeChat"
}
} }
/// 自动检测微信 db_storage 目录 /// 自动检测微信 db_storage 目录
@ -154,17 +230,7 @@ pub fn auto_detect_db_dir() -> Option<PathBuf> {
#[cfg(target_os = "macos")] #[cfg(target_os = "macos")]
fn detect_db_dir_impl() -> Option<PathBuf> { fn detect_db_dir_impl() -> Option<PathBuf> {
let home = dirs::home_dir()?; let home = sudo_user_home_dir().or_else(dirs::home_dir)?;
// 支持 sudo 环境
let home = if let Ok(sudo_user) = std::env::var("SUDO_USER") {
if !sudo_user.is_empty() {
PathBuf::from("/Users").join(&sudo_user)
} else {
home
}
} else {
home
};
let base = home.join("Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files"); let base = home.join("Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files");
if !base.exists() { if !base.exists() {
@ -190,9 +256,7 @@ fn detect_db_dir_impl() -> Option<PathBuf> {
#[cfg(target_os = "linux")] #[cfg(target_os = "linux")]
fn detect_db_dir_impl() -> Option<PathBuf> { fn detect_db_dir_impl() -> Option<PathBuf> {
let home = dirs::home_dir()?; let home = dirs::home_dir()?;
let sudo_home = std::env::var("SUDO_USER").ok() let sudo_home = sudo_user_home_dir();
.filter(|s| !s.is_empty())
.map(|u| PathBuf::from("/home").join(u));
let mut candidates: Vec<PathBuf> = Vec::new(); let mut candidates: Vec<PathBuf> = Vec::new();
for base_home in [Some(home.clone()), sudo_home].into_iter().flatten() { for base_home in [Some(home.clone()), sudo_home].into_iter().flatten() {
@ -213,13 +277,36 @@ fn detect_db_dir_impl() -> Option<PathBuf> {
} }
} }
candidates.sort_by_key(|p| { candidates.sort_by_key(|p| {
std::fs::metadata(p) // 排序:取 db_storage 目录下所有 .db 文件的最新 mtime而非目录自身的 mtime
.and_then(|m| m.modified()) // 这样当收到新消息时(只有 .db 文件被更新),能正确识别最新目录
.unwrap_or(std::time::SystemTime::UNIX_EPOCH) latest_db_mtime(p).unwrap_or(std::time::SystemTime::UNIX_EPOCH)
}); });
candidates.into_iter().next_back() candidates.into_iter().next_back()
} }
#[cfg(any(target_os = "linux", target_os = "windows"))]
/// 递归查找 db_storage 目录下所有 .db 文件的最新 mtime
fn latest_db_mtime(dir: &Path) -> Option<std::time::SystemTime> {
let mut latest = None;
if let Ok(entries) = std::fs::read_dir(dir) {
for entry in entries.flatten() {
let path = entry.path();
let mtime = if path.is_dir() {
latest_db_mtime(&path).unwrap_or(std::time::SystemTime::UNIX_EPOCH)
} else if path.extension().and_then(|s| s.to_str()) == Some("db") {
entry
.metadata()
.and_then(|m| m.modified())
.unwrap_or(std::time::SystemTime::UNIX_EPOCH)
} else {
continue;
};
latest = Some(latest.map_or(mtime, |cur| if mtime > cur { mtime } else { cur }));
}
}
latest
}
#[cfg(target_os = "windows")] #[cfg(target_os = "windows")]
fn detect_db_dir_impl() -> Option<PathBuf> { fn detect_db_dir_impl() -> Option<PathBuf> {
let appdata = std::env::var("APPDATA").ok()?; let appdata = std::env::var("APPDATA").ok()?;
@ -233,10 +320,11 @@ fn detect_db_dir_impl() -> Option<PathBuf> {
let path = entry.path(); let path = entry.path();
if path.extension().map(|e| e == "ini").unwrap_or(false) { if path.extension().map(|e| e == "ini").unwrap_or(false) {
if let Ok(content) = std::fs::read_to_string(&path) { if let Ok(content) = std::fs::read_to_string(&path) {
let data_root = content.trim().to_string(); let Some(data_root) = resolve_windows_data_root(content.trim()) else {
if PathBuf::from(&data_root).is_dir() { continue;
let pattern = PathBuf::from(&data_root) };
.join("xwechat_files"); if data_root.is_dir() {
let pattern = data_root.join("xwechat_files");
if let Ok(entries2) = std::fs::read_dir(&pattern) { if let Ok(entries2) = std::fs::read_dir(&pattern) {
for entry2 in entries2.flatten() { for entry2 in entries2.flatten() {
let storage = entry2.path().join("db_storage"); let storage = entry2.path().join("db_storage");
@ -250,10 +338,165 @@ fn detect_db_dir_impl() -> Option<PathBuf> {
} }
} }
} }
candidates.into_iter().next() candidates.sort_by_key(|p| latest_db_mtime(p).unwrap_or(std::time::SystemTime::UNIX_EPOCH));
candidates.into_iter().next_back()
}
/// Resolve the data-root path that Weixin writes to its `*.ini` file under
/// `%APPDATA%\Tencent\xwechat\config\`.
///
/// Observed forms in the wild:
/// - A plain absolute path, e.g. `D:\WeChatFiles`.
/// - The literal token `MyDocument:` (sometimes with a trailing slash),
/// which is not a real filesystem path. Empirically this denotes
/// "the current user's Documents folder"; users who relocated
/// Documents to e.g. `D:\Documents` saw auto-detect fail silently
/// because `PathBuf::from("MyDocument:").is_dir()` is false.
///
/// We accept either form. For the `MyDocument:` token we resolve via
/// `SHGetKnownFolderPath(FOLDERID_Documents)`, which respects the standard
/// shell-folder redirect at
/// `HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders\Personal`.
#[cfg(target_os = "windows")]
fn resolve_windows_data_root(content: &str) -> Option<PathBuf> {
let trimmed = content.trim();
// Strip an optional trailing slash so `MyDocument:\` and `MyDocument:/` also match.
let stripped = trimmed
.strip_suffix(['\\', '/'])
.unwrap_or(trimmed);
if stripped.eq_ignore_ascii_case("MyDocument:") {
return known_documents_dir();
}
Some(PathBuf::from(trimmed))
}
#[cfg(target_os = "windows")]
fn known_documents_dir() -> Option<PathBuf> {
use std::ffi::OsString;
use std::os::windows::ffi::OsStringExt;
use windows::Win32::Foundation::HANDLE;
use windows::Win32::System::Com::CoTaskMemFree;
use windows::Win32::UI::Shell::{
FOLDERID_Documents, SHGetKnownFolderPath, KF_FLAG_DEFAULT,
};
// SAFETY: standard Win32 known-folder API. SHGetKnownFolderPath either returns
// a heap-allocated PWSTR that the caller must free with CoTaskMemFree, or an
// error — in which case the out-pointer is not allocated. We free on every
// success path. Passing a null token (HANDLE::default()) means "the calling
// user", which is exactly what we want.
unsafe {
let pwstr =
SHGetKnownFolderPath(&FOLDERID_Documents, KF_FLAG_DEFAULT, HANDLE::default()).ok()?;
if pwstr.0.is_null() {
return None;
}
// Walk the NUL-terminated wide string to compute its length.
let mut len = 0usize;
while *pwstr.0.add(len) != 0 {
len += 1;
}
let slice = std::slice::from_raw_parts(pwstr.0, len);
let os_str = OsString::from_wide(slice);
CoTaskMemFree(Some(pwstr.0 as *const _));
let path = PathBuf::from(os_str);
if path.as_os_str().is_empty() {
None
} else {
Some(path)
}
}
} }
#[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))] #[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))]
fn detect_db_dir_impl() -> Option<PathBuf> { fn detect_db_dir_impl() -> Option<PathBuf> {
None None
} }
#[cfg(test)]
mod tests {
use super::{
config_path_in_dir, default_config_path, find_existing_config_path, home_config_path,
resolve_cli_home,
};
#[cfg(target_os = "windows")]
use super::{known_documents_dir, resolve_windows_data_root};
use std::fs;
use std::path::PathBuf;
use std::time::{SystemTime, UNIX_EPOCH};
fn temp_dir(name: &str) -> PathBuf {
let unique = format!(
"wx-cli-config-test-{}-{}-{}",
name,
std::process::id(),
SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap_or_default()
.as_nanos()
);
let dir = std::env::temp_dir().join(unique);
fs::create_dir_all(&dir).unwrap();
dir
}
#[test]
fn resolve_cli_home_prefers_sudo_home_when_present() {
let home = resolve_cli_home(PathBuf::from("/root"), Some(PathBuf::from("/Users/alice")));
assert_eq!(home, PathBuf::from("/Users/alice"));
}
#[test]
fn resolve_cli_home_falls_back_to_default_home() {
let home = resolve_cli_home(PathBuf::from("/root"), None);
assert_eq!(home, PathBuf::from("/root"));
}
#[test]
fn config_path_prefers_cwd_over_exe_and_home() {
let cwd = temp_dir("cwd");
let exe = temp_dir("exe");
let home = temp_dir("home");
fs::write(config_path_in_dir(&cwd), "{}").unwrap();
fs::write(config_path_in_dir(&exe), "{}").unwrap();
fs::create_dir_all(home.join(".wx-cli")).unwrap();
fs::write(home_config_path(&home), "{}").unwrap();
let path = find_existing_config_path(Some(&cwd), Some(&exe), Some(&home)).unwrap();
assert_eq!(path, config_path_in_dir(&cwd));
fs::remove_dir_all(cwd).unwrap();
fs::remove_dir_all(exe).unwrap();
fs::remove_dir_all(home).unwrap();
}
#[test]
fn default_config_path_matches_init_write_order() {
let cwd = PathBuf::from("/tmp/cwd");
let exe = PathBuf::from("/tmp/exe");
let home = PathBuf::from("/tmp/home");
let path = default_config_path(Some(&cwd), Some(&exe), Some(&home));
assert_eq!(path, cwd.join("config.json"));
}
#[cfg(target_os = "windows")]
#[test]
fn resolve_windows_data_root_passes_through_absolute_path() {
let p = resolve_windows_data_root("D:\\WeChatFiles").unwrap();
assert_eq!(p, PathBuf::from("D:\\WeChatFiles"));
}
#[cfg(target_os = "windows")]
#[test]
fn resolve_windows_data_root_recognises_mydocument_keyword() {
// Should match the keyword exactly (case-insensitive, with or without trailing slash)
// and resolve to a non-empty Documents path via SHGetKnownFolderPath.
let docs = known_documents_dir().expect("Documents known folder must resolve");
for keyword in ["MyDocument:", "mydocument:", "MyDocument:\\", "MyDocument:/"] {
let resolved = resolve_windows_data_root(keyword)
.unwrap_or_else(|| panic!("keyword {keyword:?} should resolve"));
assert_eq!(resolved, docs, "keyword {keyword:?}");
}
}
}

View File

@ -1,9 +1,9 @@
pub mod wal; pub mod wal;
use anyhow::{bail, Result};
use aes::Aes256; use aes::Aes256;
use cbc::Decryptor; use anyhow::{bail, Result};
use cbc::cipher::{BlockDecryptMut, KeyIvInit}; use cbc::cipher::{BlockDecryptMut, KeyIvInit};
use cbc::Decryptor;
use std::io::{Read, Write}; use std::io::{Read, Write};
use std::path::Path; use std::path::Path;
@ -65,11 +65,8 @@ fn aes_cbc_decrypt(key: &[u8; 32], iv: &[u8; 16], data: &[u8]) -> Result<Vec<u8>
bail!("密文长度不是 AES 块大小的倍数: {}", data.len()); bail!("密文长度不是 AES 块大小的倍数: {}", data.len());
} }
// 将 &[u8] 复制为 Block 数组,避免 unsafe from_raw_parts_mut // 将 &[u8] 复制为 Block 数组,避免 unsafe from_raw_parts_mut
let mut blocks: Vec<Block> = data.chunks_exact(16) let mut blocks: Vec<Block> = data.chunks_exact(16).map(Block::clone_from_slice).collect();
.map(Block::clone_from_slice) Aes256CbcDec::new(key.into(), iv.into()).decrypt_blocks_mut(&mut blocks);
.collect();
Aes256CbcDec::new(key.into(), iv.into())
.decrypt_blocks_mut(&mut blocks);
Ok(blocks.iter().flat_map(|b| b.iter().copied()).collect()) Ok(blocks.iter().flat_map(|b| b.iter().copied()).collect())
} }
@ -92,15 +89,101 @@ pub fn full_decrypt(db_path: &Path, out_path: &Path, enc_key: &[u8; 32]) -> Resu
let mut page_buf = vec![0u8; PAGE_SZ]; let mut page_buf = vec![0u8; PAGE_SZ];
for pgno in 1..=total_pages { for pgno in 1..=total_pages {
let n = input.read(&mut page_buf)?; let page_start = (pgno - 1) * PAGE_SZ;
if n == 0 { break; } let bytes_remaining = file_size.saturating_sub(page_start);
// 不足一页则补零 read_page(&mut input, &mut page_buf, bytes_remaining)?;
if n < PAGE_SZ {
page_buf[n..].fill(0);
}
let dec = decrypt_page(enc_key, &page_buf, pgno as u32)?; let dec = decrypt_page(enc_key, &page_buf, pgno as u32)?;
output.write_all(&dec)?; output.write_all(&dec)?;
} }
Ok(()) Ok(())
} }
fn read_page(
input: &mut impl Read,
page_buf: &mut [u8],
bytes_remaining: usize,
) -> std::io::Result<usize> {
let expected = bytes_remaining.min(PAGE_SZ);
input.read_exact(&mut page_buf[..expected])?;
if expected < PAGE_SZ {
page_buf[expected..].fill(0);
}
Ok(expected)
}
#[cfg(test)]
mod tests {
use super::{read_page, PAGE_SZ};
use std::io::{self, Read};
struct ChunkedReader {
chunks: Vec<Vec<u8>>,
chunk_idx: usize,
offset: usize,
}
impl ChunkedReader {
fn new(chunks: Vec<Vec<u8>>) -> Self {
Self {
chunks,
chunk_idx: 0,
offset: 0,
}
}
}
impl Read for ChunkedReader {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
if self.chunk_idx >= self.chunks.len() {
return Ok(0);
}
let chunk = &self.chunks[self.chunk_idx];
let remaining = &chunk[self.offset..];
let n = remaining.len().min(buf.len());
buf[..n].copy_from_slice(&remaining[..n]);
self.offset += n;
if self.offset == chunk.len() {
self.chunk_idx += 1;
self.offset = 0;
}
Ok(n)
}
}
#[test]
fn read_page_reads_across_short_chunks() {
let mut reader = ChunkedReader::new(vec![vec![1; 32], vec![2; PAGE_SZ - 32]]);
let mut page_buf = vec![0u8; PAGE_SZ];
let n = read_page(&mut reader, &mut page_buf, PAGE_SZ).unwrap();
assert_eq!(n, PAGE_SZ);
assert_eq!(page_buf[0], 1);
assert_eq!(page_buf[31], 1);
assert_eq!(page_buf[32], 2);
assert_eq!(page_buf[PAGE_SZ - 1], 2);
}
#[test]
fn read_page_zero_pads_last_partial_page() {
let mut reader = ChunkedReader::new(vec![vec![7; 8], vec![9; 4]]);
let mut page_buf = vec![0u8; PAGE_SZ];
let n = read_page(&mut reader, &mut page_buf, 12).unwrap();
assert_eq!(n, 12);
assert_eq!(&page_buf[..8], &[7; 8]);
assert_eq!(&page_buf[8..12], &[9; 4]);
assert!(page_buf[12..].iter().all(|&b| b == 0));
}
#[test]
fn read_page_errors_on_early_eof() {
let mut reader = ChunkedReader::new(vec![vec![1; 8]]);
let mut page_buf = vec![0u8; PAGE_SZ];
let err = read_page(&mut reader, &mut page_buf, 16).unwrap_err();
assert_eq!(err.kind(), io::ErrorKind::UnexpectedEof);
}
}

View File

@ -23,6 +23,40 @@ struct CacheEntry {
decrypted_path: PathBuf, decrypted_path: PathBuf,
} }
/// `DbCache::get_with_mode()` 本次解析 rel_key 时实际走了哪条路径。
///
/// latency tier:
/// - `CacheHit`~0ms只返回已有解密产物
/// - `WalIncremental`:典型 <10s只在 cached DB 上增量 apply WAL
/// - `FullDecrypt`:最慢路径,大库上可能到 ~120s
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum CacheMode {
/// Path 1主 `.db` 和 WAL 都没变,直接命中缓存。
CacheHit,
/// Path 2主 `.db` 没变、只有 WAL 变了,在 cached DB 上增量 apply。
WalIncremental,
/// Path 3主 `.db` 变了或缓存 miss重新 full decrypt。
FullDecrypt,
}
impl CacheMode {
/// 手工固定为 snake_case 字符串,避免未来给 enum 直接 derive `Serialize`
/// 时静默改变 wire 形态。
pub fn as_str(self) -> &'static str {
match self {
CacheMode::CacheHit => "cache_hit",
CacheMode::WalIncremental => "wal_incremental",
CacheMode::FullDecrypt => "full_decrypt",
}
}
}
#[derive(Debug, Clone)]
pub struct CacheResolve {
pub path: PathBuf,
pub mode: CacheMode,
}
/// 解密后数据库的 mtime-aware 缓存 /// 解密后数据库的 mtime-aware 缓存
/// ///
/// 当数据库文件(.db或 WAL 文件(.db-wal的 mtime 发生变化时, /// 当数据库文件(.db或 WAL 文件(.db-wal的 mtime 发生变化时,
@ -30,30 +64,43 @@ struct CacheEntry {
pub struct DbCache { pub struct DbCache {
db_dir: PathBuf, db_dir: PathBuf,
cache_dir: PathBuf, cache_dir: PathBuf,
mtime_file: PathBuf,
all_keys: HashMap<String, String>, // rel_key -> enc_key(hex) all_keys: HashMap<String, String>, // rel_key -> enc_key(hex)
inner: Arc<Mutex<HashMap<String, CacheEntry>>>, inner: Arc<Mutex<HashMap<String, CacheEntry>>>,
} }
impl DbCache { impl DbCache {
pub async fn new( pub async fn new(db_dir: PathBuf, all_keys: HashMap<String, String>) -> Result<Self> {
Self::with_dirs(db_dir, config::cache_dir(), config::mtime_file(), all_keys).await
}
/// 注入 `cache_dir` / `mtime_file`(测试用 + 生产 `new()` 复用)
pub(crate) async fn with_dirs(
db_dir: PathBuf, db_dir: PathBuf,
cache_dir: PathBuf,
mtime_file: PathBuf,
all_keys: HashMap<String, String>, all_keys: HashMap<String, String>,
) -> Result<Self> { ) -> Result<Self> {
let cache_dir = config::cache_dir();
tokio::fs::create_dir_all(&cache_dir).await?; tokio::fs::create_dir_all(&cache_dir).await?;
let inner: HashMap<String, CacheEntry> = HashMap::new();
let cache = DbCache { let cache = DbCache {
db_dir, db_dir,
cache_dir, cache_dir,
mtime_file,
all_keys, all_keys,
inner: Arc::new(Mutex::new(inner)), inner: Arc::new(Mutex::new(HashMap::new())),
}; };
cache.load_persistent().await; cache.load_persistent().await;
Ok(cache) Ok(cache)
} }
/// 数据库根目录(即 `<wxchat_base>/db_storage`)。
/// 上层attachment resolver需要 `db_dir.parent()` 来定位 `msg/attach/...` 解密图片。
pub fn db_dir(&self) -> &Path {
&self.db_dir
}
fn cache_file_path(&self, rel_key: &str) -> PathBuf { fn cache_file_path(&self, rel_key: &str) -> PathBuf {
let hash = format!("{:x}", md5::compute(rel_key.as_bytes())); let hash = format!("{:x}", md5::compute(rel_key.as_bytes()));
self.cache_dir.join(format!("{}.db", hash)) self.cache_dir.join(format!("{}.db", hash))
@ -61,7 +108,7 @@ impl DbCache {
/// 从持久化文件加载 mtime 记录,复用未过期的解密文件 /// 从持久化文件加载 mtime 记录,复用未过期的解密文件
async fn load_persistent(&self) { async fn load_persistent(&self) {
let mtime_file = config::mtime_file(); let mtime_file = &self.mtime_file;
let content = match tokio::fs::read_to_string(&mtime_file).await { let content = match tokio::fs::read_to_string(&mtime_file).await {
Ok(c) => c, Ok(c) => c,
Err(_) => return, Err(_) => return,
@ -78,18 +125,34 @@ impl DbCache {
if !dec_path.exists() { if !dec_path.exists() {
continue; continue;
} }
let db_path = self.db_dir.join(rel_key.replace('\\', std::path::MAIN_SEPARATOR_STR).replace('/', std::path::MAIN_SEPARATOR_STR)); let db_path = self.db_dir.join(
rel_key
.replace('\\', std::path::MAIN_SEPARATOR_STR)
.replace('/', std::path::MAIN_SEPARATOR_STR),
);
let wal_path = wal_path_for(&db_path); let wal_path = wal_path_for(&db_path);
let db_mt = mtime_nanos(&db_path); let db_mt = mtime_nanos(&db_path);
let wal_mt = if wal_path.exists() { mtime_nanos(&wal_path) } else { 0 }; let _wal_mt = if wal_path.exists() {
mtime_nanos(&wal_path)
} else {
0
};
if db_mt == entry.db_mt && wal_mt == entry.wal_mt { // 只要主 .db 没变,就把 cached 产物载回来。
inner.insert(rel_key.clone(), CacheEntry { // 如果 WAL mtime 变了,后续 `get()` 会自动走 Path 2在已有 cached DB 上增量 apply_wal
db_mtime: db_mt, // 而不是 daemon 重启后第一条请求又退回全量解密。
wal_mtime: wal_mt, if db_mt == entry.db_mt {
decrypted_path: dec_path, inner.insert(
}); rel_key.clone(),
CacheEntry {
db_mtime: db_mt,
// 保留"cached 产物构建时看到的 wal_mtime",让 `get()` 去比较当前 WAL
// 是否发生了变化,从而决定 exact-hit 还是 WAL 增量。
wal_mtime: entry.wal_mt,
decrypted_path: dec_path,
},
);
reused += 1; reused += 1;
} }
} }
@ -100,15 +163,21 @@ impl DbCache {
/// 持久化 mtime 记录 /// 持久化 mtime 记录
async fn save_persistent(&self) { async fn save_persistent(&self) {
let mtime_file = config::mtime_file(); let mtime_file = &self.mtime_file;
let inner = self.inner.lock().await; let inner = self.inner.lock().await;
let data: HashMap<String, MtimeEntry> = inner.iter().map(|(k, v)| { let data: HashMap<String, MtimeEntry> = inner
(k.clone(), MtimeEntry { .iter()
db_mt: v.db_mtime, .map(|(k, v)| {
wal_mt: v.wal_mtime, (
path: v.decrypted_path.to_string_lossy().into_owned(), k.clone(),
MtimeEntry {
db_mt: v.db_mtime,
wal_mt: v.wal_mtime,
path: v.decrypted_path.to_string_lossy().into_owned(),
},
)
}) })
}).collect(); .collect();
drop(inner); drop(inner);
if let Ok(json) = serde_json::to_string_pretty(&data) { if let Ok(json) = serde_json::to_string_pretty(&data) {
@ -118,84 +187,148 @@ impl DbCache {
/// 获取解密后的数据库路径 /// 获取解密后的数据库路径
/// ///
/// 如果 mtime 未变,直接返回缓存路径;否则重新解密 /// 三种命中路径:
/// 1. 主 `.db` 和 WAL mtime 都未变 → 直接返回缓存路径
/// 2. 主 `.db` 未变、WAL mtime 变了 → 在已有 cached 产物上**增量** `apply_wal`
/// apply_wal 是幂等的:旧帧 redo 同样的 page 写入,新帧追加生效;不重新 full_decrypt
/// 3. 主 `.db` mtime 变了 → 重新 `full_decrypt` + `apply_wal`
///
/// WeChat 在写消息时只 append WAL除非触发 checkpoint因此 path 2 是常态;
/// 这条路径把"每次请求都全量解密 ~1.8GB DB~120s"压到"只解 WAL 帧(典型 < 10s"。
pub async fn get(&self, rel_key: &str) -> Result<Option<PathBuf>> { pub async fn get(&self, rel_key: &str) -> Result<Option<PathBuf>> {
Ok(self.get_with_mode(rel_key).await?.map(|r| r.path))
}
pub async fn get_with_mode(&self, rel_key: &str) -> Result<Option<CacheResolve>> {
let enc_key_hex = match self.all_keys.get(rel_key) { let enc_key_hex = match self.all_keys.get(rel_key) {
Some(k) => k.clone(), Some(k) => k.clone(),
None => return Ok(None), None => return Ok(None),
}; };
let db_path = self.db_dir.join( let db_path = self.db_dir.join(
rel_key.replace('\\', std::path::MAIN_SEPARATOR_STR) rel_key
.replace('/', std::path::MAIN_SEPARATOR_STR) .replace('\\', std::path::MAIN_SEPARATOR_STR)
.replace('/', std::path::MAIN_SEPARATOR_STR),
); );
if !db_path.exists() { if !db_path.exists() {
return Ok(None); return Ok(None);
} }
let wal_path = wal_path_for(&db_path); let wal_path = wal_path_for(&db_path);
let db_mt = mtime_nanos(&db_path); let db_mt = mtime_nanos(&db_path);
let wal_mt = if wal_path.exists() { mtime_nanos(&wal_path) } else { 0 }; let wal_mt = if wal_path.exists() {
mtime_nanos(&wal_path)
} else {
0
};
// 检查缓存 let cached = {
{
let inner = self.inner.lock().await; let inner = self.inner.lock().await;
if let Some(entry) = inner.get(rel_key) { inner.get(rel_key).cloned()
if entry.db_mtime == db_mt };
&& entry.wal_mtime == wal_mt
&& entry.decrypted_path.exists() let enc_key_bytes =
{ hex_to_32bytes(&enc_key_hex).with_context(|| format!("密钥格式错误: {}", rel_key))?;
return Ok(Some(entry.decrypted_path.clone()));
// Path 1 / Path 2主 .db mtime 未变且 cached 产物仍在
if let Some(entry) = cached.as_ref() {
if entry.db_mtime == db_mt && entry.decrypted_path.exists() {
if entry.wal_mtime == wal_mt {
return Ok(Some(CacheResolve {
path: entry.decrypted_path.clone(),
mode: CacheMode::CacheHit,
}));
} }
// Path 2: WAL-only 变化 → 在 cached 产物上重新 apply_wal
// 不存在的 WAL 也要更新 wal_mtime=0虽然 SQLite 不会自发"主库不变 + WAL 清空"
let out_path = entry.decrypted_path.clone();
let t0 = std::time::Instant::now();
if wal_path.exists() {
let out_path2 = out_path.clone();
let wal_path2 = wal_path.clone();
let key_copy = enc_key_bytes;
tokio::task::spawn_blocking(move || {
wal::apply_wal(&wal_path2, &out_path2, &key_copy)
})
.await??;
}
eprintln!(
"[cache] WAL 增量 {} ({}ms)",
rel_key,
t0.elapsed().as_millis()
);
{
let mut inner = self.inner.lock().await;
inner.insert(
rel_key.to_string(),
CacheEntry {
db_mtime: db_mt,
wal_mtime: wal_mt,
decrypted_path: out_path.clone(),
},
);
}
self.save_persistent().await;
return Ok(Some(CacheResolve {
path: out_path,
mode: CacheMode::WalIncremental,
}));
} }
} }
// 需要重新解密 // Path 3: 主 .db 变了 / 缓存 miss → 全量解密
let out_path = self.cache_file_path(rel_key); let out_path = self.cache_file_path(rel_key);
let enc_key_bytes = hex_to_32bytes(&enc_key_hex)
.with_context(|| format!("密钥格式错误: {}", rel_key))?;
let t0 = std::time::Instant::now(); let t0 = std::time::Instant::now();
let db_path2 = db_path.clone(); let db_path2 = db_path.clone();
let out_path2 = out_path.clone(); let out_path2 = out_path.clone();
let key_copy = enc_key_bytes; let key_copy = enc_key_bytes;
tokio::task::spawn_blocking(move || { tokio::task::spawn_blocking(move || crypto::full_decrypt(&db_path2, &out_path2, &key_copy))
crypto::full_decrypt(&db_path2, &out_path2, &key_copy) .await??;
}).await??;
// 应用 WAL
if wal_path.exists() { if wal_path.exists() {
let out_path3 = out_path.clone(); let out_path3 = out_path.clone();
let wal_path3 = wal_path.clone(); let wal_path3 = wal_path.clone();
let key_copy2 = enc_key_bytes; let key_copy2 = enc_key_bytes;
tokio::task::spawn_blocking(move || { tokio::task::spawn_blocking(move || wal::apply_wal(&wal_path3, &out_path3, &key_copy2))
wal::apply_wal(&wal_path3, &out_path3, &key_copy2) .await??;
}).await??;
} }
let elapsed_ms = t0.elapsed().as_millis(); eprintln!(
eprintln!("[cache] 解密 {} ({}ms)", rel_key, elapsed_ms); "[cache] 全量解密 {} ({}ms)",
rel_key,
t0.elapsed().as_millis()
);
// 更新内存缓存
{ {
let mut inner = self.inner.lock().await; let mut inner = self.inner.lock().await;
inner.insert(rel_key.to_string(), CacheEntry { inner.insert(
db_mtime: db_mt, rel_key.to_string(),
wal_mtime: wal_mt, CacheEntry {
decrypted_path: out_path.clone(), db_mtime: db_mt,
}); wal_mtime: wal_mt,
decrypted_path: out_path.clone(),
},
);
} }
self.save_persistent().await; self.save_persistent().await;
Ok(Some(out_path)) Ok(Some(CacheResolve {
path: out_path,
mode: CacheMode::FullDecrypt,
}))
} }
} }
pub(super) fn mtime_nanos(path: &Path) -> u64 { pub(super) fn mtime_nanos(path: &Path) -> u64 {
std::fs::metadata(path) std::fs::metadata(path)
.and_then(|m| m.modified()) .and_then(|m| m.modified())
.map(|t| t.duration_since(std::time::UNIX_EPOCH).unwrap_or_default().as_nanos() as u64) .map(|t| {
t.duration_since(std::time::UNIX_EPOCH)
.unwrap_or_default()
.as_nanos() as u64
})
.unwrap_or(0) .unwrap_or(0)
} }
@ -217,3 +350,307 @@ fn hex_to_32bytes(s: &str) -> Result<[u8; 32]> {
} }
Ok(out) Ok(out)
} }
#[cfg(test)]
mod tests {
use super::*;
/// 64 字符 hex不需要是真 SQLCipher key — 仅用来证明"是否触发了 full_decrypt"
const FAKE_KEY_HEX: &str = "0000000000000000000000000000000000000000000000000000000000000000";
/// 路径区分约定:
/// - 完全 hit / WAL 增量 → `decrypted_path` **内容不变**
/// - 全量解密 → `crypto::full_decrypt` 把 cached file **重写为 PAGE_SZ 倍数**
/// fake key 解出 4096 字节垃圾,但仍写入 — 不验证内容合法性)
/// 因此用 cached file 的"size 是否被改"来判断走了哪条路径。
const ORIGINAL_CACHED_BYTES: &[u8] = b"original cached contents";
fn unique_tmpdir(tag: &str) -> PathBuf {
let pid = std::process::id();
let nanos = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap()
.as_nanos();
let p = std::env::temp_dir().join(format!("wx-cli-cache-test-{}-{}-{}", tag, pid, nanos));
std::fs::create_dir_all(&p).unwrap();
p
}
/// 准备一份 "DbCache 已经 reuse 了 cached 解密产物" 的初始状态。
/// 返回 (cache, db_path, decrypted_path, mtime_file, rel_key)。
async fn setup_seeded_cache(tag: &str) -> (DbCache, PathBuf, PathBuf, PathBuf, String) {
let root = unique_tmpdir(tag);
let db_dir = root.join("db_storage");
let cache_dir = root.join("cache");
std::fs::create_dir_all(&db_dir).unwrap();
std::fs::create_dir_all(&cache_dir).unwrap();
let rel_key = "message_0.db".to_string();
let db_path = db_dir.join(&rel_key);
std::fs::write(&db_path, b"fake encrypted db").unwrap();
let cached_hash = format!("{:x}", md5::compute(rel_key.as_bytes()));
let decrypted_path = cache_dir.join(format!("{}.db", cached_hash));
std::fs::write(&decrypted_path, ORIGINAL_CACHED_BYTES).unwrap();
let db_mt = mtime_nanos(&db_path);
let mtime_file = cache_dir.join("_mtimes.json");
let payload = serde_json::to_string(&serde_json::json!({
&rel_key: {
"db_mt": db_mt,
"wal_mt": 0u64,
"path": decrypted_path.display().to_string(),
}
}))
.unwrap();
std::fs::write(&mtime_file, payload).unwrap();
let mut all_keys = HashMap::new();
all_keys.insert(rel_key.clone(), FAKE_KEY_HEX.to_string());
let cache = DbCache::with_dirs(db_dir, cache_dir, mtime_file.clone(), all_keys)
.await
.unwrap();
(cache, db_path, decrypted_path, mtime_file, rel_key)
}
#[tokio::test]
async fn exact_mtime_hit_skips_decrypt() {
let (cache, _db_path, decrypted_path, _mtime_file, rel_key) =
setup_seeded_cache("exact").await;
let p = cache
.get(&rel_key)
.await
.unwrap()
.expect("cache should hit");
assert_eq!(p, decrypted_path);
// 完全 hit → cached file 内容不应被改
let body = std::fs::read(&decrypted_path).unwrap();
assert_eq!(body, ORIGINAL_CACHED_BYTES);
}
#[tokio::test]
async fn wal_only_change_uses_incremental_path() {
// 自己构造(不走 setup_seeded_cache以便初始 mtime.json 同时写 db_mt 和 wal_mt
let root = unique_tmpdir("walonly");
let db_dir = root.join("db_storage");
let cache_dir = root.join("cache");
std::fs::create_dir_all(&db_dir).unwrap();
std::fs::create_dir_all(&cache_dir).unwrap();
let rel_key = "message_0.db".to_string();
let db_path = db_dir.join(&rel_key);
std::fs::write(&db_path, b"fake encrypted db").unwrap();
let wal_path = wal_path_for(&db_path);
std::fs::write(&wal_path, [0u8; 31]).unwrap(); // ≤ WAL_HDR_SZ=32 → apply_wal noop
let cached_hash = format!("{:x}", md5::compute(rel_key.as_bytes()));
let decrypted_path = cache_dir.join(format!("{}.db", cached_hash));
std::fs::write(&decrypted_path, ORIGINAL_CACHED_BYTES).unwrap();
let db_mt = mtime_nanos(&db_path);
let wal_mt0 = mtime_nanos(&wal_path);
let mtime_file = cache_dir.join("_mtimes.json");
let payload = serde_json::to_string(&serde_json::json!({
&rel_key: {
"db_mt": db_mt,
"wal_mt": wal_mt0,
"path": decrypted_path.display().to_string(),
}
}))
.unwrap();
std::fs::write(&mtime_file, payload).unwrap();
let mut all_keys = HashMap::new();
all_keys.insert(rel_key.clone(), FAKE_KEY_HEX.to_string());
let cache = DbCache::with_dirs(db_dir, cache_dir, mtime_file, all_keys)
.await
.unwrap();
// 第一次:完全 hit
let p1 = cache.get(&rel_key).await.unwrap().expect("first get hits");
assert_eq!(p1, decrypted_path);
assert_eq!(
std::fs::read(&decrypted_path).unwrap(),
ORIGINAL_CACHED_BYTES
);
// bump WAL mtime重写仍 31 bytesapply_wal 仍 noop
std::thread::sleep(std::time::Duration::from_millis(20));
std::fs::write(&wal_path, [0xffu8; 31]).unwrap();
let wal_mt1 = mtime_nanos(&wal_path);
assert_ne!(wal_mt0, wal_mt1, "rewriting WAL should bump mtime");
// 第二次WAL 增量路径
// 如果错误地走 full_decrypt → cached file 大小会被重写为 ≥ PAGE_SZ
let p2 = cache
.get(&rel_key)
.await
.unwrap()
.expect("WAL-incremental path should produce path");
assert_eq!(p2, decrypted_path);
let body = std::fs::read(&decrypted_path).unwrap();
assert_eq!(
body, ORIGINAL_CACHED_BYTES,
"WAL-incremental should NOT rewrite cached file"
);
}
#[tokio::test]
async fn db_mtime_change_triggers_full_decrypt() {
let (cache, db_path, decrypted_path, _mtime_file, rel_key) =
setup_seeded_cache("dbchange").await;
// bump 主 .db 的 mtime重写一份不同 bytes
std::thread::sleep(std::time::Duration::from_millis(20));
std::fs::write(&db_path, b"different fake encrypted bytes").unwrap();
assert_ne!(
mtime_nanos(&db_path),
cache.inner.lock().await.get(&rel_key).unwrap().db_mtime,
"rewriting db file should bump mtime"
);
// 走 full_decrypt 路径 → fake key 不会让 full_decrypt 失败(它不验证内容),
// 但会把 cached file 重写为 PAGE_SZ 倍数。原始内容是 24 bytes重写后应该 ≥ 4096 bytes。
let p = cache
.get(&rel_key)
.await
.unwrap()
.expect("cache should produce path");
assert_eq!(p, decrypted_path);
let new_size = std::fs::metadata(&decrypted_path).unwrap().len() as usize;
assert!(
new_size >= crate::crypto::PAGE_SZ,
"expected full_decrypt to rewrite cached file to PAGE_SZ multiple, got size={}",
new_size,
);
}
#[tokio::test]
async fn get_with_mode_reports_each_path() {
let root = unique_tmpdir("getwithmode");
let db_dir = root.join("db_storage");
let cache_dir = root.join("cache");
std::fs::create_dir_all(&db_dir).unwrap();
std::fs::create_dir_all(&cache_dir).unwrap();
let rel_key = "message_0.db".to_string();
let db_path = db_dir.join(&rel_key);
std::fs::write(&db_path, b"fake encrypted db").unwrap();
let wal_path = wal_path_for(&db_path);
std::fs::write(&wal_path, [0u8; 31]).unwrap();
let cached_hash = format!("{:x}", md5::compute(rel_key.as_bytes()));
let decrypted_path = cache_dir.join(format!("{}.db", cached_hash));
std::fs::write(&decrypted_path, ORIGINAL_CACHED_BYTES).unwrap();
let db_mt = mtime_nanos(&db_path);
let wal_mt0 = mtime_nanos(&wal_path);
let mtime_file = cache_dir.join("_mtimes.json");
let payload = serde_json::to_string(&serde_json::json!({
&rel_key: {
"db_mt": db_mt,
"wal_mt": wal_mt0,
"path": decrypted_path.display().to_string(),
}
}))
.unwrap();
std::fs::write(&mtime_file, payload).unwrap();
let mut all_keys = HashMap::new();
all_keys.insert(rel_key.clone(), FAKE_KEY_HEX.to_string());
let cache = DbCache::with_dirs(db_dir, cache_dir, mtime_file, all_keys)
.await
.unwrap();
let hit = cache
.get_with_mode(&rel_key)
.await
.unwrap()
.expect("cache should hit");
assert_eq!(hit.path, decrypted_path);
assert_eq!(hit.mode, CacheMode::CacheHit);
std::thread::sleep(std::time::Duration::from_millis(20));
std::fs::write(&wal_path, [0xffu8; 31]).unwrap();
let wal = cache
.get_with_mode(&rel_key)
.await
.unwrap()
.expect("WAL-only change should stay incremental");
assert_eq!(wal.path, decrypted_path);
assert_eq!(wal.mode, CacheMode::WalIncremental);
std::thread::sleep(std::time::Duration::from_millis(20));
std::fs::write(&db_path, b"different bytes").unwrap();
let full = cache
.get_with_mode(&rel_key)
.await
.unwrap()
.expect("db mtime change should trigger full decrypt");
assert_eq!(full.path, decrypted_path);
assert_eq!(full.mode, CacheMode::FullDecrypt);
}
#[tokio::test]
async fn restart_with_wal_change_still_reuses_cached_db_then_applies_wal() {
let root = unique_tmpdir("restart-wal");
let db_dir = root.join("db_storage");
let cache_dir = root.join("cache");
std::fs::create_dir_all(&db_dir).unwrap();
std::fs::create_dir_all(&cache_dir).unwrap();
let rel_key = "message_0.db".to_string();
let db_path = db_dir.join(&rel_key);
std::fs::write(&db_path, b"fake encrypted db").unwrap();
let wal_path = wal_path_for(&db_path);
std::fs::write(&wal_path, [0u8; 31]).unwrap(); // WAL 增量仍是 noop
let cached_hash = format!("{:x}", md5::compute(rel_key.as_bytes()));
let decrypted_path = cache_dir.join(format!("{}.db", cached_hash));
std::fs::write(&decrypted_path, ORIGINAL_CACHED_BYTES).unwrap();
let db_mt = mtime_nanos(&db_path);
let wal_mt0 = mtime_nanos(&wal_path);
let mtime_file = cache_dir.join("_mtimes.json");
let payload = serde_json::to_string(&serde_json::json!({
&rel_key: {
"db_mt": db_mt,
"wal_mt": wal_mt0,
"path": decrypted_path.display().to_string(),
}
}))
.unwrap();
std::fs::write(&mtime_file, payload).unwrap();
// 模拟 daemon 重启前又有新消息写入 WAL
std::thread::sleep(std::time::Duration::from_millis(20));
std::fs::write(&wal_path, [0xffu8; 31]).unwrap();
let wal_mt1 = mtime_nanos(&wal_path);
assert_ne!(wal_mt0, wal_mt1);
let mut all_keys = HashMap::new();
all_keys.insert(rel_key.clone(), FAKE_KEY_HEX.to_string());
let cache = DbCache::with_dirs(db_dir, cache_dir, mtime_file, all_keys)
.await
.unwrap();
let p = cache
.get(&rel_key)
.await
.unwrap()
.expect("cache should reuse persisted DB");
assert_eq!(p, decrypted_path);
let body = std::fs::read(&decrypted_path).unwrap();
assert_eq!(
body, ORIGINAL_CACHED_BYTES,
"restart + WAL-only change should still reuse cached DB and avoid full_decrypt"
);
}
}

269
src/daemon/meta.rs 100644
View File

@ -0,0 +1,269 @@
//! Freshness metadata appended to every q_* response.
//!
//! 背景:`all_keys.json` 是 `wx init` 时的快照。WeChat 在 daemon 启动后随时可能创建
//! 新的 `message_N.db` 分片;如果只信任 init 时收到的 `msg_db_keys` 列表,新分片里
//! 的数据对 daemon 完全不可见 → 调用方拿到的是看似正常但缺数据的结果("stale")。
//!
//! 本模块的职责:
//! 1. 提供 `Meta` 结构体,由各 `q_*` 函数填充后塞进 response顶层 `meta` 字段)。
//! 2. 提供 `discover_unknown_shards(db_dir, msg_db_keys)`:扫描磁盘上当前真实存在的
//! `message/message_*.db` 文件diff 出 daemon 未持有 enc_key 的"未知分片"列表。
//! 3. 集中 `MetaStatus` 的判定规则,避免 8 个 q_* 各自判,规则漂移。
use serde::Serialize;
use std::collections::HashMap;
use std::path::Path;
/// 每条 q_* 响应附带的"新鲜度元数据"。
///
/// 序列化为 JSON 时,所有 `Option` 字段在 `None` 时省略,让最常见的命令调用
/// 输出尽量短重负载字段per_shard_*、shard_paths默认不填由 CLI 层
/// 通过 `--debug-source` 等开关显式请求时才放进来。
#[derive(Debug, Clone, Serialize, Default)]
pub struct Meta {
/// 命中数据中最新一条的 create_timeunix 秒)。
/// `q_history` / `q_search` / `q_new_messages` 等基于 Msg_ 表的查询都应填。
/// `q_sessions` / `q_unread` 这类基于 SessionTable 的查询填会话维度的最新 ts。
#[serde(skip_serializing_if = "Option::is_none")]
pub chat_latest_timestamp: Option<i64>,
/// 上面那条最新消息所在的分片 rel_key`message/message_3.db`)。
/// 让 agent 一眼看出"当前命中的最新数据来自哪个分片"。
#[serde(skip_serializing_if = "Option::is_none")]
pub chat_latest_db: Option<String>,
/// 该 chat 在 `session.db.SessionTable.last_timestamp` 里的值(如果可读)。
/// 这是 WeChat 自己写的"最近一条消息时间",与上面 `chat_latest_timestamp` 比较
/// 即可发现"session 说有更新但 history 没读到" → 漏分片。
#[serde(skip_serializing_if = "Option::is_none")]
pub session_last_timestamp: Option<i64>,
/// 本次查询实际遍历的分片数(即 `names.msg_db_keys.len()` 的子集;包括命中 0 行的)。
pub shards_scanned: usize,
/// 本次查询里至少返回了 1 行的分片数。
pub shards_hit: usize,
/// 磁盘上存在但 daemon 没有 enc_key 的分片 rel_key 列表。
/// 非空 ⇒ `wx init` 之后 WeChat 又分裂了新分片 → 必须重跑 `wx init`。
pub unknown_shards: Vec<String>,
/// 由上述字段派生出的总体状态CLI / agent 主要看这一个。
pub status: MetaStatus,
// 重负载/调试字段默认不填CLI 层显式开启
#[serde(skip_serializing_if = "Option::is_none")]
pub per_shard_latest: Option<HashMap<String, i64>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub cache_mode_per_shard: Option<HashMap<String, String>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub shard_paths: Option<HashMap<String, String>>,
}
#[derive(Debug, Clone, Copy, Serialize, PartialEq, Eq, Default)]
#[serde(rename_all = "snake_case")]
pub enum MetaStatus {
#[default]
Ok,
/// `session.db` 的最新时间明显领先于本次消息查询结果,说明数据可能过期或不完整。
PossiblyStale,
/// 最强信号:磁盘上出现 daemon 不认识的新分片,通常必须重跑 `wx init --force`。
PossiblyStaleUnknownShards,
/// 调用方主动传了 `since` / `until` / `offset` 等窗口条件,结果天然是局部视图。
Windowed,
}
/// session 领先 history 多少秒就报 `PossiblyStale`。
///
/// 24h 的取值是故意保守的:活跃群聊/私聊很少会整整一天没有新消息,
/// 超过这个窗口就值得显式提醒 agent 不要把结果当成“当前最新状态”。
pub const STALE_THRESHOLD_SECS: i64 = 24 * 3600;
/// 统一 freshness status 的优先级:
/// 1. `unknown_shards` 非空daemon 整体视图已经过期,优先返回 `PossiblyStaleUnknownShards`
/// 2. `windowed=true`:调用方本来就在看局部窗口,不参与 stale 推导
/// 3. `session_last - chat_latest > STALE_THRESHOLD_SECS`:返回 `PossiblyStale`
/// 4. 其他情况:`Ok`
pub fn derive_status(
chat_latest: Option<i64>,
session_last: Option<i64>,
unknown_shards: &[String],
windowed: bool,
) -> MetaStatus {
if !unknown_shards.is_empty() {
return MetaStatus::PossiblyStaleUnknownShards;
}
if windowed {
return MetaStatus::Windowed;
}
match (chat_latest, session_last) {
(Some(c), Some(s)) if s - c > STALE_THRESHOLD_SECS => MetaStatus::PossiblyStale,
_ => MetaStatus::Ok,
}
}
/// 扫描 `<db_dir>/message/` 下真实存在的 `message_*.db`diff 出 daemon 当前没有 key
/// 的未知分片。
///
/// 契约:
/// - 返回值一律是 `/` 分隔的 rel_key如 `message/message_3.db`),与 `all_keys.json` 对齐
/// - 结果按字典序排序,方便测试和 CLI 稳定显示
/// - 排除 `_fts*` / `_resource*`,因为它们是索引/附件库,不属于消息分片真相
pub fn discover_unknown_shards(db_dir: &Path, known: &[String]) -> Vec<String> {
let known_set: std::collections::HashSet<String> =
known.iter().map(|k| k.replace('\\', "/")).collect();
let msg_dir = db_dir.join("message");
let entries = match std::fs::read_dir(&msg_dir) {
Ok(it) => it,
Err(_) => return Vec::new(),
};
let mut unknown: Vec<String> = Vec::new();
for entry in entries.flatten() {
let name = entry.file_name();
let Some(name_str) = name.to_str() else {
continue;
};
if !is_message_shard(name_str) {
continue;
}
let rel = format!("message/{}", name_str);
if !known_set.contains(&rel) {
unknown.push(rel);
}
}
unknown.sort();
unknown
}
fn is_message_shard(file_name: &str) -> bool {
if !file_name.starts_with("message_") || !file_name.ends_with(".db") {
return false;
}
if file_name.contains("_fts") || file_name.contains("_resource") {
return false;
}
let stem = &file_name["message_".len()..file_name.len() - ".db".len()];
!stem.is_empty() && stem.chars().all(|c| c.is_ascii_digit())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn is_message_shard_accepts_normal_shards() {
assert!(is_message_shard("message_0.db"));
assert!(is_message_shard("message_12.db"));
}
#[test]
fn is_message_shard_rejects_fts_and_resource() {
assert!(!is_message_shard("message_0_fts.db"));
assert!(!is_message_shard("message_fts.db"));
assert!(!is_message_shard("message_0_resource.db"));
assert!(!is_message_shard("message_resource.db"));
}
#[test]
fn is_message_shard_rejects_non_digits() {
assert!(!is_message_shard("message_a.db"));
assert!(!is_message_shard("message_.db"));
assert!(!is_message_shard("session.db"));
assert!(!is_message_shard("message_0.db.bak"));
}
#[test]
fn discover_unknown_shards_finds_disk_only_shards() {
let dir = tempdir();
let msg_dir = dir.join("message");
std::fs::create_dir_all(&msg_dir).unwrap();
for f in [
"message_0.db",
"message_1.db",
"message_2.db",
"message_0_fts.db",
] {
std::fs::write(msg_dir.join(f), b"").unwrap();
}
let known = vec![
"message/message_0.db".to_string(),
"message/message_1.db".to_string(),
];
let unknown = discover_unknown_shards(&dir, &known);
assert_eq!(unknown, vec!["message/message_2.db".to_string()]);
}
#[test]
fn discover_unknown_shards_normalizes_backslash_in_known_keys() {
let dir = tempdir();
let msg_dir = dir.join("message");
std::fs::create_dir_all(&msg_dir).unwrap();
std::fs::write(msg_dir.join("message_0.db"), b"").unwrap();
let known = vec!["message\\message_0.db".to_string()];
assert!(discover_unknown_shards(&dir, &known).is_empty());
}
#[test]
fn discover_unknown_shards_returns_empty_when_message_dir_missing() {
let dir = tempdir();
assert!(discover_unknown_shards(&dir, &[]).is_empty());
}
#[test]
fn derive_status_unknown_shards_overrides_windowed() {
let unknown = vec!["message/message_3.db".to_string()];
assert_eq!(
derive_status(Some(100), Some(100), &unknown, true),
MetaStatus::PossiblyStaleUnknownShards
);
}
#[test]
fn derive_status_windowed_when_user_paginates() {
assert_eq!(
derive_status(Some(100), Some(999_999), &[], true),
MetaStatus::Windowed,
);
}
#[test]
fn derive_status_possibly_stale_when_session_far_ahead() {
let chat = Some(1_000_000);
let session = Some(1_000_000 + STALE_THRESHOLD_SECS + 1);
assert_eq!(
derive_status(chat, session, &[], false),
MetaStatus::PossiblyStale
);
}
#[test]
fn derive_status_ok_when_within_threshold() {
let chat = Some(1_000_000);
let session = Some(1_000_000 + STALE_THRESHOLD_SECS - 1);
assert_eq!(derive_status(chat, session, &[], false), MetaStatus::Ok);
}
#[test]
fn derive_status_ok_when_either_side_unknown() {
assert_eq!(
derive_status(None, Some(999_999_999), &[], false),
MetaStatus::Ok
);
assert_eq!(derive_status(Some(1), None, &[], false), MetaStatus::Ok);
assert_eq!(derive_status(None, None, &[], false), MetaStatus::Ok);
}
fn tempdir() -> std::path::PathBuf {
let pid = std::process::id();
let nanos = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap()
.as_nanos();
let p = std::env::temp_dir().join(format!("wx-cli-meta-test-{}-{}", pid, nanos));
std::fs::create_dir_all(&p).unwrap();
p
}
}

View File

@ -1,4 +1,5 @@
pub mod cache; pub mod cache;
pub mod meta;
pub mod query; pub mod query;
pub mod server; pub mod server;
@ -8,6 +9,39 @@ use std::sync::Arc;
use crate::config; use crate::config;
fn normalized_rel_key(rel_key: &str) -> String {
rel_key.replace('\\', "/")
}
fn is_msg_db_key(rel_key: &str) -> bool {
let rel_key = normalized_rel_key(rel_key);
rel_key.starts_with("message/message_")
&& rel_key.ends_with(".db")
&& !rel_key.contains("_fts")
&& !rel_key.contains("_resource")
}
fn is_biz_msg_db_key(rel_key: &str) -> bool {
let rel_key = normalized_rel_key(rel_key);
rel_key.starts_with("message/biz_message_")
&& rel_key.ends_with(".db")
&& !rel_key.contains("_fts")
&& !rel_key.contains("_resource")
}
fn collect_db_keys(
all_keys: &HashMap<String, String>,
predicate: fn(&str) -> bool,
) -> Vec<String> {
let mut keys: Vec<String> = all_keys
.keys()
.filter(|k| predicate(k))
.cloned()
.collect();
keys.sort();
keys
}
/// daemon 入口 /// daemon 入口
/// ///
/// 当 WX_DAEMON_MODE 环境变量设置时main() 调用此函数 /// 当 WX_DAEMON_MODE 环境变量设置时main() 调用此函数
@ -25,9 +59,7 @@ async fn async_run() -> Result<()> {
tokio::fs::create_dir_all(&cli_dir).await?; tokio::fs::create_dir_all(&cli_dir).await?;
tokio::fs::create_dir_all(config::cache_dir()).await?; tokio::fs::create_dir_all(config::cache_dir()).await?;
// 写 PID 文件
let pid = std::process::id(); let pid = std::process::id();
tokio::fs::write(config::pid_path(), pid.to_string()).await?;
// 注册 SIGTERM / SIGINT 处理 // 注册 SIGTERM / SIGINT 处理
setup_signal_handler().await; setup_signal_handler().await;
@ -39,7 +71,8 @@ async fn async_run() -> Result<()> {
eprintln!("[daemon] DB_DIR: {}", cfg.db_dir.display()); eprintln!("[daemon] DB_DIR: {}", cfg.db_dir.display());
// 加载密钥 // 加载密钥
let keys_content = tokio::fs::read_to_string(&cfg.keys_file).await let keys_content = tokio::fs::read_to_string(&cfg.keys_file)
.await
.map_err(|e| anyhow::anyhow!("读取密钥文件 {:?} 失败: {}", cfg.keys_file, e))?; .map_err(|e| anyhow::anyhow!("读取密钥文件 {:?} 失败: {}", cfg.keys_file, e))?;
let keys_raw: serde_json::Value = serde_json::from_str(&keys_content)?; let keys_raw: serde_json::Value = serde_json::from_str(&keys_content)?;
let all_keys = extract_keys(&keys_raw); let all_keys = extract_keys(&keys_raw);
@ -49,14 +82,8 @@ async fn async_run() -> Result<()> {
let db = Arc::new(cache::DbCache::new(cfg.db_dir.clone(), all_keys.clone()).await?); let db = Arc::new(cache::DbCache::new(cfg.db_dir.clone(), all_keys.clone()).await?);
// 收集消息 DB 列表 // 收集消息 DB 列表
let msg_db_keys: Vec<String> = all_keys.keys() let msg_db_keys = collect_db_keys(&all_keys, is_msg_db_key);
.filter(|k| { let biz_msg_db_keys = collect_db_keys(&all_keys, is_biz_msg_db_key);
let k = k.replace('\\', "/");
k.contains("message/message_") && k.ends_with(".db")
&& !k.contains("_fts") && !k.contains("_resource")
})
.cloned()
.collect();
// 预热:加载联系人 + 解密 session.db // 预热:加载联系人 + 解密 session.db
eprintln!("[daemon] 预热..."); eprintln!("[daemon] 预热...");
@ -66,18 +93,27 @@ async fn async_run() -> Result<()> {
map: HashMap::new(), map: HashMap::new(),
md5_to_uname: HashMap::new(), md5_to_uname: HashMap::new(),
msg_db_keys: Vec::new(), msg_db_keys: Vec::new(),
biz_msg_db_keys: Vec::new(),
verify_flags: HashMap::new(),
} }
}); });
let mut names = names_raw; let mut names = names_raw;
names.msg_db_keys = msg_db_keys; names.msg_db_keys = msg_db_keys;
names.biz_msg_db_keys = biz_msg_db_keys;
let _ = db.get("session/session.db").await; let _ = db.get("session/session.db").await;
let _ = db.get("sns/sns.db").await;
eprintln!("[daemon] 预热完成,联系人 {}", names.map.len()); eprintln!("[daemon] 预热完成,联系人 {}", names.map.len());
let names_arc = Arc::new(std::sync::RwLock::new(names)); // 包一层内部 ArcIPC 请求取 guard 后只做 Arc::cloneO(1)
// 避免每次请求都全量 clone 几千个联系人的 HashMap。
// 用 tokio::sync::RwLock 允许 guard 跨 await当前不跨为未来 reload 留余地)。
let names_arc = Arc::new(tokio::sync::RwLock::new(Arc::new(names)));
// 启动 IPC server阻塞 // 启动 IPC server阻塞
server::serve(Arc::clone(&db), Arc::clone(&names_arc)).await?; let serve_result = server::serve(Arc::clone(&db), Arc::clone(&names_arc)).await;
cleanup_ipc_files();
serve_result?;
Ok(()) Ok(())
} }
@ -91,7 +127,9 @@ fn extract_keys(json: &serde_json::Value) -> HashMap<String, String> {
let mut result = HashMap::new(); let mut result = HashMap::new();
if let Some(obj) = json.as_object() { if let Some(obj) = json.as_object() {
for (k, v) in obj { for (k, v) in obj {
if k.starts_with('_') { continue; } if k.starts_with('_') {
continue;
}
let enc_key = if let Some(s) = v.as_str() { let enc_key = if let Some(s) = v.as_str() {
s.to_string() s.to_string()
} else if let Some(obj2) = v.as_object() { } else if let Some(obj2) = v.as_object() {
@ -127,8 +165,38 @@ async fn setup_signal_handler() {
}); });
} }
#[cfg(unix)]
fn cleanup_and_exit() { fn cleanup_and_exit() {
let _ = std::fs::remove_file(config::sock_path()); cleanup_ipc_files();
let _ = std::fs::remove_file(config::pid_path());
std::process::exit(0); std::process::exit(0);
} }
fn cleanup_ipc_files() {
let _ = std::fs::remove_file(config::sock_path());
let _ = std::fs::remove_file(config::pid_path());
}
#[cfg(test)]
mod tests {
use super::{is_biz_msg_db_key, is_msg_db_key};
#[test]
fn message_db_key_filter_ignores_biz_and_auxiliary_files() {
assert!(is_msg_db_key("message/message_0.db"));
assert!(is_msg_db_key("message\\message_12.db"));
assert!(!is_msg_db_key("message/biz_message_0.db"));
assert!(!is_msg_db_key("message/message_0.db-wal"));
assert!(!is_msg_db_key("message/message_0_fts.db"));
assert!(!is_msg_db_key("message/message_0_resource.db"));
}
#[test]
fn biz_message_db_key_filter_matches_only_biz_shards() {
assert!(is_biz_msg_db_key("message/biz_message_0.db"));
assert!(is_biz_msg_db_key("message\\biz_message_3.db"));
assert!(!is_biz_msg_db_key("message/message_0.db"));
assert!(!is_biz_msg_db_key("message/biz_message_0.db-wal"));
assert!(!is_biz_msg_db_key("message/biz_message_0_fts.db"));
assert!(!is_biz_msg_db_key("message/biz_message_0_resource.db"));
}
}

File diff suppressed because it is too large Load Diff

View File

@ -2,15 +2,12 @@ use anyhow::Result;
use std::sync::Arc; use std::sync::Arc;
use tokio::io::{AsyncBufReadExt, AsyncWriteExt, BufReader}; use tokio::io::{AsyncBufReadExt, AsyncWriteExt, BufReader};
use crate::ipc::{Request, Response};
use super::cache::DbCache; use super::cache::DbCache;
use super::query::Names; use super::query::Names;
use crate::ipc::{Request, Response};
/// 启动 IPC serverUnix socket / Windows named pipe /// 启动 IPC serverUnix socket / Windows named pipe
pub async fn serve( pub async fn serve(db: Arc<DbCache>, names: Arc<tokio::sync::RwLock<Arc<Names>>>) -> Result<()> {
db: Arc<DbCache>,
names: Arc<std::sync::RwLock<Names>>,
) -> Result<()> {
#[cfg(unix)] #[cfg(unix)]
serve_unix(db, names).await?; serve_unix(db, names).await?;
#[cfg(windows)] #[cfg(windows)]
@ -19,10 +16,7 @@ pub async fn serve(
} }
#[cfg(unix)] #[cfg(unix)]
async fn serve_unix( async fn serve_unix(db: Arc<DbCache>, names: Arc<tokio::sync::RwLock<Arc<Names>>>) -> Result<()> {
db: Arc<DbCache>,
names: Arc<std::sync::RwLock<Names>>,
) -> Result<()> {
use tokio::net::UnixListener; use tokio::net::UnixListener;
let sock_path = crate::config::sock_path(); let sock_path = crate::config::sock_path();
@ -58,7 +52,7 @@ async fn serve_unix(
async fn handle_connection_unix( async fn handle_connection_unix(
stream: tokio::net::UnixStream, stream: tokio::net::UnixStream,
db: Arc<DbCache>, db: Arc<DbCache>,
names: Arc<std::sync::RwLock<Names>>, names: Arc<tokio::sync::RwLock<Arc<Names>>>,
) -> Result<()> { ) -> Result<()> {
let (reader, mut writer) = stream.into_split(); let (reader, mut writer) = stream.into_split();
let mut lines = BufReader::new(reader).lines(); let mut lines = BufReader::new(reader).lines();
@ -86,18 +80,17 @@ async fn handle_connection_unix(
#[cfg(windows)] #[cfg(windows)]
async fn serve_windows( async fn serve_windows(
db: Arc<DbCache>, db: Arc<DbCache>,
names: Arc<std::sync::RwLock<Names>>, names: Arc<tokio::sync::RwLock<Arc<Names>>>,
) -> Result<()> { ) -> Result<()> {
use interprocess::local_socket::{ use interprocess::local_socket::{tokio::prelude::*, GenericNamespaced, ListenerOptions};
tokio::prelude::*, GenericNamespaced, ListenerOptions,
};
let pipe_name = r"\\.\pipe\wx-cli-daemon"; // interprocess 的 GenericNamespaced 在 Windows 上会自动拼接 `\\.\pipe\` 前缀,
let name = pipe_name.to_ns_name::<GenericNamespaced>()?; // 这里必须传相对名client 端用 `\\.\pipe\wx-cli-daemon` 直接打开可以对上
let name = "wx-cli-daemon".to_ns_name::<GenericNamespaced>()?;
let opts = ListenerOptions::new().name(name); let opts = ListenerOptions::new().name(name);
let listener = opts.create_tokio()?; let listener = opts.create_tokio()?;
eprintln!("[server] 监听 {}", pipe_name); eprintln!("[server] 监听 \\\\.\\pipe\\wx-cli-daemon");
loop { loop {
let conn = listener.accept().await?; let conn = listener.accept().await?;
@ -105,118 +98,264 @@ async fn serve_windows(
let names2 = Arc::clone(&names); let names2 = Arc::clone(&names);
tokio::spawn(async move { tokio::spawn(async move {
if let Err(e) = handle_connection_generic(conn, db2, names2).await { if let Err(e) = handle_connection_windows(conn, db2, names2).await {
eprintln!("[server] 连接处理错误: {}", e); eprintln!("[server] 连接处理错误: {}", e);
} }
}); });
} }
} }
async fn dispatch( #[cfg(windows)]
req: Request, async fn handle_connection_windows(
db: &DbCache, conn: interprocess::local_socket::tokio::Stream,
names: &std::sync::RwLock<Names>, db: Arc<DbCache>,
) -> Response { names: Arc<tokio::sync::RwLock<Arc<Names>>>,
use crate::ipc::Request::*; ) -> Result<()> {
let (reader, mut writer) = tokio::io::split(conn);
let mut lines = BufReader::new(reader).lines();
let line = match lines.next_line().await? {
Some(l) => l,
None => return Ok(()),
};
let req: Request = match serde_json::from_str(&line) {
Ok(r) => r,
Err(e) => {
let resp = Response::err(format!("JSON 解析错误: {}", e));
writer.write_all(resp.to_json_line()?.as_bytes()).await?;
return Ok(());
}
};
let resp = dispatch(req, &db, &names).await;
writer.write_all(resp.to_json_line()?.as_bytes()).await?;
Ok(())
}
async fn dispatch(req: Request, db: &DbCache, names: &tokio::sync::RwLock<Arc<Names>>) -> Response {
use super::query; use super::query;
use crate::ipc::Request::*;
// 取 guard → O(1) clone Arc → 立即 drop 锁。后续 await 期间不持有锁,
// 多个并发 IPC 请求可以真正并行。Names 本身不可变(由 daemon 启动时
// 一次性构建),共享 Arc 即可。
let names_arc: Arc<Names> = {
let guard = names.read().await;
Arc::clone(&*guard)
};
match req { match req {
Ping => Response::ok(serde_json::json!({ "pong": true })), Ping => Response::ok(serde_json::json!({ "pong": true })),
Sessions { limit } => { Sessions {
let names_snapshot = match clone_names(names) { limit,
Ok(n) => n, with_meta,
Err(e) => return Response::err(e), debug_source,
}; } => match query::q_sessions(db, &names_arc, limit, with_meta, debug_source).await {
match query::q_sessions(db, &names_snapshot, limit).await { Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()),
},
History {
chat,
limit,
offset,
since,
until,
msg_type,
with_meta,
debug_source,
} => {
match query::q_history(
db,
&names_arc,
&chat,
limit,
offset,
since,
until,
msg_type,
with_meta,
debug_source,
)
.await
{
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()), Err(e) => Response::err(e.to_string()),
} }
} }
History { chat, limit, offset, since, until, msg_type } => { Search {
let names_snapshot = match clone_names(names) { keyword,
Ok(n) => n, chats,
Err(e) => return Response::err(e), limit,
}; since,
match query::q_history(db, &names_snapshot, &chat, limit, offset, since, until, msg_type).await { until,
Ok(v) => Response::ok(v), msg_type,
Err(e) => Response::err(e.to_string()), with_meta,
} debug_source,
} } => {
Search { keyword, chats, limit, since, until, msg_type } => { match query::q_search(
let names_snapshot = match clone_names(names) { db,
Ok(n) => n, &names_arc,
Err(e) => return Response::err(e), &keyword,
}; chats,
match query::q_search(db, &names_snapshot, &keyword, chats, limit, since, until, msg_type).await { limit,
since,
until,
msg_type,
with_meta,
debug_source,
)
.await
{
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()), Err(e) => Response::err(e.to_string()),
} }
} }
Contacts { query, limit } => { Contacts { query, limit } => {
let names_snapshot = match clone_names(names) { match query::q_contacts(&names_arc, query.as_deref(), limit).await {
Ok(n) => n,
Err(e) => return Response::err(e),
};
match query::q_contacts(&names_snapshot, query.as_deref(), limit).await {
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()), Err(e) => Response::err(e.to_string()),
} }
} }
Unread { limit } => { Unread {
let names_snapshot = match clone_names(names) { limit,
Ok(n) => n, filter,
Err(e) => return Response::err(e), with_meta,
}; debug_source,
match query::q_unread(db, &names_snapshot, limit).await { } => match query::q_unread(db, &names_arc, limit, filter, with_meta, debug_source).await {
Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()),
},
Members { chat } => match query::q_members(db, &names_arc, &chat).await {
Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()),
},
NewMessages {
state,
limit,
with_meta,
debug_source,
} => {
match query::q_new_messages(db, &names_arc, state, limit, with_meta, debug_source).await
{
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()), Err(e) => Response::err(e.to_string()),
} }
} }
Members { chat } => { Favorites {
let names_snapshot = match clone_names(names) { limit,
Ok(n) => n, fav_type,
Err(e) => return Response::err(e), query,
}; } => match query::q_favorites(db, limit, fav_type, query).await {
match query::q_members(db, &names_snapshot, &chat).await { Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()),
},
Stats {
chat,
since,
until,
with_meta,
debug_source,
} => {
match query::q_stats(db, &names_arc, &chat, since, until, with_meta, debug_source).await
{
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()), Err(e) => Response::err(e.to_string()),
} }
} }
NewMessages { state, limit } => { SnsNotifications {
let names_snapshot = match clone_names(names) { limit,
Ok(n) => n, since,
Err(e) => return Response::err(e), until,
}; include_read,
match query::q_new_messages(db, &names_snapshot, state, limit).await { } => {
match query::q_sns_notifications(db, &names_arc, limit, since, until, include_read)
.await
{
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()), Err(e) => Response::err(e.to_string()),
} }
} }
Favorites { limit, fav_type, query } => { SnsFeed {
match query::q_favorites(db, limit, fav_type, query).await { limit,
since,
until,
user,
} => match query::q_sns_feed(db, &names_arc, limit, since, until, user.as_deref()).await {
Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()),
},
SnsSearch {
keyword,
limit,
since,
until,
user,
} => {
match query::q_sns_search(
db,
&names_arc,
&keyword,
limit,
since,
until,
user.as_deref(),
)
.await
{
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()), Err(e) => Response::err(e.to_string()),
} }
} }
Stats { chat, since, until } => { ReloadConfig => Response::ok(serde_json::json!({ "reloading": true })),
let names_snapshot = match clone_names(names) { BizArticles {
Ok(n) => n, limit,
Err(e) => return Response::err(e), account,
}; since,
match query::q_stats(db, &names_snapshot, &chat, since, until).await { until,
unread,
} => {
match query::q_biz_articles(db, &names_arc, limit, account, since, until, unread).await
{
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()), Err(e) => Response::err(e.to_string()),
} }
} }
Attachments {
chat,
kinds,
limit,
offset,
since,
until,
with_meta,
debug_source,
} => {
match query::q_attachments(
db,
&names_arc,
&chat,
kinds,
limit,
offset,
since,
until,
with_meta,
debug_source,
)
.await
{
Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()),
}
}
Extract {
attachment_id,
output,
overwrite,
} => match query::q_extract(db, &names_arc, &attachment_id, &output, overwrite).await {
Ok(v) => Response::ok(v),
Err(e) => Response::err(e.to_string()),
},
} }
} }
/// 克隆 Names 以避免 RwLockGuard 跨 await
fn clone_names(names: &std::sync::RwLock<Names>) -> Result<Names, String> {
let guard = names.read().map_err(|_| "内部错误: names lock poisoned".to_string())?;
Ok(Names {
map: guard.map.clone(),
md5_to_uname: guard.md5_to_uname.clone(),
msg_db_keys: guard.msg_db_keys.clone(),
})
}

View File

@ -1,6 +1,6 @@
use std::collections::HashMap;
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
use serde_json::Value; use serde_json::Value;
use std::collections::HashMap;
/// CLI 向 daemon 发送的请求(换行符分隔 JSON与 Python 版兼容) /// CLI 向 daemon 发送的请求(换行符分隔 JSON与 Python 版兼容)
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
@ -10,6 +10,10 @@ pub enum Request {
Sessions { Sessions {
#[serde(default = "default_limit_20")] #[serde(default = "default_limit_20")]
limit: usize, limit: usize,
#[serde(default, skip_serializing_if = "is_false")]
with_meta: bool,
#[serde(default, skip_serializing_if = "is_false")]
debug_source: bool,
}, },
History { History {
chat: String, chat: String,
@ -23,6 +27,10 @@ pub enum Request {
until: Option<i64>, until: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
msg_type: Option<i64>, msg_type: Option<i64>,
#[serde(default, skip_serializing_if = "is_false")]
with_meta: bool,
#[serde(default, skip_serializing_if = "is_false")]
debug_source: bool,
}, },
Search { Search {
keyword: String, keyword: String,
@ -36,6 +44,10 @@ pub enum Request {
until: Option<i64>, until: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
msg_type: Option<i64>, msg_type: Option<i64>,
#[serde(default, skip_serializing_if = "is_false")]
with_meta: bool,
#[serde(default, skip_serializing_if = "is_false")]
debug_source: bool,
}, },
Contacts { Contacts {
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
@ -46,6 +58,13 @@ pub enum Request {
Unread { Unread {
#[serde(default = "default_limit_20")] #[serde(default = "default_limit_20")]
limit: usize, limit: usize,
/// 按会话类型过滤private / group / official / folded / all支持多选
#[serde(default, skip_serializing_if = "Option::is_none")]
filter: Option<Vec<String>>,
#[serde(default, skip_serializing_if = "is_false")]
with_meta: bool,
#[serde(default, skip_serializing_if = "is_false")]
debug_source: bool,
}, },
Members { Members {
chat: String, chat: String,
@ -57,6 +76,10 @@ pub enum Request {
state: Option<HashMap<String, i64>>, state: Option<HashMap<String, i64>>,
#[serde(default = "default_limit_200")] #[serde(default = "default_limit_200")]
limit: usize, limit: usize,
#[serde(default, skip_serializing_if = "is_false")]
with_meta: bool,
#[serde(default, skip_serializing_if = "is_false")]
debug_source: bool,
}, },
Stats { Stats {
chat: String, chat: String,
@ -64,6 +87,10 @@ pub enum Request {
since: Option<i64>, since: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
until: Option<i64>, until: Option<i64>,
#[serde(default, skip_serializing_if = "is_false")]
with_meta: bool,
#[serde(default, skip_serializing_if = "is_false")]
debug_source: bool,
}, },
Favorites { Favorites {
#[serde(default = "default_limit_50")] #[serde(default = "default_limit_50")]
@ -75,9 +102,91 @@ pub enum Request {
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
query: Option<String>, query: Option<String>,
}, },
/// 朋友圈互动通知(点赞 + 评论)
SnsNotifications {
#[serde(default = "default_limit_50")]
limit: usize,
#[serde(skip_serializing_if = "Option::is_none")]
since: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")]
until: Option<i64>,
/// 包含已读通知(默认仅未读)
#[serde(default)]
include_read: bool,
},
/// 朋友圈时间线(按时间 / 作者筛选帖子)
SnsFeed {
#[serde(default = "default_limit_20")]
limit: usize,
#[serde(skip_serializing_if = "Option::is_none")]
since: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")]
until: Option<i64>,
/// 作者昵称 / 备注名 / 微信 username模糊匹配
#[serde(skip_serializing_if = "Option::is_none")]
user: Option<String>,
},
/// 查询公众号文章推送biz_message_*.db 分片)
BizArticles {
#[serde(default = "default_limit_50")]
limit: usize,
/// 公众号名称过滤(模糊匹配 display nameNone = 全部)
#[serde(skip_serializing_if = "Option::is_none")]
account: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
since: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")]
until: Option<i64>,
/// 只看有未读消息的公众号,每个公众号取最新 1 篇
#[serde(default)]
unread: bool,
},
/// 朋友圈全文搜索(匹配 contentDesc
SnsSearch {
keyword: String,
#[serde(default = "default_limit_20")]
limit: usize,
#[serde(skip_serializing_if = "Option::is_none")]
since: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")]
until: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")]
user: Option<String>,
},
/// 重新加载配置和密钥init --force 后 daemon 不会自动重读)
ReloadConfig,
/// 列出某个会话里的图片附件
/// 输出每条带 `attachment_id`(不透明 base64url 句柄),传给 `Extract` 时取回本体
Attachments {
chat: String,
/// 类型过滤:当前仅支持 image
#[serde(default, skip_serializing_if = "Option::is_none")]
kinds: Option<Vec<String>>,
#[serde(default = "default_limit_50")]
limit: usize,
#[serde(default)]
offset: usize,
#[serde(skip_serializing_if = "Option::is_none")]
since: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")]
until: Option<i64>,
#[serde(default, skip_serializing_if = "is_false")]
with_meta: bool,
#[serde(default, skip_serializing_if = "is_false")]
debug_source: bool,
},
/// 提取(解密)单个附件的本体到指定路径
Extract {
/// `Attachments` 返回的不透明 ID
attachment_id: String,
/// 写入的绝对路径daemon 直接写盘,不经 socket 传 binary
output: String,
/// 已存在时是否覆盖
#[serde(default)]
overwrite: bool,
},
} }
/// daemon 的响应 /// daemon 的响应
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Response { pub struct Response {
@ -90,11 +199,19 @@ pub struct Response {
impl Response { impl Response {
pub fn ok(data: Value) -> Self { pub fn ok(data: Value) -> Self {
Self { ok: true, error: None, data } Self {
ok: true,
error: None,
data,
}
} }
pub fn err(msg: impl Into<String>) -> Self { pub fn err(msg: impl Into<String>) -> Self {
Self { ok: false, error: Some(msg.into()), data: Value::Null } Self {
ok: false,
error: Some(msg.into()),
data: Value::Null,
}
} }
pub fn to_json_line(&self) -> anyhow::Result<String> { pub fn to_json_line(&self) -> anyhow::Result<String> {
@ -103,6 +220,15 @@ impl Response {
} }
} }
fn default_limit_20() -> usize { 20 } fn default_limit_20() -> usize {
fn default_limit_50() -> usize { 50 } 20
fn default_limit_200() -> usize { 200 } }
fn default_limit_50() -> usize {
50
}
fn default_limit_200() -> usize {
200
}
fn is_false(v: &bool) -> bool {
!*v
}

View File

@ -4,6 +4,7 @@ mod crypto;
mod scanner; mod scanner;
mod daemon; mod daemon;
mod cli; mod cli;
mod attachment;
fn main() { fn main() {
if std::env::var("WX_DAEMON_MODE").is_ok() { if std::env::var("WX_DAEMON_MODE").is_ok() {

View File

@ -3,7 +3,7 @@
/// 通过 /proc/<pid>/maps 枚举内存区域, /// 通过 /proc/<pid>/maps 枚举内存区域,
/// 通过 /proc/<pid>/mem 读取内存内容, /// 通过 /proc/<pid>/mem 读取内存内容,
/// 搜索 x'<64hex><32hex>' 格式的 SQLCipher 密钥 /// 搜索 x'<64hex><32hex>' 格式的 SQLCipher 密钥
use anyhow::{bail, Context, Result}; use anyhow::{Context, Result};
use std::io::{Read, Seek, SeekFrom}; use std::io::{Read, Seek, SeekFrom};
use std::path::Path; use std::path::Path;

View File

@ -110,7 +110,20 @@ pub fn scan_keys(db_dir: &Path) -> Result<Vec<KeyEntry>> {
let kr = task_for_pid(mach_task_self(), pid, &mut task); let kr = task_for_pid(mach_task_self(), pid, &mut task);
if kr != KERN_SUCCESS { if kr != KERN_SUCCESS {
bail!( bail!(
"task_for_pid 失败 (kr={})\n请确认:(1) 以 root 运行 (2) WeChat 已 ad-hoc 签名", "task_for_pid 失败 (kr={})。请按以下步骤修复:\n\
\n\
1. WeChat \n\
codesign --force --deep --sign - /Applications/WeChat.app\n\
\n\
2. WeChat\n\
killall WeChat && open /Applications/WeChat.app\n\
\n\
3. root\n\
sudo wx init\n\
\n\
codesign \"signature in use\",先执行:\n\
codesign --remove-signature /Applications/WeChat.app/Contents/Frameworks/vlc_plugins/librtp_mpeg4_plugin.dylib\n\
codesign --force --deep --sign - /Applications/WeChat.app",
kr kr
); );
} }

View File

@ -5,19 +5,19 @@
/// - OpenProcess: 获取进程句柄(需要 PROCESS_VM_READ | PROCESS_QUERY_INFORMATION /// - OpenProcess: 获取进程句柄(需要 PROCESS_VM_READ | PROCESS_QUERY_INFORMATION
/// - VirtualQueryEx: 枚举内存区域 /// - VirtualQueryEx: 枚举内存区域
/// - ReadProcessMemory: 读取内存内容 /// - ReadProcessMemory: 读取内存内容
use anyhow::{bail, Context, Result}; use anyhow::{Context, Result};
use std::path::Path; use std::path::Path;
use windows::Win32::Foundation::{CloseHandle, HANDLE}; use windows::Win32::Foundation::{CloseHandle, HANDLE};
use windows::Win32::System::Diagnostics::Debug::ReadProcessMemory;
use windows::Win32::System::Diagnostics::ToolHelp::{ use windows::Win32::System::Diagnostics::ToolHelp::{
CreateToolhelp32Snapshot, Process32First, Process32Next, PROCESSENTRY32, TH32CS_SNAPPROCESS, CreateToolhelp32Snapshot, Process32First, Process32Next, PROCESSENTRY32, TH32CS_SNAPPROCESS,
}; };
use windows::Win32::System::Memory::{ use windows::Win32::System::Memory::{
VirtualQueryEx, MEMORY_BASIC_INFORMATION, MEM_COMMIT, PAGE_READWRITE, VirtualQueryEx, MEMORY_BASIC_INFORMATION, MEM_COMMIT, PAGE_EXECUTE_READWRITE,
PAGE_EXECUTE_WRITECOPY, PAGE_GUARD, PAGE_NOCACHE, PAGE_READWRITE, PAGE_WRITECOMBINE,
PAGE_WRITECOPY,
}; };
use windows::Win32::System::Threading::{ use windows::Win32::System::Threading::{OpenProcess, PROCESS_QUERY_INFORMATION, PROCESS_VM_READ};
OpenProcess, PROCESS_QUERY_INFORMATION, PROCESS_VM_READ,
};
use windows::Win32::System::Diagnostics::Debug::ReadProcessMemory;
use super::{collect_db_salts, KeyEntry}; use super::{collect_db_salts, KeyEntry};
@ -27,9 +27,7 @@ const CHUNK_SIZE: usize = 2 * 1024 * 1024;
/// 查找 Weixin.exe 进程 PID /// 查找 Weixin.exe 进程 PID
fn find_wechat_pid() -> Option<u32> { fn find_wechat_pid() -> Option<u32> {
// SAFETY: CreateToolhelp32Snapshot 标准 Windows API // SAFETY: CreateToolhelp32Snapshot 标准 Windows API
let snap = unsafe { let snap = unsafe { CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0).ok()? };
CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0).ok()?
};
let mut entry = PROCESSENTRY32 { let mut entry = PROCESSENTRY32 {
dwSize: std::mem::size_of::<PROCESSENTRY32>() as u32, dwSize: std::mem::size_of::<PROCESSENTRY32>() as u32,
@ -43,8 +41,8 @@ fn find_wechat_pid() -> Option<u32> {
return None; return None;
} }
loop { loop {
let name = std::ffi::CStr::from_ptr(entry.szExeFile.as_ptr() as *const i8) let name =
.to_string_lossy(); std::ffi::CStr::from_ptr(entry.szExeFile.as_ptr() as *const i8).to_string_lossy();
if name.eq_ignore_ascii_case("Weixin.exe") { if name.eq_ignore_ascii_case("Weixin.exe") {
let pid = entry.th32ProcessID; let pid = entry.th32ProcessID;
let _ = CloseHandle(snap); let _ = CloseHandle(snap);
@ -60,8 +58,7 @@ fn find_wechat_pid() -> Option<u32> {
} }
pub fn scan_keys(db_dir: &Path) -> Result<Vec<KeyEntry>> { pub fn scan_keys(db_dir: &Path) -> Result<Vec<KeyEntry>> {
let pid = find_wechat_pid() let pid = find_wechat_pid().context("找不到 Weixin.exe 进程,请确认微信正在运行")?;
.context("找不到 Weixin.exe 进程,请确认微信正在运行")?;
eprintln!("WeChat PID: {}", pid); eprintln!("WeChat PID: {}", pid);
// SAFETY: OpenProcess 请求读取权限 // SAFETY: OpenProcess 请求读取权限
@ -78,7 +75,9 @@ pub fn scan_keys(db_dir: &Path) -> Result<Vec<KeyEntry>> {
eprintln!("找到 {} 个候选密钥", raw_keys.len()); eprintln!("找到 {} 个候选密钥", raw_keys.len());
// SAFETY: 关闭进程句柄 // SAFETY: 关闭进程句柄
unsafe { let _ = CloseHandle(process); } unsafe {
let _ = CloseHandle(process);
}
let mut entries = Vec::new(); let mut entries = Vec::new();
for (key_hex, salt_hex) in &raw_keys { for (key_hex, salt_hex) in &raw_keys {
@ -119,8 +118,9 @@ fn scan_memory(process: HANDLE) -> Result<Vec<(String, String)>> {
let region_size = mbi.RegionSize; let region_size = mbi.RegionSize;
let base = mbi.BaseAddress as usize; let base = mbi.BaseAddress as usize;
// 只扫描已提交的可读写页面 // 只扫描已提交的可读可写页面。Windows 的保护位可能带 modifier bits
if mbi.State == MEM_COMMIT && mbi.Protect == PAGE_READWRITE { // 也可能是 WRITECOPY / EXECUTE_READWRITE 这种同样可读可写的保护类型。
if mbi.State == MEM_COMMIT && is_writable_readable_page(mbi.Protect.0) {
scan_region(process, base, region_size, &mut results); scan_region(process, base, region_size, &mut results);
} }
@ -133,12 +133,18 @@ fn scan_memory(process: HANDLE) -> Result<Vec<(String, String)>> {
Ok(results) Ok(results)
} }
fn scan_region( fn is_writable_readable_page(protect: u32) -> bool {
process: HANDLE, let base = protect & !(PAGE_GUARD.0 | PAGE_NOCACHE.0 | PAGE_WRITECOMBINE.0);
base: usize, matches!(
size: usize, base,
results: &mut Vec<(String, String)>, x if x == PAGE_READWRITE.0
) { || x == PAGE_WRITECOPY.0
|| x == PAGE_EXECUTE_READWRITE.0
|| x == PAGE_EXECUTE_WRITECOPY.0
)
}
fn scan_region(process: HANDLE, base: usize, size: usize, results: &mut Vec<(String, String)>) {
let overlap = HEX_PATTERN_LEN + 3; let overlap = HEX_PATTERN_LEN + 3;
let mut offset = 0usize; let mut offset = 0usize;
@ -159,7 +165,8 @@ fn scan_region(
buf.as_mut_ptr() as *mut _, buf.as_mut_ptr() as *mut _,
chunk_size, chunk_size,
Some(&mut bytes_read), Some(&mut bytes_read),
).is_ok() )
.is_ok()
}; };
if ok && bytes_read > 0 { if ok && bytes_read > 0 {
@ -203,10 +210,8 @@ fn search_pattern(buf: &[u8], results: &mut Vec<(String, String)>) {
i += 1; i += 1;
continue; continue;
} }
let key_hex = String::from_utf8_lossy(&buf[hex_start..hex_start + 64]) let key_hex = String::from_utf8_lossy(&buf[hex_start..hex_start + 64]).to_lowercase();
.to_lowercase(); let salt_hex = String::from_utf8_lossy(&buf[hex_start + 64..hex_start + 96]).to_lowercase();
let salt_hex = String::from_utf8_lossy(&buf[hex_start + 64..hex_start + 96])
.to_lowercase();
let is_dup = results.iter().any(|(k, s)| k == &key_hex && s == &salt_hex); let is_dup = results.iter().any(|(k, s)| k == &key_hex && s == &salt_hex);
if !is_dup { if !is_dup {
results.push((key_hex, salt_hex)); results.push((key_hex, salt_hex));