Commit Graph

161 Commits (main)
 

Author SHA1 Message Date
jakevin 08af894594
fix(biz-articles): read all biz_message shards (#81) 2026-05-19 14:19:02 +08:00
jakevin 94fcc36ffe
feat(attachments): expose stable group sender identity (#77)
`q_attachments` 群聊场景下两个昵称同名的成员,原本只输出
`sender` 字段(取群名片),无法在 JSON 消费侧区分谁发的图。

跟 #68 把 `sender_username / sender_contact_display /
sender_group_nickname` 一起追加到 attachment row 上,复用
PR68 引入的 `add_sender_identity` / `sender_username` helper,
保持 4 处出口 (history / search / new-messages / stats.top_senders)
+ attachments 的字段语义完全一致。

调整:
- `q_attachments` 元组从 7 字段扩到 8 字段(多带一个稳定 wxid)
- spawn_blocking 内部多算一次 `sender_username`,per-row 复杂度 O(1)
- JSON build 处调用 `add_sender_identity`,行为对齐:非群 / 解析不到
  wxid 时三字段不输出

测试 / 文档:
- 新增 `attachment_row_gets_stable_group_sender_identity_via_helper`,
  锁住"两同名成员可被 sender_username 区分" + "非群 / 未知 sender
  不追加伪字段"
- README + SKILL.md 在 `attachments` 段和顶部 "sender 选择策略" 段
  同时记录新字段,标明 wxid 解析不到时的不输出语义

closes #23
2026-05-19 01:44:03 +08:00
jackwener 0612789d19 Merge pull request #68 from t0m1sacat/kael/sender-identity
fix: expose stable group sender identity
2026-05-19 01:14:58 +08:00
jackwener f8550ae74d Merge pull request #63 from Icy-Cat/feat/windows-mydocument-keyword
feat(windows): resolve MyDocument: token in Weixin data-root ini
2026-05-19 01:14:58 +08:00
jackwener 5f87ce6348 Merge pull request #62 from Icy-Cat/fix/init-error-shows-config-path
fix(init): show config.json path in auto-detect error
2026-05-19 01:14:58 +08:00
jackwener ed95812332 Merge pull request #76 from Suda202/fix/group-nickname-field
fix(members): ignore non-card fields for group nicknames
2026-05-19 01:14:58 +08:00
suda be1a174226 fix(members): ignore non-card fields for group nicknames 2026-05-18 23:18:20 +08:00
kael c34f5f8fe2 fix: expose stable group sender identity 2026-05-16 08:46:37 +08:00
jackwener 739b66a4b1 chore(release): bump version to 0.3.0 2026-05-16 02:23:46 +08:00
jackwener b5edaf7177 feat(meta): expose freshness coverage in query output 2026-05-16 02:22:03 +08:00
jackwener 9f6a2cfba3 review: restore cache mode coverage and rationale comments 2026-05-15 22:33:32 +08:00
jackwener 76024901e9 feat(meta): expose freshness coverage in query output 2026-05-15 22:08:46 +08:00
jakevin 12740afb53
docs(macos): document codesign side-effect popup (#64)
* docs(macos): document codesign side-effect popup ("微信" 想访问其他 App 的数据)

After `codesign --force --deep --sign - /Applications/WeChat.app`, macOS
treats the re-signed WeChat as a different code identity from the
original. When WeChat then accesses its own container / cache / app-group
data (notably triggered when opening 公众号 articles), macOS fires the
"'微信' 想访问其他 App 的数据" popup.

This is a known side-effect of the current macOS invasive init path,
not a "wx-cli is reading other apps' data" issue and not a 公众号-only
problem — 公众号 is just a high-frequency trigger surface because of
WebView / cache access.

Document this in 3 places per agreed scope:
- README.md macOS init: add "副作用提示" callout linking to the guide
- docs/macos-permission-guide.md: new §六 with first-principles
  explanation, mitigation options, and long-term direction
- src/cli/init.rs: print a short macOS-only warning at the end of
  `wx init` so users see it right when they finish the invasive setup

* review: stop overstating the trade-off and condition the init warning

Per codex review on PR #64:

1. src/cli/init.rs warning was unconditional but the wording presumed
   the user had taken the ad-hoc re-sign path. If init goes through the
   tier 2 path (Apple-signed WeChat + GUI Terminal + Developer Tools TCC
   authorization), the warning would mis-fire. Reword conditionally and
   point to the GitHub URL of the doc instead of a relative path that
   release-binary / npm-installed users won't have on disk.

2. docs/macos-permission-guide.md §六 and the matching README callout
   said "restoring official WeChat = giving up macOS memory-scan". This
   contradicts the same guide's §一 实测表 which shows
   "Apple 签名 + 本机 Terminal sudo = ". Restoring the official
   signature only gives up the default re-sign path; the local-Terminal
   + Developer-Tools route still works on Apple-signed WeChat. Only
   SSH + Apple-signed WeChat actually requires re-signing.

* review (round 2): caveat empirical gap + drop emoji

Self-review found two issues both LGTMs missed:

1. The "tier 2 仍走通" claim (README + §六) leans on §一 实测表 row
   "Apple 签名 + 本机 Terminal sudo = ". But that data only covers
   macOS 10.15 (Catalina) and 11.1 (Big Sur). macOS 14/15 — the exact
   versions where the popup behavior originates — were never tested
   for that path in this project. Add an explicit caveat instead of
   silently extrapolating across major macOS versions.

2. `init.rs` warning used a ⚠️ emoji prefix, which violates the
   project + global "no emojis in files unless requested" rule. README
   and the rest of init.rs have no emoji. Replace with `[macOS]`.
2026-05-15 15:47:15 +08:00
Icy-Cat b58ae5468d feat(windows): resolve MyDocument: token in Weixin data-root ini
The data-root ini under %APPDATA%\Tencent\xwechat\config\*.ini is
observed to contain either a plain absolute path (e.g. D:\WeChatFiles)
or the literal token 'MyDocument:'. The token form is not a real
filesystem path, so detect_db_dir_impl() — which previously did
PathBuf::from(content).is_dir() — silently failed on it, even though
the user's Weixin data was sitting in their (possibly relocated)
Documents folder.

Empirically the token denotes 'the calling user's Documents folder'.
We resolve it via SHGetKnownFolderPath(FOLDERID_Documents), which
honours the standard Windows shell-folder redirect (HKCU User Shell
Folders\Personal), so users who moved Documents to e.g. D:\Documents
now auto-detect correctly.

Plain absolute paths still pass through unchanged.

Adds Win32_UI_Shell + Win32_System_Com features to the windows crate
(needed for SHGetKnownFolderPath and CoTaskMemFree).
2026-05-15 11:53:35 +08:00
Icy-Cat 7451ce5684 fix(init): show config.json path in auto-detect error
When auto_detect_db_dir() fails, the error told the user to edit
config.json without saying where that file lives. On Windows that is
%USERPROFILE%\.wx-cli\config.json, which is non-obvious.

Use the config_path already computed at the top of cmd_init() so the
error message includes the absolute path, plus a concrete example of
the db_dir shape.
2026-05-15 11:49:40 +08:00
jackwener 52cc39a55c chore(release): bump version to 0.2.0
主要新增:
- `wx attachments` / `wx extract`:从本地 chat 数据解密提取 V2 图片附件(macOS / Windows)
- `DbCache` WAL 增量复用:daemon 请求路径从每次 ~120s 全量解密压到 < 1s(典型 WAL)

完整 changelog 见 #57 / #58。
2026-05-14 21:38:05 +08:00
jackwener 6424a2162b fix(cache): reuse decrypted db across wal-only updates (#58) 2026-05-14 19:37:22 +08:00
jackwener e9f65ba71b review: preserve wal incremental reuse across restart 2026-05-14 19:35:36 +08:00
jackwener b032b8be04 fix(cache): apply WAL incrementally instead of full re-decrypting on WAL mtime change
DbCache 之前只要 .db 或 .db-wal 任一 mtime 变就 full_decrypt。WeChat 在写消息
时会持续 append WAL(无 checkpoint 时),导致每次 attachments/extract 请求都
重新解密 1.8GB 的 message_0.db(实测 ~120s/次)。

改成三种 hit 路径:
  1. db_mt + wal_mt 都不变 → 直接返回 cached path
  2. db_mt 不变、wal_mt 变了 → 在 cached 产物上**再 apply 一次 WAL**
     (apply_wal 是幂等的:旧帧 redo 同样的 page 写入,新帧追加生效)
  3. db_mt 变了 → 全量解密 + apply WAL(旧路径)

效果:典型 WAL(< 10MB)从 ~120s 压到 < 1s;100MB 大 WAL 也只在 ~7s。
SQLite 不会自发"主库不变 + WAL 清空",所以 path 2 的边角不需要特殊处理。

测试覆盖三条路径:
  - exact_mtime_hit_skips_decrypt
  - wal_only_change_uses_incremental_path
  - db_mtime_change_triggers_full_decrypt
区分手段:cached file 大小是否被 full_decrypt 重写到 PAGE_SZ 倍数。
2026-05-14 19:24:02 +08:00
jackwener ff96f957b7 feat(attachment): support image extraction from local chat data (#57) 2026-05-14 19:11:13 +08:00
jackwener b63589b368 review: tighten attachment extraction scope 2026-05-14 19:10:03 +08:00
jackwener 7feacc6371 fix(daemon): drop redundant `ok` from extract payload (collides with Response.ok)
Response 用 #[serde(flatten)] 把 q_* 返回的 Value 拼到 `{ok, error, ...data}`
里,q_extract 里再塞一个 `"ok": true` 就会在 wire 上写出两个同名 key,CLI
端 `serde_json::from_str::<Response>` 直接报「duplicate field `ok`」,对外
表现是「extract 失败 / 解析 daemon 响应失败」,但 daemon 实际已经把图解出来
了。其他 q_* 都没塞 ok(biz_articles / sessions / history 等),保持一致。
2026-05-14 18:48:46 +08:00
jackwener 2d88c9542d feat(attachment): wire wx attachments / wx extract end-to-end
把 V1 (legacy XOR + V1 fixed-AES) + 平台相关 V2 (macOS / Windows) image 解
密能力一路接到 CLI:

- ipc: 新增 Attachments / Extract 两个 Request variant
- daemon/server: dispatch 路由到 query::q_attachments / q_extract
- daemon/cache: DbCache::db_dir() 公开,让 resolver 推 wxchat_base
- daemon/query: q_attachments 走 Msg_<chat> 表按 (local_type & 0xFFFFFFFF)
  IN (...) 过滤、按 ts DESC 全局排序后分页,返回不透明 attachment_id;
  q_extract 解码 attachment_id → 查 message_resource.db → 找本地 .dat →
  按 magic 分发 v1/v2 解码 → 写盘。bridge 用 ImageKeyMaterial.{aes_key,
  xor_key}(codex 实测真实账号 xor_key=0xa2,不能硬编码 0x88)
- cli: 新增 wx attachments / wx extract 两个子命令,flag 风格与现有
  history / biz-articles 对齐
- README + SKILL: 加附件提取章节,含三档解码档位与 V2 image key 派生说明
2026-05-14 18:40:57 +08:00
jackwener bf8d0d934a feat(attachment): implement V2 image key providers 2026-05-14 18:34:38 +08:00
jackwener 14fdfde1d3 feat(attachment): scaffold module + V1 decoders + resource resolver
Lays down the skeleton for聊天附件 (chat attachment) extraction. This commit
introduces the `attachment` module with:

- `attachment_id`: opaque base64url(json) round-trip handle for CLI/IPC. Carries
  `(chat, local_id, create_time, kind)` — `local_id` alone is not unique
  (实测同 chat 内最多 7 条同 local_id 的记录), so create_time is required for
  disambiguation.
- `decoder/`: dispatch by 6B header magic. Three branches:
  - `V2_MAGIC` → AES-128-ECB + raw + XOR (need image AES key)
  - `V1_MAGIC` → AES-128-ECB with fixed key `cfcd208495d565ef` (= md5("0")[:16])
  - else → legacy single-byte XOR with magic auto-detect
  Manual ECB + PKCS7 unpad to avoid pulling in another crate.
- `resolver`: `message_resource.db` lookup chain
  `username → ChatName2Id.rowid → MessageResourceInfo.packed_info → md5`
  + on-disk `.dat` selection (full > _h > _t) under
  `<wxchat_base>/msg/attach/<md5(chat)>/<YYYY-MM>/Img/<md5>[_t|_h].dat`.
  Honors `message_local_type % 2^32` to strip the high flag bits, and orders by
  `message_create_time DESC` to handle local_id reuse.
- `image_key/`: stub trait + macOS / Windows placeholders. To be filled by
  codex with the V2 image key extraction (kvcomm + brute-force on macOS, memory
  scan on Windows).

V1 decoder ships with 6 unit tests covering every supported magic + the BMP
extra validation; resolver ships with packed_info parser + dat-file selection
tests; v2 decoder ships with header validation tests. 21 tests pass.

`cargo check` and `cargo check --target x86_64-pc-windows-gnu` both clean.
2026-05-14 18:25:32 +08:00
jackwener 5c001b18be chore(release): bump version to 0.1.11 2026-05-14 17:26:20 +08:00
jakevin c4c3b72796
docs(readme): mention Windows VirtualQueryEx + ReadProcessMemory in 原理 section (#55)
The 原理 section previously listed only macOS Mach VM API and Linux /proc/<pid>/mem,
omitting the Windows scanner path that has existed in src/scanner/windows.rs since
the Rust rewrite. Add the Windows API pair and the required process access rights
so the section accurately reflects all three platforms supported in CI/builds.
2026-05-14 17:20:07 +08:00
jakevin 70aa3a44e3
fix(daemon,scanner,crypto): harden lifecycle, widen Windows page scan, fix SQLCipher short read (#54)
- daemon: write pid file only after IPC bound; clean sock+pid on normal return
- transport: PidFile JSON metadata + identity verification (ps/QueryFullProcessImageNameW); SIGTERM with poll-timeout; backward-compat read for plain-text pid
- daemon_cmd: status/stop work with both new JSON and legacy plain-text pid file
- config: cwd → exe_dir → ~/.wx-cli config precedence matches `wx init` write order; Windows DB auto-detect picks newest by latest mtime
- crypto: full_decrypt uses read_exact for intermediate pages, zero-pads only the final partial page; tests cover short-chunk reads and early EOF
- scanner/windows: page protect check covers PAGE_READWRITE / PAGE_WRITECOPY / PAGE_EXECUTE_*WRITE* with modifier-bit stripping

Cross-reviewed by @wx-cli-coder. Windows verified via `cargo check --target x86_64-pc-windows-gnu` (no Windows runtime test).
2026-05-14 17:11:42 +08:00
jakevin d4587b1c68
fix(query): three correctness/latency fixes from deep review (#51)
- q_contacts: replaced ad-hoc `gh_*`/`biz_*` prefix filter with
  `chat_type_of == "private"`. The old filter leaked groups
  (`@chatroom`), folded entries (`brandsessionholder` /
  `@placeholder_foldgroup`), verified service accounts
  (`verify_flag != 0`), and internal `@xxx` system accounts into
  `wx contacts` output.

- q_search: parallelized the per-message-DB blocking phase via
  `JoinSet::spawn_blocking`. Previously the `for (db_path, ...) in
  by_path { ... .await }` loop ran one DB at a time; users with N
  message_*.db shards paid N× latency. Each DB now runs concurrently
  on the blocking pool; total latency collapses to a single slow DB.

- q_new_messages: fixed `new_state` reset path so first-run + truncated
  sessions don't lock `since_ts` at `fallback_ts` forever. Old code
  always wrote `state[uname] = old_since_ts || fallback_ts` for changed
  sessions, then advanced only those that appeared in `all_msgs`. On
  first run (state=None) truncated sessions ended up with
  `state[uname] = now-86400` and stayed there across calls — every
  subsequent call re-scanned a window that grew with elapsed time.
  New logic separates three cases:
    * in_results        → advance to returned_max (incremental fetch)
    * truncated + state → keep prev since_ts (retry next call)
    * truncated + none  → advance to session_ts (avoid lock-in; old
                          messages remain reachable via `wx history`).
2026-05-14 17:11:27 +08:00
jakevin f0f3d3cf22
feat(favorites): expose article url field (#50)
Co-authored-by: Kyrie <kyrie@mallab.world>
2026-05-14 16:08:48 +08:00
陈源泉 dab3217d3f
feat(biz): add wx biz-articles command to query public account messages (#33)
* feat(biz): add biz-articles command to query public account messages

加载 biz_message_0.db 提取公众号推送(标题/url/作者/时间)。

- daemon 端通过 DbCache 按需解密 biz_message_0.db(密钥已在 all_keys.json 中)
- 新增 IPC 变体 BizArticles(limit/account/since/until 参数)
- 新增 query 处理器 q_biz_articles:
  - 通过 Name2Id 反查 gh_* username → md5 → Msg_<hash> 表映射
  - 过滤 local_type & 0xFFFFFFFF = 49(appmsg 公众号文章)
  - zstd 解压 + extract_cdata 解析 <mmreader>/<item> XML
  - 支持多文章推送(一条消息含多篇文章)
  - 输出字段:time/timestamp/recv_time/account/account_username/title/url/digest/cover_url
- 新增 CLI 子命令 wx biz-articles,参数:-n / --account / --since / --until / --json
- 新增工具函数 extract_cdata(CDATA 块解析)和 parse_biz_xml_items
- 新增 8 个单测(biz_tests 模块)覆盖 CDATA 解析和多文章场景

支持工作流:
  wx biz-articles --since today --json | jq ".[].url" | xargs opencli weixin download

Verified: 返朴 ADHD 文章、Datawhale Claude Code 文章、土猛员外知识引擎文章均已正确提取。

* feat(biz-articles): add --unread filter (one latest article per account)

只列「有未读的公众号」的最近 1 篇文章 — 与 'wx unread --filter official'
行为一致,便于扫描"哪些公众号还有未读,标题是啥"。

- ipc.rs: BizArticles 加 unread: bool 字段(serde default = false 向后兼容)
- cli/mod.rs: --unread flag
- cli/biz_articles.rs: 透传 unread
- daemon/server.rs: dispatch 加 unread 参数
- daemon/query.rs: q_biz_articles
  - 开启 --unread 时先查 session.db 拿 unread_count>0 且
    chat_type==official_account 的 username 集合
  - 与 --account 取交集(两者都给时进一步缩小范围)
  - 空交集提前 return,避免无意义全表扫
  - 解析后按 pub_time DESC 排,每个 account_username 只保留首条
  - 最后再 truncate(limit)

* docs: PR draft - update --unread + --until usage

* chore(biz-articles): drop PR draft, document command, fix typo

- 删除 PR_DRAFT.md(误入 repo 的 PR 描述草稿,不该进 main)
- README.md / SKILL.md 补 biz-articles 用法
- query.rs: 密鑰 → 密钥

Co-authored-by: wx-cli-coder <coder@example.com>

---------

Co-authored-by: jackwener <jakevingoo@gmail.com>
Co-authored-by: wx-cli-coder <coder@example.com>
2026-05-14 16:07:39 +08:00
Haoqing Wang c284b4ade6
fix: parse appmsg subtypes from type 49 messages (#24) 2026-05-14 15:29:01 +08:00
jakevin 9d5a78ac04
docs(macOS): document TCC csreq invalidation after re-signing WeChat (#48)
macOS TCC binds permissions to (bundle id, csreq) where csreq encodes
the app's code signature. `codesign --force --deep --sign -` on
WeChat changes the csreq, silently invalidating every existing TCC
grant for com.tencent.xinWeChat — yet System Settings still paints
each toggle as ON because the UI only checks bundle id, hiding the
drift. WeChat then reprompts for screen recording / camera /
microphone / file access despite "looking allowed".

Three doc-only updates, no code changes:

- README.md quick start: add the `tccutil reset` loop right after the
  codesign step, plus a one-line callout pointing at the deep-dive
  section.
- SKILL.md macOS init flow: same loop in the agent-readable order, so
  agents executing the steps don't skip it.
- docs/macos-permission-guide.md: new section 五 with first-principles
  root cause, the reset loop, the macOS 26 "录屏与系统录音 / 仅系统
  录音" UI split footgun, and ad-hoc signature verification.

Builds on the BobbyCat PR #29 — keeps the symptom description and the
macOS 26 UI split note, expands scope from ScreenCapture-only to all
TCC services that re-signing actually breaks (Camera / Microphone /
AppleEvents / AddressBook / Documents / Downloads / Desktop), drops
the misleading TCC.db sqlite query (path varies by macOS version, can
need FDA, and is no more useful than just trying WeChat's screenshot
again), and explicitly leaves the reset as a manual step rather than
auto-running it from `wx init` because it would wipe currently-working
grants.

Co-authored-by: BobbyCat <114374951+BobbyCats@users.noreply.github.com>
2026-05-14 15:13:50 +08:00
Tsing 1b00d04598
feat: expose url field for link/appmsg messages (#18)
* feat: expose url field for link/appmsg messages

Extract <url> from appmsg XML in type-49 messages and append it as
a 'url' field in history/search output. The field is omitted when
the message has no valid URL (non-link types, empty, non-http).

* fix: normalize appmsg urls across query outputs

---------

Co-authored-by: tsinghu <tsinghu@tencent.com>
Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 14:46:34 +08:00
Haoqing Wang b0431352ce
feat(appmsg): 支持引用消息原文解析 (#28)
* feat(appmsg): parse quoted message content

* docs(appmsg): document quote message output
2026-05-14 14:42:03 +08:00
Haoqing Wang 35a8f0e94b
feat(group): 支持群昵称/群名片展示 (#23)
* feat: support group nicknames

* fix(group): keep duplicate nickname senders separate in stats

---------

Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 14:22:55 +08:00
刘传佳 d750ef6e9f
fix(cli,config): 修复 sudo 下初始化失败 + daemon 不重载问题 (#37)
* fix(cli,config): 修复 sudo 下初始化失败 + daemon 不重载问题

  - cli/transport: 新增 stop_daemon(),init 后自动停止旧 daemon
  - config: cli_dir() 优先读 SUDO_USER 环境变量,避免写到 /root/.wx-cli
  - config: auto_detect_db_dir() 按 .db 文件最新 mtime 排序,正确选最新目录
  - daemon/server: dispatch 新增 ReloadConfig 命令(预留)
  - ipc: Request 新增 ReloadConfig 变体
  - scanner/linux: 移除调试日志,清理 unused bail import

* fix(config): resolve sudo home via passwd lookup

---------

Co-authored-by: cjliu <cjliu@upointech.com>
Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 13:50:04 +08:00
jackwener 6659f48984 chore: bump version to 0.1.10 2026-04-19 21:27:59 +08:00
jakevin c7e2775aa6
perf(sns): parse_post_xml 单走 roxmltree DOM,去掉 regex+DOM 双解析 (#17)
* perf(sns): parse_post_xml 单走 roxmltree DOM,去掉 regex+DOM 双解析

之前一份 SnsTimeLine.content 在 q_sns_feed / q_sns_search 全表扫描时
要被解两次:extract_xml_text 走字符串扫描取 createTime / contentDesc
/ username,parse_post_media 再 build 一次完整 roxmltree DOM 取媒体
列表。10k+ 行扫描时是显式的工作浪费。

本次重构:

- parse_post_xml 一次性 Document::parse,定位到 TimelineObject 之后所有
  字段(createTime / contentDesc / username / media / location)共用同
  一个 doc,roxmltree 只 build 一次。
- 把 parse_post_media 拆成 parse_media_from_timeline(node),避免外部
  parse 之后又重新 parse;旧的 parse_post_media(&str) 单测专用,标
  #[cfg(test)]。
- 删除 sns_location_re(不再需要 regex 抽 poiName)。
- 副作用:roxmltree 自动解码 XML entity,所以 content / location /
  username 字段输出的是解码后文本(旧版字符串扫描原样保留 `&lt;` 等)。
  对下游是更正确的语义;新增 parse_decodes_xml_entities_in_content 单
  测把行为锁住。
- 新增 parse_returns_defaults_for_malformed_xml 单测覆盖 DOM parse 失败
  时的 fallback 路径(不 panic、author 走 column fallback)。

q_sns_search 的 LIKE 预筛仍走 extract_xml_text(contentDesc) 字符串扫描
做 false-positive 过滤——这一步比 build 一棵 DOM 更快,是真优化,保
留。q_sns_notifications 也仍用 extract_xml_text,本 PR 不动(每次只跑
~limit 条,DOM 化收益小,避免扩大 scope)。

验证:
- cargo check ×3 target (darwin / windows-gnu / linux-gnu)
- cargo test 39 passed (37 → 39,新增 2 个)

* refactor(sns): parse_post_xml dedup 两份 ParsedPost 早 return 块

merge 前自查发现 Document::parse 失败 / 找不到 TimelineObject 两条
fallback 路径写了完全相同的 9 行 ParsedPost 字面量。抽成 empty()
闭包,从 2×9 行降到 1×7 行 + 两个 return empty()。

行为完全等价(含 author = column fallback)。

* fix(sns): salvage scalar fields from malformed post xml
2026-04-19 13:56:55 +08:00
郭立lee 2b5d872f0b
feat(sns): sns-feed / sns-search 输出完整 media[] 字段 (#15)
#14 之上增量:把 sns-feed / sns-search 的 media_count 升级成完整 media[] 数组(含 url/thumb/key/token/md5/enc_idx/size + video_md5/duration),下游可直接做图片代理或离线渲染。

- 用 roxmltree(pure Rust,无 C 依赖)替代 regex 抽属性
- 字段命名对齐 artifacts 仓库 Python _parse_media,跨实现 diff 友好
- 14 个 sns 单测:作者新增 6 个 fixture(单图/三图/视频/纯文字/malformed/缺 totalSize)+ 已有 8 个保持
- 与之前 PR #14 的 --user XML fallback 修复 / SNS_MAX_LIMIT / SNS_MAX_SCAN / escape_like_pattern 完全兼容

Author: leeguooooo <guoli@zhihu.com>
Co-fixed-by: wx-cli-coder (rebase + 冲突解决 + 测试模块合并 + media_count 语义文档补充)
2026-04-19 02:22:55 +08:00
JL e8939f315d
feat(sns): sns-notifications / sns-feed / sns-search (#14)
新增 3 个朋友圈相关命令:sns-notifications / sns-feed / sns-search。
PR review 修复(已 push 进同一分支):
- 修 --user 过滤与 XML <username> fallback 打架的 bug(@wx-cli-codex 发现)
- 加 SNS_MAX_LIMIT / SNS_MAX_SCAN 防御性上限
- 抽 escape_like_pattern() helper
- 补 8 个单测(parse_post_xml / escape_like_pattern)

Cargo check 三 target 全过:aarch64-darwin / x86_64-pc-windows-gnu / x86_64-unknown-linux-gnu。
Co-authored-by: fengliu222 <fengliu222@users.noreply.github.com>
2026-04-19 01:58:21 +08:00
郭立lee f0dcd4ea05
docs(readme): explain how to fetch more than 500 messages (#13)
Clarify that the 500-message behavior is only a default limit, not a hard cap.
Document `-n/--limit` examples for history, search, and export in both README and SKILL.
2026-04-18 15:01:15 +08:00
jackwener 697d3fc720 chore: bump version to 0.1.9 2026-04-18 02:11:28 +08:00
jackwener 1e52014a6b perf(daemon): Arc<Names> + tokio RwLock, O(1) clone per IPC request
Was: Arc<std::sync::RwLock<Names>>; each dispatch clone_names() copied
4 HashMaps (~100KB for a user with 2700 contacts) and used std RwLock
which blocks the tokio worker thread during the clone.

Now: Arc<tokio::sync::RwLock<Arc<Names>>>; dispatch takes the read
guard, does Arc::clone (pointer bump), drops the guard, then spawns
the query work. Names is immutable after daemon startup; Arc is ideal.

Smoke tested: `wx sessions --json` returns correct data including
chat_type; 8 concurrent clients finish in 12ms.
2026-04-18 02:10:45 +08:00
JL e977007306
feat(unread): 按 chat_type 分类会话,新增 --filter (#9)
Before: wx unread / sessions / history 把公众号、订阅号折叠入口
(brandsessionholder)、折叠群聊(@placeholder_foldgroup)、认证服务号
全归为 is_group=false,与真私聊混在一起。甚至 username 形如 wxid_* 但
实为公众号的条目也完全分不出来。

改动:
- 新增 chat_type_of(username, names) helper,输出固定为
  group / official_account / folded / private。
- 判据依次:@chatroom → group;brandsessionholder / @placeholder_foldgroup
  → folded;contact.verify_flag != 0 → official_account(覆盖 wxid_*
  伪装为公众号的情况,以及银行/品牌服务号、qqsafe / mphelper 等认证账号);
  gh_* / biz_* / @* 前缀兜底;其余为 private。
- load_names 顺带读 contact.verify_flag,Names::is_verified 封装查询。
- q_sessions / q_unread / q_history / q_new_messages / q_stats 输出
  新增 chat_type 字段,is_group 保留向后兼容并统一由 chat_type 派生。
- wx unread 新增 --filter,clap value_parser 限制可选值为
  all / private / group / official / folded,逗号分隔多选,默认 all。
  例:wx unread --filter private,group 可过滤公众号与折叠入口。
- SKILL.md / README.md 补充新字段与用法说明。
- .gitignore 补 target/(Rust 项目标配)。

性能:默认 wx unread 的 SQL 与改动前相同(保留 LIMIT)。仅当传入
--filter 时改为全表扫再 Rust 侧过滤,否则 SQL LIMIT 会先把匹配
filter 的条目截断导致漏召。
2026-04-18 01:59:35 +08:00
jackwener bfb7048cf0 fix: bind CLI --version to crate version (credit: @leeguooooo #4) 2026-04-18 01:55:37 +08:00
jackwener c564438994 chore: bump version to 0.1.8 2026-04-18 01:50:25 +08:00
jackwener e44990ba01 fix: drop privileges after key scan to avoid root-owned ~/.wx-cli/ (#7 #8)
Root cause: `wx init` does two conceptually-separate things in one
privileged process: (1) scan WeChat memory for keys (needs root) and
(2) write ~/.wx-cli/{all_keys,config}.json (needs only user). When
run under sudo, the files inherit root ownership, so later the daemon
(forked as the user) can't create daemon.sock/log/pid → silent 15s
timeout.

Also: all_keys.json is the raw AES key; 0644 leaked it to every user
on the system.

Fix in init.rs: after the scan completes, immediately setgid+setuid
back to \$SUDO_UID/\$SUDO_GID and set umask 0o077 before any file I/O.
Files are then created as the real user with 0600 by default. Migrate
old broken installs by chown+chmod-recursive before the setuid call.

Fix in transport.rs: pre-check that ~/.wx-cli/ is writable before
spawning daemon; on EACCES print a clear "sudo chown -R ..." hint
instead of the useless "daemon 启动超时" message.
2026-04-18 01:48:42 +08:00
jackwener ae74072b3f docs: add Windows cross-check setup and IPC same-library rule 2026-04-17 16:43:05 +08:00
jackwener 4e6907c5cc chore: bump version to 0.1.7 2026-04-17 16:42:02 +08:00