Commit Graph

26 Commits (c284b4ade6690c6c4e5d6559db1d5f3f8d380450)

Author SHA1 Message Date
Haoqing Wang c284b4ade6
fix: parse appmsg subtypes from type 49 messages (#24) 2026-05-14 15:29:01 +08:00
Tsing 1b00d04598
feat: expose url field for link/appmsg messages (#18)
* feat: expose url field for link/appmsg messages

Extract <url> from appmsg XML in type-49 messages and append it as
a 'url' field in history/search output. The field is omitted when
the message has no valid URL (non-link types, empty, non-http).

* fix: normalize appmsg urls across query outputs

---------

Co-authored-by: tsinghu <tsinghu@tencent.com>
Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 14:46:34 +08:00
Haoqing Wang b0431352ce
feat(appmsg): 支持引用消息原文解析 (#28)
* feat(appmsg): parse quoted message content

* docs(appmsg): document quote message output
2026-05-14 14:42:03 +08:00
Haoqing Wang 35a8f0e94b
feat(group): 支持群昵称/群名片展示 (#23)
* feat: support group nicknames

* fix(group): keep duplicate nickname senders separate in stats

---------

Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 14:22:55 +08:00
刘传佳 d750ef6e9f
fix(cli,config): 修复 sudo 下初始化失败 + daemon 不重载问题 (#37)
* fix(cli,config): 修复 sudo 下初始化失败 + daemon 不重载问题

  - cli/transport: 新增 stop_daemon(),init 后自动停止旧 daemon
  - config: cli_dir() 优先读 SUDO_USER 环境变量,避免写到 /root/.wx-cli
  - config: auto_detect_db_dir() 按 .db 文件最新 mtime 排序,正确选最新目录
  - daemon/server: dispatch 新增 ReloadConfig 命令(预留)
  - ipc: Request 新增 ReloadConfig 变体
  - scanner/linux: 移除调试日志,清理 unused bail import

* fix(config): resolve sudo home via passwd lookup

---------

Co-authored-by: cjliu <cjliu@upointech.com>
Co-authored-by: jackwener <jakevingoo@gmail.com>
2026-05-14 13:50:04 +08:00
jakevin c7e2775aa6
perf(sns): parse_post_xml 单走 roxmltree DOM,去掉 regex+DOM 双解析 (#17)
* perf(sns): parse_post_xml 单走 roxmltree DOM,去掉 regex+DOM 双解析

之前一份 SnsTimeLine.content 在 q_sns_feed / q_sns_search 全表扫描时
要被解两次:extract_xml_text 走字符串扫描取 createTime / contentDesc
/ username,parse_post_media 再 build 一次完整 roxmltree DOM 取媒体
列表。10k+ 行扫描时是显式的工作浪费。

本次重构:

- parse_post_xml 一次性 Document::parse,定位到 TimelineObject 之后所有
  字段(createTime / contentDesc / username / media / location)共用同
  一个 doc,roxmltree 只 build 一次。
- 把 parse_post_media 拆成 parse_media_from_timeline(node),避免外部
  parse 之后又重新 parse;旧的 parse_post_media(&str) 单测专用,标
  #[cfg(test)]。
- 删除 sns_location_re(不再需要 regex 抽 poiName)。
- 副作用:roxmltree 自动解码 XML entity,所以 content / location /
  username 字段输出的是解码后文本(旧版字符串扫描原样保留 `&lt;` 等)。
  对下游是更正确的语义;新增 parse_decodes_xml_entities_in_content 单
  测把行为锁住。
- 新增 parse_returns_defaults_for_malformed_xml 单测覆盖 DOM parse 失败
  时的 fallback 路径(不 panic、author 走 column fallback)。

q_sns_search 的 LIKE 预筛仍走 extract_xml_text(contentDesc) 字符串扫描
做 false-positive 过滤——这一步比 build 一棵 DOM 更快,是真优化,保
留。q_sns_notifications 也仍用 extract_xml_text,本 PR 不动(每次只跑
~limit 条,DOM 化收益小,避免扩大 scope)。

验证:
- cargo check ×3 target (darwin / windows-gnu / linux-gnu)
- cargo test 39 passed (37 → 39,新增 2 个)

* refactor(sns): parse_post_xml dedup 两份 ParsedPost 早 return 块

merge 前自查发现 Document::parse 失败 / 找不到 TimelineObject 两条
fallback 路径写了完全相同的 9 行 ParsedPost 字面量。抽成 empty()
闭包,从 2×9 行降到 1×7 行 + 两个 return empty()。

行为完全等价(含 author = column fallback)。

* fix(sns): salvage scalar fields from malformed post xml
2026-04-19 13:56:55 +08:00
郭立lee 2b5d872f0b
feat(sns): sns-feed / sns-search 输出完整 media[] 字段 (#15)
#14 之上增量:把 sns-feed / sns-search 的 media_count 升级成完整 media[] 数组(含 url/thumb/key/token/md5/enc_idx/size + video_md5/duration),下游可直接做图片代理或离线渲染。

- 用 roxmltree(pure Rust,无 C 依赖)替代 regex 抽属性
- 字段命名对齐 artifacts 仓库 Python _parse_media,跨实现 diff 友好
- 14 个 sns 单测:作者新增 6 个 fixture(单图/三图/视频/纯文字/malformed/缺 totalSize)+ 已有 8 个保持
- 与之前 PR #14 的 --user XML fallback 修复 / SNS_MAX_LIMIT / SNS_MAX_SCAN / escape_like_pattern 完全兼容

Author: leeguooooo <guoli@zhihu.com>
Co-fixed-by: wx-cli-coder (rebase + 冲突解决 + 测试模块合并 + media_count 语义文档补充)
2026-04-19 02:22:55 +08:00
JL e8939f315d
feat(sns): sns-notifications / sns-feed / sns-search (#14)
新增 3 个朋友圈相关命令:sns-notifications / sns-feed / sns-search。
PR review 修复(已 push 进同一分支):
- 修 --user 过滤与 XML <username> fallback 打架的 bug(@wx-cli-codex 发现)
- 加 SNS_MAX_LIMIT / SNS_MAX_SCAN 防御性上限
- 抽 escape_like_pattern() helper
- 补 8 个单测(parse_post_xml / escape_like_pattern)

Cargo check 三 target 全过:aarch64-darwin / x86_64-pc-windows-gnu / x86_64-unknown-linux-gnu。
Co-authored-by: fengliu222 <fengliu222@users.noreply.github.com>
2026-04-19 01:58:21 +08:00
jackwener 1e52014a6b perf(daemon): Arc<Names> + tokio RwLock, O(1) clone per IPC request
Was: Arc<std::sync::RwLock<Names>>; each dispatch clone_names() copied
4 HashMaps (~100KB for a user with 2700 contacts) and used std RwLock
which blocks the tokio worker thread during the clone.

Now: Arc<tokio::sync::RwLock<Arc<Names>>>; dispatch takes the read
guard, does Arc::clone (pointer bump), drops the guard, then spawns
the query work. Names is immutable after daemon startup; Arc is ideal.

Smoke tested: `wx sessions --json` returns correct data including
chat_type; 8 concurrent clients finish in 12ms.
2026-04-18 02:10:45 +08:00
JL e977007306
feat(unread): 按 chat_type 分类会话,新增 --filter (#9)
Before: wx unread / sessions / history 把公众号、订阅号折叠入口
(brandsessionholder)、折叠群聊(@placeholder_foldgroup)、认证服务号
全归为 is_group=false,与真私聊混在一起。甚至 username 形如 wxid_* 但
实为公众号的条目也完全分不出来。

改动:
- 新增 chat_type_of(username, names) helper,输出固定为
  group / official_account / folded / private。
- 判据依次:@chatroom → group;brandsessionholder / @placeholder_foldgroup
  → folded;contact.verify_flag != 0 → official_account(覆盖 wxid_*
  伪装为公众号的情况,以及银行/品牌服务号、qqsafe / mphelper 等认证账号);
  gh_* / biz_* / @* 前缀兜底;其余为 private。
- load_names 顺带读 contact.verify_flag,Names::is_verified 封装查询。
- q_sessions / q_unread / q_history / q_new_messages / q_stats 输出
  新增 chat_type 字段,is_group 保留向后兼容并统一由 chat_type 派生。
- wx unread 新增 --filter,clap value_parser 限制可选值为
  all / private / group / official / folded,逗号分隔多选,默认 all。
  例:wx unread --filter private,group 可过滤公众号与折叠入口。
- SKILL.md / README.md 补充新字段与用法说明。
- .gitignore 补 target/(Rust 项目标配)。

性能:默认 wx unread 的 SQL 与改动前相同(保留 LIMIT)。仅当传入
--filter 时改为全表扫再 Rust 侧过滤,否则 SQL LIMIT 会先把匹配
filter 的条目截断导致漏召。
2026-04-18 01:59:35 +08:00
jackwener bfb7048cf0 fix: bind CLI --version to crate version (credit: @leeguooooo #4) 2026-04-18 01:55:37 +08:00
jackwener e44990ba01 fix: drop privileges after key scan to avoid root-owned ~/.wx-cli/ (#7 #8)
Root cause: `wx init` does two conceptually-separate things in one
privileged process: (1) scan WeChat memory for keys (needs root) and
(2) write ~/.wx-cli/{all_keys,config}.json (needs only user). When
run under sudo, the files inherit root ownership, so later the daemon
(forked as the user) can't create daemon.sock/log/pid → silent 15s
timeout.

Also: all_keys.json is the raw AES key; 0644 leaked it to every user
on the system.

Fix in init.rs: after the scan completes, immediately setgid+setuid
back to \$SUDO_UID/\$SUDO_GID and set umask 0o077 before any file I/O.
Files are then created as the real user with 0600 by default. Migrate
old broken installs by chown+chmod-recursive before the setuid call.

Fix in transport.rs: pre-check that ~/.wx-cli/ is writable before
spawning daemon; on EACCES print a clear "sudo chown -R ..." hint
instead of the useless "daemon 启动超时" message.
2026-04-18 01:48:42 +08:00
jackwener 6a2b23486a fix: client connects via interprocess on Windows, not OpenOptions
Server uses interprocess::local_socket, but client was using
std::fs::OpenOptions("\\.\pipe\wx-cli-daemon") which fails to
connect to pipes created by interprocess's tokio listener.

Use the same interprocess client API on both sides for consistency.

Verified with: cargo check --target x86_64-pc-windows-gnu (mingw-w64).
2026-04-17 16:41:32 +08:00
jackwener 18daf5b22e fix: Windows init and daemon startup (issue #5)
Three related bugs caused "wx init" and daemon startup to fail on Windows:

1. init.rs: create ~/.wx-cli/ before writing all_keys.json (was created
   only before config.json, so first write failed with ENOENT)

2. transport.rs (Windows): daemon.log was always empty because stderr
   was never redirected, and log file open silently fell back to null
   when parent dir didn't exist. Now mirror the Unix version: create
   parent dir, try_clone to redirect both stdout and stderr.

3. server.rs (Windows): interprocess GenericNamespaced auto-prepends
   \\.\pipe\ on Windows. Passing the full path caused a double-prefixed
   pipe name that clients (using raw \\.\pipe\wx-cli-daemon) could
   never connect to, leading to the 15s startup timeout.
2026-04-17 14:01:04 +08:00
jackwener e4bfc39c8f fix: improve task_for_pid error message and document codesign steps 2026-04-17 10:46:55 +08:00
jackwener d8f4c6e87d fix: replace macOS-only libc::__error() with std::io::Error::last_os_error() 2026-04-16 23:35:30 +08:00
jackwener 59dd6bfa24 fix: Windows build errors (handle_connection, creation_flags, mkdir)
- server.rs: add handle_connection_windows for named pipe connections
- transport.rs: import CommandExt trait for creation_flags on Windows
- release.yml: mkdir -p before binary copy to npm bin dirs
2026-04-16 23:14:58 +08:00
jackwener 6cdc806642 chore: Apache-2.0 license, Windows support, install.ps1 2026-04-16 22:30:45 +08:00
jackwener 33b4249bd5 fix: 系统消息/撤回消息解析,补全消息类型格式化
- type 10000 (系统消息): 解析 <content> 标签,显示 [系统] 内容
- type 10002 (撤回): 解析 <content>,显示 [撤回] 内容
- type 34 (语音) / 43 (视频): 之前漏了,现在显示 [语音]/[视频]
- 避免 raw XML 出现在 history/watch 输出中
2026-04-16 17:22:54 +08:00
jackwener 7f869e7c3b fix: 深度 review 修复 10 个 bug/问题
Critical & High:
- daemon 日志:启动时将 stdout/stderr 重定向到 ~/.wx-cli/daemon.log
  而非 /dev/null,使 wx daemon logs 真正可用
- q_history 找不到聊天时改为 bail! 而非 ok:true+error 字段,
  避免 CLI 静默返回空输出
- init 写 config.json 默认路径改为 ~/.wx-cli/config.json,
  避免写入系统 bin 目录(/usr/local/bin/config.json)
- LIKE 通配符:搜索关键词中的 %/_/\ 现在正确转义
- WAL 路径:改用 OsString.push 拼接 "-wal" 后缀,
  避免 display() 在非 UTF-8 路径上失效
- cmd_stop:检查 kill() 返回值,ESRCH 时给出明确提示

Performance & Code quality:
- full_decrypt:改为流式逐页读写,峰值内存从 2×文件大小降为 O(1)
- Regex:msg_table_re() 用 OnceLock 静态编译,避免热路径重复编译
- mtime_nanos:消除 daemon/mod.rs 与 cache.rs 的重复定义
- use super::super::cli::transport → use super::transport
- 删除未使用的 save_config、Request::to_json_line 死代码
2026-04-16 17:07:15 +08:00
jackwener dfd020a2b9 fix: 引用消息 XML 转义解析 + 搜索容错跳过 corrupt DB
- 引用消息(type=57)的 ref_content 可能是 HTML 转义的 XML,新增
  unescape_html() 先反转义,再递归调用 parse_appmsg 解析嵌套结构
- 全局搜索遍历 msg_db_keys 时,单个 DB open/query 失败改为 eprintln+continue
  而非传播错误,避免一个 corrupt cache 导致整个搜索失败
- search_in_table 失败也改为 skip 而非 abort
2026-04-16 16:48:59 +08:00
jackwener 2fd864b85d fix: 修复消息内容为空的 bug(TEXT/BLOB 兼容),过滤 fts/resource DB,超时调为 120s 2026-04-16 16:16:41 +08:00
jackwener 3e7b4ed8ee fix: 目录和 pipe 名统一改为 wx-cli(原 wechat-cli) 2026-04-16 15:49:35 +08:00
jackwener 8bfea8869e fix: 修复全部 medium/low 优先级问题
- cache/daemon: mtime 比较从 f64(secs) 改为 u64(nanos),避免浮点误差丢失变更
- transport: Unix 启动 daemon 前调用 setsid(),使其脱离控制终端防止 SIGHUP
- daemon/mod: 删除对已废弃 watcher 模块的引用
- watcher.rs: 删除全量死代码文件(功能已内联至 daemon/mod.rs)
- query: find_msg_tables 实际按 max_ts 降序排序(原注释有排序但无实现)
- scanner: 移除三平台 scan_memory 中的 dedup_by(search_pattern 已全局去重)
- watch: Windows 平台返回明确错误而非静默失败
- CI: cargo build 增加 --locked 确保使用 Cargo.lock 版本
2026-04-16 15:12:33 +08:00
jackwener 993ac1ed47 fix: 修复 review 发现的 4 个高优先级 bug
- Cargo.toml: libc 依赖范围从 macos 改为 unix(修复 Linux 编译失败)
- scanner/macos.rs: VM_REGION_BASIC_INFO_COUNT_64 改为硬编码 9(修复 mach_vm_region 调用失败)
- crypto/wal.rs: WAL 第一页帧不走主 DB 第一页特殊路径(修复 WAL 数据损坏)
- daemon/query.rs: 全局搜索传入正确 names_map(修复 sender 字段为空)
2026-04-16 14:48:03 +08:00
jackwener d475f6219b feat: Rust 完整重写 wx-cli(单一二进制,支持 macOS/Linux/Windows)
实现所有核心模块:
- src/crypto/: SQLCipher 4 页解密 + WAL 应用(AES-256-CBC)
- src/scanner/: 三平台内存扫描(macOS Mach VM / Linux /proc/mem / Windows ReadProcessMemory)
- src/daemon/: tokio 异步 daemon,Unix socket IPC,mtime-aware DB 缓存,WAL 监听推送
- src/cli/: clap CLI,自动启动 daemon,完整命令实现
- src/config.rs: 跨平台配置加载,兼容 Python 版 config.json 格式
- src/ipc.rs: 换行符分隔 JSON 协议,与 Python 版兼容
- .github/workflows/release.yml: 四平台自动构建发布

cargo build --release 验证通过,生成 4.8MB macOS arm64 单一二进制
2026-04-16 14:37:10 +08:00