From 01d1cef57af3376b08ea95f47ae0f4a3cf81bf6f Mon Sep 17 00:00:00 2001 From: jackwener Date: Thu, 14 May 2026 15:57:12 +0800 Subject: [PATCH] chore(biz-articles): drop PR draft, document command, fix typo MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - 删除 PR_DRAFT.md(误入 repo 的 PR 描述草稿,不该进 main) - README.md / SKILL.md 补 biz-articles 用法 - query.rs: 密鑰 → 密钥 Co-authored-by: wx-cli-coder --- PR_DRAFT.md | 126 -------------------------------------------- README.md | 15 ++++++ SKILL.md | 27 ++++++++++ src/daemon/query.rs | 2 +- 4 files changed, 43 insertions(+), 127 deletions(-) delete mode 100644 PR_DRAFT.md diff --git a/PR_DRAFT.md b/PR_DRAFT.md deleted file mode 100644 index f19eb14..0000000 --- a/PR_DRAFT.md +++ /dev/null @@ -1,126 +0,0 @@ -# feat(biz): add `wx biz-articles` command to query public account messages - -## Summary - -Adds a new `biz-articles` subcommand that queries locally cached WeChat public account (公众号) article pushes from `biz_message_0.db`. - -This enables a downstream workflow for downloading full article content: - -```bash -wx biz-articles --since today --json | jq '.[].url' | xargs opencli weixin download -``` - -## Background - -- WeChat stores public account (官方账号) message pushes in a separate database: `message/biz_message_0.db` (SQLCipher 4 encrypted) -- This DB was not exposed by any existing wx-cli command -- The encryption key is already scanned and stored in `~/.wx-cli/all_keys.json` by `wx init` -- Each public account has its own `Msg_{md5(username)}` table, following the same convention as `message_0.db` -- Message content is zstd-compressed XML containing `/` structures with article metadata - -## New CLI Interface - -```bash -# Last 50 articles (default) -wx biz-articles - -# More articles -wx biz-articles -n 200 - -# Filter by public account name (fuzzy match on display name) -wx biz-articles --account "返朴" -wx biz-articles --account "Datawhale" - -# Time filter (article publish time, YYYY-MM-DD) -wx biz-articles --since 2026-05-10 -wx biz-articles --since 2026-05-01 --until 2026-05-10 - -# Show only accounts with unread messages, one latest article per account -wx biz-articles --unread -wx biz-articles --unread --account "Datawhale" # combine: unread within specific account - -# JSON output (for downstream piping) -wx biz-articles --json -wx biz-articles --since 2026-05-10 --json | jq '.[].url' -``` - -## Output Fields - -Each article item includes: - -| Field | Description | -|-------|-------------| -| `time` | Article publish time (formatted) | -| `timestamp` | Article publish timestamp (seconds) | -| `recv_time` | Message receive time (when WeChat pushed it) | -| `recv_time_str` | Message receive time (formatted) | -| `account` | Public account display name | -| `account_username` | Public account username (gh_*) | -| `title` | Article title | -| `url` | Article URL (mp.weixin.qq.com link) | -| `digest` | Article summary/excerpt | -| `cover_url` | Cover image URL | - -## Implementation Notes - -- `biz_message_0.db` is loaded on-demand via existing `DbCache` mechanism (no startup cost unless `biz-articles` is called) -- The key for `message/biz_message_0.db` is already in `all_keys.json`, no changes to `wx init` needed -- Multi-article pushes (图文消息) are expanded: each `` in `` becomes a separate output row -- Items without URL or title (e.g., payment notifications from service accounts) are filtered out -- New `extract_cdata` helper function strips CDATA wrappers from XML content -- Results sorted by `pub_time` DESC (article publish time, not message receive time) - -### `--unread` semantics - -- Queries `session.db` for `unread_count > 0` rows whose `chat_type == official_account`, intersects with `--account` filter if both provided -- Returns at most **one latest article per account** (dedupe by `account_username` after the global pub_time DESC sort) -- Aligns with the behavior of `wx unread --filter official` for fast "what unread accounts are there + what's the latest title" scanning -- Empty intersection short-circuits before scanning biz tables - -## Changes - -- `src/ipc.rs`: Add `BizArticles` IPC request variant -- `src/cli/biz_articles.rs`: New CLI command handler (follows sns_feed pattern) -- `src/cli/mod.rs`: Register `BizArticles` subcommand in clap + dispatch -- `src/daemon/query.rs`: Add `q_biz_articles` query + `parse_biz_xml_items` + `extract_cdata` helpers + 8 unit tests -- `src/daemon/server.rs`: Add dispatch case for `BizArticles` - -## Test Results - -``` -test result: ok. 49 passed; 0 failed; 0 ignored -``` - -New tests (8): -- `biz_tests::extract_cdata_normal` -- `biz_tests::extract_cdata_empty` -- `biz_tests::extract_cdata_url` -- `biz_tests::extract_cdata_no_cdata_wrapper` -- `biz_tests::parse_biz_xml_items_single_article` -- `biz_tests::parse_biz_xml_items_skips_no_url` -- `biz_tests::parse_biz_xml_items_multi_article` -- `biz_tests::parse_biz_xml_items_pub_time_fallback` - -## Verified Output (real WeChat install with ~30 public accounts, 2026-05-10) - -```yaml -- account: 返朴 - title: 细胞生物学家俞立:从后进生到科学家,一个ADHD孩子的逆袭 - url: http://mp.weixin.qq.com/s?__biz=Mzg2MTUyODU2NA==&mid=2247642795&... - -- account: Datawhale - title: 刚刚,Claude Code 团队这篇文章爆了! - url: http://mp.weixin.qq.com/s?__biz=MzIyNjM2MzQyNg==&mid=2247722630&... - -- account: 土猛的员外 - title: AI时代,企业的业务底座正在从数据库变成知识引擎 - url: http://mp.weixin.qq.com/s?__biz=MzIyOTA5NTM1OA==&mid=2247485270&... -``` - -## Branch - -`ChenyqThu/wx-cli` → `feat/biz-articles` - ---- - -*Waiting for Lucien's review before opening PR.* diff --git a/README.md b/README.md index b9783ed..de816d1 100644 --- a/README.md +++ b/README.md @@ -196,6 +196,21 @@ wx sns-search "婚礼" --user "李四" --since 2023-01-01 朋友圈数据只覆盖你本地刷到过的帖子(微信 app 按需下载)。 +### 公众号文章 + +公众号文章推送存在独立的 `biz_message_0.db`,用 `biz-articles` 单独查: + +```bash +wx biz-articles # 最近 50 篇 +wx biz-articles -n 200 # 更多 +wx biz-articles --account "返朴" # 限定公众号(名称模糊匹配) +wx biz-articles --since 2026-05-01 --until 2026-05-10 +wx biz-articles --unread # 仅有未读的公众号,每号取最新 1 篇 +wx biz-articles --json | jq '.[].url' # 下游消费 URL +``` + +每条返回:`account` / `account_username` / `title` / `url` / `digest` / `cover_url` / `time` / `timestamp` / `recv_time_str`。多图文推送会展开成多行。 + ### 联系人 & 群组 ```bash diff --git a/SKILL.md b/SKILL.md index 7d587af..fe7418c 100644 --- a/SKILL.md +++ b/SKILL.md @@ -215,6 +215,33 @@ wx sns-search "婚礼" --user "李四" --since 2023-01-01 -n 50 > 只保存你本地刷到过的朋友圈(微信 app 按需下载)。没刷到过的帖子不在本地,任何命令都拿不到。 +### 公众号文章 + +公众号的文章推送存在独立的 `biz_message_0.db`,与普通 `message_0.db` 分开: + +```bash +# 最近 50 篇(默认) +wx biz-articles + +# 更多 +wx biz-articles -n 200 + +# 限定公众号(名称模糊匹配 display name / username) +wx biz-articles --account "返朴" + +# 时间范围(YYYY-MM-DD,发布时间,非接收时间) +wx biz-articles --since 2026-05-01 --until 2026-05-10 + +# 仅有未读消息的公众号,每号取最新 1 篇(适合"今天有什么新推送"扫描) +wx biz-articles --unread +wx biz-articles --unread --account "Datawhale" # 与 --account 取交集 + +# 下游消费:拿 URL 做内容抓取 +wx biz-articles --since 2026-05-10 --json | jq '.[].url' +``` + +每条返回的字段:`account` / `account_username`(`gh_*`)/ `title` / `url`(`mp.weixin.qq.com` 链接)/ `digest` / `cover_url` / `time` + `timestamp`(文章发布时间)/ `recv_time_str` + `recv_time`(微信接收推送的时间)。多图文推送会展开为多行。 + ### 收藏与统计 ```bash diff --git a/src/daemon/query.rs b/src/daemon/query.rs index 9805258..98574ab 100644 --- a/src/daemon/query.rs +++ b/src/daemon/query.rs @@ -3049,7 +3049,7 @@ pub async fn q_biz_articles( unread: bool, ) -> Result { let biz_path = db.get("message/biz_message_0.db").await? - .context("无法解密 biz_message_0.db,请确认 all_keys.json 包含对应密鑰")? + .context("无法解密 biz_message_0.db,请确认 all_keys.json 包含对应密钥")? ; // 开启 --unread:从 session.db 拿“公众号 + unread_count>0”的 username 子集,