feat: transcribe WeChat voice attachments

pull/104/head
Richard Liu 2026-06-09 13:12:17 +08:00
parent 08af894594
commit d7bd351cec
9 changed files with 1373 additions and 95 deletions

View File

@ -230,9 +230,9 @@ wx biz-articles --json | jq '.[].url' # 下游消费 URL
每条返回:`account` / `account_username` / `title` / `url` / `digest` / `cover_url` / `time` / `timestamp` / `recv_time_str`。多图文推送会展开成多行。
### 附件提取(图片)
### 附件提取(图片;语音 POC
聊天里的附件本体存在 `xwechat_files/<wxid>/msg/attach/...` 下的 `.dat` 文件,需要按消息所在 `message_resource.db` 的 md5 + 平台相关 image key 解码才能拿到原图。
聊天里的附件本体存在本地数据库或 `xwechat_files/<wxid>/msg/attach/...` 下的资源文件。图片需要按消息所在 `message_resource.db` 的 md5 + 平台相关 image key 解码才能拿到原图;语音目前是 POC优先从 `message/media_0.db::VoiceInfo` 导出 `voice_data`,未命中时再尝试本地文件缓存,只做原样复制,不做转码或转文字
```bash
# 1) 列出会话里的图片附件,先拿到不透明的 attachment_id
@ -240,14 +240,37 @@ wx attachments "张三"
wx attachments "AI群" --kind image -n 100
wx attachments "AI群" --since 2026-04-01 --until 2026-04-15
# 2) 把单个 attachment_id 解密写出去(扩展名建议保留 .jpg / .mp4 等)
# POC: 列出语音消息资源
wx attachments "张三" --kind voice -n 20
# 2) 把单个 attachment_id 写出去(图片会解码;语音 POC 原样复制)
wx extract <attachment_id> -o ~/Desktop/photo.jpg
wx extract <voice_attachment_id> -o /tmp/voice.aud
wx extract <attachment_id> -o /tmp/x.jpg --overwrite
```
`attachments` 输出每条带:`attachment_id` / `kind` / `type` / `local_id` / `timestamp` / `time`,群聊里还有 `sender` 以及稳定身份三件套 `sender_username` / `sender_contact_display` / `sender_group_nickname`(语义同 `history` / `search` / `new-messages``sender_username` 是 wxid用于两个同名成员之间的稳定区分解析不到 wxid 时这三字段不输出)。当前 `kind` 固定为 `image`;命令名保留成 `attachments` 是为了后续扩到其他附件类型时不 break CLI
`attachments` 输出每条带:`attachment_id` / `kind` / `type` / `local_id` / `timestamp` / `time`,群聊里还有 `sender` 以及稳定身份三件套 `sender_username` / `sender_contact_display` / `sender_group_nickname`(语义同 `history` / `search` / `new-messages``sender_username` 是 wxid用于两个同名成员之间的稳定区分解析不到 wxid 时这三字段不输出)。默认 `kind``image``--kind voice` / `--kind audio` 是实验能力,依赖本地 `media_0.db` 或语音文件缓存仍可读取
`extract` 输出报告里带:`md5` / `dat_path` / `dat_size` / `output` / `output_size` / `format`实际识别出的图片格式jpg / png / gif / webp / hevc 等)/ `decoder`(实际选用的解码器:`legacy_xor` / `v1_aes` / `v2`)。
`extract` 输出报告里带:`output` / `output_size` / `format` / `decoder`;从本地附件文件命中时还带 `md5` / `dat_path` / `dat_size`。图片的 `format` 是实际识别出的图片格式jpg / png / gif / webp / hevc 等),`decoder` 是 `legacy_xor` / `v1_aes` / `v2`;语音 POC 的 `decoder``media_0_voice_data``raw_copy`
#### 语音转文字 POC
`wx transcribe` 会把语音 `attachment_id` 走完整本地链路:导出 WeChat 原始语音 bytes → SILK v3 decoder 转 PCM → `ffmpeg` 转 16k mono WAV → `whisper.cpp` 本地 ASR。wx-cli 不内置模型,也不下载依赖;所有工具都在本机执行。`--keep-temp` 会保留中间音频文件,目录权限保持 `0700`,但这些文件仍然是私密语音数据,只应在调试时使用。
```bash
# 依赖示例:
# 1) kn007/silk-v3-decoder 编译得到 silk/decoder
# 2) whisper.cpp 编译得到 whisper-cli并下载 ggml 多语种模型
# 3) ffmpeg 在 PATH 中
wx transcribe <voice_attachment_id> \
--silk-decoder /path/to/silk-v3-decoder/silk/decoder \
--whisper-bin /path/to/whisper.cpp/build/bin/whisper-cli \
--model /path/to/whisper.cpp/models/ggml-large-v3-turbo.bin \
--language zh
```
也可用环境变量减少参数:`WX_SILK_DECODER` / `WX_WHISPER_BIN` / `WX_WHISPER_MODEL` / `WX_FFMPEG`
支持的解码档位:
- **legacy XOR**:早期单字节 XOR无 magic按文件首字节探测格式自动反推

View File

@ -267,24 +267,42 @@ wx biz-articles --since 2026-05-10 --json | jq '.[].url'
每条返回的字段:`account` / `account_username``gh_*`/ `title` / `url``mp.weixin.qq.com` 链接)/ `digest` / `cover_url` / `time` + `timestamp`(文章发布时间)/ `recv_time_str` + `recv_time`(微信接收推送的时间)。多图文推送会展开为多行。
### 附件提取(图片)
### 附件提取(图片;语音 POC
聊天里的图片本体在 `xwechat_files/<wxid>/msg/attach/...` 下加密存储(`.dat`),需要按消息所在 `message_resource.db` 的 md5 + 平台相关 image key 才能解码。两步走:
聊天里的附件本体存在本地数据库或 `xwechat_files/<wxid>/msg/attach/...` 下的资源文件。图片需要按消息所在 `message_resource.db` 的 md5 + 平台相关 image key 解码才能拿到原图;语音目前是 POC优先从 `message/media_0.db::VoiceInfo` 导出 `voice_data`,未命中时再尝试本地文件缓存,只做原样复制,不做转码或转文字。
```bash
# 1) 先列出图片附件,拿到不透明的 attachment_id
# 1) 先列出附件,拿到不透明的 attachment_id
wx attachments "张三"
wx attachments "AI群" --kind image -n 100
wx attachments "AI群" --since 2026-04-01 --until 2026-04-15
# 2) 用 attachment_id 把单个资源解密写到指定路径
# POC: 列出语音消息资源
wx attachments "张三" --kind voice -n 20
# 2) 用 attachment_id 把单个资源写到指定路径
wx extract <attachment_id> -o ~/Desktop/photo.jpg
wx extract <voice_attachment_id> -o /tmp/voice.aud
wx extract <attachment_id> -o /tmp/x.jpg --overwrite
```
`attachments` 输出每条带:`attachment_id` / `kind`(当前固定 `image`/ `type` / `local_id` / `timestamp` / `time`,群聊里另带 `sender` 和稳定身份三件套(同上文)。命令名保留成 `attachments` 是为了后续扩到其他附件类型时不 break CLI
`attachments` 输出每条带:`attachment_id` / `kind` / `type` / `local_id` / `timestamp` / `time`,群聊里另带 `sender` 和稳定身份三件套(同上文)。默认 `kind``image``--kind voice` / `--kind audio` 是 POC优先从 `message/media_0.db::VoiceInfo` 导出 `voice_data`,未命中时再尝试本地文件缓存,只做原样复制,不做转码或转文字
`extract` 报告里带:`md5` / `dat_path` / `dat_size` / `output` / `output_size` / `format`实际识别出的图片格式jpg / png / gif / webp / hevc 等)/ `decoder`(实际选用的解码器:`legacy_xor` / `v1_aes` / `v2`)。
`extract` 报告里带:`output` / `output_size` / `format` / `decoder`;从本地附件文件命中时还带 `md5` / `dat_path` / `dat_size`。图片的 `decoder``legacy_xor` / `v1_aes` / `v2`;语音 POC 的 `decoder``media_0_voice_data``raw_copy`
#### 语音转文字 POC
`wx transcribe` 会把语音 `attachment_id` 走完整本地链路:导出 WeChat 原始语音 bytes → SILK v3 decoder 转 PCM → `ffmpeg` 转 16k mono WAV → `whisper.cpp` 本地 ASR。wx-cli 不内置模型,也不下载依赖;所有工具都在本机执行。`--keep-temp` 会保留中间音频文件,目录权限保持 `0700`,但这些文件仍然是私密语音数据,只应在调试时使用。
```bash
wx transcribe <voice_attachment_id> \
--silk-decoder /path/to/silk-v3-decoder/silk/decoder \
--whisper-bin /path/to/whisper.cpp/build/bin/whisper-cli \
--model /path/to/whisper.cpp/models/ggml-large-v3-turbo.bin \
--language zh
```
也可用环境变量减少参数:`WX_SILK_DECODER` / `WX_WHISPER_BIN` / `WX_WHISPER_MODEL` / `WX_FFMPEG`
支持的解码档位:
- **legacy XOR**:早期单字节 XOR无 magic按文件首字节探测格式自动反推

View File

@ -17,9 +17,10 @@
use anyhow::{anyhow, Context, Result};
use chrono::TimeZone;
use rusqlite::Connection;
use std::collections::HashSet;
use std::path::{Path, PathBuf};
use super::AttachmentId;
use super::{AttachmentId, AttachmentKind};
/// 单条 attachment 在资源库 + 本地 attach 树下的解析结果。
#[derive(Debug, Clone)]
@ -40,6 +41,14 @@ pub struct AttachmentMetadata {
pub md5: String,
}
/// `message/media_0.db::VoiceInfo` 中的一条语音资源。
#[derive(Debug, Clone)]
pub struct ResolvedVoiceMedia {
pub data: Vec<u8>,
pub chunks: usize,
pub svr_id: Option<i64>,
}
/// 用 `(chat, local_id)` 查 message_resource.db 拿 file md5。
///
/// 调用方传已经解密好的 `message_resource.db` 路径(由 daemon 的 `DBCache` 准备)。
@ -87,8 +96,8 @@ pub fn lookup_md5_blocking(
)
.ok();
let packed: Option<Vec<u8>> = packed_exact.or_else(|| conn
.query_row(
let packed: Option<Vec<u8>> = packed_exact.or_else(|| {
conn.query_row(
"SELECT packed_info FROM MessageResourceInfo
WHERE chat_id = ?1
AND message_local_id = ?2
@ -98,7 +107,8 @@ pub fn lookup_md5_blocking(
rusqlite::params![chat_id, local_id, msg_local_type_lo32],
|row| row.get(0),
)
.ok());
.ok()
});
let Some(blob) = packed else {
return Ok(None);
@ -106,6 +116,170 @@ pub fn lookup_md5_blocking(
Ok(extract_md5_from_packed_info(&blob).map(|md5| AttachmentMetadata { md5 }))
}
/// 从 `message/media_0.db` 的 VoiceInfo 表读取语音 BLOB。
///
/// WeChat 4.x 语音不一定进入 `message_resource.db`,常见路径是:
/// `media_0.db::VoiceInfo(local_id, create_time, voice_data, data_index)`。
/// `data_index` 预留分片能力,所以这里按 data_index 顺序拼接同一条语音的所有 chunk。
pub fn lookup_voice_media_blocking(
media_db_path: &Path,
chat: &str,
local_id: i64,
create_time: i64,
) -> Result<Option<ResolvedVoiceMedia>> {
let conn = Connection::open_with_flags(
media_db_path,
rusqlite::OpenFlags::SQLITE_OPEN_READ_ONLY | rusqlite::OpenFlags::SQLITE_OPEN_URI,
)
.with_context(|| format!("打开 media_0.db {:?}", media_db_path))?;
let has_voice_info: bool = conn
.query_row(
"SELECT 1 FROM sqlite_master WHERE type='table' AND name='VoiceInfo'",
[],
|_| Ok(()),
)
.is_ok();
if !has_voice_info {
return Ok(None);
}
let columns = table_columns(&conn, "VoiceInfo")?;
if !columns.contains("voice_data") {
return Ok(None);
}
let data_index_expr = if columns.contains("data_index") {
"CAST(COALESCE(data_index, '0') AS INTEGER)"
} else {
"0"
};
let svr_id_expr = if columns.contains("svr_id") {
"svr_id"
} else {
"NULL"
};
let mut rows = Vec::new();
if columns.contains("local_id") {
if columns.contains("chat_name_id") {
let chat_id: Option<i64> = conn
.query_row(
"SELECT rowid FROM Name2Id WHERE user_name = ?1",
[chat],
|row| row.get(0),
)
.ok();
let Some(chat_id) = chat_id else {
return Ok(None);
};
if columns.contains("create_time") {
rows = query_voice_rows(
&conn,
"chat_name_id = ?1 AND local_id = ?2 AND create_time = ?3",
rusqlite::params![chat_id, local_id, create_time],
data_index_expr,
svr_id_expr,
)?;
}
if rows.is_empty() && !columns.contains("create_time") {
rows = query_voice_rows(
&conn,
"chat_name_id = ?1 AND local_id = ?2",
rusqlite::params![chat_id, local_id],
data_index_expr,
svr_id_expr,
)?;
}
}
}
if rows.is_empty() && columns.contains("msgid") {
if !columns.contains("user_name") {
return Ok(None);
}
if columns.contains("msgtime") {
rows = query_voice_rows(
&conn,
"user_name = ?1 AND msgid = ?2 AND msgtime = ?3",
rusqlite::params![chat, local_id, create_time],
data_index_expr,
svr_id_expr,
)?;
}
if rows.is_empty() && !columns.contains("msgtime") {
rows = query_voice_rows(
&conn,
"user_name = ?1 AND msgid = ?2",
rusqlite::params![chat, local_id],
data_index_expr,
svr_id_expr,
)?;
}
}
if rows.is_empty() {
return Ok(None);
}
rows.sort_by_key(|row| row.0);
let svr_id = rows.iter().find_map(|row| row.2);
let chunks = rows.len();
let total_len: usize = rows.iter().map(|row| row.1.len()).sum();
if total_len == 0 {
return Ok(None);
}
let mut data = Vec::with_capacity(total_len);
for (_idx, chunk, _svr_id) in rows {
data.extend_from_slice(&chunk);
}
Ok(Some(ResolvedVoiceMedia {
data,
chunks,
svr_id,
}))
}
fn table_columns(conn: &Connection, table: &str) -> Result<HashSet<String>> {
let mut stmt = conn.prepare(&format!("PRAGMA table_info({table})"))?;
let columns = stmt
.query_map([], |row| row.get::<_, String>(1))?
.collect::<rusqlite::Result<HashSet<_>>>()?;
Ok(columns)
}
fn query_voice_rows<P>(
conn: &Connection,
where_clause: &str,
params: P,
data_index_expr: &str,
svr_id_expr: &str,
) -> Result<Vec<(i64, Vec<u8>, Option<i64>)>>
where
P: rusqlite::Params,
{
let sql = format!(
"SELECT {data_index_expr} AS voice_index, voice_data, {svr_id_expr} AS voice_svr_id
FROM VoiceInfo
WHERE {where_clause}
ORDER BY voice_index, rowid"
);
let mut stmt = conn.prepare(&sql)?;
let rows = stmt
.query_map(params, |row| {
Ok((
row.get::<_, i64>(0).unwrap_or(0),
row.get::<_, Vec<u8>>(1).unwrap_or_default(),
row.get::<_, i64>(2).ok(),
))
})?
.collect::<rusqlite::Result<Vec<_>>>()?;
Ok(rows)
}
/// 从 `MessageResourceInfo.packed_info` (protobuf) 提取 32 字节 ASCII hex md5。
///
/// 主路径:搜 4 字节 marker `12 22 0a 20`field=2 LEN, length=34, sub field=1 LEN, length=32
@ -145,12 +319,10 @@ fn find_subslice(haystack: &[u8], needle: &[u8]) -> Option<usize> {
if needle.is_empty() || needle.len() > haystack.len() {
return None;
}
haystack
.windows(needle.len())
.position(|w| w == needle)
haystack.windows(needle.len()).position(|w| w == needle)
}
/// 在 `<attach_root>/<md5(chat)>/<YYYY-MM>/Img/<md5>[_t|_h].dat` 下找文件。
/// 在 `<attach_root>/<md5(chat)>/<YYYY-MM>/Img/<md5>[_t|_h].dat` 下找图片文件。
///
/// 优先级full > `_h`HD thumbnail> `_t`thumbnail。返回最优的一个
/// 找不到返回 None。
@ -163,18 +335,40 @@ pub fn find_dat_file(
chat: &str,
file_md5: &str,
create_time: i64,
) -> Option<PathBuf> {
find_media_file(
attach_root,
chat,
file_md5,
create_time,
AttachmentKind::Image,
)
}
/// 在本地附件树中定位指定 kind 的媒体文件。
///
/// image 走已经验证过的 `Img/<md5>[_h|_t].dat` 规则voice 是 POC 路径,优先试
/// `Voice` / `Audio` 目录里的 md5 同名文件,最后在 `msg/attach` 下按 md5 前缀递归兜底。
pub fn find_media_file(
attach_root: &Path,
chat: &str,
file_md5: &str,
create_time: i64,
kind: AttachmentKind,
) -> Option<PathBuf> {
let chat_hash = format!("{:x}", md5::compute(chat.as_bytes()));
let chat_dir = attach_root.join(&chat_hash);
if !chat_dir.is_dir() {
return None;
return match kind {
AttachmentKind::Voice => find_by_md5_recursive(attach_root, file_md5, kind),
_ => None,
};
}
// 第一步:试 create_time 当月 + 前后各一个月(共 3 个候选目录)
let candidates_ym: Vec<String> = three_month_candidates(create_time);
for ym in &candidates_ym {
let img_dir = chat_dir.join(ym).join("Img");
if let Some(p) = pick_best_in_img_dir(&img_dir, file_md5) {
if let Some(p) = pick_best_in_month_dir(&chat_dir.join(ym), file_md5, kind) {
return Some(p);
}
}
@ -189,12 +383,37 @@ pub fn find_dat_file(
// 已经试过的 3 个候选可以跳过,但成本极小;保留全量扫
all_months.sort();
for month_dir in all_months {
let img_dir = month_dir.join("Img");
if let Some(p) = pick_best_in_img_dir(&img_dir, file_md5) {
if let Some(p) = pick_best_in_month_dir(&month_dir, file_md5, kind) {
return Some(p);
}
}
// POC fallbackMac 4.x 的语音路径未完全验证。若上面的目录名猜错,仍按资源 md5
// 在 attach 树下递归找一次,避免因为 `Voice`/`Audio` 布局差异直接失败。
match kind {
AttachmentKind::Voice => find_by_md5_recursive(attach_root, file_md5, kind),
_ => None,
}
}
fn pick_best_in_month_dir(
month_dir: &Path,
file_md5: &str,
kind: AttachmentKind,
) -> Option<PathBuf> {
match kind {
AttachmentKind::Image => pick_best_in_img_dir(&month_dir.join("Img"), file_md5),
AttachmentKind::Voice => {
for subdir in ["Voice", "Audio", "Aud"] {
if let Some(p) = pick_best_media_file(&month_dir.join(subdir), file_md5, kind) {
return Some(p);
}
}
None
}
AttachmentKind::Video => pick_best_media_file(&month_dir.join("Video"), file_md5, kind),
AttachmentKind::File => pick_best_media_file(month_dir, file_md5, kind),
}
}
fn pick_best_in_img_dir(img_dir: &Path, file_md5: &str) -> Option<PathBuf> {
@ -216,6 +435,94 @@ fn pick_best_in_img_dir(img_dir: &Path, file_md5: &str) -> Option<PathBuf> {
None
}
fn pick_best_media_file(media_dir: &Path, file_md5: &str, kind: AttachmentKind) -> Option<PathBuf> {
if !media_dir.is_dir() {
return None;
}
for name in exact_media_names(file_md5, kind) {
let path = media_dir.join(name);
if path.is_file() {
return Some(path);
}
}
let mut candidates = media_dir
.read_dir()
.ok()?
.filter_map(|e| e.ok())
.map(|e| e.path())
.filter(|p| {
p.is_file()
&& p.file_name()
.and_then(|s| s.to_str())
.map(|name| name.starts_with(file_md5))
.unwrap_or(false)
})
.collect::<Vec<_>>();
candidates.sort_by_key(|p| {
let size = p.metadata().map(|m| m.len()).unwrap_or(0);
std::cmp::Reverse(size)
});
candidates.into_iter().next()
}
fn exact_media_names(file_md5: &str, kind: AttachmentKind) -> Vec<String> {
match kind {
AttachmentKind::Image => vec![
format!("{}.dat", file_md5),
format!("{}_h.dat", file_md5),
format!("{}_t.dat", file_md5),
],
AttachmentKind::Voice => ["", ".aud", ".amr", ".silk", ".wav", ".m4a", ".mp3", ".dat"]
.iter()
.map(|ext| format!("{}{}", file_md5, ext))
.collect(),
AttachmentKind::Video => [".mp4", ".mov", ".m4v", ".dat"]
.iter()
.map(|ext| format!("{}{}", file_md5, ext))
.collect(),
AttachmentKind::File => vec![file_md5.to_string()],
}
}
fn find_by_md5_recursive(root: &Path, file_md5: &str, kind: AttachmentKind) -> Option<PathBuf> {
if !root.is_dir() {
return None;
}
let mut stack = vec![root.to_path_buf()];
let mut matches = Vec::new();
while let Some(dir) = stack.pop() {
let Ok(entries) = std::fs::read_dir(&dir) else {
continue;
};
for entry in entries.filter_map(|e| e.ok()) {
let path = entry.path();
if path.is_dir() {
stack.push(path);
continue;
}
if !path.is_file() {
continue;
}
let Some(name) = path.file_name().and_then(|s| s.to_str()) else {
continue;
};
if name == file_md5
|| exact_media_names(file_md5, kind).iter().any(|n| n == name)
|| name.starts_with(file_md5)
{
matches.push(path);
}
}
}
matches.sort_by_key(|p| {
let size = p.metadata().map(|m| m.len()).unwrap_or(0);
std::cmp::Reverse(size)
});
matches.into_iter().next()
}
fn three_month_candidates(unix_ts: i64) -> Vec<String> {
use chrono::{Datelike, Duration};
let dt = match chrono::Local.timestamp_opt(unix_ts, 0).single() {
@ -268,10 +575,12 @@ pub fn resolve_blocking(
)
})?;
let dat_path = find_dat_file(attach_root, &id.chat, &meta.md5, id.create_time).ok_or_else(
let dat_path =
find_media_file(attach_root, &id.chat, &meta.md5, id.create_time, id.kind).ok_or_else(
|| {
anyhow!(
"找不到本地 .datmd5={} chat={} create_time={})— 微信可能尚未下载该附件,或附件已被清理",
"找不到本地附件文件kind={} md5={} chat={} create_time={})— 微信可能尚未下载该附件,或附件已被清理",
id.kind.as_str(),
meta.md5,
id.chat,
id.create_time
@ -280,7 +589,12 @@ pub fn resolve_blocking(
)?;
let size = std::fs::metadata(&dat_path).map(|m| m.len()).unwrap_or(0);
Ok(ResolvedAttachment { id: id.clone(), md5: meta.md5, dat_path, size })
Ok(ResolvedAttachment {
id: id.clone(),
md5: meta.md5,
dat_path,
size,
})
}
#[cfg(test)]
@ -334,10 +648,7 @@ mod tests {
let dir = tempdir_for_test();
let db_path = dir.join("message_resource.db");
let conn = Connection::open(&db_path).unwrap();
conn.execute(
"CREATE TABLE ChatName2Id (user_name TEXT)",
[],
)
conn.execute("CREATE TABLE ChatName2Id (user_name TEXT)", [])
.unwrap();
conn.execute(
"INSERT INTO ChatName2Id (rowid, user_name) VALUES (1, 'room@chatroom')",
@ -392,6 +703,208 @@ mod tests {
assert_eq!(new.md5, "22222222222222222222222222222222");
}
#[test]
fn lookup_voice_media_reads_chunks_from_media_db() {
let dir = tempdir_for_test();
let db_path = dir.join("media_0.db");
let conn = Connection::open(&db_path).unwrap();
conn.execute("CREATE TABLE Name2Id (user_name TEXT)", [])
.unwrap();
conn.execute(
"INSERT INTO Name2Id (rowid, user_name) VALUES (9, 'room@chatroom')",
[],
)
.unwrap();
conn.execute(
"CREATE TABLE VoiceInfo (
chat_name_id INTEGER,
create_time INTEGER,
local_id INTEGER,
svr_id INTEGER,
voice_data BLOB,
data_index TEXT DEFAULT '0'
)",
[],
)
.unwrap();
conn.execute(
"INSERT INTO VoiceInfo
(chat_name_id, create_time, local_id, svr_id, voice_data, data_index)
VALUES (?1, ?2, ?3, ?4, ?5, ?6)",
rusqlite::params![9i64, 2000i64, 7i64, 123i64, b"two", "2"],
)
.unwrap();
conn.execute(
"INSERT INTO VoiceInfo
(chat_name_id, create_time, local_id, svr_id, voice_data, data_index)
VALUES (?1, ?2, ?3, ?4, ?5, ?6)",
rusqlite::params![9i64, 2000i64, 7i64, 123i64, b"one", "1"],
)
.unwrap();
let media = lookup_voice_media_blocking(&db_path, "room@chatroom", 7, 2000)
.unwrap()
.unwrap();
assert_eq!(media.data, b"onetwo");
assert_eq!(media.chunks, 2);
assert_eq!(media.svr_id, Some(123));
}
#[test]
fn lookup_voice_media_keeps_rows_scoped_to_chat() {
let dir = tempdir_for_test();
let db_path = dir.join("media_0.db");
let conn = Connection::open(&db_path).unwrap();
conn.execute("CREATE TABLE Name2Id (user_name TEXT)", [])
.unwrap();
conn.execute(
"INSERT INTO Name2Id (rowid, user_name) VALUES (9, 'room@chatroom')",
[],
)
.unwrap();
conn.execute(
"INSERT INTO Name2Id (rowid, user_name) VALUES (10, 'other@chatroom')",
[],
)
.unwrap();
conn.execute(
"CREATE TABLE VoiceInfo (
chat_name_id INTEGER,
create_time INTEGER,
local_id INTEGER,
svr_id INTEGER,
voice_data BLOB,
data_index TEXT DEFAULT '0'
)",
[],
)
.unwrap();
for (chat_id, data) in [(10i64, b"wrong".as_slice()), (9i64, b"right".as_slice())] {
conn.execute(
"INSERT INTO VoiceInfo
(chat_name_id, create_time, local_id, svr_id, voice_data, data_index)
VALUES (?1, ?2, ?3, ?4, ?5, ?6)",
rusqlite::params![chat_id, 2000i64, 7i64, 123i64, data, "0"],
)
.unwrap();
}
let media = lookup_voice_media_blocking(&db_path, "room@chatroom", 7, 2000)
.unwrap()
.unwrap();
assert_eq!(media.data, b"right");
}
#[test]
fn lookup_voice_media_uses_create_time_to_disambiguate_reused_local_id() {
let dir = tempdir_for_test();
let db_path = dir.join("media_0.db");
let conn = Connection::open(&db_path).unwrap();
conn.execute("CREATE TABLE Name2Id (user_name TEXT)", [])
.unwrap();
conn.execute(
"INSERT INTO Name2Id (rowid, user_name) VALUES (9, 'room@chatroom')",
[],
)
.unwrap();
conn.execute(
"CREATE TABLE VoiceInfo (
chat_name_id INTEGER,
create_time INTEGER,
local_id INTEGER,
svr_id INTEGER,
voice_data BLOB,
data_index TEXT DEFAULT '0'
)",
[],
)
.unwrap();
for (create_time, data) in [(1000i64, b"old".as_slice()), (2000i64, b"new".as_slice())] {
conn.execute(
"INSERT INTO VoiceInfo
(chat_name_id, create_time, local_id, svr_id, voice_data, data_index)
VALUES (?1, ?2, ?3, ?4, ?5, ?6)",
rusqlite::params![9i64, create_time, 7i64, 123i64, data, "0"],
)
.unwrap();
}
let media = lookup_voice_media_blocking(&db_path, "room@chatroom", 7, 2000)
.unwrap()
.unwrap();
assert_eq!(media.data, b"new");
assert!(
lookup_voice_media_blocking(&db_path, "room@chatroom", 7, 3000)
.unwrap()
.is_none()
);
}
#[test]
fn lookup_voice_media_reads_legacy_schema_without_chunk_columns() {
let dir = tempdir_for_test();
let db_path = dir.join("media_0.db");
let conn = Connection::open(&db_path).unwrap();
conn.execute(
"CREATE TABLE VoiceInfo (
user_name TEXT,
msgid INTEGER,
msgtime INTEGER,
voice_data BLOB
)",
[],
)
.unwrap();
conn.execute(
"INSERT INTO VoiceInfo (user_name, msgid, msgtime, voice_data)
VALUES (?1, ?2, ?3, ?4)",
rusqlite::params!["room@chatroom", 7i64, 2000i64, b"voice"],
)
.unwrap();
let media = lookup_voice_media_blocking(&db_path, "room@chatroom", 7, 2000)
.unwrap()
.unwrap();
assert_eq!(media.data, b"voice");
assert_eq!(media.chunks, 1);
assert_eq!(media.svr_id, None);
}
#[test]
fn lookup_voice_media_legacy_schema_uses_msgtime_to_disambiguate_reused_msgid() {
let dir = tempdir_for_test();
let db_path = dir.join("media_0.db");
let conn = Connection::open(&db_path).unwrap();
conn.execute(
"CREATE TABLE VoiceInfo (
user_name TEXT,
msgid INTEGER,
msgtime INTEGER,
voice_data BLOB
)",
[],
)
.unwrap();
for (msgtime, data) in [(1000i64, b"old".as_slice()), (2000i64, b"new".as_slice())] {
conn.execute(
"INSERT INTO VoiceInfo (user_name, msgid, msgtime, voice_data)
VALUES (?1, ?2, ?3, ?4)",
rusqlite::params!["room@chatroom", 7i64, msgtime, data],
)
.unwrap();
}
let media = lookup_voice_media_blocking(&db_path, "room@chatroom", 7, 2000)
.unwrap()
.unwrap();
assert_eq!(media.data, b"new");
assert!(
lookup_voice_media_blocking(&db_path, "room@chatroom", 7, 3000)
.unwrap()
.is_none()
);
}
#[test]
fn three_month_candidates_includes_prev_curr_next() {
// 2025-08-15 (mid-month) → 2025-07, 2025-08, 2025-09
@ -415,17 +928,58 @@ mod tests {
std::fs::write(img.join(format!("{}_h.dat", md5)), b"hd").unwrap();
// 只有 _t / _h 时取 _h
assert_eq!(
pick_best_in_img_dir(&img, md5).unwrap().file_name().unwrap(),
pick_best_in_img_dir(&img, md5)
.unwrap()
.file_name()
.unwrap(),
format!("{}_h.dat", md5).as_str()
);
// 加 full 后取 full
std::fs::write(img.join(format!("{}.dat", md5)), b"full").unwrap();
assert_eq!(
pick_best_in_img_dir(&img, md5).unwrap().file_name().unwrap(),
pick_best_in_img_dir(&img, md5)
.unwrap()
.file_name()
.unwrap(),
format!("{}.dat", md5).as_str()
);
}
#[test]
fn find_media_file_finds_voice_by_month_voice_dir() {
let tmp = tempdir_for_test();
let chat = "room@chatroom";
let chat_hash = format!("{:x}", md5::compute(chat.as_bytes()));
let ts = chrono::Local
.with_ymd_and_hms(2026, 6, 9, 12, 0, 0)
.unwrap()
.timestamp();
let voice_dir = tmp.join(chat_hash).join("2026-06").join("Voice");
std::fs::create_dir_all(&voice_dir).unwrap();
let md5 = "00112233445566778899aabbccddeeff";
std::fs::write(voice_dir.join(format!("{}.aud", md5)), b"voice").unwrap();
let found = find_media_file(&tmp, chat, md5, ts, AttachmentKind::Voice).unwrap();
assert_eq!(found.file_name().unwrap(), format!("{}.aud", md5).as_str());
}
#[test]
fn find_media_file_voice_recurses_when_layout_unknown() {
let tmp = tempdir_for_test();
let chat = "room@chatroom";
let ts = chrono::Local
.with_ymd_and_hms(2026, 6, 9, 12, 0, 0)
.unwrap()
.timestamp();
let odd_dir = tmp.join("somehash").join("2026-06").join("NotVoice");
std::fs::create_dir_all(&odd_dir).unwrap();
let md5 = "abcdefabcdefabcdefabcdefabcdefab";
std::fs::write(odd_dir.join(format!("{}.silk", md5)), b"voice").unwrap();
let found = find_media_file(&tmp, chat, md5, ts, AttachmentKind::Voice).unwrap();
assert_eq!(found.file_name().unwrap(), format!("{}.silk", md5).as_str());
}
fn tempdir_for_test() -> PathBuf {
let pid = std::process::id();
let nanos = std::time::SystemTime::now()

View File

@ -8,7 +8,8 @@ use crate::ipc::Request;
/// `wx attachments` — 列出指定会话的附件消息(默认 image可多选
///
/// 输出每条 `attachment_id`,再传给 `wx extract` 才真正读 message_resource.db
/// 与本地 .dat 解码。这一步只查 `Msg_<chat>` 表,几千条群聊也能秒返。
/// 与本地资源文件。POC 中 image 解码voice/audio 原样复制;这一步只查
/// `Msg_<chat>` 表,几千条群聊也能秒返。
pub fn cmd_attachments(
chat: String,
kinds: Vec<String>,

View File

@ -1,14 +1,14 @@
use anyhow::Result;
use crate::ipc::Request;
use super::output::{print_value, resolve};
use super::transport;
use crate::ipc::Request;
/// `wx extract` — 把单个 `attachment_id` 对应的资源解密写到指定路径。
/// `wx extract` — 把单个 `attachment_id` 对应的资源写到指定路径。
///
/// daemon 端:解析 `attachment_id` → 查 `message_resource.db` 拿 file md5 →
/// 在 `<wxchat_base>/msg/attach/...` 找 .dat → 按 magic 分发到 v1/v2 解码器 →
/// 写出真实图片/文件
/// 在 `<wxchat_base>/msg/attach/...` 找资源文件。image 按 magic 分发到 v1/v2
/// 解码器voice/audio POC 原样复制
pub fn cmd_extract(
attachment_id: String,
output: String,

View File

@ -16,12 +16,14 @@ pub mod sns_feed;
pub mod sns_notifications;
pub mod sns_search;
pub mod stats;
pub mod transcribe;
pub mod transport;
pub mod unread;
use self::output::OutputOpts;
use anyhow::Result;
use clap::{Parser, Subcommand};
use std::path::PathBuf;
/// wx — 微信本地数据 CLI
#[derive(Parser)]
@ -271,13 +273,13 @@ enum Commands {
#[arg(long)]
json: bool,
},
/// 列出某会话的图片附件,返回不透明 attachment_id
/// 列出某会话的附件,返回不透明 attachment_id
Attachments {
/// 会话名称(联系人显示名 / wxid / @chatroom username 都可以)
chat: String,
/// 类型(当前仅支持 image
/// 类型(POC 支持 image / voice
#[arg(long = "kind", value_name = "KIND",
value_parser = ["image", "img"])]
value_parser = ["image", "img", "voice", "audio"])]
kinds: Vec<String>,
/// 显示数量
#[arg(short = 'n', long, default_value = "50")]
@ -295,11 +297,11 @@ enum Commands {
#[arg(long)]
json: bool,
},
/// 把单个 attachment_id 对应的资源解密写到指定文件路径
/// 把单个 attachment_id 对应的资源写到指定文件路径
Extract {
/// 由 `wx attachments` 输出的不透明 IDbase64url 字符串)
attachment_id: String,
/// 输出文件路径(绝对或相对当前工作目录均可;扩展名建议保留为 .jpg 等
/// 输出文件路径(图片建议 .jpg/.png语音 POC 建议先保留原始扩展名
#[arg(short = 'o', long)]
output: String,
/// 目标已存在时覆盖
@ -309,6 +311,32 @@ enum Commands {
#[arg(long)]
json: bool,
},
/// 转写单个语音 attachment_idSILK -> WAV -> whisper.cpp
Transcribe {
/// 由 `wx attachments --kind voice` 输出的不透明 IDbase64url 字符串)
attachment_id: String,
/// whisper.cpp 模型路径;也可用 WX_WHISPER_MODEL
#[arg(long, value_name = "PATH")]
model: Option<PathBuf>,
/// whisper.cpp 的 whisper-cli 路径;默认找 WX_WHISPER_BIN 或 PATH 里的 whisper-cli
#[arg(long = "whisper-bin", value_name = "PATH")]
whisper_bin: Option<PathBuf>,
/// SILK v3 decoder 路径;默认找 WX_SILK_DECODER 或 PATH 里的 silk-decoder/silk_v3_decoder/silk_decoder
#[arg(long = "silk-decoder", value_name = "PATH")]
silk_decoder: Option<PathBuf>,
/// ffmpeg 路径;默认找 WX_FFMPEG 或 PATH 里的 ffmpeg
#[arg(long, value_name = "PATH")]
ffmpeg: Option<PathBuf>,
/// 语音语言,传给 whisper.cpp -l普通话建议 zh自动识别用 auto
#[arg(short = 'l', long = "language", default_value = "zh")]
language: String,
/// 保留中间文件raw/silk/pcm/wav用于调试转码质量目录权限保持 0700
#[arg(long)]
keep_temp: bool,
/// 输出 JSON默认 YAML
#[arg(long)]
json: bool,
},
/// 管理 wx-daemon
Daemon {
#[command(subcommand)]
@ -520,6 +548,25 @@ fn dispatch(cli: Cli) -> Result<()> {
overwrite,
json,
} => extract::cmd_extract(attachment_id, output, overwrite, json),
Commands::Transcribe {
attachment_id,
model,
whisper_bin,
silk_decoder,
ffmpeg,
language,
keep_temp,
json,
} => transcribe::cmd_transcribe(
attachment_id,
model,
whisper_bin,
silk_decoder,
ffmpeg,
language,
keep_temp,
json,
),
Commands::Daemon { cmd } => daemon_cmd::cmd_daemon(cmd),
}
}

View File

@ -0,0 +1,467 @@
use anyhow::{anyhow, Context, Result};
use serde_json::{json, Value};
use std::ffi::OsStr;
use std::io::Write;
use std::path::{Path, PathBuf};
use std::process::Command;
use super::output::{print_value, resolve};
use super::transport;
use crate::ipc::Request;
/// `wx transcribe` — 从语音 attachment_id 导出音频并调用本机 ASR。
///
/// Pipeline:
/// 1. daemon `Extract` 导出 WeChat 原始语音 bytes
/// 2. SILK v3: 规整 `#!SILK` header → decoder 输出 s16le PCM
/// 3. ffmpeg 转为 whisper.cpp 需要的 16k mono WAV
/// 4. whisper-cli 做本地 ASR
pub fn cmd_transcribe(
attachment_id: String,
model: Option<PathBuf>,
whisper_bin: Option<PathBuf>,
silk_decoder: Option<PathBuf>,
ffmpeg: Option<PathBuf>,
language: String,
keep_temp: bool,
json_out: bool,
) -> Result<()> {
let model = resolve_required_model(model)?;
let whisper_bin = resolve_tool(
whisper_bin,
"WX_WHISPER_BIN",
&["whisper-cli"],
"找不到 whisper.cpp 的 whisper-cli请用 --whisper-bin 指定路径,或设置 WX_WHISPER_BIN",
)?;
let ffmpeg = resolve_tool(
ffmpeg,
"WX_FFMPEG",
&["ffmpeg"],
"找不到 ffmpeg请安装 ffmpeg或用 --ffmpeg 指定路径",
)?;
let work = WorkDir::new(keep_temp)?;
let raw_path = work.path.join("voice.aud");
let silk_path = work.path.join("voice.silk");
let pcm_path = work.path.join("voice.pcm");
let wav_path = work.path.join("voice.wav");
let extract_report = extract_voice(&attachment_id, &raw_path)?;
let kind = extract_report
.get("kind")
.and_then(Value::as_str)
.unwrap_or("");
if kind != "voice" {
return Err(anyhow!(
"attachment_id 不是语音资源kind={}),请先用 `wx attachments CHAT --kind voice` 获取语音 ID",
kind
));
}
let raw_bytes = std::fs::read(&raw_path)
.with_context(|| format!("读取语音文件失败:{}", raw_path.display()))?;
let format = detect_audio_format(
extract_report
.get("format")
.and_then(Value::as_str)
.unwrap_or_default(),
&raw_bytes,
&raw_path,
);
let mut silk_header_offset: Option<usize> = None;
let decode_stage = if format == "silk" {
let silk_decoder = resolve_tool(
silk_decoder,
"WX_SILK_DECODER",
&["silk-decoder", "silk_v3_decoder", "silk_decoder"],
"找不到 SILK v3 decoder请用 --silk-decoder 指定 kn007/silk-v3-decoder 的 silk/decoder 路径,或设置 WX_SILK_DECODER",
)?;
silk_header_offset = Some(write_normalized_silk(&raw_bytes, &silk_path)?);
run_silk_decoder(&silk_decoder, &silk_path, &pcm_path)?;
run_ffmpeg_pcm_to_wav(&ffmpeg, &pcm_path, &wav_path)?;
json!({
"input_format": "silk",
"silk_header_offset": silk_header_offset,
"silk_decoder": silk_decoder.display().to_string(),
})
} else {
run_ffmpeg_audio_to_wav(&ffmpeg, &raw_path, &wav_path)?;
json!({
"input_format": format,
"silk_header_offset": silk_header_offset,
})
};
let whisper = run_whisper(&whisper_bin, &model, &wav_path, &language)?;
let transcript = clean_whisper_stdout(&whisper.stdout);
let mut report = json!({
"transcript": transcript,
"language": language,
"engine": "whisper.cpp",
"model": model.display().to_string(),
"whisper_bin": whisper_bin.display().to_string(),
"ffmpeg": ffmpeg.display().to_string(),
"audio": {
"source": extract_report.get("source").cloned(),
"format": format,
"decoder": extract_report.get("decoder").cloned(),
"output_size": extract_report.get("output_size").cloned(),
},
"decode": decode_stage,
"whisper": {
"stderr": whisper.stderr.trim(),
},
"kept_temp": keep_temp,
});
if keep_temp {
report["temp_dir"] = json!(work.path.display().to_string());
report["files"] = json!({
"raw": raw_path.display().to_string(),
"silk": if silk_path.exists() { Some(silk_path.display().to_string()) } else { None },
"pcm": if pcm_path.exists() { Some(pcm_path.display().to_string()) } else { None },
"wav": wav_path.display().to_string(),
});
}
print_value(&report, &resolve(json_out))
}
fn extract_voice(attachment_id: &str, raw_path: &Path) -> Result<Value> {
let resp = transport::send(Request::Extract {
attachment_id: attachment_id.to_string(),
output: raw_path.display().to_string(),
overwrite: true,
})?;
set_private_file_permissions(raw_path)?;
Ok(resp.data)
}
fn resolve_required_model(model: Option<PathBuf>) -> Result<PathBuf> {
if let Some(path) = model {
return require_existing_file(path, "--model");
}
if let Ok(path) = std::env::var("WX_WHISPER_MODEL") {
return require_existing_file(PathBuf::from(path), "WX_WHISPER_MODEL");
}
Err(anyhow!(
"缺少 whisper.cpp 模型路径;请传 --model /path/to/ggml-large-v3-turbo.bin或设置 WX_WHISPER_MODEL"
))
}
fn resolve_tool(
explicit: Option<PathBuf>,
env_name: &str,
candidates: &[&str],
missing_msg: &str,
) -> Result<PathBuf> {
if let Some(path) = explicit {
return require_existing_file(path, env_name);
}
if let Ok(path) = std::env::var(env_name) {
return require_existing_file(PathBuf::from(path), env_name);
}
for candidate in candidates {
if let Some(path) = find_in_path(candidate) {
return Ok(path);
}
}
Err(anyhow!(missing_msg.to_string()))
}
fn require_existing_file(path: PathBuf, label: &str) -> Result<PathBuf> {
if path.is_file() {
Ok(path)
} else {
Err(anyhow!("{} 指向的文件不存在:{}", label, path.display()))
}
}
fn find_in_path(name: &str) -> Option<PathBuf> {
let candidate = Path::new(name);
if candidate.components().count() > 1 && candidate.is_file() {
return Some(candidate.to_path_buf());
}
let paths = std::env::var_os("PATH")?;
for dir in std::env::split_paths(&paths) {
let path = dir.join(name);
if path.is_file() {
return Some(path);
}
}
None
}
fn detect_audio_format<'a>(reported: &'a str, bytes: &[u8], path: &Path) -> &'a str {
if find_subslice_prefix(bytes, b"#!SILK", 8).is_some() {
return "silk";
}
if bytes.starts_with(b"#!AMR") {
return "amr";
}
if bytes.len() >= 12 && &bytes[..4] == b"RIFF" && &bytes[8..12] == b"WAVE" {
return "wav";
}
if bytes.starts_with(b"ID3") || bytes.starts_with(&[0xFF, 0xFB]) {
return "mp3";
}
if bytes.len() >= 12 && &bytes[4..8] == b"ftyp" {
return "m4a";
}
if !reported.is_empty() && reported != "bin" && reported != "dat" {
return reported;
}
match path.extension().and_then(OsStr::to_str).unwrap_or_default() {
"amr" => "amr",
"wav" => "wav",
"m4a" => "m4a",
"mp3" => "mp3",
"silk" | "slk" => "silk",
_ => "bin",
}
}
fn write_normalized_silk(bytes: &[u8], silk_path: &Path) -> Result<usize> {
let offset = find_subslice_prefix(bytes, b"#!SILK", 8).ok_or_else(|| {
anyhow!("语音报告为 SILK但前 8 字节内找不到 #!SILK header无法调用 SILK decoder")
})?;
write_private_file(silk_path, &bytes[offset..])
.with_context(|| format!("写出 SILK 中间文件失败:{}", silk_path.display()))?;
Ok(offset)
}
fn find_subslice_prefix(haystack: &[u8], needle: &[u8], max_offset: usize) -> Option<usize> {
if needle.is_empty() || haystack.len() < needle.len() {
return None;
}
let end = haystack.len().saturating_sub(needle.len()).min(max_offset);
(0..=end).find(|&idx| &haystack[idx..idx + needle.len()] == needle)
}
fn run_silk_decoder(decoder: &Path, silk_path: &Path, pcm_path: &Path) -> Result<()> {
let output = Command::new(decoder)
.arg(silk_path)
.arg(pcm_path)
.output()
.with_context(|| format!("启动 SILK decoder 失败:{}", decoder.display()))?;
if !output.status.success() || !pcm_path.is_file() {
return Err(anyhow!(
"SILK decoder 失败:{}\n{}",
output.status,
String::from_utf8_lossy(&output.stderr).trim()
));
}
set_private_file_permissions(pcm_path)?;
Ok(())
}
fn run_ffmpeg_pcm_to_wav(ffmpeg: &Path, pcm_path: &Path, wav_path: &Path) -> Result<()> {
run_command(
Command::new(ffmpeg)
.arg("-y")
.arg("-f")
.arg("s16le")
.arg("-ar")
.arg("24000")
.arg("-ac")
.arg("1")
.arg("-i")
.arg(pcm_path)
.arg("-ar")
.arg("16000")
.arg("-ac")
.arg("1")
.arg("-c:a")
.arg("pcm_s16le")
.arg(wav_path),
"ffmpeg PCM -> WAV",
)?;
set_private_file_permissions(wav_path)
}
fn run_ffmpeg_audio_to_wav(ffmpeg: &Path, input_path: &Path, wav_path: &Path) -> Result<()> {
run_command(
Command::new(ffmpeg)
.arg("-y")
.arg("-i")
.arg(input_path)
.arg("-ar")
.arg("16000")
.arg("-ac")
.arg("1")
.arg("-c:a")
.arg("pcm_s16le")
.arg(wav_path),
"ffmpeg audio -> WAV",
)?;
set_private_file_permissions(wav_path)
}
fn run_whisper(
whisper_bin: &Path,
model: &Path,
wav_path: &Path,
language: &str,
) -> Result<CommandOutput> {
let output = Command::new(whisper_bin)
.arg("-m")
.arg(model)
.arg("-f")
.arg(wav_path)
.arg("-l")
.arg(language)
.arg("-nt")
.arg("-np")
.output()
.with_context(|| format!("启动 whisper-cli 失败:{}", whisper_bin.display()))?;
if !output.status.success() {
return Err(anyhow!(
"whisper-cli 失败:{}\n{}",
output.status,
String::from_utf8_lossy(&output.stderr).trim()
));
}
Ok(CommandOutput {
stdout: String::from_utf8_lossy(&output.stdout).to_string(),
stderr: String::from_utf8_lossy(&output.stderr).to_string(),
})
}
fn run_command(cmd: &mut Command, stage: &str) -> Result<()> {
let output = cmd
.output()
.with_context(|| format!("启动 {} 失败", stage))?;
if output.status.success() {
Ok(())
} else {
Err(anyhow!(
"{} 失败:{}\n{}",
stage,
output.status,
String::from_utf8_lossy(&output.stderr).trim()
))
}
}
fn write_private_file(path: &Path, bytes: &[u8]) -> Result<()> {
let mut options = std::fs::OpenOptions::new();
options.write(true).create_new(true);
#[cfg(unix)]
{
use std::os::unix::fs::OpenOptionsExt;
options.mode(0o600);
}
let mut file = options
.open(path)
.with_context(|| format!("创建私有文件失败:{}", path.display()))?;
file.write_all(bytes)
.with_context(|| format!("写入私有文件失败:{}", path.display()))?;
set_private_file_permissions(path)
}
fn set_private_file_permissions(path: &Path) -> Result<()> {
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
std::fs::set_permissions(path, std::fs::Permissions::from_mode(0o600))
.with_context(|| format!("设置文件权限失败:{}", path.display()))?;
}
Ok(())
}
fn clean_whisper_stdout(stdout: &str) -> String {
stdout
.lines()
.map(str::trim)
.filter(|line| !line.is_empty())
.collect::<Vec<_>>()
.join("\n")
}
struct CommandOutput {
stdout: String,
stderr: String,
}
struct WorkDir {
path: PathBuf,
keep: bool,
}
impl WorkDir {
fn new(keep: bool) -> Result<Self> {
for attempt in 0..128u32 {
let nanos = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap_or_default()
.as_nanos();
let path = std::env::temp_dir().join(format!(
"wx-transcribe-{}-{}-{}",
std::process::id(),
nanos,
attempt
));
match create_private_dir(&path) {
Ok(()) => {
return Ok(Self { path, keep });
}
Err(e) if e.kind() == std::io::ErrorKind::AlreadyExists => continue,
Err(e) => {
return Err(e).with_context(|| format!("创建临时目录失败:{}", path.display()));
}
}
}
Err(anyhow!("创建临时目录失败:连续 128 次命名冲突"))
}
}
fn create_private_dir(path: &Path) -> std::io::Result<()> {
#[cfg(unix)]
{
use std::os::unix::fs::DirBuilderExt;
std::fs::DirBuilder::new().mode(0o700).create(path)
}
#[cfg(not(unix))]
{
std::fs::create_dir(path)
}
}
impl Drop for WorkDir {
fn drop(&mut self) {
if !self.keep {
let _ = std::fs::remove_dir_all(&self.path);
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn find_silk_header_after_wechat_prefix() {
assert_eq!(
find_subslice_prefix(b"\x02#!SILK_V3", b"#!SILK", 8),
Some(1)
);
assert_eq!(find_subslice_prefix(b"#!SILK_V3", b"#!SILK", 8), Some(0));
}
#[test]
fn clean_whisper_stdout_keeps_non_empty_lines() {
assert_eq!(clean_whisper_stdout("\n 你好 \n\n世界\n"), "你好\n世界");
}
#[cfg(unix)]
#[test]
fn workdir_is_private() {
use std::os::unix::fs::PermissionsExt;
let work = WorkDir::new(true).unwrap();
let mode = std::fs::metadata(&work.path).unwrap().permissions().mode() & 0o777;
assert_eq!(mode, 0o700);
std::fs::remove_dir_all(&work.path).unwrap();
}
}

View File

@ -925,7 +925,8 @@ fn query_messages(
let mut result = Vec::new();
for (local_id, local_type, ts, real_sender_id, content_bytes, ct) in rows {
let content = decompress_message(&content_bytes, ct);
let sender_username = sender_username(real_sender_id, &content, is_group, chat_username, &id2u);
let sender_username =
sender_username(real_sender_id, &content, is_group, chat_username, &id2u);
let sender = sender_label(
real_sender_id,
&content,
@ -946,7 +947,13 @@ fn query_messages(
"type": fmt_type(local_type),
"local_id": local_id,
});
add_sender_identity(&mut msg, is_group, &sender_username, names_map, group_nicknames);
add_sender_identity(
&mut msg,
is_group,
&sender_username,
names_map,
group_nicknames,
);
if let Some(u) = url {
msg["url"] = serde_json::Value::String(u);
}
@ -1032,7 +1039,8 @@ fn search_in_table(
let mut result = Vec::new();
for (local_id, local_type, ts, real_sender_id, content_bytes, ct) in rows {
let content = decompress_message(&content_bytes, ct);
let sender_username = sender_username(real_sender_id, &content, is_group, chat_username, &id2u);
let sender_username =
sender_username(real_sender_id, &content, is_group, chat_username, &id2u);
let sender = sender_label(
real_sender_id,
&content,
@ -1057,7 +1065,13 @@ fn search_in_table(
"content": text,
"type": fmt_type(local_type),
});
add_sender_identity(&mut msg, is_group, &sender_username, names_map, group_nicknames);
add_sender_identity(
&mut msg,
is_group,
&sender_username,
names_map,
group_nicknames,
);
if let Some(u) = url {
msg["url"] = serde_json::Value::String(u);
}
@ -1558,11 +1572,13 @@ fn add_sender_identity(
}
row["sender_username"] = Value::String(username.to_string());
row["sender_contact_display"] = Value::String(
names.get(username).cloned().unwrap_or_else(|| username.to_string())
);
row["sender_group_nickname"] = Value::String(
group_nicknames.get(username).cloned().unwrap_or_default()
names
.get(username)
.cloned()
.unwrap_or_else(|| username.to_string()),
);
row["sender_group_nickname"] =
Value::String(group_nicknames.get(username).cloned().unwrap_or_default());
}
fn sender_label(
@ -2193,14 +2209,7 @@ mod appmsg_tests {
.expect("create message table");
conn.execute(
"INSERT INTO Msg_test VALUES (?1, ?2, ?3, ?4, ?5, ?6)",
rusqlite::params![
1_i64,
1_i64,
1775146911_i64,
42_i64,
"hello",
0_i64
],
rusqlite::params![1_i64, 1_i64, 1775146911_i64, 42_i64, "hello", 0_i64],
)
.expect("insert text message");
}
@ -2227,7 +2236,10 @@ mod appmsg_tests {
assert_eq!(rows.len(), 1);
assert_eq!(rows[0]["sender"].as_str(), Some("同名"));
assert_eq!(rows[0]["sender_username"].as_str(), Some("wxid_alice"));
assert_eq!(rows[0]["sender_contact_display"].as_str(), Some("Alice Contact"));
assert_eq!(
rows[0]["sender_contact_display"].as_str(),
Some("Alice Contact")
);
assert_eq!(rows[0]["sender_group_nickname"].as_str(), Some("同名"));
}
@ -2284,7 +2296,10 @@ mod appmsg_tests {
assert_eq!(rows.len(), 1);
assert_eq!(rows[0]["sender"].as_str(), Some("同名"));
assert_eq!(rows[0]["sender_username"].as_str(), Some("wxid_alice"));
assert_eq!(rows[0]["sender_contact_display"].as_str(), Some("Alice Contact"));
assert_eq!(
rows[0]["sender_contact_display"].as_str(),
Some("Alice Contact")
);
assert_eq!(rows[0]["sender_group_nickname"].as_str(), Some("同名"));
}
@ -2314,7 +2329,10 @@ mod appmsg_tests {
add_sender_identity(&mut alice_row, true, "wxid_alice", &names, &group_nicknames);
assert_eq!(alice_row["sender"].as_str(), Some("同名"));
assert_eq!(alice_row["sender_username"].as_str(), Some("wxid_alice"));
assert_eq!(alice_row["sender_contact_display"].as_str(), Some("Alice Contact"));
assert_eq!(
alice_row["sender_contact_display"].as_str(),
Some("Alice Contact")
);
assert_eq!(alice_row["sender_group_nickname"].as_str(), Some("同名"));
let mut bob_row = json!({
@ -2336,7 +2354,13 @@ mod appmsg_tests {
// 非群 chat 不该追加 identity 字段(行为对齐 history/search/new-messages
let mut private_row = json!({"attachment_id": "ghi", "sender": ""});
add_sender_identity(&mut private_row, false, "wxid_alice", &names, &group_nicknames);
add_sender_identity(
&mut private_row,
false,
"wxid_alice",
&names,
&group_nicknames,
);
assert!(private_row.get("sender_username").is_none());
assert!(private_row.get("sender_contact_display").is_none());
assert!(private_row.get("sender_group_nickname").is_none());
@ -2992,7 +3016,8 @@ pub async fn q_new_messages(
let mut result = Vec::new();
for (local_id, local_type, ts, real_sender_id, content_bytes, ct) in rows {
let content = decompress_message(&content_bytes, ct);
let sender_username = sender_username(real_sender_id, &content, is_group, &uname2, &id2u);
let sender_username =
sender_username(real_sender_id, &content, is_group, &uname2, &id2u);
let sender = sender_label(
real_sender_id,
&content,
@ -3015,7 +3040,13 @@ pub async fn q_new_messages(
"content": text,
"type": fmt_type(local_type),
});
add_sender_identity(&mut msg, is_group, &sender_username, &names_map, &group_nicknames2);
add_sender_identity(
&mut msg,
is_group,
&sender_username,
&names_map,
&group_nicknames2,
);
if let Some(u) = url {
msg["url"] = serde_json::Value::String(u);
}
@ -4393,18 +4424,21 @@ pub async fn q_attachments(
&names_map,
&group_nicknames2,
),
sender_username(
real_sender_id,
&content,
true,
&uname,
&id2u,
),
sender_username(real_sender_id, &content, true, &uname, &id2u),
)
} else {
(String::new(), String::new())
};
Ok((local_id, lo32, ts, real_sender_id, sender, sender_uname, ts, db_idx2))
Ok((
local_id,
lo32,
ts,
real_sender_id,
sender,
sender_uname,
ts,
db_idx2,
))
})?
.filter_map(|r| r.ok())
.collect();
@ -4449,7 +4483,13 @@ pub async fn q_attachments(
if is_group && !sender.is_empty() {
row["sender"] = Value::String(sender);
}
add_sender_identity(&mut row, is_group, &sender_uname, &names.map, &group_nicknames);
add_sender_identity(
&mut row,
is_group,
&sender_uname,
&names.map,
&group_nicknames,
);
results.push(row);
}
let unknown_shards = current_unknown_shards(db, names);
@ -4476,7 +4516,9 @@ pub async fn q_attachments(
}))
}
/// 解码 attachment_id → 查 message_resource.db → 找本地 .dat → 解密 → 写盘。
/// 解码 attachment_id → 写出附件资源。
/// image: message_resource.db → 本地 .dat → 解码。
/// voice POC: 优先 media_0.db::VoiceInfo → 原样写出 SILK/音频 bytes未命中再走资源文件 fallback。
pub async fn q_extract(
db: &DbCache,
_names: &Names,
@ -4487,7 +4529,7 @@ pub async fn q_extract(
use crate::attachment::{
attachment_id::AttachmentId,
decoder::{self, V2KeyMaterial},
image_key, resolver,
image_key, resolver, AttachmentKind,
};
let id = AttachmentId::decode(attachment_id)
@ -4508,6 +4550,44 @@ pub async fn q_extract(
}
}
if id.kind == AttachmentKind::Voice {
if let Some(media_path) = db.get("message/media_0.db").await? {
let id_for_task = id.clone();
let output_path2 = output_path.clone();
let report = tokio::task::spawn_blocking(move || -> Result<Option<Value>> {
let Some(voice) = resolver::lookup_voice_media_blocking(
&media_path,
&id_for_task.chat,
id_for_task.local_id,
id_for_task.create_time,
)?
else {
return Ok(None);
};
std::fs::write(&output_path2, &voice.data)
.with_context(|| format!("写出文件失败:{}", output_path2.display()))?;
Ok(Some(json!({
"kind": id_for_task.kind.as_str(),
"source": "message/media_0.db",
"local_id": id_for_task.local_id,
"create_time": id_for_task.create_time,
"chunks": voice.chunks,
"svr_id": voice.svr_id,
"output": output_path2.display().to_string(),
"output_size": voice.data.len(),
"format": raw_media_format(&output_path2, &voice.data),
"decoder": "media_0_voice_data",
"poc": true,
})))
})
.await??;
if let Some(report) = report {
return Ok(report);
}
}
}
// 1) 拿 message_resource.db
let resource_path = db
.get("message/message_resource.db")
@ -4535,6 +4615,22 @@ pub async fn q_extract(
let dat_bytes = std::fs::read(&resolved.dat_path)
.with_context(|| format!("读取 .dat 失败:{}", resolved.dat_path.display()))?;
if id_for_task.kind != AttachmentKind::Image {
std::fs::write(&output_path2, &dat_bytes)
.with_context(|| format!("写出文件失败:{}", output_path2.display()))?;
return Ok(json!({
"kind": id_for_task.kind.as_str(),
"md5": resolved.md5,
"dat_path": resolved.dat_path.display().to_string(),
"dat_size": resolved.size,
"output": output_path2.display().to_string(),
"output_size": dat_bytes.len(),
"format": raw_media_format(&resolved.dat_path, &dat_bytes),
"decoder": "raw_copy",
"poc": true,
}));
}
// V2 image key — 平台相关。`ImageKeyMaterial` 同时给 aes_key + xor_key。
// xor_key 不能硬编码 0x88实测 macOS 真实账号上是 `uin & 0xff` 派生的0xa2 等),
// 所以这里桥接时必须把 provider 的 xor_key 透传给 V2KeyMaterial。
@ -4599,7 +4695,7 @@ pub async fn q_extract(
}
/// 解析 `kinds` 参数到 `(AttachmentKind, lo32_local_type)` 列表。
/// 当前只支持 image命令名保留成 `attachments` 是为了后续扩到其他附件类型时不 break CLI
/// 默认 imagevoice/audio 是 POC可以枚举并 raw-copy 本地语音文件,但不做转码/转写
fn parse_attachment_kinds(
kinds: Option<&[String]>,
) -> Result<Vec<(crate::attachment::AttachmentKind, i64)>> {
@ -4613,12 +4709,13 @@ fn parse_attachment_kinds(
for k in raw {
let (kind, t): (AttachmentKind, i64) = match k.to_ascii_lowercase().as_str() {
"image" | "img" => (AttachmentKind::Image, 3),
"voice" | "audio" | "video" | "file" => {
"voice" | "audio" => (AttachmentKind::Voice, 34),
"video" | "file" => {
anyhow::bail!(
"当前只支持 image 提取video/file/voice 的资源路径与 decoder 还没接通"
"当前只支持 image 和 voice POCvideo/file 的资源路径与 decoder 还没接通"
)
}
other => anyhow::bail!("未知附件类型:{}(当前支持 image", other),
other => anyhow::bail!("未知附件类型:{}(当前支持 image / voice POC", other),
};
if seen.insert(kind.as_str()) {
out.push((kind, t));
@ -4627,10 +4724,75 @@ fn parse_attachment_kinds(
Ok(out)
}
fn raw_media_format(path: &std::path::Path, bytes: &[u8]) -> &'static str {
if bytes.starts_with(b"#!SILK")
|| bytes
.windows(b"#!SILK".len())
.take(8)
.any(|chunk| chunk == b"#!SILK")
{
return "silk";
}
if bytes.starts_with(b"#!AMR") {
return "amr";
}
if bytes.len() >= 12 && &bytes[..4] == b"RIFF" && &bytes[8..12] == b"WAVE" {
return "wav";
}
if bytes.starts_with(b"ID3") || bytes.starts_with(&[0xFF, 0xFB]) {
return "mp3";
}
if bytes.len() >= 12 && &bytes[4..8] == b"ftyp" {
return "m4a";
}
match path
.extension()
.and_then(|s| s.to_str())
.unwrap_or_default()
{
"aud" => "aud",
"amr" => "amr",
"silk" => "silk",
"wav" => "wav",
"m4a" => "m4a",
"mp3" => "mp3",
"dat" => "dat",
_ => "bin",
}
}
#[cfg(test)]
mod biz_tests {
use super::*;
#[test]
fn parse_attachment_kinds_accepts_voice_aliases() {
let kinds = vec!["voice".to_string(), "audio".to_string()];
let parsed = parse_attachment_kinds(Some(&kinds)).unwrap();
assert_eq!(parsed.len(), 1);
assert_eq!(parsed[0].0.as_str(), "voice");
assert_eq!(parsed[0].1, 34);
}
#[test]
fn raw_media_format_detects_common_audio_headers() {
assert_eq!(
raw_media_format(std::path::Path::new("x.bin"), b"#!SILK_V3"),
"silk"
);
assert_eq!(
raw_media_format(std::path::Path::new("x.aud"), b"\x02#!SILK_V3"),
"silk"
);
assert_eq!(
raw_media_format(std::path::Path::new("x.bin"), b"#!AMR\n"),
"amr"
);
let mut wav = b"RIFF0000WAVE".to_vec();
wav.extend_from_slice(&[0; 8]);
assert_eq!(raw_media_format(std::path::Path::new("x.bin"), &wav), "wav");
}
#[test]
fn extract_cdata_normal() {
let xml = "<title><![CDATA[TencentResearch]]></title>";
@ -4837,12 +4999,18 @@ mod group_nickname_tests {
assert_eq!(top.len(), 2);
assert_eq!(top[0]["sender"].as_str(), Some("同名"));
assert_eq!(top[0]["sender_username"].as_str(), Some("wxid_alice"));
assert_eq!(top[0]["sender_contact_display"].as_str(), Some("Alice Contact"));
assert_eq!(
top[0]["sender_contact_display"].as_str(),
Some("Alice Contact")
);
assert_eq!(top[0]["sender_group_nickname"].as_str(), Some("同名"));
assert_eq!(top[0]["count"].as_i64(), Some(7));
assert_eq!(top[1]["sender"].as_str(), Some("同名"));
assert_eq!(top[1]["sender_username"].as_str(), Some("wxid_bob"));
assert_eq!(top[1]["sender_contact_display"].as_str(), Some("Bob Contact"));
assert_eq!(
top[1]["sender_contact_display"].as_str(),
Some("Bob Contact")
);
assert_eq!(top[1]["sender_group_nickname"].as_str(), Some("同名"));
assert_eq!(top[1]["count"].as_i64(), Some(3));
}

View File

@ -155,11 +155,11 @@ pub enum Request {
},
/// 重新加载配置和密钥init --force 后 daemon 不会自动重读)
ReloadConfig,
/// 列出某个会话里的图片附件
/// 列出某个会话里的附件
/// 输出每条带 `attachment_id`(不透明 base64url 句柄),传给 `Extract` 时取回本体
Attachments {
chat: String,
/// 类型过滤:当前仅支持 image
/// 类型过滤:默认 imagePOC 支持 voice/audio
#[serde(default, skip_serializing_if = "Option::is_none")]
kinds: Option<Vec<String>>,
#[serde(default = "default_limit_50")]
@ -175,7 +175,7 @@ pub enum Request {
#[serde(default, skip_serializing_if = "is_false")]
debug_source: bool,
},
/// 提取(解密)单个附件的本体到指定路径
/// 提取单个附件的本体到指定路径;图片解码,语音 POC 原样复制
Extract {
/// `Attachments` 返回的不透明 ID
attachment_id: String,