Commit Graph

10 Commits

Author SHA1 Message Date
nianzhibai 7e5e67697e feat: add GuangYaPan drive support
Implement a new GuangYaPan cloud drive integration across the backend, admin UI, playback proxy, and Spider91 migration flow.

Backend changes:\n- Add a GuangYaPan drive driver with token refresh, QR/device login support, directory listing, stream link resolution, directory creation, rename/delete operations, OSS multipart upload, and upload task polling.\n- Register GuangYaPan as a supported storage kind in configuration, catalog normalization, admin APIs, public drive labels, and 302 playback redirects.\n- Allow Spider91 crawler uploads to target GuangYaPan through a dedicated migration adapter.\n- Add scan, thumbnail, preview, and fingerprint cooldown handling for GuangYaPan based on explicit HTTP status codes, Retry-After values, and structured provider codes instead of natural-language message matching.\n- Tighten existing provider cooldown detectors so OneDrive, Google Drive, 115, PikPak, 123pan, Wopan, and media workers avoid treating arbitrary response text as a rate-limit signal.\n- Keep large videos eligible for preview generation unless the user disables preview generation.

Admin and tooling changes:\n- Add GuangYaPan as a selectable drive type with QR login UI and token/root-path credential fields.\n- Add crawler upload target support for GuangYaPan in the admin UI.\n- Add drive branding, labels, metadata display, and docs/config examples for GuangYaPan.\n- Include a standalone GuangYaPan QR login helper script for manual credential acquisition.

Tests:\n- Add GuangYaPan driver, QR login, proxy, admin API, crawler upload target, fingerprint, cooldown, and form coverage.\n- Update rate-limit tests to assert that message-only throttling text no longer starts cooldowns.\n- Cover explicit HTTP status parsing through shared drive helper tests.
2026-06-14 15:44:50 +08:00
nianzhibai 96e423b952 feat: 完善爬虫去重、上传进度和源文件删除
为脚本爬虫增加候选预算、重复 source 记录和默认爬虫标签,避免重复视频占满目标新增数量。

新增爬虫上传迁移进度上报和管理页上传卡片,让每个爬虫可以展示本轮上传处理情况。

为视频删除增加可选删除云盘源文件能力,补齐播放页、管理页交互,并为多个网盘驱动实现 Remove 接口。

补充相关测试并更新爬虫协议文档。
2026-06-11 22:42:11 +08:00
nianzhibai a8ccc19e9e Fix script crawler migration to PikPak
Handle already-migrated crawler assets by binding local script crawler rows to equivalent files that already exist on the configured target drive. This keeps thumbnail, preview, and fingerprint readiness stable while removing local crawler videos once an equivalent target object is available.

Harden PikPak uploads by retrying failed upload sessions, requesting fresh resumable upload metadata between attempts, and using CNAME-style OSS requests for PikPak upload endpoints so the SDK does not generate invalid bucket-prefixed hosts such as vip-lixian-07.upload-a10b.mypikpak.com.

Add focused tests for duplicate target binding, retrying failed PikPak OSS uploads with a fresh session, and preserving the expected PikPak upload endpoint URL shape.
2026-06-11 14:03:37 +08:00
nianzhibai 6e87f88d53 feat: support spider91 uploads to OneDrive 2026-05-30 18:04:15 +08:00
nianzhibai c146ad50ed Fix PikPak captcha recovery 2026-05-29 14:49:47 +08:00
nianzhibai 7540371838 feat: restore tag classification and drive controls
Restore the previous fixed-tag classification flow, including startup backfill for existing videos and the 91porn spider tag.

Also commit the current drive scanning, preview scheduling, and admin drive-control updates present in the workspace.
2026-05-28 12:18:17 +08:00
nianzhibai 95bf67667a fix(spider91): cool down PikPak captcha migration failures 2026-05-27 10:59:12 +08:00
nianzhibai bd3f27d5b3 fix(pikpak): auto-recover from error_code=4002 captcha_token expired
When PikPak's cached captcha_token expires, Init() and runtime API
calls used to fail permanently with error_code=4002, leaving the drive
un-attached and blocking spider91 -> PikPak migration.

- refreshCaptchaToken: on 4002, clear cached token and retry once with
  empty captcha_token so the server issues a fresh one. Covers the
  driver-attach path during server startup.
- requestOnce: extend captcha-refresh-and-retry path from case 9 to also
  cover case 4002, clearing cache before refresh to avoid sending the
  same expired token again. Covers per-API-call recovery at runtime.
- Add captcha_recovery_test.go covering: recovery on 4002, no-loop
  guard when token already empty, request-level recovery, and
  single-retry boundary.

OpenList's upstream PikPak driver does not currently handle 4002 either,
so this is a strict improvement.
2026-05-25 16:33:41 +08:00
nianzhibai ada69fec87 feat(pikpak): 302 重定向播放 + 自动迁移 spider91 视频
- PikPak 视频播放从反代切到 302 直连 PikPak CDN(与 OpenList 一致),
  浏览器直接拿签名链接,backend 不再消耗带宽转发字节。
  proxy.shouldRedirect 改成 switch,pikpak 与 p115 同等处理。

- 实现 PikPak Driver.Upload:参考 OpenList 协议,先算 GCID
  (SHA1-of-SHA1-blocks 自定义 hash,OpenList 同款)申请上传会话;
  命中秒传直接返回 file id,否则用 vendored 的 aliyun-oss-go-sdk
  PutObject 走 S3 兼容上传。单次 PutObject 上限 5GiB-1。
  另加 PikPak.Rename(PATCH /drive/v1/files/<id>)。

- 新建 internal/spider91migrate 包:周期把 spider91 爬的视频上传到
  指定的 PikPak drive,事务性改写 catalog 行(drive_id / file_id /
  file_name / content_hash),删本地 mp4+thumb。视频 ID 保持
  spider91-<driveID>-<viewkey> 不变,video_tags / views / likes /
  91porn 标签全部保留。catalog 加 MigrateVideoToDrive +
  ListVideosByDriveID + ListSpider91Viewkeys。

- 上传策略:本地保留最新 KeepLatestN=15 个文件,超出部分(更旧的)
  才上传到 PikPak。第一次爬完 15 个全留本地不上传;第二次爬完 30 个
  时把最旧 15 个迁走。稳态本地 ≤15 个最新视频,PikPak 累积所有历史。

- 文件名方案 B:上传到 PikPak 时用 <sanitized title>-<viewkey后8>.<ext>,
  catalog file_name 同步更新;启动时 backfillFileNames 幂等地把已迁
  视频的旧名(viewkey.ext)改成新格式。

- crawler 完成后立即 ping migrator,不必等 60s 周期。

- 修一个迁移破坏去重的 bug:crawler 写 seen viewkey 时按 drive_id 查,
  但视频迁到 PikPak 后 drive_id 不再是 spider91。改用 ListSpider91Viewkeys
  按 id 前缀 'spider91-<driveID>-' 查,迁移后仍能识别。

- 加全局设置 spider91_upload_drive_id(settings 表)+ admin GET/PUT API;
  未显式设置时自动选取唯一的 PikPak drive。

- 顺手清理已废弃的 RemoteDir / preview 回写网盘相关代码(teaser+封面
  早就只走本地,但残留了 Config 字段、yaml 示例、NewWorker 多余参数、
  catalog UpdatePreview 多余参数)。

测试:
- 新增 ~40 个单测覆盖 GCID 算法、PikPak Upload/Rename schema、
  migrator 各种场景(保留窗口内/外、上传失败、未配 target、批次限流、
  孤儿清理、文件名 backfill 幂等)、文件名 sanitize、PikPak 302 重定向。
- 全包 go test -count=1 通过。

联调:在生产实例上验证:spider91 17 条已迁视频 6 秒内全部秒传到 PikPak、
catalog 改写正确、本地清空、PikPak 视频回放走 302 直连
dl-z01a-0043.mypikpak.net;触发新爬 15 条本地保留不上传;
backfill 把旧 viewkey.mp4 命名改成 <title>-<viewkey后8>.mp4。
2026-05-23 02:01:36 +08:00
nianzhibai 3506328441 Add PikPak drive support
Add PikPak backend driver, fixed tag matching, cached transcode playback, fast cover handling, and LF normalization.
2026-05-10 23:55:04 +08:00