64 Commits

Author SHA1 Message Date
nianzhibai 0faeaf408f fix(crawlers): stabilize manual upload workflow
Add a manual crawler upload action in the admin UI and backend so users can retry uploads when automatic migration leaves local crawler videos behind.

Keep the button always clickable and return clear refusal messages when there are no local videos, no upload target, unfinished fingerprints/previews, failed generated assets, or active crawler work.

Simplify crawler cards by removing pipeline/status capsules, dropping the ready pill, and aligning the preview toggle with the existing action button style.

Avoid the small-/tmp upload bug by reusing seekable local files for PikPak GCID calculation/uploads and by routing fallback upload temp files for PikPak, 115, 123Pan, and WoPan into the application data upload-tmp directory.

Add regression coverage for crawler manual upload handling, frontend form expectations, configured upload temp dirs, and PikPak seekable-reader uploads.

Verification: npm run lint; npm test; npm run build; go test ./... -count=1.
2026-06-20 00:14:37 +08:00
nianzhibai f9351324c6 fix: show active preview generation status 2026-06-14 18:22:04 +08:00
nianzhibai bb83277d62 feat: add crawler preview generation toggle
Expose per-crawler teaser settings on crawler cards and persist them through the admin API.\n\nWhen preview generation is disabled, crawler imports still create thumbnails and fingerprints while marking previews disabled and allowing migration without waiting for teaser files.\n\nPreserve the latest teaser setting after crawler runs so stale crawl state cannot overwrite a user toggle.
2026-06-14 17:52:29 +08:00
nianzhibai 7e5e67697e feat: add GuangYaPan drive support
Implement a new GuangYaPan cloud drive integration across the backend, admin UI, playback proxy, and Spider91 migration flow.

Backend changes:\n- Add a GuangYaPan drive driver with token refresh, QR/device login support, directory listing, stream link resolution, directory creation, rename/delete operations, OSS multipart upload, and upload task polling.\n- Register GuangYaPan as a supported storage kind in configuration, catalog normalization, admin APIs, public drive labels, and 302 playback redirects.\n- Allow Spider91 crawler uploads to target GuangYaPan through a dedicated migration adapter.\n- Add scan, thumbnail, preview, and fingerprint cooldown handling for GuangYaPan based on explicit HTTP status codes, Retry-After values, and structured provider codes instead of natural-language message matching.\n- Tighten existing provider cooldown detectors so OneDrive, Google Drive, 115, PikPak, 123pan, Wopan, and media workers avoid treating arbitrary response text as a rate-limit signal.\n- Keep large videos eligible for preview generation unless the user disables preview generation.

Admin and tooling changes:\n- Add GuangYaPan as a selectable drive type with QR login UI and token/root-path credential fields.\n- Add crawler upload target support for GuangYaPan in the admin UI.\n- Add drive branding, labels, metadata display, and docs/config examples for GuangYaPan.\n- Include a standalone GuangYaPan QR login helper script for manual credential acquisition.

Tests:\n- Add GuangYaPan driver, QR login, proxy, admin API, crawler upload target, fingerprint, cooldown, and form coverage.\n- Update rate-limit tests to assert that message-only throttling text no longer starts cooldowns.\n- Cover explicit HTTP status parsing through shared drive helper tests.
2026-06-14 15:44:50 +08:00
nianzhibai 9cc8e02bec feat: add sky theme and refresh themed UI
Add the sky theme across the frontend and backend theme APIs, including starfield assets and icon-only branding.

Refresh themed grid backgrounds, admin/login/sidebar styling, and theme-specific video/listing polish.
2026-06-14 11:53:07 +08:00
nianzhibai 738406162a feat: add video blacklist management
Add backend blacklist tombstone APIs and hidden-video migration support.

Update the admin video management UI with blacklist tabs, restore actions, alignment fixes, responsive layout polish, and regression coverage.
2026-06-13 14:34:00 +08:00
nianzhibai 0f111b846d feat: add opt-in toggle for local STRM targets outside the storage root
Local .strm files that pointed to a path outside the configured storage
root previously failed cover/preview/fingerprint generation and playback
with "strm target escapes root", breaking the common layout where the
strm library and the real media files (e.g. an rclone mount) live in
separate directories (issue #22 follow-up).

- localstorage driver gains STRMAllowOutsideRoot; when on, strm targets
  outside the root are allowed (still resolves symlinks and still rejects
  nested strm, so no new escape vector). Default off preserves the
  existing security boundary
- Toggle persisted as the strm_allow_outside_root credential; editing a
  localstorage drive now merges credentials per-key so leaving the path
  blank keeps the old value while flipping the toggle
- Saving a localstorage drive with the toggle on auto-re-enqueues
  previously-failed thumbnails/previews/fingerprints, so enabling it
  recovers without manually clicking the three retry buttons
- Drives API exposes strmAllowOutsideRoot for form echo-back; admin
  drive form adds a "允许指向目录外" select with a security warning
- Tests cover allow-outside-root on/off and that nested strm stays
  rejected even when the toggle is on

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 10:34:54 +08:00
nianzhibai 4dd9015bd7 feat: add per-storage manual transcode for browser-incompatible videos
Add a transcode control to each storage in the admin drives page,
modeled after the cover/preview generation controls:

- Manual start/stop button per storage; transcoding is off by default
  and never runs automatically (not triggered by scans or the nightly
  pipeline)
- New transcode worker probes candidates (non mp4/webm extensions)
  with ffprobe: already-compatible files are marked skipped; AVI with
  H.264 is remuxed losslessly; incompatible codecs (MPEG-4 Part 2,
  WMV, RMVB, HEVC...) are transcoded to H.264/AAC MP4 with +faststart
- Transcoded output is uploaded back to the same storage under a
  "91转码" directory which is auto-added to the drive's scan skip list
  so the scanner never re-imports the artifacts
- Playback source automatically prefers the transcoded file once
  ready, keeping the 302 direct-link mode for cloud drives
- videos table gains transcode_status/error/file_id/size columns via
  startup migration; counts and live task status surface in the
  admin drives API and generation panel UI
- Stop semantics: per-drive stop button, drive-level "stop all tasks"
  and global stop all include the transcode task; interrupted videos
  keep their candidate status and resume on next start

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 09:41:08 +08:00
nianzhibai ae324d3752 feat: add Unicom cloud drive source operations
Add Unicom cloud drive support for source-file deletion and crawler uploads.

- Implement source-file removal for Unicom cloud drive so deleting videos can also remove the original cloud-drive file when requested.

- Resolve Unicom cloud drive source identifiers across file FID, object ID, directory ID, rename, and delete flows.

- Add upload support for Spider91 crawler imports targeting Unicom cloud drive storage.

- Add Unicom cloud drive QR login backend APIs, frontend form support, and tests.

- Extend drive capability metadata, scanner behavior, proxy handling, preview handling, and migration coverage for cloud-drive source operations.

- Rename Chinese display labels from 联通沃盘 to 联通网盘 and from 123 云盘 to 123网盘 while keeping the root README aligned with origin/main.

- Add referrer-policy coverage for 302 video playback and update related frontend playback tests.
2026-06-12 15:49:15 +08:00
nianzhibai 811d87cc27 feat: 完善脚本爬虫导入与管理体验
- 重写添加/编辑爬虫弹窗布局,优化桌面宽度、脚本来源与测试区域比例,并隐藏本地路径/导入 URL 等内部信息。

- 调整爬虫管理页文案和移动端统计卡片布局,统一状态卡片两列展示,避免弹窗点击遮罩误关闭。

- 支持 GitHub blob 链接自动转换为 raw 链接,提升通过 URL 导入脚本的兼容性。

- 为脚本爬虫下载结果增加 ffprobe 完整性校验,失败时删除坏文件且不写入 seen,允许后续重新抓取。

- 支持 .m3u8/HLS 媒体通过 ffmpeg 重新封装为本地 MP4,并继续走指纹、封面、预览和上传迁移流程。

- 修复 dry-run stderr 日志偶发丢失问题,并补充 GitHub URL、坏视频清理、HLS 下载、弹窗交互和响应式布局测试。
2026-06-12 10:53:18 +08:00
nianzhibai 96e423b952 feat: 完善爬虫去重、上传进度和源文件删除
为脚本爬虫增加候选预算、重复 source 记录和默认爬虫标签,避免重复视频占满目标新增数量。

新增爬虫上传迁移进度上报和管理页上传卡片,让每个爬虫可以展示本轮上传处理情况。

为视频删除增加可选删除云盘源文件能力,补齐播放页、管理页交互,并为多个网盘驱动实现 Remove 接口。

补充相关测试并更新爬虫协议文档。
2026-06-11 22:42:11 +08:00
nianzhibai 7ddf33d726 Improve crawler asset stats and admin navigation
- Count crawler assets by crawler source ID prefix after cloud migration

- Add crawler API totals for cumulative, local, and migrated videos

- Let crawler thumbnail and preview readiness inherit equivalent canonical videos

- Show cumulative crawl data in crawler management cards

- Remove low-value expanded crawler metadata fields from the card body

- Move return-to-site into the main admin navigation with grouped sections

- Rename the content admin group to management and adjust footer icon sizing

- Update backend and frontend tests for crawler/admin behavior
2026-06-10 23:45:43 +08:00
nianzhibai c1355385e1 feat(crawler): simplify script crawler workflow
Redesign crawler management around imported Python scripts instead of built-in crawler storage. Crawler scripts now declare CRAWLER_NAME, imports validate metadata, crawler IDs are generated internally, and deleted crawler scripts are detached without deleting already imported videos.

Add backend support for file and URL script imports, dry-run testing, metadata parsing, safer job paths, original filename preservation, and crawler listing that ignores detached script records. Remove the legacy built-in Spider91 script path flow and hidden Python/config JSON fields from the crawler API.

Rework the admin crawler page into an independent crawler console with script import, dry-run testing, status metrics, spider iconography, and simplified controls. Update docs, examples, installer checks, Docker/release packaging, and tests for the new protocol.
2026-06-10 14:27:16 +08:00
nianzhibai ec5a01b6aa feat(crawler): redesign crawler scripts and admin workflow
- add generic scriptcrawler backend runner using the crawler.v1 JSONL protocol

- support crawler script upload and HTTP(S) URL import from the admin crawler page

- simplify the user-facing crawler contract to title, media_url, optional thumbnail_url and optional source_id

- convert Spider91 into a normal script crawler and reject new Spider91 storage-drive configs

- keep legacy Spider91 storage rows visible only for cleanup/deletion

- add crawler protocol docs, example script, admin UI, tests and migration coverage
2026-06-09 23:51:12 +08:00
nianzhibai 940e5dd76d feat: support spider91 uploads to google drive 2026-06-08 23:50:19 +08:00
nianzhibai 5fc8e9ebb7 Improve drive scan task coordination 2026-06-08 17:37:58 +08:00
nianzhibai c87208117e Fix scanner cancellation and shorts UI 2026-06-06 08:37:00 +00:00
nianzhibai 8dff0f07b9 feat: add admin video deletion and mobile UI polish
Adds tombstone-backed video deletion with generated asset cleanup, plus responsive video management actions and centered confirmation dialogs.
2026-06-04 16:10:26 +08:00
nianzhibai 5080203b7c feat: add drive task stop controls
Add per-drive and global admin controls to stop scan, preview, thumbnail, and fingerprint work.

Keep stopped pending generation resumable, wire cancellation through workers and nightly runs, and refine mobile drive-management UI/history behavior.
2026-06-03 23:42:54 +08:00
nianzhibai df6f0ebbbf feat: support spider91 upload to 123pan 2026-06-03 21:49:27 +08:00
nianzhibai 8f0d52aec4 fix: hash long local media asset filenames 2026-06-03 20:35:53 +08:00
nianzhibai 869c0d5f78 refactor: rename teaser UI copy to preview video 2026-06-03 19:45:15 +08:00
nianzhibai cada336e96 123云盘支持,删除存储逻辑优化 2026-06-02 14:30:16 +08:00
nianzhibai e36a17f99d fix: improve 91Spider tagging and deduped tag filters 2026-06-01 18:51:56 +08:00
nianzhibai cf9de5b40a Add failed fingerprint retry controls 2026-06-01 13:42:32 +08:00
nianzhibai 4ba964b7e2 fix thumbnail status and frontend serving 2026-05-31 17:40:16 +08:00
nianzhibai cd3b3c6976 feat: use root id as drive scan root 2026-05-31 17:13:51 +08:00
nianzhibai a407312dfa fix: prevent duplicate scan-all jobs 2026-05-31 15:09:05 +08:00
nianzhibai 87d197496b Limit thumbnail transient retries 2026-05-31 12:02:49 +08:00
nianzhibai 0e3a5bd5cd Add Google Drive support 2026-05-31 11:14:03 +08:00
nianzhibai 66adf444ba fix: detect Docker image version for update checks 2026-05-31 09:55:15 +08:00
nianzhibai e57058db79 feat: prepare v0.0.4 storage release 2026-05-30 20:02:02 +08:00
nianzhibai 6ec61833f2 feat: probe video duration during thumbnail generation 2026-05-30 18:30:22 +08:00
nianzhibai 6e87f88d53 feat: support spider91 uploads to OneDrive 2026-05-30 18:04:15 +08:00
nianzhibai e78fa9d978 feat: improve media generation pipeline status 2026-05-30 17:37:31 +08:00
nianzhibai 039ec2a988 Improve fingerprint dedupe maintenance 2026-05-29 23:58:36 +08:00
nianzhibai da0683344e Add sampled fingerprint deduplication 2026-05-29 23:19:52 +08:00
nianzhibai 34b6fa8ea9 Release v0.0.3 improvements 2026-05-29 18:34:38 +08:00
nianzhibai f5c20f9594 Fix spider91 upload target and thumbnails 2026-05-29 06:28:18 +00:00
nianzhibai 137cfbcf82 feat: add prebuilt installer workflow 2026-05-28 19:13:41 +08:00
nianzhibai bb8818a55a feat: improve admin setup and drive management 2026-05-28 18:41:40 +08:00
nianzhibai d2d4db8062 fix: harden spider91 source matching 2026-05-28 16:10:20 +08:00
nianzhibai 7540371838 feat: restore tag classification and drive controls
Restore the previous fixed-tag classification flow, including startup backfill for existing videos and the 91porn spider tag.

Also commit the current drive scanning, preview scheduling, and admin drive-control updates present in the workspace.
2026-05-28 12:18:17 +08:00
nianzhibai 39ef2defcc feat(spider91): 流式爬取 + 完成后统一入队 teaser + 封面失败标 failed
三件相关改动,主题都是 spider91 爬虫流程。

1. 流式爬取协议(取代旧的 "Python 凑齐 15 个再交 Go" 模型)

  Python 端 (spider_91porn.py):
    - 新增 --stream-output flag。开启后每解析出一个 video 直链就把
      entry 作为一行 JSON 写到 stdout 并 flush。
    - log() 在 stream 模式下走 stderr,避免污染 stdout JSONL 协议。
    - --output FILE 仍生效,作离线归档用。

  Go 端 (crawler.go):
    - 新 startSpiderTargetNew() 异步启动 cmd,返回 stdout pipe。
    - RunOnce 用 bufio.Scanner 按行读 stdout,每行解析后立即 processOne
      (下载视频 + 封面 + UpsertVideo)。删掉旧 readSpiderOutput / 全 JSON
      文件解析路径。
    - Python stderr 转发到 backend log,前缀 [spider91:py]。

  收益:Python 翻页找下一个 viewkey 与 Go 下载当前视频在时间上重叠,
  最大化每条签名链接 e= 时间窗。今天观察到 Python 77 秒就找完 15 个
  viewkey 全部 emit;如果还像旧模型那样要等 Go 串行下完才开始下一个,
  后面几个的签名很容易过期(之前 8/15 全 EOF 的根因之一)。

2. teaser 在 crawler 完成后统一入队(取代每条入库立即 enqueue)

  - main.go attachSpider91Crawler 不再注入 OnNewVideo callback。
  - main.go runSpider91Crawl 在 Crawler.RunOnce 完成后调一次
    enqueueDriveGeneration(driveID),让所有新视频统一进 teaser worker。
  - 与 nightly Phase 2 的 "等 teaser 队列 idle" 语义自然对齐。
  - 下载阶段不和 ffmpeg 抢 CPU/IO。

3. 网站封面下载失败时显式标 thumbnail_status='failed'

  spider91 drive 的 thumb worker 按设计不处理 spider91 视频(封面应是
  网站原图直接保存)。当网站封面下载失败时,url='' + status='pending'
  会让 enqueueDriveGeneration 的 waitForThumbnailsBeforePreview 因为
  CountVideosNeedingThumbnail > 0 把 teaser 卡死等待循环。

  修复:crawler.go processOne 中 thumb 失败分支显式标 status='failed'
  (CountVideosNeedingThumbnail 条件 status != 'failed' 会排除)。

  今天观察到的现象:187 MB 视频 c2c04fc8602c5396d469 卡在
  '[preview] waiting for 1 thumbnails before teaser generation'
  循环 35 分钟。

测试:
  - crawler_test.go 重构为 buildFakeSpiderScript helper,
    生成支持 --stream-output 的伪 python(其实是 sh),逐行 echo JSON。
  - TestCrawlerRunOnceFullFlow / TestCrawlerThumbDownloadFailureMarksStatusFailed
    通过新 helper 验证流式协议 + thumb fail 闸门。

go test ./... 全绿;线上手动触发 spider91 抓取验证流式行为正确。
2026-05-27 18:48:30 +08:00
nianzhibai 1eeebbf305 refactor(scheduling): 统一三套定时调度为 NightlyJob 流水线
替代 scanLoop / crawlerLoop / Migrator.Run 三个并行的周期循环为单一 nightly.Runner,
每天 cron_hour(默认 01:00)串行跑一条流水线:

  Phase 1  扫所有非 spider91 / 非 localupload 网盘
           → 检测新增视频 + 检测被删视频(清理 catalog 行 + 本地封面/teaser)
           → 入队封面 + teaser(per-drive teaser_enabled 决定 teaser 是否入队)
           → 等所有 thumb / teaser worker 队列 idle
  Phase 2  仅当存在 spider91 drive:跑 91 爬虫,新视频入队 teaser
           → 等 teaser 队列 idle
  Phase 3  spider91 → 云盘迁移(PikPak/115 一次性 sweep)

关键属性:
  - 6h 软超时(nightly.max_duration);到点 phase 跑完,后续 phase 不启动
  - 当天去重:last_run_date 持久化到 settings 表,进程崩溃重启不重复跑
  - sync.Mutex.TryLock 保证手动触发与自然 cron 触发互斥
  - 每 phase 边界检查 ctx.Err,不强 kill 进行中的 ffmpeg / 上传
  - 单 drive '重扫' 和 spider91 '立即抓取' 按钮保留
  - 顶栏新增 '立即跑全流程' 按钮 (POST /admin/api/jobs/nightly/run)

附带优化:
  - preview.Worker / ThumbWorker 增加 WaitIdle(ctx) error,nightly 用作同步屏障
  - scanner 增加 30s 心跳进度日志,避免长扫盘内部黑盒
    格式: [scanner] drive=X progress: scanned=N added=K errors=E dirs=M elapsed=Ts at=<dir>
  - cleanupMissingDriveVideos 从 PikPak-only 扩展到所有云盘 kind
    (保留 stats.Errors==0 闸门避免 API 抖动误删)
  - Migrator 移除周期 ticker / Trigger 通道,改成可单独调用的 RunOnce
    (captcha cooldown 状态机仍保留,跨 RunOnce 持久 5 分钟)

废弃 (字段保留以兼容旧 yaml):
  - scanner.interval_seconds   (替代为 nightly.cron_hour 调度)
  - spider91 drive 的 crawl_hour 凭证字段 (last_crawl_at 仅作 admin UI 显示)

测试:go test ./... 全绿 (含 nightly 包 ~320 行单元测试);npm run build 通过。
2026-05-27 13:17:44 +08:00
nianzhibai ebd6943a10 feat(spider91,drives): 支持上传 115 + 每盘 Teaser 开关
* spider91 → 云盘迁移目标从仅 PikPak 扩展到 PikPak ∪ 115:
  - 115 driver 新增 UploadAndReportSha1(buffer 到 tmp 文件 + sha1 +
    SDK RapidUploadOrByMultipart + 父目录按 sha1 找 fileID)和 Rename
  - migrator 引入 uploadTarget 接口 + pikpakAdapter / p115Adapter,
    按 drive Kind() 路由;catalog 改写 / 本地清理 / 失败冷却 / backfill
    file_name 行为对两种目标盘统一。captcha 冷却仍只对 PikPak 4002/9 生效
  - App.Spider91UploadDriveID 校验放宽到 pikpak ∪ p115,自动选取在两类
    候选并存时拒绝(要求显式选定)
  - admin DrivesPage 在 spider91 表单里加"上传目标"下拉,文案按系统中
    实际挂载的盘 kind 自适应(只挂 PikPak 不会显示 115 字样,反之亦然)

* 全局 teaser 开关下沉为每盘 toggle 按钮:
  - drives 表加 teaser_enabled INTEGER NOT NULL DEFAULT 1
  - 删除 App.PreviewEnabled / SetPreviewEnabled / loadPreviewEnabled
    和 settings.previewEnabled 字段;前端删除 PreviewToggle 组件
  - 新增 catalog.SetDriveTeaserEnabled + POST /admin/api/drives/{id}/teaser-enabled
    接口;AdminServer 加 OnTeaserEnabledChanged hook,从关到开时立刻
    enqueueDriveGeneration 补扫 pending teaser
  - 网盘列表"操作"列加 Power / PowerOff toggle 按钮,乐观更新 + 失败回滚
  - 一次性迁移 resetDriveTeaserEnabledToDefaultOnce:把现存 drive 强制
    重置为开启,marker setting 记号防止重复(兼容短暂存在过的、把全局
    preview.enabled=0 同步成 per-drive=0 的中间版本)
  - 封面 worker 仍始终入队,开关只控制 teaser,避免越权

测试:go test ./... 全绿;npx tsc --noEmit / npm run build 通过。
2026-05-27 12:07:41 +08:00
nianzhibai d920943b58 fix(security): replace reflect-Origin CORS with allowlist (C-1)
Previously corsMiddleware reflected any Origin back into
Access-Control-Allow-Origin while emitting Allow-Credentials: true.
Combined with no CSRF token, this let any third-party site read or
write authenticated APIs cross-origin (full session takeover via
chained requests to /admin/api/drives etc).

Changes:
- config.Server.AllowedOrigins []string (default empty = same-origin only)
- corsMiddleware now only emits CORS headers for whitelisted Origins;
  unknown origins receive no Allow-Origin and 403 on preflight
- '*' entries are silently dropped to prevent regression
- Always set Vary: Origin to keep caches honest
- Drop the originOr() helper, no longer needed
- Add cors_test.go covering allow / reject / preflight / wildcard cases

Same-origin deployments (nginx fronting / and /api on the same domain)
keep working with no config change. Cross-origin deployments must add
their frontend Origin to server.allowed_origins.
2026-05-25 13:28:06 +08:00
nianzhibai ada69fec87 feat(pikpak): 302 重定向播放 + 自动迁移 spider91 视频
- PikPak 视频播放从反代切到 302 直连 PikPak CDN(与 OpenList 一致),
  浏览器直接拿签名链接,backend 不再消耗带宽转发字节。
  proxy.shouldRedirect 改成 switch,pikpak 与 p115 同等处理。

- 实现 PikPak Driver.Upload:参考 OpenList 协议,先算 GCID
  (SHA1-of-SHA1-blocks 自定义 hash,OpenList 同款)申请上传会话;
  命中秒传直接返回 file id,否则用 vendored 的 aliyun-oss-go-sdk
  PutObject 走 S3 兼容上传。单次 PutObject 上限 5GiB-1。
  另加 PikPak.Rename(PATCH /drive/v1/files/<id>)。

- 新建 internal/spider91migrate 包:周期把 spider91 爬的视频上传到
  指定的 PikPak drive,事务性改写 catalog 行(drive_id / file_id /
  file_name / content_hash),删本地 mp4+thumb。视频 ID 保持
  spider91-<driveID>-<viewkey> 不变,video_tags / views / likes /
  91porn 标签全部保留。catalog 加 MigrateVideoToDrive +
  ListVideosByDriveID + ListSpider91Viewkeys。

- 上传策略:本地保留最新 KeepLatestN=15 个文件,超出部分(更旧的)
  才上传到 PikPak。第一次爬完 15 个全留本地不上传;第二次爬完 30 个
  时把最旧 15 个迁走。稳态本地 ≤15 个最新视频,PikPak 累积所有历史。

- 文件名方案 B:上传到 PikPak 时用 <sanitized title>-<viewkey后8>.<ext>,
  catalog file_name 同步更新;启动时 backfillFileNames 幂等地把已迁
  视频的旧名(viewkey.ext)改成新格式。

- crawler 完成后立即 ping migrator,不必等 60s 周期。

- 修一个迁移破坏去重的 bug:crawler 写 seen viewkey 时按 drive_id 查,
  但视频迁到 PikPak 后 drive_id 不再是 spider91。改用 ListSpider91Viewkeys
  按 id 前缀 'spider91-<driveID>-' 查,迁移后仍能识别。

- 加全局设置 spider91_upload_drive_id(settings 表)+ admin GET/PUT API;
  未显式设置时自动选取唯一的 PikPak drive。

- 顺手清理已废弃的 RemoteDir / preview 回写网盘相关代码(teaser+封面
  早就只走本地,但残留了 Config 字段、yaml 示例、NewWorker 多余参数、
  catalog UpdatePreview 多余参数)。

测试:
- 新增 ~40 个单测覆盖 GCID 算法、PikPak Upload/Rename schema、
  migrator 各种场景(保留窗口内/外、上传失败、未配 target、批次限流、
  孤儿清理、文件名 backfill 幂等)、文件名 sanitize、PikPak 302 重定向。
- 全包 go test -count=1 通过。

联调:在生产实例上验证:spider91 17 条已迁视频 6 秒内全部秒传到 PikPak、
catalog 改写正确、本地清空、PikPak 视频回放走 302 直连
dl-z01a-0043.mypikpak.net;触发新爬 15 条本地保留不上传;
backfill 把旧 viewkey.mp4 命名改成 <title>-<viewkey后8>.mp4。
2026-05-23 02:01:36 +08:00
nianzhibai d424fc0553 feat(spider91): 接入 91porn 爬虫作为新的视频源
把 91VideoSpider/spider_91porn.py 包装成一种 spider91 drive 类型,
每天凌晨自动从 91porn 本月最热第 1 页起翻页,跳过已知 viewkey 凑够
N 个新视频后停止;下载视频和封面到本地,接入现有的视频列表 / 详情
/ 标签 / teaser 流水线。

主要内容:
- Python 脚本:加 --target-new / --seen-viewkeys-file CLI 参数
- 后端:新增 drives/spider91 包(driver + crawler + 测试)
- 后端:catalog.ListVideoFileIDsByDrive 辅助查询
- 后端:crawlerLoop ticker(独立于 02:00-07:00 的网盘扫描循环)
- 后端:HTTP 客户端尊重 HTTPS_PROXY 环境变量 + 每 drive 可选 proxy
- 后端:视频文件后缀按直链 URL 真实后缀决定(mp4/webm/mkv/flv 等)
- 后端:所有 spider91 视频自动打 91porn 标签(source=system)
- API:新增 /p/spider91/{videoID} 路由用 http.ServeFile 服务本地文件
- 管理后台:下拉加 "91 爬虫" 类型;几处特例适配
  (状态显示"已就绪"、操作显示"立即抓取"、扫描根列显示"上次抓取
  N 小时前"、表单隐藏 root_id 等无关字段)
- 文档:README + plan 16 节完整记录

测试:20+ 新增用例覆盖 driver 路径安全、crawler 端到端(伪 python +
httptest 服务器)、扩展名识别、定时窗口判断。
2026-05-22 21:13:26 +08:00
nianzhibai ce0512d19b refactor(playback): drop transcode/VLC layers, all videos use direct 302
Final decision after evaluating three approaches:
- VLC external player with vlc:// scheme: poor UX, protocol unreliable
- ffprobe + smart remux/transcode: 2-core box gets pinned by ffmpeg
- All-302: simplest and least resource intensive

Removed:
- /p/transcode/{id} routes and full ffmpeg pipeline
- /api/play-token + /p/play VLC bridge
- Server.FFmpegPath/FFprobePath/transcodeJobs fields
- needsBrowserTranscode helper
- VLC button + modal in VideoActions, related CSS
- VideoPlayer transcode polling

videoSource now returns:
- /p/upload/<id> for local uploads
- /p/stream/<driveID>/<fileID> for everything else (302 to CDN)

Trade-off: mkv/avi can no longer be played natively in <video>;
documented in plan section 14.7/14.8 as known limitation.
2026-05-22 00:52:01 +08:00