68 Commits

Author SHA1 Message Date
nianzhibai be759dec73 Merge PR #68 with auth and admin fixes
Manual no-ff merge of origin/pr/68 into main.

Includes follow-up fixes for vendored bcrypt/blowfish sources, admin/user management, permanent IP bans, proxy IP handling, mobile UI polish, persisted likes, and hot-video ordering.
2026-06-24 12:55:06 +08:00
风如歌 abc52610a0 Merge branch 'main' into main 2026-06-23 23:36:30 +08:00
wind 08253bca8e feat(auth): 添加用户认证系统和管理功能
- 实现基于数据库的用户登录验证
- 添加管理员角色支持
- 在安装脚本中增加重置用户密码功能
- 更新管理界面导航菜单显示逻辑
- 添加用户登出功能
- 将存储统计功能按平台分离实现
- 修复干运行进程管理的跨平台兼容性
- 移除配置文件中的默认管理员凭据
- 为首次设置创建默认管理员账户
2026-06-23 23:21:45 +08:00
nianzhibai 6884473dbf feat: remove video categories and refine mobile admin UI
Remove the legacy video category feature from the frontend, public API, admin API, scanner, crawler import flow, and catalog schema. Add startup migration handling that drops the old videos.category column, including a rebuild fallback for legacy databases with category-dependent indexes, and extend regression tests for existing deployments.

Keep tag-based workflows as the only classification surface by updating listing, promo, tag management, and video edit UI. Replace the old categories data module with promos and prevent category data from leaking through list or admin responses.

Improve mobile and desktop video management UX: align drive/search/refresh controls, keep bulk actions from shifting video cards, add blacklist drive filtering, center mobile modals, tighten drive picker cards, and fix edit-video tag options so short labels stay on one line.

Refine public video cards and detail navigation: truncate long author metadata, avoid empty author separators, and return users to their originating page after deleting a video instead of always navigating to /list.

Verified with go test ./..., npm run lint, npm test, npm run build, and git diff --check.
2026-06-23 16:06:12 +08:00
nianzhibai e32da9016b feat: unify crawler pipeline and duplicate maintenance
Remove the legacy spider91-specific storage, route, migration, and admin upload-target handling so crawler imports are treated as generic scriptcrawler drives.

Replace the spider91 migrator with crawlerupload and update the nightly pipeline to run generic crawler crawling, crawler uploads, and full-library duplicate video maintenance.

Add exact duplicate removal by size_bytes plus sampled_sha256 and near-duplicate removal by title similarity, duration, and thumbnail SSIM, keeping the larger source and deleting duplicate catalog rows with tombstones.

Mark automatically deduped tombstones with reason=duplicate and show a compact 重复文件 pill in the admin blacklist table while leaving manual blacklist entries unmarked.

Add media similarity helpers, scriptcrawler near-duplicate checks, file_name-backed public search, crawler upload UI updates, and tests for the new behavior.

Remove the old /p/spider91 playback route and frontend special casing after the dedicated spider91 drive implementation was removed.

Verified with: go test ./... -count=1; npm test; npm run build.
2026-06-22 22:49:18 +08:00
nianzhibai 0faeaf408f fix(crawlers): stabilize manual upload workflow
Add a manual crawler upload action in the admin UI and backend so users can retry uploads when automatic migration leaves local crawler videos behind.

Keep the button always clickable and return clear refusal messages when there are no local videos, no upload target, unfinished fingerprints/previews, failed generated assets, or active crawler work.

Simplify crawler cards by removing pipeline/status capsules, dropping the ready pill, and aligning the preview toggle with the existing action button style.

Avoid the small-/tmp upload bug by reusing seekable local files for PikPak GCID calculation/uploads and by routing fallback upload temp files for PikPak, 115, 123Pan, and WoPan into the application data upload-tmp directory.

Add regression coverage for crawler manual upload handling, frontend form expectations, configured upload temp dirs, and PikPak seekable-reader uploads.

Verification: npm run lint; npm test; npm run build; go test ./... -count=1.
2026-06-20 00:14:37 +08:00
nianzhibai f9351324c6 fix: show active preview generation status 2026-06-14 18:22:04 +08:00
nianzhibai bb83277d62 feat: add crawler preview generation toggle
Expose per-crawler teaser settings on crawler cards and persist them through the admin API.\n\nWhen preview generation is disabled, crawler imports still create thumbnails and fingerprints while marking previews disabled and allowing migration without waiting for teaser files.\n\nPreserve the latest teaser setting after crawler runs so stale crawl state cannot overwrite a user toggle.
2026-06-14 17:52:29 +08:00
nianzhibai 7e5e67697e feat: add GuangYaPan drive support
Implement a new GuangYaPan cloud drive integration across the backend, admin UI, playback proxy, and Spider91 migration flow.

Backend changes:\n- Add a GuangYaPan drive driver with token refresh, QR/device login support, directory listing, stream link resolution, directory creation, rename/delete operations, OSS multipart upload, and upload task polling.\n- Register GuangYaPan as a supported storage kind in configuration, catalog normalization, admin APIs, public drive labels, and 302 playback redirects.\n- Allow Spider91 crawler uploads to target GuangYaPan through a dedicated migration adapter.\n- Add scan, thumbnail, preview, and fingerprint cooldown handling for GuangYaPan based on explicit HTTP status codes, Retry-After values, and structured provider codes instead of natural-language message matching.\n- Tighten existing provider cooldown detectors so OneDrive, Google Drive, 115, PikPak, 123pan, Wopan, and media workers avoid treating arbitrary response text as a rate-limit signal.\n- Keep large videos eligible for preview generation unless the user disables preview generation.

Admin and tooling changes:\n- Add GuangYaPan as a selectable drive type with QR login UI and token/root-path credential fields.\n- Add crawler upload target support for GuangYaPan in the admin UI.\n- Add drive branding, labels, metadata display, and docs/config examples for GuangYaPan.\n- Include a standalone GuangYaPan QR login helper script for manual credential acquisition.

Tests:\n- Add GuangYaPan driver, QR login, proxy, admin API, crawler upload target, fingerprint, cooldown, and form coverage.\n- Update rate-limit tests to assert that message-only throttling text no longer starts cooldowns.\n- Cover explicit HTTP status parsing through shared drive helper tests.
2026-06-14 15:44:50 +08:00
nianzhibai 9cc8e02bec feat: add sky theme and refresh themed UI
Add the sky theme across the frontend and backend theme APIs, including starfield assets and icon-only branding.

Refresh themed grid backgrounds, admin/login/sidebar styling, and theme-specific video/listing polish.
2026-06-14 11:53:07 +08:00
nianzhibai 738406162a feat: add video blacklist management
Add backend blacklist tombstone APIs and hidden-video migration support.

Update the admin video management UI with blacklist tabs, restore actions, alignment fixes, responsive layout polish, and regression coverage.
2026-06-13 14:34:00 +08:00
nianzhibai 0f111b846d feat: add opt-in toggle for local STRM targets outside the storage root
Local .strm files that pointed to a path outside the configured storage
root previously failed cover/preview/fingerprint generation and playback
with "strm target escapes root", breaking the common layout where the
strm library and the real media files (e.g. an rclone mount) live in
separate directories (issue #22 follow-up).

- localstorage driver gains STRMAllowOutsideRoot; when on, strm targets
  outside the root are allowed (still resolves symlinks and still rejects
  nested strm, so no new escape vector). Default off preserves the
  existing security boundary
- Toggle persisted as the strm_allow_outside_root credential; editing a
  localstorage drive now merges credentials per-key so leaving the path
  blank keeps the old value while flipping the toggle
- Saving a localstorage drive with the toggle on auto-re-enqueues
  previously-failed thumbnails/previews/fingerprints, so enabling it
  recovers without manually clicking the three retry buttons
- Drives API exposes strmAllowOutsideRoot for form echo-back; admin
  drive form adds a "允许指向目录外" select with a security warning
- Tests cover allow-outside-root on/off and that nested strm stays
  rejected even when the toggle is on

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 10:34:54 +08:00
nianzhibai 4dd9015bd7 feat: add per-storage manual transcode for browser-incompatible videos
Add a transcode control to each storage in the admin drives page,
modeled after the cover/preview generation controls:

- Manual start/stop button per storage; transcoding is off by default
  and never runs automatically (not triggered by scans or the nightly
  pipeline)
- New transcode worker probes candidates (non mp4/webm extensions)
  with ffprobe: already-compatible files are marked skipped; AVI with
  H.264 is remuxed losslessly; incompatible codecs (MPEG-4 Part 2,
  WMV, RMVB, HEVC...) are transcoded to H.264/AAC MP4 with +faststart
- Transcoded output is uploaded back to the same storage under a
  "91转码" directory which is auto-added to the drive's scan skip list
  so the scanner never re-imports the artifacts
- Playback source automatically prefers the transcoded file once
  ready, keeping the 302 direct-link mode for cloud drives
- videos table gains transcode_status/error/file_id/size columns via
  startup migration; counts and live task status surface in the
  admin drives API and generation panel UI
- Stop semantics: per-drive stop button, drive-level "stop all tasks"
  and global stop all include the transcode task; interrupted videos
  keep their candidate status and resume on next start

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 09:41:08 +08:00
nianzhibai ae324d3752 feat: add Unicom cloud drive source operations
Add Unicom cloud drive support for source-file deletion and crawler uploads.

- Implement source-file removal for Unicom cloud drive so deleting videos can also remove the original cloud-drive file when requested.

- Resolve Unicom cloud drive source identifiers across file FID, object ID, directory ID, rename, and delete flows.

- Add upload support for Spider91 crawler imports targeting Unicom cloud drive storage.

- Add Unicom cloud drive QR login backend APIs, frontend form support, and tests.

- Extend drive capability metadata, scanner behavior, proxy handling, preview handling, and migration coverage for cloud-drive source operations.

- Rename Chinese display labels from 联通沃盘 to 联通网盘 and from 123 云盘 to 123网盘 while keeping the root README aligned with origin/main.

- Add referrer-policy coverage for 302 video playback and update related frontend playback tests.
2026-06-12 15:49:15 +08:00
nianzhibai 811d87cc27 feat: 完善脚本爬虫导入与管理体验
- 重写添加/编辑爬虫弹窗布局,优化桌面宽度、脚本来源与测试区域比例,并隐藏本地路径/导入 URL 等内部信息。

- 调整爬虫管理页文案和移动端统计卡片布局,统一状态卡片两列展示,避免弹窗点击遮罩误关闭。

- 支持 GitHub blob 链接自动转换为 raw 链接,提升通过 URL 导入脚本的兼容性。

- 为脚本爬虫下载结果增加 ffprobe 完整性校验,失败时删除坏文件且不写入 seen,允许后续重新抓取。

- 支持 .m3u8/HLS 媒体通过 ffmpeg 重新封装为本地 MP4,并继续走指纹、封面、预览和上传迁移流程。

- 修复 dry-run stderr 日志偶发丢失问题,并补充 GitHub URL、坏视频清理、HLS 下载、弹窗交互和响应式布局测试。
2026-06-12 10:53:18 +08:00
nianzhibai 96e423b952 feat: 完善爬虫去重、上传进度和源文件删除
为脚本爬虫增加候选预算、重复 source 记录和默认爬虫标签,避免重复视频占满目标新增数量。

新增爬虫上传迁移进度上报和管理页上传卡片,让每个爬虫可以展示本轮上传处理情况。

为视频删除增加可选删除云盘源文件能力,补齐播放页、管理页交互,并为多个网盘驱动实现 Remove 接口。

补充相关测试并更新爬虫协议文档。
2026-06-11 22:42:11 +08:00
nianzhibai 7ddf33d726 Improve crawler asset stats and admin navigation
- Count crawler assets by crawler source ID prefix after cloud migration

- Add crawler API totals for cumulative, local, and migrated videos

- Let crawler thumbnail and preview readiness inherit equivalent canonical videos

- Show cumulative crawl data in crawler management cards

- Remove low-value expanded crawler metadata fields from the card body

- Move return-to-site into the main admin navigation with grouped sections

- Rename the content admin group to management and adjust footer icon sizing

- Update backend and frontend tests for crawler/admin behavior
2026-06-10 23:45:43 +08:00
nianzhibai c1355385e1 feat(crawler): simplify script crawler workflow
Redesign crawler management around imported Python scripts instead of built-in crawler storage. Crawler scripts now declare CRAWLER_NAME, imports validate metadata, crawler IDs are generated internally, and deleted crawler scripts are detached without deleting already imported videos.

Add backend support for file and URL script imports, dry-run testing, metadata parsing, safer job paths, original filename preservation, and crawler listing that ignores detached script records. Remove the legacy built-in Spider91 script path flow and hidden Python/config JSON fields from the crawler API.

Rework the admin crawler page into an independent crawler console with script import, dry-run testing, status metrics, spider iconography, and simplified controls. Update docs, examples, installer checks, Docker/release packaging, and tests for the new protocol.
2026-06-10 14:27:16 +08:00
nianzhibai ec5a01b6aa feat(crawler): redesign crawler scripts and admin workflow
- add generic scriptcrawler backend runner using the crawler.v1 JSONL protocol

- support crawler script upload and HTTP(S) URL import from the admin crawler page

- simplify the user-facing crawler contract to title, media_url, optional thumbnail_url and optional source_id

- convert Spider91 into a normal script crawler and reject new Spider91 storage-drive configs

- keep legacy Spider91 storage rows visible only for cleanup/deletion

- add crawler protocol docs, example script, admin UI, tests and migration coverage
2026-06-09 23:51:12 +08:00
nianzhibai 940e5dd76d feat: support spider91 uploads to google drive 2026-06-08 23:50:19 +08:00
nianzhibai 5fc8e9ebb7 Improve drive scan task coordination 2026-06-08 17:37:58 +08:00
nianzhibai c87208117e Fix scanner cancellation and shorts UI 2026-06-06 08:37:00 +00:00
nianzhibai 8dff0f07b9 feat: add admin video deletion and mobile UI polish
Adds tombstone-backed video deletion with generated asset cleanup, plus responsive video management actions and centered confirmation dialogs.
2026-06-04 16:10:26 +08:00
nianzhibai 5080203b7c feat: add drive task stop controls
Add per-drive and global admin controls to stop scan, preview, thumbnail, and fingerprint work.

Keep stopped pending generation resumable, wire cancellation through workers and nightly runs, and refine mobile drive-management UI/history behavior.
2026-06-03 23:42:54 +08:00
nianzhibai df6f0ebbbf feat: support spider91 upload to 123pan 2026-06-03 21:49:27 +08:00
nianzhibai 8f0d52aec4 fix: hash long local media asset filenames 2026-06-03 20:35:53 +08:00
nianzhibai 869c0d5f78 refactor: rename teaser UI copy to preview video 2026-06-03 19:45:15 +08:00
nianzhibai cada336e96 123云盘支持,删除存储逻辑优化 2026-06-02 14:30:16 +08:00
nianzhibai e36a17f99d fix: improve 91Spider tagging and deduped tag filters 2026-06-01 18:51:56 +08:00
nianzhibai cf9de5b40a Add failed fingerprint retry controls 2026-06-01 13:42:32 +08:00
nianzhibai 4ba964b7e2 fix thumbnail status and frontend serving 2026-05-31 17:40:16 +08:00
nianzhibai cd3b3c6976 feat: use root id as drive scan root 2026-05-31 17:13:51 +08:00
nianzhibai a407312dfa fix: prevent duplicate scan-all jobs 2026-05-31 15:09:05 +08:00
nianzhibai 87d197496b Limit thumbnail transient retries 2026-05-31 12:02:49 +08:00
nianzhibai 0e3a5bd5cd Add Google Drive support 2026-05-31 11:14:03 +08:00
nianzhibai 66adf444ba fix: detect Docker image version for update checks 2026-05-31 09:55:15 +08:00
nianzhibai e57058db79 feat: prepare v0.0.4 storage release 2026-05-30 20:02:02 +08:00
nianzhibai 6ec61833f2 feat: probe video duration during thumbnail generation 2026-05-30 18:30:22 +08:00
nianzhibai 6e87f88d53 feat: support spider91 uploads to OneDrive 2026-05-30 18:04:15 +08:00
nianzhibai e78fa9d978 feat: improve media generation pipeline status 2026-05-30 17:37:31 +08:00
nianzhibai 039ec2a988 Improve fingerprint dedupe maintenance 2026-05-29 23:58:36 +08:00
nianzhibai da0683344e Add sampled fingerprint deduplication 2026-05-29 23:19:52 +08:00
nianzhibai 34b6fa8ea9 Release v0.0.3 improvements 2026-05-29 18:34:38 +08:00
nianzhibai f5c20f9594 Fix spider91 upload target and thumbnails 2026-05-29 06:28:18 +00:00
nianzhibai 137cfbcf82 feat: add prebuilt installer workflow 2026-05-28 19:13:41 +08:00
nianzhibai bb8818a55a feat: improve admin setup and drive management 2026-05-28 18:41:40 +08:00
nianzhibai d2d4db8062 fix: harden spider91 source matching 2026-05-28 16:10:20 +08:00
nianzhibai 7540371838 feat: restore tag classification and drive controls
Restore the previous fixed-tag classification flow, including startup backfill for existing videos and the 91porn spider tag.

Also commit the current drive scanning, preview scheduling, and admin drive-control updates present in the workspace.
2026-05-28 12:18:17 +08:00
nianzhibai 39ef2defcc feat(spider91): 流式爬取 + 完成后统一入队 teaser + 封面失败标 failed
三件相关改动,主题都是 spider91 爬虫流程。

1. 流式爬取协议(取代旧的 "Python 凑齐 15 个再交 Go" 模型)

  Python 端 (spider_91porn.py):
    - 新增 --stream-output flag。开启后每解析出一个 video 直链就把
      entry 作为一行 JSON 写到 stdout 并 flush。
    - log() 在 stream 模式下走 stderr,避免污染 stdout JSONL 协议。
    - --output FILE 仍生效,作离线归档用。

  Go 端 (crawler.go):
    - 新 startSpiderTargetNew() 异步启动 cmd,返回 stdout pipe。
    - RunOnce 用 bufio.Scanner 按行读 stdout,每行解析后立即 processOne
      (下载视频 + 封面 + UpsertVideo)。删掉旧 readSpiderOutput / 全 JSON
      文件解析路径。
    - Python stderr 转发到 backend log,前缀 [spider91:py]。

  收益:Python 翻页找下一个 viewkey 与 Go 下载当前视频在时间上重叠,
  最大化每条签名链接 e= 时间窗。今天观察到 Python 77 秒就找完 15 个
  viewkey 全部 emit;如果还像旧模型那样要等 Go 串行下完才开始下一个,
  后面几个的签名很容易过期(之前 8/15 全 EOF 的根因之一)。

2. teaser 在 crawler 完成后统一入队(取代每条入库立即 enqueue)

  - main.go attachSpider91Crawler 不再注入 OnNewVideo callback。
  - main.go runSpider91Crawl 在 Crawler.RunOnce 完成后调一次
    enqueueDriveGeneration(driveID),让所有新视频统一进 teaser worker。
  - 与 nightly Phase 2 的 "等 teaser 队列 idle" 语义自然对齐。
  - 下载阶段不和 ffmpeg 抢 CPU/IO。

3. 网站封面下载失败时显式标 thumbnail_status='failed'

  spider91 drive 的 thumb worker 按设计不处理 spider91 视频(封面应是
  网站原图直接保存)。当网站封面下载失败时,url='' + status='pending'
  会让 enqueueDriveGeneration 的 waitForThumbnailsBeforePreview 因为
  CountVideosNeedingThumbnail > 0 把 teaser 卡死等待循环。

  修复:crawler.go processOne 中 thumb 失败分支显式标 status='failed'
  (CountVideosNeedingThumbnail 条件 status != 'failed' 会排除)。

  今天观察到的现象:187 MB 视频 c2c04fc8602c5396d469 卡在
  '[preview] waiting for 1 thumbnails before teaser generation'
  循环 35 分钟。

测试:
  - crawler_test.go 重构为 buildFakeSpiderScript helper,
    生成支持 --stream-output 的伪 python(其实是 sh),逐行 echo JSON。
  - TestCrawlerRunOnceFullFlow / TestCrawlerThumbDownloadFailureMarksStatusFailed
    通过新 helper 验证流式协议 + thumb fail 闸门。

go test ./... 全绿;线上手动触发 spider91 抓取验证流式行为正确。
2026-05-27 18:48:30 +08:00
nianzhibai 1eeebbf305 refactor(scheduling): 统一三套定时调度为 NightlyJob 流水线
替代 scanLoop / crawlerLoop / Migrator.Run 三个并行的周期循环为单一 nightly.Runner,
每天 cron_hour(默认 01:00)串行跑一条流水线:

  Phase 1  扫所有非 spider91 / 非 localupload 网盘
           → 检测新增视频 + 检测被删视频(清理 catalog 行 + 本地封面/teaser)
           → 入队封面 + teaser(per-drive teaser_enabled 决定 teaser 是否入队)
           → 等所有 thumb / teaser worker 队列 idle
  Phase 2  仅当存在 spider91 drive:跑 91 爬虫,新视频入队 teaser
           → 等 teaser 队列 idle
  Phase 3  spider91 → 云盘迁移(PikPak/115 一次性 sweep)

关键属性:
  - 6h 软超时(nightly.max_duration);到点 phase 跑完,后续 phase 不启动
  - 当天去重:last_run_date 持久化到 settings 表,进程崩溃重启不重复跑
  - sync.Mutex.TryLock 保证手动触发与自然 cron 触发互斥
  - 每 phase 边界检查 ctx.Err,不强 kill 进行中的 ffmpeg / 上传
  - 单 drive '重扫' 和 spider91 '立即抓取' 按钮保留
  - 顶栏新增 '立即跑全流程' 按钮 (POST /admin/api/jobs/nightly/run)

附带优化:
  - preview.Worker / ThumbWorker 增加 WaitIdle(ctx) error,nightly 用作同步屏障
  - scanner 增加 30s 心跳进度日志,避免长扫盘内部黑盒
    格式: [scanner] drive=X progress: scanned=N added=K errors=E dirs=M elapsed=Ts at=<dir>
  - cleanupMissingDriveVideos 从 PikPak-only 扩展到所有云盘 kind
    (保留 stats.Errors==0 闸门避免 API 抖动误删)
  - Migrator 移除周期 ticker / Trigger 通道,改成可单独调用的 RunOnce
    (captcha cooldown 状态机仍保留,跨 RunOnce 持久 5 分钟)

废弃 (字段保留以兼容旧 yaml):
  - scanner.interval_seconds   (替代为 nightly.cron_hour 调度)
  - spider91 drive 的 crawl_hour 凭证字段 (last_crawl_at 仅作 admin UI 显示)

测试:go test ./... 全绿 (含 nightly 包 ~320 行单元测试);npm run build 通过。
2026-05-27 13:17:44 +08:00