mirror of
https://github.com/nianzhibai/91.git
synced 2026-06-15 08:45:41 +08:00
refactor(scheduling): 统一三套定时调度为 NightlyJob 流水线
替代 scanLoop / crawlerLoop / Migrator.Run 三个并行的周期循环为单一 nightly.Runner,
每天 cron_hour(默认 01:00)串行跑一条流水线:
Phase 1 扫所有非 spider91 / 非 localupload 网盘
→ 检测新增视频 + 检测被删视频(清理 catalog 行 + 本地封面/teaser)
→ 入队封面 + teaser(per-drive teaser_enabled 决定 teaser 是否入队)
→ 等所有 thumb / teaser worker 队列 idle
Phase 2 仅当存在 spider91 drive:跑 91 爬虫,新视频入队 teaser
→ 等 teaser 队列 idle
Phase 3 spider91 → 云盘迁移(PikPak/115 一次性 sweep)
关键属性:
- 6h 软超时(nightly.max_duration);到点 phase 跑完,后续 phase 不启动
- 当天去重:last_run_date 持久化到 settings 表,进程崩溃重启不重复跑
- sync.Mutex.TryLock 保证手动触发与自然 cron 触发互斥
- 每 phase 边界检查 ctx.Err,不强 kill 进行中的 ffmpeg / 上传
- 单 drive '重扫' 和 spider91 '立即抓取' 按钮保留
- 顶栏新增 '立即跑全流程' 按钮 (POST /admin/api/jobs/nightly/run)
附带优化:
- preview.Worker / ThumbWorker 增加 WaitIdle(ctx) error,nightly 用作同步屏障
- scanner 增加 30s 心跳进度日志,避免长扫盘内部黑盒
格式: [scanner] drive=X progress: scanned=N added=K errors=E dirs=M elapsed=Ts at=<dir>
- cleanupMissingDriveVideos 从 PikPak-only 扩展到所有云盘 kind
(保留 stats.Errors==0 闸门避免 API 抖动误删)
- Migrator 移除周期 ticker / Trigger 通道,改成可单独调用的 RunOnce
(captcha cooldown 状态机仍保留,跨 RunOnce 持久 5 分钟)
废弃 (字段保留以兼容旧 yaml):
- scanner.interval_seconds (替代为 nightly.cron_hour 调度)
- spider91 drive 的 crawl_hour 凭证字段 (last_crawl_at 仅作 admin UI 显示)
测试:go test ./... 全绿 (含 nightly 包 ~320 行单元测试);npm run build 通过。
This commit is contained in:
@@ -5,7 +5,7 @@
|
||||
- 前端:React 18 + Vite + TypeScript
|
||||
- 后端:Go 1.23,SQLite(纯 Go 驱动,无 CGO),ffmpeg 生成 teaser 和封面
|
||||
- 网盘接入:夸克自研 + 115driver SDK + PikPak 自研(参考 OpenList)+ wopan-sdk-go SDK + OneDrive(OpenList 在线续期 + Microsoft Graph 文件接口)
|
||||
- 爬虫接入:91 爬虫(`91VideoSpider/spider_91porn.py`,每天凌晨拉一页视频 + 封面到本地)
|
||||
- 爬虫接入:91 爬虫(`91VideoSpider/spider_91porn.py`,由凌晨流水线触发,拉一页视频 + 封面到本地)
|
||||
|
||||
## 当前功能
|
||||
|
||||
@@ -279,7 +279,7 @@ OneDrive 当前采用 OpenList 在线 API 的续期方式,不要求用户提
|
||||
| 字段 | 默认值 | 说明 |
|
||||
|---|---|---|
|
||||
| `target_new` | `15` | 每次爬取的新视频数。从 page 1 起翻页,跳过已知 viewkey,凑够这么多个新视频后停止 |
|
||||
| `crawl_hour` | `0` | 0-23,整点触发的小时;默认 00:00-00:59 之间触发 |
|
||||
| `crawl_hour` | `0` | **已废弃**:旧版每天按这个整点触发爬取,现在统一由 `nightly.cron_hour` 调度(默认 01:00)。字段保留仅为兼容旧 yaml |
|
||||
| `proxy` | `(空)` | 下载代理 URL,如 `http://127.0.0.1:7890`;留空时回退到 backend 进程的 `HTTPS_PROXY` 环境变量 |
|
||||
| `python_path` | `python3` | 解释器路径,可填绝对路径 |
|
||||
| `script_path` | (自动定位) | 脚本绝对路径;不填时从仓库结构里推断 `91VideoSpider/spider_91porn.py` |
|
||||
@@ -299,9 +299,10 @@ backend/data/spider91/<driveID>/
|
||||
|
||||
**触发逻辑**:
|
||||
|
||||
- 每分钟轮询一次。命中 `crawl_hour` 小时窗口(默认 0:00-0:59)+ 距离上次成功爬取至少 12 小时 → 触发
|
||||
- 管理后台点 "立即抓取" 等同于立刻手动触发一次(不受时间窗约束)
|
||||
- 每个 `spider91` drive 独立调度;可以挂多个不同 `crawl_hour` 的实例
|
||||
- 每天 `nightly.cron_hour`(默认 01:00)作为整条流水线 Phase 2 串行触发;详见上文「凌晨流水线」章节
|
||||
- 管理后台点 spider91 drive 的「立即抓取」按钮:只跑该 drive 一次爬取(不连带 Phase 1 / Phase 3)
|
||||
- 顶栏「立即跑全流程」按钮:跑完整 Phase 1 + 2 + 3
|
||||
- `crawl_hour` 字段保留但已不参与调度判定;`last_crawl_at` 仅作 admin UI「上次抓取 N 小时前」显示用
|
||||
|
||||
**去重**:用 91porn 网站的 `viewkey` 作为唯一标识,配合 `videos.id = "spider91-<driveID>-<viewkey>"` 的拼接规则去重。每次爬取前 backend 会把 catalog 里已存在的 viewkey 列表写到 `.crawl/seen-<时间戳>.txt`,作为 `--seen-viewkeys-file` 传给 Python 脚本;脚本只会请求未见过 viewkey 的详情页。
|
||||
|
||||
@@ -331,13 +332,51 @@ backend/data/spider91/<driveID>/
|
||||
- **目标 PikPak drive 选择**:`spider91_upload_drive_id` 全局设置;admin 可通过 `PUT /admin/api/settings` 显式指定。**未设置时会自动选取唯一的 PikPak drive**;如果有多个 PikPak drive,必须在管理后台显式选定其中一个,否则迁移不会发生。
|
||||
- **PikPak 目录**:用该 PikPak drive 的 `rootId` 作为上传父目录。建议在 PikPak Web 端预先建一个空的子目录(比如 `/91Spider/`),把这个目录的 file ID 填到 PikPak drive 的 `rootId`,这样既能让自动迁移落到这个子目录,也能让该 PikPak drive 的扫描根只看这个子目录,不会和 115 等其它网盘内容重叠。
|
||||
- **PikPak 文件名**:上传时使用 `<视频标题>-<viewkey后8位>.<ext>` 格式(方案 B)。标题被 sanitize 过:去控制字符、非法字符 `/ \ : * ? " < > |` 替成空格、折叠空白、首尾去点号、按 unicode 截断 80 字符。catalog 的 `file_name` 同步更新成上传名,下次 PikPak 扫盘时按 `(file_name, size)` 也能匹配上。
|
||||
- **触发节奏**:迁移 worker 每 60 秒轮询一次;每次 spider91 爬虫完成后立刻额外触发一次(不必等周期)。但触发不等于上传 —— 是否上传由"本地是否超过 15 个"决定。
|
||||
- **触发节奏**:迁移作为凌晨流水线的 Phase 3 触发 —— Phase 2 spider91 爬取 + 所有新视频 teaser 生成完毕后才进入;详见上文「凌晨流水线」章节。每天最多触发一次(除非管理员点「立即跑全流程」)。触发不等于上传 —— 是否上传由"本地是否超过 15 个"决定。
|
||||
- **catalog 改写**:上传成功后事务性地把视频行的 `drive_id` / `file_id` / `file_name` / `content_hash` 改成 PikPak 的;视频自身的 `id`(`spider91-<driveID>-<viewkey>`)保持不变,所以 `video_tags`、`views`、`likes`、`91porn` 标签等关联数据全部保留。改写后再次扫盘时,scanner 通过 `(content_hash)` 或 `(file_name, size)` 现成的 `findDuplicate` 兜底逻辑认出来,不会重复入库。
|
||||
- **本地清理**:迁移成功立即删本地 mp4 + thumb(封面已复制到 `backend/data/previews/thumbs/`,前端展示不受影响)。每轮 worker 末尾还有一道防御性兜底 —— 扫所有本地文件,对 catalog 中 `drive_id` 已迁走但本地仍有残留的孤儿做清理(正常路径不会触发)。
|
||||
- **去重 seen 文件**:crawler 每次跑前会写一份 "已知 viewkey" 文件喂给 Python 脚本,让它跳过已爬过的详情页。这个列表按 `id LIKE 'spider91-<driveID>-%'` 查(不依赖 `drive_id`),所以 spider91 视频被迁到 PikPak 后还能被认出来,**不会重复爬**。
|
||||
- **失败处理**:上传失败时本地文件保留、catalog 行保持原样;下次轮询会重试。账户超额或永久错误目前没有特殊标记,watch 日志(`[spider91migrate]`)即可。
|
||||
- **不开 PikPak?** 不指定 `spider91_upload_drive_id` 也不挂 PikPak drive 时,spider91 视频继续从本地服务(`/p/spider91/<videoID>`),跟以前一样工作;磁盘会持续增长,需要手动管理。
|
||||
|
||||
## 凌晨流水线
|
||||
|
||||
所有定时任务统一进一条流水线,每天 `nightly.cron_hour`(默认 1 即 01:00)触发一次:
|
||||
|
||||
```
|
||||
01:00 ▶ NightlyJob 启动
|
||||
├─ Phase 1 扫所有非 spider91 / 非 localupload 网盘
|
||||
│ 为每个 drive 串行:scan → 检测新增视频 → 检测被删视频
|
||||
│ (本项目数据库 + 本地封面 / teaser 一并清理;扫描 errors > 0
|
||||
│ 时跳过删除清理避免误删)→ 入队封面 + teaser(如该 drive
|
||||
│ 的 teaser_enabled 为 true)
|
||||
│ 等所有 drive 的封面 + teaser 队列 idle
|
||||
│
|
||||
├─ Phase 2 仅当存在 spider91 drive:串行跑 91 爬虫,新视频入队 teaser
|
||||
│ 等 teaser 队列 idle
|
||||
│
|
||||
└─ Phase 3 spider91 → 云盘迁移(一次完整 sweep;本地保留最新 15 个,
|
||||
更旧的传到 spider91_upload_drive_id 指定的 PikPak 或 115)
|
||||
```
|
||||
|
||||
**关键属性**:
|
||||
|
||||
- **6 小时软超时**:`nightly.max_duration` 控制单轮总耗时上限。到点时正在跑的 phase 跑完,后续 phase 不再启动 —— 不强 kill 任何 ffmpeg / 上传。
|
||||
- **当天去重**:流水线启动后会把当天日期写入 `settings.nightly.last_run_date`;同一天 01:00 的下一次 tick 看到日期相同就跳过,进程崩溃 + 重启也不会重复跑。
|
||||
- **失败处理**:单个 drive 扫描失败、单条 teaser 生成失败、单条迁移失败都不会阻塞流水线;都通过日志可观测,下次流水线再试。teaser `failed` 状态需管理员手动「重生失败 teaser」恢复。
|
||||
- **每 drive 的 teaser 开关**:`drives.teaser_enabled` 字段在 Phase 1 / Phase 2 入队时被尊重;关闭时 teaser worker 不会被入队,封面仍然生成。
|
||||
- **手动触发**:`/admin/drives` 顶栏「立即跑全流程」按钮(POST `/admin/api/jobs/nightly/run`)忽略时间窗立即跑一遍,已在跑时被 `runMu.TryLock` 丢弃。
|
||||
|
||||
**已不存在的旧调度**(替代关系):
|
||||
|
||||
| 旧机制 | 替代为 |
|
||||
|---|---|
|
||||
| `scanLoop`:每 6h 在 02–07 点窗口扫一次 | Phase 1 |
|
||||
| `crawlerLoop`:spider91 drive 按 `crawl_hour` 每分钟轮询 | Phase 2 |
|
||||
| `Migrator.Run`:每 60 秒 + 爬完立即触发的迁移 worker | Phase 3 |
|
||||
|
||||
`config.yaml` 里 `scanner.interval_seconds` 字段保留但已不生效;spider91 drive 的 `crawl_hour` 凭证字段保留但已不参与调度,`last_crawl_at` 仅作 admin UI 显示「上次抓取 N 小时前」用。
|
||||
|
||||
## Teaser 和封面生成策略
|
||||
|
||||
- 封面:固定从第 5 秒抽一帧 jpg,不再为封面单独探测视频时长
|
||||
|
||||
+95
-137
@@ -29,6 +29,7 @@ import (
|
||||
"github.com/video-site/backend/internal/drives/quark"
|
||||
"github.com/video-site/backend/internal/drives/spider91"
|
||||
"github.com/video-site/backend/internal/drives/wopan"
|
||||
"github.com/video-site/backend/internal/nightly"
|
||||
"github.com/video-site/backend/internal/preview"
|
||||
"github.com/video-site/backend/internal/proxy"
|
||||
"github.com/video-site/backend/internal/scanner"
|
||||
@@ -167,6 +168,11 @@ func main() {
|
||||
SetSpider91UploadDriveID: func(id string) error {
|
||||
return app.SetSpider91UploadDriveID(ctx, id)
|
||||
},
|
||||
OnRunNightlyJob: func() {
|
||||
if app.nightlyRunner != nil {
|
||||
app.nightlyRunner.TriggerNow()
|
||||
}
|
||||
},
|
||||
}
|
||||
|
||||
r := chi.NewRouter()
|
||||
@@ -177,12 +183,23 @@ func main() {
|
||||
apiServer.RegisterRoutes(r, authr)
|
||||
adminServer.Register(r)
|
||||
|
||||
// 启动定时扫描
|
||||
go app.scanLoop(ctx)
|
||||
// spider91 爬虫专用的凌晨任务(默认 00:30 触发,避开网盘扫描窗口)
|
||||
go app.crawlerLoop(ctx)
|
||||
// spider91 → PikPak 上传 worker
|
||||
go app.spider91Migrator.Run(ctx)
|
||||
// 凌晨流水线:每天 cron_hour 触发一次,串行跑
|
||||
// Phase 1 扫所有非 spider91 / localupload 网盘 + 删除检测 + 入队封面/teaser
|
||||
// Phase 2 spider91 爬虫 + 入队 teaser
|
||||
// Phase 3 spider91 → 云盘迁移
|
||||
// 也响应 admin "立即跑全流程" 按钮(POST /admin/api/jobs/nightly/run → TriggerNow)。
|
||||
app.nightlyRunner = nightly.New(nightly.Config{
|
||||
Settings: cat,
|
||||
CronHour: cfg.Nightly.CronHour,
|
||||
MaxDuration: cfg.Nightly.MaxDuration,
|
||||
ListScanTargets: app.listScanTargetIDs,
|
||||
RunScan: app.runScan,
|
||||
ListSpider91Drives: app.listSpider91DriveIDs,
|
||||
RunSpider91Crawl: app.runSpider91Crawl,
|
||||
WaitPreviewQueuesIdle: app.waitAllPreviewQueuesIdle,
|
||||
RunMigration: app.spider91Migrator.RunOnce,
|
||||
})
|
||||
go app.nightlyRunner.Run(ctx)
|
||||
|
||||
srv := &http.Server{
|
||||
Addr: cfg.Server.Listen,
|
||||
@@ -228,6 +245,10 @@ type App struct {
|
||||
|
||||
// spider91Migrator 周期把 spider91 视频上传到目标 drive(PikPak 或 115)。
|
||||
spider91Migrator *spider91migrate.Migrator
|
||||
|
||||
// nightlyRunner 是凌晨流水线调度器:每天 cron_hour 串行跑扫盘 → 91 爬虫 → 迁移。
|
||||
// 也响应 admin 「立即跑全流程」按钮(TriggerNow)。
|
||||
nightlyRunner *nightly.Runner
|
||||
}
|
||||
|
||||
// teaserEnabledForDrive 查询某个 drive 当前的 per-drive teaser 开关。
|
||||
@@ -832,15 +853,22 @@ func (a *App) runScan(ctx context.Context, driveID string) {
|
||||
log.Printf("[cleanup] removed %d excluded 115 videos for drive=%s", removed, driveID)
|
||||
}
|
||||
}
|
||||
if drv.Kind() == "pikpak" {
|
||||
// 删除检测:扫描到的 file_ids 是当前云盘上的真实存在;catalog 里这个 drive
|
||||
// 名下、且其 parent_id 处在本次扫描走过的目录内(或本次是从根扫的)、却
|
||||
// 不在 SeenFileIDs 中的视频 → 视为已被删除。
|
||||
//
|
||||
// spider91 / localupload 走自己的生命周期管理,不应该参与扫描清理;
|
||||
// stats.Errors > 0 时(云盘 API 中途抖动)保守起见跳过这一轮,避免把
|
||||
// "暂时列不出来"误认成"被用户删了"。
|
||||
if drv.Kind() != spider91.Kind && drv.ID() != localupload.DriveID {
|
||||
if stats.Errors > 0 {
|
||||
log.Printf("[cleanup] skip stale PikPak cleanup for drive=%s: scan had %d directory errors", driveID, stats.Errors)
|
||||
log.Printf("[cleanup] skip stale cleanup for drive=%s kind=%s: scan had %d directory errors", driveID, drv.Kind(), stats.Errors)
|
||||
} else {
|
||||
removed, err := a.cleanupMissingDriveVideos(ctx, driveID, stats.SeenFileIDs, stats.VisitedDirIDs, startID == drv.RootID())
|
||||
if err != nil {
|
||||
log.Printf("[cleanup] stale PikPak cleanup drive=%s error: %v", driveID, err)
|
||||
log.Printf("[cleanup] stale cleanup drive=%s kind=%s error: %v", driveID, drv.Kind(), err)
|
||||
} else if removed > 0 {
|
||||
log.Printf("[cleanup] removed %d stale PikPak videos for drive=%s", removed, driveID)
|
||||
log.Printf("[cleanup] removed %d stale videos for drive=%s kind=%s", removed, driveID, drv.Kind())
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1056,61 +1084,60 @@ func (a *App) regenFailedPreviews(ctx context.Context, driveID string) {
|
||||
log.Printf("[preview] enqueued failed videos for regen drive=%s queued=%d", driveID, queued)
|
||||
}
|
||||
|
||||
func (a *App) scanLoop(ctx context.Context) {
|
||||
if a.cfg.Scanner.IntervalSeconds <= 0 {
|
||||
return
|
||||
}
|
||||
interval := time.Duration(a.cfg.Scanner.IntervalSeconds) * time.Second
|
||||
var lastScheduledScan time.Time
|
||||
if a.scanAllOnceIfDue(ctx, time.Now(), lastScheduledScan, interval) {
|
||||
lastScheduledScan = time.Now()
|
||||
}
|
||||
|
||||
checkEvery := interval
|
||||
if checkEvery > time.Minute {
|
||||
checkEvery = time.Minute
|
||||
}
|
||||
ticker := time.NewTicker(checkEvery)
|
||||
defer ticker.Stop()
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return
|
||||
case now := <-ticker.C:
|
||||
if a.scanAllOnceIfDue(ctx, now, lastScheduledScan, interval) {
|
||||
lastScheduledScan = now
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (a *App) scanAllOnceIfDue(ctx context.Context, now, lastScheduledScan time.Time, interval time.Duration) bool {
|
||||
if !scheduledScanDue(now, lastScheduledScan, interval) {
|
||||
return false
|
||||
}
|
||||
a.scanAllOnce(ctx)
|
||||
return true
|
||||
}
|
||||
|
||||
func scheduledScanDue(now, lastScheduledScan time.Time, interval time.Duration) bool {
|
||||
if interval <= 0 || !scheduledScanAllowed(now) {
|
||||
return false
|
||||
}
|
||||
return lastScheduledScan.IsZero() || now.Sub(lastScheduledScan) >= interval
|
||||
}
|
||||
|
||||
func scheduledScanAllowed(now time.Time) bool {
|
||||
hour := now.Hour()
|
||||
return hour >= 2 && hour < 7
|
||||
}
|
||||
|
||||
func (a *App) scanAllOnce(ctx context.Context) {
|
||||
for _, d := range a.registry.All() {
|
||||
// listScanTargetIDs 返回 nightly Phase 1 应扫描的所有 drive ID
|
||||
// (非 spider91、非 localupload)。顺序按 registry.All 给的稳定顺序。
|
||||
func (a *App) listScanTargetIDs(_ context.Context) []string {
|
||||
all := a.registry.All()
|
||||
out := make([]string, 0, len(all))
|
||||
for _, d := range all {
|
||||
if !shouldScanDrive(d) {
|
||||
continue
|
||||
}
|
||||
a.runScan(ctx, d.ID())
|
||||
out = append(out, d.ID())
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// listSpider91DriveIDs 返回 nightly Phase 2 应触发爬取的 spider91 drive ID 列表。
|
||||
func (a *App) listSpider91DriveIDs(_ context.Context) []string {
|
||||
a.mu.Lock()
|
||||
defer a.mu.Unlock()
|
||||
out := make([]string, 0, len(a.spider91Crawlers))
|
||||
for id := range a.spider91Crawlers {
|
||||
out = append(out, id)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// waitAllPreviewQueuesIdle 阻塞直到所有 drive 的封面 worker 和 teaser worker
|
||||
// 队列都为空且无 in-flight 任务。
|
||||
//
|
||||
// 顺序:先等所有 thumb worker(因为 enqueueDriveGeneration 内部已经先等当前
|
||||
// drive 的封面再入队 teaser,但这里是跨 drive 的全局同步),再等所有 teaser。
|
||||
// 若 ctx 在等待中被取消(软超时 / shutdown),立即返回 ctx.Err。
|
||||
func (a *App) waitAllPreviewQueuesIdle(ctx context.Context) error {
|
||||
a.mu.Lock()
|
||||
thumbWorkers := make([]*preview.ThumbWorker, 0, len(a.thumbWorkers))
|
||||
previewWorkers := make([]*preview.Worker, 0, len(a.workers))
|
||||
for _, w := range a.thumbWorkers {
|
||||
thumbWorkers = append(thumbWorkers, w)
|
||||
}
|
||||
for _, w := range a.workers {
|
||||
previewWorkers = append(previewWorkers, w)
|
||||
}
|
||||
a.mu.Unlock()
|
||||
|
||||
for _, w := range thumbWorkers {
|
||||
if err := w.WaitIdle(ctx); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
for _, w := range previewWorkers {
|
||||
if err := w.WaitIdle(ctx); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func shouldScanDrive(d drives.Drive) bool {
|
||||
@@ -1124,78 +1151,13 @@ func shouldScanDrive(d drives.Drive) bool {
|
||||
return true
|
||||
}
|
||||
|
||||
// ---------- spider91 crawler loop ----------
|
||||
|
||||
const (
|
||||
// spider91DefaultCrawlHour 默认在凌晨 00 点的窗口内触发;可被每个 drive 的
|
||||
// credentials.crawl_hour 覆盖。
|
||||
spider91DefaultCrawlHour = 0
|
||||
// spider91MinIntervalHours 两次自动爬取之间的最小间隔,避免一天触发多次。
|
||||
spider91MinIntervalHours = 12
|
||||
)
|
||||
|
||||
// crawlerLoop 每分钟轮询一次,命中触发窗口的 spider91 drive 就启动一次爬取。
|
||||
// 每个 drive 单独跟踪 last_crawl_at(写在 credentials 里)。
|
||||
func (a *App) crawlerLoop(ctx context.Context) {
|
||||
ticker := time.NewTicker(time.Minute)
|
||||
defer ticker.Stop()
|
||||
// 启动后立刻检查一次(可能错过了今天的窗口)
|
||||
a.tickSpider91(ctx, time.Now())
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return
|
||||
case now := <-ticker.C:
|
||||
a.tickSpider91(ctx, now)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// tickSpider91 检查所有 spider91 drive,决定哪些该触发。
|
||||
func (a *App) tickSpider91(ctx context.Context, now time.Time) {
|
||||
a.mu.Lock()
|
||||
crawlerIDs := make([]string, 0, len(a.spider91Crawlers))
|
||||
for id := range a.spider91Crawlers {
|
||||
crawlerIDs = append(crawlerIDs, id)
|
||||
}
|
||||
a.mu.Unlock()
|
||||
|
||||
for _, id := range crawlerIDs {
|
||||
d, err := a.cat.GetDrive(ctx, id)
|
||||
if err != nil || d == nil {
|
||||
continue
|
||||
}
|
||||
if !spider91DueAt(now, d) {
|
||||
continue
|
||||
}
|
||||
go a.runSpider91Crawl(ctx, id)
|
||||
}
|
||||
}
|
||||
|
||||
// spider91DueAt 判断 drive 是否应该在 now 触发自动爬取。
|
||||
// - 当前小时必须等于 drive.crawl_hour(默认 0)
|
||||
// - 距离上次 last_crawl_at 至少 spider91MinIntervalHours 小时
|
||||
func spider91DueAt(now time.Time, d *catalog.Drive) bool {
|
||||
if d == nil {
|
||||
return false
|
||||
}
|
||||
hour := spider91IntCred(d, "crawl_hour", spider91DefaultCrawlHour)
|
||||
if hour < 0 || hour > 23 {
|
||||
hour = spider91DefaultCrawlHour
|
||||
}
|
||||
if now.Hour() != hour {
|
||||
return false
|
||||
}
|
||||
last := spider91IntCred(d, "last_crawl_at", 0)
|
||||
if last <= 0 {
|
||||
return true
|
||||
}
|
||||
delta := now.Sub(time.Unix(int64(last), 0))
|
||||
return delta >= time.Duration(spider91MinIntervalHours)*time.Hour
|
||||
}
|
||||
// ---------- spider91 crawl ----------
|
||||
|
||||
// runSpider91Crawl 运行一次完整爬取流程并把 last_crawl_at 写回 drive.credentials。
|
||||
// 即使爬取失败也会更新 last_crawl_at,避免一直在错误循环里反复触发;下次窗口仍会重试。
|
||||
//
|
||||
// 即使爬取失败也会更新 last_crawl_at,避免一直在错误循环里反复触发;下一次 nightly
|
||||
// 流水线重跑时仍会重试。该方法是阻塞的,被 nightly Phase 2 串行调用,以及被
|
||||
// admin "立即抓取" 单 drive 异步调用。
|
||||
func (a *App) runSpider91Crawl(ctx context.Context, driveID string) {
|
||||
a.mu.Lock()
|
||||
c := a.spider91Crawlers[driveID]
|
||||
@@ -1223,7 +1185,8 @@ func (a *App) runSpider91Crawl(ctx context.Context, driveID string) {
|
||||
driveID, res.TargetNew, res.TotalEntries, res.NewVideos, res.Skipped, res.Failed, res.SeenSnapshot)
|
||||
}
|
||||
|
||||
// 标记最后一次爬取时间
|
||||
// 标记最后一次爬取时间。这字段已不再用于调度判定(nightly 流水线统一调度),
|
||||
// 留着仅作为 admin UI 显示"上次抓取 N 小时前"用。
|
||||
if d.Credentials == nil {
|
||||
d.Credentials = make(map[string]string)
|
||||
}
|
||||
@@ -1238,11 +1201,6 @@ func (a *App) runSpider91Crawl(ctx context.Context, driveID string) {
|
||||
if err := a.cat.UpsertDrive(ctx, d); err != nil {
|
||||
log.Printf("[spider91] drive=%s update last_crawl_at: %v", driveID, err)
|
||||
}
|
||||
|
||||
// 爬完立刻 ping 一次迁移 worker,不等下一个周期。
|
||||
if a.spider91Migrator != nil {
|
||||
a.spider91Migrator.Trigger()
|
||||
}
|
||||
}
|
||||
|
||||
// spider91IntCred 解析 credentials 中的整数字段,缺省时返回 def。
|
||||
|
||||
@@ -1,85 +1,11 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"strconv"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/video-site/backend/internal/catalog"
|
||||
)
|
||||
|
||||
func TestSpider91DueAtTriggerWindow(t *testing.T) {
|
||||
d := &catalog.Drive{
|
||||
ID: "x",
|
||||
Credentials: map[string]string{"crawl_hour": "0"},
|
||||
}
|
||||
// 凌晨 0 点 → 触发
|
||||
now := time.Date(2026, 5, 22, 0, 30, 0, 0, time.Local)
|
||||
if !spider91DueAt(now, d) {
|
||||
t.Fatalf("expected due at 0:30 with hour=0")
|
||||
}
|
||||
// 1 点 → 不触发
|
||||
now2 := time.Date(2026, 5, 22, 1, 0, 0, 0, time.Local)
|
||||
if spider91DueAt(now2, d) {
|
||||
t.Fatalf("not expected due at 1:00 with hour=0")
|
||||
}
|
||||
}
|
||||
|
||||
func TestSpider91DueAtCustomHour(t *testing.T) {
|
||||
d := &catalog.Drive{
|
||||
ID: "x",
|
||||
Credentials: map[string]string{"crawl_hour": "3"},
|
||||
}
|
||||
due := time.Date(2026, 5, 22, 3, 5, 0, 0, time.Local)
|
||||
if !spider91DueAt(due, d) {
|
||||
t.Fatalf("expected due at 3:05 with hour=3")
|
||||
}
|
||||
notDue := time.Date(2026, 5, 22, 4, 5, 0, 0, time.Local)
|
||||
if spider91DueAt(notDue, d) {
|
||||
t.Fatalf("not expected due at 4:05 with hour=3")
|
||||
}
|
||||
}
|
||||
|
||||
func TestSpider91DueAtRespectsLastCrawl(t *testing.T) {
|
||||
now := time.Date(2026, 5, 22, 0, 30, 0, 0, time.Local)
|
||||
|
||||
// 上次刚跑过 1 小时 → 不触发
|
||||
d1 := &catalog.Drive{
|
||||
ID: "x",
|
||||
Credentials: map[string]string{
|
||||
"crawl_hour": "0",
|
||||
"last_crawl_at": strconv.FormatInt(now.Add(-1*time.Hour).Unix(), 10),
|
||||
},
|
||||
}
|
||||
if spider91DueAt(now, d1) {
|
||||
t.Fatalf("not expected due 1h after last_crawl_at")
|
||||
}
|
||||
|
||||
// 上次跑了 13 小时前(>=12h)→ 触发
|
||||
d2 := &catalog.Drive{
|
||||
ID: "x",
|
||||
Credentials: map[string]string{
|
||||
"crawl_hour": "0",
|
||||
"last_crawl_at": strconv.FormatInt(now.Add(-13*time.Hour).Unix(), 10),
|
||||
},
|
||||
}
|
||||
if !spider91DueAt(now, d2) {
|
||||
t.Fatalf("expected due 13h after last_crawl_at")
|
||||
}
|
||||
}
|
||||
|
||||
func TestSpider91DueAtFirstRun(t *testing.T) {
|
||||
d := &catalog.Drive{
|
||||
ID: "x",
|
||||
Credentials: map[string]string{"crawl_hour": "0"},
|
||||
}
|
||||
now := time.Date(2026, 5, 22, 0, 30, 0, 0, time.Local)
|
||||
// 没有 last_crawl_at → 视为首次运行,命中窗口就触发
|
||||
if !spider91DueAt(now, d) {
|
||||
t.Fatalf("expected due on first run within window")
|
||||
}
|
||||
}
|
||||
|
||||
func TestSpider91IntCredFallbacks(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
|
||||
@@ -375,46 +375,6 @@ func TestEnqueueUploadedVideoQueuesLocalPreviewWorker(t *testing.T) {
|
||||
t.Fatalf("preview status = %q, want ready", got.PreviewStatus)
|
||||
}
|
||||
|
||||
func TestScheduledScanWindowAllowsOnlyEarlyMorning(t *testing.T) {
|
||||
loc := time.FixedZone("CST", 8*60*60)
|
||||
cases := []struct {
|
||||
name string
|
||||
now time.Time
|
||||
want bool
|
||||
}{
|
||||
{name: "before window", now: time.Date(2026, 5, 12, 1, 59, 0, 0, loc), want: false},
|
||||
{name: "at start", now: time.Date(2026, 5, 12, 2, 0, 0, 0, loc), want: true},
|
||||
{name: "inside window", now: time.Date(2026, 5, 12, 6, 59, 0, 0, loc), want: true},
|
||||
{name: "at end", now: time.Date(2026, 5, 12, 7, 0, 0, 0, loc), want: false},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
if got := scheduledScanAllowed(tc.now); got != tc.want {
|
||||
t.Fatalf("scheduledScanAllowed(%s) = %v, want %v", tc.now.Format(time.RFC3339), got, tc.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestScheduledScanDueRespectsWindowAndInterval(t *testing.T) {
|
||||
loc := time.FixedZone("CST", 8*60*60)
|
||||
interval := 2 * time.Hour
|
||||
inside := time.Date(2026, 5, 12, 2, 0, 0, 0, loc)
|
||||
|
||||
if scheduledScanDue(time.Date(2026, 5, 12, 1, 59, 0, 0, loc), time.Time{}, interval) {
|
||||
t.Fatal("scheduled scan due outside window, want false")
|
||||
}
|
||||
if !scheduledScanDue(inside, time.Time{}, interval) {
|
||||
t.Fatal("first scheduled scan inside window = false, want true")
|
||||
}
|
||||
if scheduledScanDue(inside.Add(time.Hour), inside, interval) {
|
||||
t.Fatal("scheduled scan due before interval elapsed, want false")
|
||||
}
|
||||
if !scheduledScanDue(inside.Add(2*time.Hour), inside, interval) {
|
||||
t.Fatal("scheduled scan due after interval elapsed, want true")
|
||||
}
|
||||
}
|
||||
|
||||
func TestShouldScanDriveSkipsLocalUpload(t *testing.T) {
|
||||
if shouldScanDrive(&serverLocalUploadFakeDrive{}) {
|
||||
t.Fatal("local upload drive should not be scanned")
|
||||
|
||||
@@ -25,13 +25,25 @@ storage:
|
||||
local_preview_dir: "./data/previews"
|
||||
|
||||
scanner:
|
||||
# 自动扫盘最小间隔(秒);只在每天 02:00-07:00 触发,0 表示仅允许管理员手动重扫
|
||||
interval_seconds: 21600
|
||||
# 已废弃:旧版的"每天 02:00-07:00 按 IntervalSeconds 间隔重复扫盘"已被移除。
|
||||
# 现在所有定时任务统一由 nightly 块控制(每天 01:00 跑一条完整流水线)。
|
||||
# 字段保留仅为兼容旧 yaml,运行时被忽略。
|
||||
interval_seconds: 0
|
||||
# 单次扫描每家网盘目录递归层数上限
|
||||
max_depth: 5
|
||||
# 被扫描的扩展名
|
||||
video_extensions: [".mp4", ".mkv", ".mov", ".webm", ".avi"]
|
||||
|
||||
nightly:
|
||||
# 凌晨流水线触发整点(0-23),默认 1 即每天 01:00。流程:
|
||||
# Phase 1 扫所有非 spider91 / 非 localupload 网盘 → 检测新增 / 删除
|
||||
# → 入队封面和 teaser → 等所有队列 idle
|
||||
# Phase 2 spider91 爬虫(如配置)→ 入队 teaser → 等队列 idle
|
||||
# Phase 3 spider91 → 云盘迁移(一次性 sweep)
|
||||
cron_hour: 1
|
||||
# 单次流水线总耗时上限(软超时);超过后当前 phase 跑完不启动后续 phase。
|
||||
max_duration: 6h
|
||||
|
||||
preview:
|
||||
# 是否启用 ffmpeg 抽帧生成 teaser
|
||||
enabled: true
|
||||
|
||||
@@ -36,6 +36,10 @@ type AdminServer struct {
|
||||
// Spider91 → PikPak 上传目标 drive ID 读写
|
||||
GetSpider91UploadDriveID func() string
|
||||
SetSpider91UploadDriveID func(driveID string) error
|
||||
// OnRunNightlyJob 触发一次完整的凌晨流水线(Phase1 扫盘 + Phase2 91 爬虫 +
|
||||
// Phase3 迁移)。立即返回 —— 实际任务在后台跑,admin 在日志或下次状态查询里
|
||||
// 看进度。重复点击会被 Runner.TryLock 丢弃。
|
||||
OnRunNightlyJob func()
|
||||
}
|
||||
|
||||
type GenerationStatus struct {
|
||||
@@ -83,6 +87,9 @@ func (a *AdminServer) Register(r chi.Router) {
|
||||
// 运行时设置
|
||||
r.Get("/settings", a.handleGetSettings)
|
||||
r.Put("/settings", a.handlePutSettings)
|
||||
|
||||
// 运维任务
|
||||
r.Post("/jobs/nightly/run", a.handleRunNightlyJob)
|
||||
})
|
||||
})
|
||||
}
|
||||
@@ -309,6 +316,16 @@ func (a *AdminServer) handleRescan(w http.ResponseWriter, r *http.Request) {
|
||||
writeJSON(w, http.StatusAccepted, map[string]any{"ok": true})
|
||||
}
|
||||
|
||||
// handleRunNightlyJob 触发一次完整的凌晨流水线(不论当前时间,不论今日是否已跑)。
|
||||
// 立即返回 202;进度通过 backend 日志和下次 GET /admin/api/drives 的状态变化观察。
|
||||
// 流水线已在跑时 Runner 会丢弃此次触发并记日志。
|
||||
func (a *AdminServer) handleRunNightlyJob(w http.ResponseWriter, r *http.Request) {
|
||||
if a.OnRunNightlyJob != nil {
|
||||
a.OnRunNightlyJob()
|
||||
}
|
||||
writeJSON(w, http.StatusAccepted, map[string]any{"ok": true})
|
||||
}
|
||||
|
||||
// teaserEnabledReq 是 POST /admin/api/drives/{id}/teaser-enabled 的入参。
|
||||
type teaserEnabledReq struct {
|
||||
Enabled bool `json:"enabled"`
|
||||
|
||||
@@ -4,6 +4,7 @@ import (
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"time"
|
||||
|
||||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
@@ -13,6 +14,7 @@ type Config struct {
|
||||
Storage Storage `yaml:"storage"`
|
||||
Scanner Scanner `yaml:"scanner"`
|
||||
Preview Preview `yaml:"preview"`
|
||||
Nightly Nightly `yaml:"nightly"`
|
||||
Drives []Drive `yaml:"drives"`
|
||||
}
|
||||
|
||||
@@ -38,6 +40,8 @@ type Storage struct {
|
||||
}
|
||||
|
||||
type Scanner struct {
|
||||
// IntervalSeconds 已废弃。旧版每天 02:00–07:00 窗口内按这个间隔重复扫盘;
|
||||
// 新版统一由 nightly.cron_hour 调度,此字段被忽略,保留仅为兼容旧 yaml。
|
||||
IntervalSeconds int `yaml:"interval_seconds"`
|
||||
MaxDepth int `yaml:"max_depth"`
|
||||
VideoExtensions []string `yaml:"video_extensions"`
|
||||
@@ -52,6 +56,18 @@ type Preview struct {
|
||||
Segments int `yaml:"segments"`
|
||||
}
|
||||
|
||||
// Nightly 是凌晨流水线(扫盘 → 91 爬虫 → 迁移)的调度配置。
|
||||
//
|
||||
// 一个进程只跑一条 nightly 流水线;该 cron 时间到达且当天还没跑过时触发,
|
||||
// 也可被管理后台「立即跑全流程」按钮手动触发。MaxDuration 是软超时,超过
|
||||
// 后当前 phase 完成、后续 phase 不再启动。
|
||||
type Nightly struct {
|
||||
// CronHour 是每日触发整点(0–23);默认 1 表示 01:00。
|
||||
CronHour int `yaml:"cron_hour"`
|
||||
// MaxDuration 是单次流水线总耗时上限;默认 6h。
|
||||
MaxDuration time.Duration `yaml:"max_duration"`
|
||||
}
|
||||
|
||||
// Drive 配置项中的敏感字段(Cookie / RefreshToken 等)最终由管理后台写入 DB
|
||||
// 这里保留 yaml 中的静态定义,用于启动时预置盘。生产建议只在 DB 里维护。
|
||||
type Drive struct {
|
||||
@@ -118,4 +134,15 @@ func (c *Config) applyDefaults() {
|
||||
if c.Preview.Segments == 0 {
|
||||
c.Preview.Segments = 3
|
||||
}
|
||||
// Nightly defaults。CronHour=0 是合法值(午夜),没法用 zero-value 单独
|
||||
// 区分"未设"和"显式 0"。把整个 nightly 块当 sentinel —— MaxDuration==0
|
||||
// 视为整个块缺失,重置成 (cron_hour=1, max_duration=6h)。代价:用户想配
|
||||
// CronHour=0(午夜)必须同时显式写 max_duration(任何 >0 的值即可)。
|
||||
// 收益:默认部署(yaml 没 nightly 块)得到 01:00 + 6h,与用户预期一致。
|
||||
if c.Nightly.MaxDuration <= 0 {
|
||||
c.Nightly.CronHour = 1
|
||||
c.Nightly.MaxDuration = 6 * time.Hour
|
||||
} else if c.Nightly.CronHour < 0 || c.Nightly.CronHour > 23 {
|
||||
c.Nightly.CronHour = 1
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,312 @@
|
||||
// Package nightly orchestrates the single nightly maintenance pipeline that
|
||||
// replaces the legacy scanLoop / crawlerLoop / spider91 migrator periodic loop.
|
||||
//
|
||||
// Pipeline (fired once per day at cron_hour, also via TriggerNow for admin
|
||||
// "立即跑全流程"):
|
||||
//
|
||||
// Phase 1: for each non-spider91 cloud drive
|
||||
// scan + delete-detection + enqueue thumb + enqueue teaser
|
||||
// wait until all thumb / teaser queues are idle
|
||||
// Phase 2: if any spider91 drive configured
|
||||
// crawl + enqueue teaser for new videos
|
||||
// wait until teaser queues are idle
|
||||
// Phase 3: spider91 → cloud migration (single sweep, captcha cooldown still
|
||||
// honored within this call)
|
||||
//
|
||||
// A 6h soft deadline guards each pipeline run; phases check deadline at their
|
||||
// boundaries and exit cleanly if exceeded (no in-flight ffmpeg / upload is
|
||||
// killed mid-task).
|
||||
//
|
||||
// State persistence: the date string of the most recent successfully started
|
||||
// run is stored in catalog.settings under the key "nightly.last_run_date".
|
||||
// This survives restarts so a quick crash inside cron_hour won't trigger a
|
||||
// duplicate pipeline.
|
||||
package nightly
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"log"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
const (
|
||||
// settingLastRunDate stores the YYYY-MM-DD of the last natural cron-triggered
|
||||
// pipeline run. Manual TriggerNow() also updates this to keep behavior consistent.
|
||||
settingLastRunDate = "nightly.last_run_date"
|
||||
// dateLayout matches catalog.GetSetting string semantics; using ISO-8601 date.
|
||||
dateLayout = "2006-01-02"
|
||||
// pollInterval is the heartbeat for the natural cron decision loop.
|
||||
pollInterval = time.Minute
|
||||
// minSafeMaxDuration prevents misconfiguration: anything below this would
|
||||
// almost guarantee Phase 2/3 are skipped.
|
||||
minSafeMaxDuration = 5 * time.Minute
|
||||
)
|
||||
|
||||
// SettingStore is the minimal catalog.Catalog surface we rely on.
|
||||
type SettingStore interface {
|
||||
GetSetting(ctx context.Context, key, defaultValue string) (string, error)
|
||||
SetSetting(ctx context.Context, key, value string) error
|
||||
}
|
||||
|
||||
// Config wires the runner to its dependencies. The function-callback shape
|
||||
// avoids importing main / drives / preview from this package, keeping the
|
||||
// dependency graph clean.
|
||||
type Config struct {
|
||||
Settings SettingStore
|
||||
CronHour int // 0-23; default 1 (01:00)
|
||||
MaxDuration time.Duration // soft deadline for one pipeline run; default 6h
|
||||
|
||||
// ListScanTargets returns the drive IDs to run Phase 1 on, in deterministic
|
||||
// order. Should exclude spider91 and localupload drives.
|
||||
ListScanTargets func(ctx context.Context) []string
|
||||
|
||||
// RunScan synchronously runs scan + cleanup + enqueueDriveGeneration for
|
||||
// one drive. Errors are expected to be logged inside, not surfaced.
|
||||
RunScan func(ctx context.Context, driveID string)
|
||||
|
||||
// ListSpider91Drives returns spider91 drive IDs to crawl in Phase 2.
|
||||
// Returns empty slice when no spider91 drive is configured.
|
||||
ListSpider91Drives func(ctx context.Context) []string
|
||||
|
||||
// RunSpider91Crawl synchronously runs one crawl cycle (downloads + thumbs +
|
||||
// teaser enqueue) for a single spider91 drive.
|
||||
RunSpider91Crawl func(ctx context.Context, driveID string)
|
||||
|
||||
// WaitPreviewQueuesIdle blocks until both the thumbnail and teaser queues
|
||||
// across all drives are drained (queue empty + no in-flight task). It must
|
||||
// honor ctx cancellation.
|
||||
WaitPreviewQueuesIdle func(ctx context.Context) error
|
||||
|
||||
// RunMigration runs spider91migrate.Migrator.RunOnce for Phase 3.
|
||||
RunMigration func(ctx context.Context) error
|
||||
|
||||
// Now is injected for tests; nil → time.Now.
|
||||
Now func() time.Time
|
||||
}
|
||||
|
||||
// Runner drives the nightly pipeline.
|
||||
type Runner struct {
|
||||
cfg Config
|
||||
trigger chan struct{} // buffered(1); manual "run now"
|
||||
runMu sync.Mutex // prevents overlapping pipeline runs
|
||||
}
|
||||
|
||||
// New constructs a Runner. cfg is shallow-copied; defaults are applied.
|
||||
func New(cfg Config) *Runner {
|
||||
if cfg.CronHour < 0 || cfg.CronHour > 23 {
|
||||
cfg.CronHour = 1
|
||||
}
|
||||
if cfg.MaxDuration <= 0 {
|
||||
cfg.MaxDuration = 6 * time.Hour
|
||||
}
|
||||
if cfg.MaxDuration < minSafeMaxDuration {
|
||||
cfg.MaxDuration = minSafeMaxDuration
|
||||
}
|
||||
if cfg.Now == nil {
|
||||
cfg.Now = time.Now
|
||||
}
|
||||
return &Runner{
|
||||
cfg: cfg,
|
||||
trigger: make(chan struct{}, 1),
|
||||
}
|
||||
}
|
||||
|
||||
// Run is a blocking loop until ctx is done. It wakes up once per minute and
|
||||
// either fires the natural cron-driven pipeline (when cron_hour matches and
|
||||
// today hasn't run) or honors a manual TriggerNow() request.
|
||||
func (r *Runner) Run(ctx context.Context) {
|
||||
t := time.NewTicker(pollInterval)
|
||||
defer t.Stop()
|
||||
log.Printf("[nightly] runner started; cron_hour=%d max_duration=%s", r.cfg.CronHour, r.cfg.MaxDuration)
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
log.Printf("[nightly] runner stopping: %v", ctx.Err())
|
||||
return
|
||||
case <-t.C:
|
||||
r.tryNaturalRun(ctx)
|
||||
case <-r.trigger:
|
||||
log.Printf("[nightly] manual trigger received")
|
||||
r.runPipelineLocked(ctx, true)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TriggerNow asks the running loop to fire a pipeline ASAP. If a pipeline is
|
||||
// already in progress (or another trigger is already pending), the request
|
||||
// is dropped — the in-progress run will absorb the intent.
|
||||
func (r *Runner) TriggerNow() {
|
||||
select {
|
||||
case r.trigger <- struct{}{}:
|
||||
default:
|
||||
}
|
||||
}
|
||||
|
||||
// tryNaturalRun checks the cron decision and runs the pipeline if due today.
|
||||
func (r *Runner) tryNaturalRun(ctx context.Context) {
|
||||
now := r.cfg.Now()
|
||||
if now.Hour() != r.cfg.CronHour {
|
||||
return
|
||||
}
|
||||
last, err := r.readLastRunDate(ctx)
|
||||
if err != nil {
|
||||
log.Printf("[nightly] read last_run_date: %v", err)
|
||||
return
|
||||
}
|
||||
if !shouldRun(now, last) {
|
||||
return
|
||||
}
|
||||
log.Printf("[nightly] natural cron trigger at %s", now.Format(time.RFC3339))
|
||||
r.runPipelineLocked(ctx, false)
|
||||
}
|
||||
|
||||
// shouldRun returns true when "today" (per now) hasn't already been processed.
|
||||
func shouldRun(now time.Time, lastRunDate string) bool {
|
||||
return lastRunDate != now.Format(dateLayout)
|
||||
}
|
||||
|
||||
// runPipelineLocked guards against overlapping runs. If another pipeline is
|
||||
// in progress, the call returns immediately (logged once). After completion
|
||||
// (regardless of success), today's date is recorded so subsequent triggers
|
||||
// the same calendar day are skipped.
|
||||
func (r *Runner) runPipelineLocked(ctx context.Context, manual bool) {
|
||||
if !r.runMu.TryLock() {
|
||||
log.Printf("[nightly] another pipeline is already running, skipping this trigger")
|
||||
return
|
||||
}
|
||||
defer r.runMu.Unlock()
|
||||
|
||||
started := r.cfg.Now()
|
||||
deadline := started.Add(r.cfg.MaxDuration)
|
||||
runCtx, cancel := context.WithDeadline(ctx, deadline)
|
||||
defer cancel()
|
||||
|
||||
mode := "scheduled"
|
||||
if manual {
|
||||
mode = "manual"
|
||||
}
|
||||
log.Printf("[nightly] pipeline (%s) start; soft deadline=%s", mode, deadline.Format(time.RFC3339))
|
||||
|
||||
r.runPipeline(runCtx)
|
||||
|
||||
finished := r.cfg.Now()
|
||||
log.Printf("[nightly] pipeline (%s) finish; took=%s", mode, finished.Sub(started).Round(time.Second))
|
||||
|
||||
// Mark today as processed regardless of success/error. This is intentional:
|
||||
// a partial / failing pipeline shouldn't trigger again the same day, the
|
||||
// admin can inspect logs and click "立即跑全流程" to retry explicitly.
|
||||
dateStr := started.Format(dateLayout)
|
||||
if err := r.cfg.Settings.SetSetting(ctx, settingLastRunDate, dateStr); err != nil {
|
||||
log.Printf("[nightly] persist last_run_date: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// runPipeline executes the three phases. It returns when the pipeline finishes
|
||||
// OR ctx is done (deadline / cancel). Errors are logged but not propagated —
|
||||
// each phase is best-effort; downstream phases still attempt to run unless ctx
|
||||
// is dead.
|
||||
func (r *Runner) runPipeline(ctx context.Context) {
|
||||
// ---------- Phase 1 ----------
|
||||
if r.checkDeadline(ctx, "phase 1") {
|
||||
return
|
||||
}
|
||||
scanIDs := []string{}
|
||||
if r.cfg.ListScanTargets != nil {
|
||||
scanIDs = r.cfg.ListScanTargets(ctx)
|
||||
}
|
||||
if len(scanIDs) == 0 {
|
||||
log.Printf("[nightly] phase 1 skipped: no cloud drives to scan")
|
||||
} else {
|
||||
log.Printf("[nightly] phase 1: scanning %d drive(s)", len(scanIDs))
|
||||
for _, id := range scanIDs {
|
||||
if ctx.Err() != nil {
|
||||
log.Printf("[nightly] phase 1 aborted by ctx: %v", ctx.Err())
|
||||
return
|
||||
}
|
||||
log.Printf("[nightly] phase 1: scanning drive=%s", id)
|
||||
r.cfg.RunScan(ctx, id)
|
||||
}
|
||||
log.Printf("[nightly] phase 1: waiting for preview queues to drain")
|
||||
if err := r.waitIdle(ctx, "phase 1"); err != nil {
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
// ---------- Phase 2 ----------
|
||||
if r.checkDeadline(ctx, "phase 2") {
|
||||
return
|
||||
}
|
||||
spiderIDs := []string{}
|
||||
if r.cfg.ListSpider91Drives != nil {
|
||||
spiderIDs = r.cfg.ListSpider91Drives(ctx)
|
||||
}
|
||||
if len(spiderIDs) == 0 {
|
||||
log.Printf("[nightly] phase 2/3 skipped: no spider91 drive configured")
|
||||
return
|
||||
}
|
||||
log.Printf("[nightly] phase 2: crawling %d spider91 drive(s)", len(spiderIDs))
|
||||
for _, id := range spiderIDs {
|
||||
if ctx.Err() != nil {
|
||||
log.Printf("[nightly] phase 2 aborted by ctx: %v", ctx.Err())
|
||||
return
|
||||
}
|
||||
log.Printf("[nightly] phase 2: crawling drive=%s", id)
|
||||
r.cfg.RunSpider91Crawl(ctx, id)
|
||||
}
|
||||
log.Printf("[nightly] phase 2: waiting for teaser queue to drain")
|
||||
if err := r.waitIdle(ctx, "phase 2"); err != nil {
|
||||
return
|
||||
}
|
||||
|
||||
// ---------- Phase 3 ----------
|
||||
if r.checkDeadline(ctx, "phase 3") {
|
||||
return
|
||||
}
|
||||
log.Printf("[nightly] phase 3: spider91 migration")
|
||||
if r.cfg.RunMigration != nil {
|
||||
if err := r.cfg.RunMigration(ctx); err != nil {
|
||||
log.Printf("[nightly] phase 3 migration: %v", err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// checkDeadline returns true when ctx is already done (i.e., the soft deadline
|
||||
// has been reached or the runner is shutting down) and the caller should bail.
|
||||
func (r *Runner) checkDeadline(ctx context.Context, phase string) bool {
|
||||
if err := ctx.Err(); err != nil {
|
||||
switch {
|
||||
case errors.Is(err, context.DeadlineExceeded):
|
||||
log.Printf("[nightly] %s: soft deadline reached, bailing out", phase)
|
||||
default:
|
||||
log.Printf("[nightly] %s: ctx done (%v), bailing out", phase, err)
|
||||
}
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// waitIdle calls the configured WaitPreviewQueuesIdle, logging the outcome.
|
||||
func (r *Runner) waitIdle(ctx context.Context, phase string) error {
|
||||
if r.cfg.WaitPreviewQueuesIdle == nil {
|
||||
return nil
|
||||
}
|
||||
if err := r.cfg.WaitPreviewQueuesIdle(ctx); err != nil {
|
||||
if errors.Is(err, context.DeadlineExceeded) {
|
||||
log.Printf("[nightly] %s: soft deadline reached while waiting for preview queues", phase)
|
||||
} else {
|
||||
log.Printf("[nightly] %s: wait preview queues: %v", phase, err)
|
||||
}
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// readLastRunDate reads the persisted last_run_date or returns "" when unset.
|
||||
func (r *Runner) readLastRunDate(ctx context.Context) (string, error) {
|
||||
if r.cfg.Settings == nil {
|
||||
return "", nil
|
||||
}
|
||||
return r.cfg.Settings.GetSetting(ctx, settingLastRunDate, "")
|
||||
}
|
||||
@@ -0,0 +1,323 @@
|
||||
package nightly
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
// stubSettings is an in-memory SettingStore for tests.
|
||||
type stubSettings struct {
|
||||
mu sync.Mutex
|
||||
kv map[string]string
|
||||
}
|
||||
|
||||
func newStubSettings() *stubSettings { return &stubSettings{kv: make(map[string]string)} }
|
||||
|
||||
func (s *stubSettings) GetSetting(_ context.Context, key, def string) (string, error) {
|
||||
s.mu.Lock()
|
||||
defer s.mu.Unlock()
|
||||
if v, ok := s.kv[key]; ok {
|
||||
return v, nil
|
||||
}
|
||||
return def, nil
|
||||
}
|
||||
|
||||
func (s *stubSettings) SetSetting(_ context.Context, key, value string) error {
|
||||
s.mu.Lock()
|
||||
defer s.mu.Unlock()
|
||||
s.kv[key] = value
|
||||
return nil
|
||||
}
|
||||
|
||||
func TestShouldRunChecksDate(t *testing.T) {
|
||||
now := time.Date(2026, 5, 27, 1, 30, 0, 0, time.UTC)
|
||||
if !shouldRun(now, "") {
|
||||
t.Fatal("first ever run with empty last_run_date should be due")
|
||||
}
|
||||
if !shouldRun(now, "2026-05-26") {
|
||||
t.Fatal("yesterday's run should not block today")
|
||||
}
|
||||
if shouldRun(now, "2026-05-27") {
|
||||
t.Fatal("today's run already recorded should block another natural run")
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewAppliesDefaults(t *testing.T) {
|
||||
r := New(Config{Settings: newStubSettings()})
|
||||
// CronHour=0 is a legitimate hour (midnight); New() only clamps out-of-range
|
||||
// values. The "default to 01:00" responsibility lives in the config layer.
|
||||
if r.cfg.CronHour != 0 {
|
||||
t.Errorf("CronHour zero-value should be preserved, got %d", r.cfg.CronHour)
|
||||
}
|
||||
if r.cfg.MaxDuration != 6*time.Hour {
|
||||
t.Errorf("MaxDuration default = %s, want 6h", r.cfg.MaxDuration)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewRejectsInvalidCronHour(t *testing.T) {
|
||||
r := New(Config{CronHour: -1, Settings: newStubSettings()})
|
||||
if r.cfg.CronHour != 1 {
|
||||
t.Fatalf("invalid cron_hour fall back to 1, got %d", r.cfg.CronHour)
|
||||
}
|
||||
r2 := New(Config{CronHour: 25, Settings: newStubSettings()})
|
||||
if r2.cfg.CronHour != 1 {
|
||||
t.Fatalf("out-of-range cron_hour fall back to 1, got %d", r2.cfg.CronHour)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewRaisesUnsafeMaxDuration(t *testing.T) {
|
||||
r := New(Config{MaxDuration: time.Second, Settings: newStubSettings()})
|
||||
if r.cfg.MaxDuration != minSafeMaxDuration {
|
||||
t.Fatalf("MaxDuration too small clamps to %s, got %s", minSafeMaxDuration, r.cfg.MaxDuration)
|
||||
}
|
||||
}
|
||||
|
||||
// recorder accumulates the order of phase invocations so tests can assert
|
||||
// orchestration semantics.
|
||||
type recorder struct {
|
||||
mu sync.Mutex
|
||||
calls []string
|
||||
}
|
||||
|
||||
func (r *recorder) push(s string) {
|
||||
r.mu.Lock()
|
||||
r.calls = append(r.calls, s)
|
||||
r.mu.Unlock()
|
||||
}
|
||||
|
||||
func (r *recorder) snapshot() []string {
|
||||
r.mu.Lock()
|
||||
defer r.mu.Unlock()
|
||||
out := make([]string, len(r.calls))
|
||||
copy(out, r.calls)
|
||||
return out
|
||||
}
|
||||
|
||||
func TestRunPipelineHonoursPhaseOrder(t *testing.T) {
|
||||
rec := &recorder{}
|
||||
settings := newStubSettings()
|
||||
|
||||
r := New(Config{
|
||||
Settings: settings,
|
||||
ListScanTargets: func(context.Context) []string {
|
||||
rec.push("list-scan")
|
||||
return []string{"drive-a", "drive-b"}
|
||||
},
|
||||
RunScan: func(_ context.Context, id string) {
|
||||
rec.push("scan:" + id)
|
||||
},
|
||||
ListSpider91Drives: func(context.Context) []string {
|
||||
rec.push("list-spider")
|
||||
return []string{"sp-1"}
|
||||
},
|
||||
RunSpider91Crawl: func(_ context.Context, id string) {
|
||||
rec.push("crawl:" + id)
|
||||
},
|
||||
WaitPreviewQueuesIdle: func(context.Context) error {
|
||||
rec.push("wait-idle")
|
||||
return nil
|
||||
},
|
||||
RunMigration: func(context.Context) error {
|
||||
rec.push("migrate")
|
||||
return nil
|
||||
},
|
||||
})
|
||||
|
||||
r.runPipeline(context.Background())
|
||||
|
||||
got := rec.snapshot()
|
||||
want := []string{
|
||||
"list-scan",
|
||||
"scan:drive-a",
|
||||
"scan:drive-b",
|
||||
"wait-idle", // after phase 1
|
||||
"list-spider",
|
||||
"crawl:sp-1",
|
||||
"wait-idle", // after phase 2
|
||||
"migrate",
|
||||
}
|
||||
if len(got) != len(want) {
|
||||
t.Fatalf("call sequence len = %d, want %d; got=%v", len(got), len(want), got)
|
||||
}
|
||||
for i := range want {
|
||||
if got[i] != want[i] {
|
||||
t.Fatalf("call[%d] = %q, want %q (full=%v)", i, got[i], want[i], got)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestRunPipelineSkipsMigrationWhenNoSpider91(t *testing.T) {
|
||||
rec := &recorder{}
|
||||
|
||||
r := New(Config{
|
||||
Settings: newStubSettings(),
|
||||
ListScanTargets: func(context.Context) []string { return []string{"drive-a"} },
|
||||
RunScan: func(_ context.Context, id string) { rec.push("scan:" + id) },
|
||||
ListSpider91Drives: func(context.Context) []string { return nil },
|
||||
RunSpider91Crawl: func(_ context.Context, id string) { rec.push("crawl:" + id) },
|
||||
WaitPreviewQueuesIdle: func(context.Context) error {
|
||||
rec.push("wait-idle")
|
||||
return nil
|
||||
},
|
||||
RunMigration: func(context.Context) error {
|
||||
rec.push("migrate")
|
||||
return nil
|
||||
},
|
||||
})
|
||||
|
||||
r.runPipeline(context.Background())
|
||||
|
||||
for _, c := range rec.snapshot() {
|
||||
if c == "migrate" || c == "crawl:sp-1" {
|
||||
t.Fatalf("phase 2/3 should be skipped when no spider91 drive, got call %q", c)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestRunPipelineExitsWhenContextCancelledMidPhase(t *testing.T) {
|
||||
rec := &recorder{}
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
|
||||
r := New(Config{
|
||||
Settings: newStubSettings(),
|
||||
ListScanTargets: func(context.Context) []string {
|
||||
return []string{"drive-a", "drive-b", "drive-c"}
|
||||
},
|
||||
RunScan: func(_ context.Context, id string) {
|
||||
rec.push("scan:" + id)
|
||||
if id == "drive-a" {
|
||||
cancel()
|
||||
}
|
||||
},
|
||||
ListSpider91Drives: func(context.Context) []string { return []string{"x"} },
|
||||
RunSpider91Crawl: func(context.Context, string) { rec.push("crawl") },
|
||||
WaitPreviewQueuesIdle: func(context.Context) error { rec.push("wait-idle"); return nil },
|
||||
RunMigration: func(context.Context) error { rec.push("migrate"); return nil },
|
||||
})
|
||||
|
||||
r.runPipeline(ctx)
|
||||
|
||||
got := rec.snapshot()
|
||||
for _, c := range got {
|
||||
if c == "scan:drive-c" || c == "scan:drive-b" {
|
||||
t.Fatalf("scan should bail out after cancel, got call %q (full=%v)", c, got)
|
||||
}
|
||||
}
|
||||
for _, c := range got {
|
||||
if c == "crawl" || c == "migrate" {
|
||||
t.Fatalf("subsequent phase should not run after cancel, got call %q", c)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestRunPipelineRecordsLastRunDateAfterCompletion(t *testing.T) {
|
||||
settings := newStubSettings()
|
||||
now := time.Date(2026, 5, 27, 1, 5, 0, 0, time.UTC)
|
||||
r := New(Config{
|
||||
Settings: settings,
|
||||
Now: func() time.Time { return now },
|
||||
ListScanTargets: func(context.Context) []string { return nil },
|
||||
WaitPreviewQueuesIdle: func(context.Context) error { return nil },
|
||||
})
|
||||
|
||||
r.runPipelineLocked(context.Background(), false)
|
||||
|
||||
got, _ := settings.GetSetting(context.Background(), settingLastRunDate, "")
|
||||
if got != "2026-05-27" {
|
||||
t.Fatalf("last_run_date = %q, want 2026-05-27", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestRunPipelineLockedDropsOverlappingTriggers(t *testing.T) {
|
||||
var (
|
||||
started atomic.Int32
|
||||
releaseFirst = make(chan struct{})
|
||||
)
|
||||
r := New(Config{
|
||||
Settings: newStubSettings(),
|
||||
ListScanTargets: func(context.Context) []string {
|
||||
started.Add(1)
|
||||
<-releaseFirst
|
||||
return nil
|
||||
},
|
||||
WaitPreviewQueuesIdle: func(context.Context) error { return nil },
|
||||
})
|
||||
|
||||
go r.runPipelineLocked(context.Background(), false)
|
||||
|
||||
// Wait for first to start
|
||||
for started.Load() == 0 {
|
||||
time.Sleep(5 * time.Millisecond)
|
||||
}
|
||||
// Second trigger should bail out without invoking ListScanTargets again
|
||||
r.runPipelineLocked(context.Background(), true)
|
||||
if started.Load() != 1 {
|
||||
t.Fatalf("overlapping run should be dropped; started=%d", started.Load())
|
||||
}
|
||||
close(releaseFirst)
|
||||
}
|
||||
|
||||
func TestSoftDeadlinePreventsLaterPhases(t *testing.T) {
|
||||
rec := &recorder{}
|
||||
settings := newStubSettings()
|
||||
|
||||
r := New(Config{
|
||||
Settings: settings,
|
||||
MaxDuration: minSafeMaxDuration, // smallest allowed
|
||||
ListScanTargets: func(context.Context) []string { return nil },
|
||||
WaitPreviewQueuesIdle: func(ctx context.Context) error {
|
||||
// Force the wait to see a deadline-exceeded ctx before we proceed
|
||||
ctxDone := make(chan struct{})
|
||||
go func() {
|
||||
<-ctx.Done()
|
||||
close(ctxDone)
|
||||
}()
|
||||
select {
|
||||
case <-ctxDone:
|
||||
return ctx.Err()
|
||||
case <-time.After(time.Second):
|
||||
return errors.New("test timeout")
|
||||
}
|
||||
},
|
||||
ListSpider91Drives: func(context.Context) []string {
|
||||
rec.push("list-spider")
|
||||
return []string{"x"}
|
||||
},
|
||||
RunSpider91Crawl: func(context.Context, string) { rec.push("crawl") },
|
||||
RunMigration: func(context.Context) error { rec.push("migrate"); return nil },
|
||||
})
|
||||
|
||||
// Build a runCtx whose deadline is already past to simulate timeout reached
|
||||
// at a phase boundary. We can't directly inject, but emulate via canceled.
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
cancel()
|
||||
|
||||
r.runPipeline(ctx)
|
||||
|
||||
for _, c := range rec.snapshot() {
|
||||
if c == "crawl" || c == "migrate" || c == "list-spider" {
|
||||
t.Fatalf("later phase should not run after ctx done; got %q", c)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestTriggerNowIsNonBlocking(t *testing.T) {
|
||||
r := New(Config{Settings: newStubSettings()})
|
||||
// fill the trigger channel
|
||||
r.TriggerNow()
|
||||
// Second call must not block
|
||||
done := make(chan struct{})
|
||||
go func() {
|
||||
r.TriggerNow()
|
||||
close(done)
|
||||
}()
|
||||
select {
|
||||
case <-done:
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("TriggerNow blocked when channel is full")
|
||||
}
|
||||
}
|
||||
@@ -1220,6 +1220,48 @@ func (w *ThumbWorker) Status() TaskStatus {
|
||||
return taskStatus(&w.activity, &w.rateLimit, w.queue.lengthExcluding(currentID))
|
||||
}
|
||||
|
||||
// WaitIdle 阻塞直到 worker 队列为空且当前没有正在处理的任务。
|
||||
//
|
||||
// "队列空"的判定基于 videoQueue —— 它在 Enqueue 时 reserve、processQueued
|
||||
// defer 里 release,因此 lengthExcluding("") == 0 同时覆盖:
|
||||
// - channel 中尚未被消费的项
|
||||
// - 当前正在 processQueued 的项(哪怕处于 cooldown 等待中)
|
||||
//
|
||||
// 调用方应通过 ctx 传入超时 / cancel;ctx 结束时返回 ctx.Err()。
|
||||
// 200ms 轮询:开销极低,凌晨流水线对几百毫秒级响应延迟不敏感。
|
||||
func (w *Worker) WaitIdle(ctx context.Context) error {
|
||||
if w == nil {
|
||||
return nil
|
||||
}
|
||||
return waitQueueIdle(ctx, &w.queue)
|
||||
}
|
||||
|
||||
// WaitIdle 见 Worker.WaitIdle 注释。
|
||||
func (w *ThumbWorker) WaitIdle(ctx context.Context) error {
|
||||
if w == nil {
|
||||
return nil
|
||||
}
|
||||
return waitQueueIdle(ctx, &w.queue)
|
||||
}
|
||||
|
||||
func waitQueueIdle(ctx context.Context, q *videoQueue) error {
|
||||
if q.lengthExcluding("") == 0 {
|
||||
return nil
|
||||
}
|
||||
ticker := time.NewTicker(200 * time.Millisecond)
|
||||
defer ticker.Stop()
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return ctx.Err()
|
||||
case <-ticker.C:
|
||||
if q.lengthExcluding("") == 0 {
|
||||
return nil
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func taskStatus(activity *taskActivity, rateLimit *rateLimitState, queueLength int) TaskStatus {
|
||||
if queueLength < 0 {
|
||||
queueLength = 0
|
||||
|
||||
@@ -567,3 +567,81 @@ func (d *previewFakeDrive) EnsureDir(context.Context, string) (string, error) {
|
||||
return "", drives.ErrNotSupported
|
||||
}
|
||||
func (d *previewFakeDrive) RootID() string { return "root" }
|
||||
|
||||
|
||||
func TestWorkerWaitIdleReturnsImmediatelyWhenQueueEmpty(t *testing.T) {
|
||||
worker := NewWorker(&fakeTeaserGenerator{}, nil, &previewFakeDrive{})
|
||||
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
|
||||
defer cancel()
|
||||
|
||||
start := time.Now()
|
||||
if err := worker.WaitIdle(ctx); err != nil {
|
||||
t.Fatalf("WaitIdle on empty queue: %v", err)
|
||||
}
|
||||
if took := time.Since(start); took > 50*time.Millisecond {
|
||||
t.Fatalf("WaitIdle on empty queue took %s, want immediate return", took)
|
||||
}
|
||||
}
|
||||
|
||||
func TestWorkerWaitIdleBlocksUntilQueueDrains(t *testing.T) {
|
||||
worker := NewWorker(&fakeTeaserGenerator{}, nil, &previewFakeDrive{})
|
||||
v := &catalog.Video{ID: "wait-idle-vid"}
|
||||
if !worker.queue.reserve(v) {
|
||||
t.Fatalf("reserve should succeed on fresh queue")
|
||||
}
|
||||
|
||||
go func() {
|
||||
time.Sleep(120 * time.Millisecond)
|
||||
worker.queue.release(v)
|
||||
}()
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
|
||||
defer cancel()
|
||||
|
||||
start := time.Now()
|
||||
if err := worker.WaitIdle(ctx); err != nil {
|
||||
t.Fatalf("WaitIdle: %v", err)
|
||||
}
|
||||
took := time.Since(start)
|
||||
if took < 100*time.Millisecond {
|
||||
t.Fatalf("WaitIdle returned in %s, expected to wait until release", took)
|
||||
}
|
||||
if took > time.Second {
|
||||
t.Fatalf("WaitIdle took %s, expected to return shortly after release", took)
|
||||
}
|
||||
}
|
||||
|
||||
func TestWorkerWaitIdleHonoursContextCancel(t *testing.T) {
|
||||
worker := NewWorker(&fakeTeaserGenerator{}, nil, &previewFakeDrive{})
|
||||
v := &catalog.Video{ID: "ctx-cancel"}
|
||||
if !worker.queue.reserve(v) {
|
||||
t.Fatalf("reserve should succeed")
|
||||
}
|
||||
t.Cleanup(func() { worker.queue.release(v) })
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
|
||||
defer cancel()
|
||||
|
||||
if err := worker.WaitIdle(ctx); err == nil {
|
||||
t.Fatalf("WaitIdle expected ctx.Err, got nil")
|
||||
}
|
||||
}
|
||||
|
||||
func TestThumbWorkerWaitIdleBlocksUntilQueueDrains(t *testing.T) {
|
||||
worker := NewThumbWorker(&fakeThumbGenerator{}, nil, &previewFakeDrive{})
|
||||
v := &catalog.Video{ID: "thumb-wait-idle"}
|
||||
if !worker.queue.reserve(v) {
|
||||
t.Fatalf("reserve should succeed")
|
||||
}
|
||||
|
||||
go func() {
|
||||
time.Sleep(80 * time.Millisecond)
|
||||
worker.queue.release(v)
|
||||
}()
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
|
||||
defer cancel()
|
||||
if err := worker.WaitIdle(ctx); err != nil {
|
||||
t.Fatalf("ThumbWorker.WaitIdle: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -19,8 +19,15 @@ type Scanner struct {
|
||||
MaxDepth int
|
||||
// 回调:新视频被加入后触发 teaser 生成
|
||||
OnNewVideo func(v *catalog.Video)
|
||||
// ProgressInterval 控制扫描内部 heartbeat 的最小输出间隔。
|
||||
// 0 → 默认 30s;< 0 → 关闭 heartbeat(仅留外层 start / done 两行)。
|
||||
// heartbeat 单行格式:
|
||||
// [scanner] drive=X progress: scanned=N added=K errors=E dirs=M elapsed=Ts at=<dir>
|
||||
ProgressInterval time.Duration
|
||||
}
|
||||
|
||||
const defaultScanProgressInterval = 30 * time.Second
|
||||
|
||||
func New(cat *catalog.Catalog, drv drives.Drive, exts []string, maxDepth int, onNew func(v *catalog.Video)) *Scanner {
|
||||
m := make(map[string]bool, len(exts))
|
||||
for _, e := range exts {
|
||||
@@ -57,13 +64,44 @@ func (s *Scanner) Run(ctx context.Context, startDirID string) (Stats, error) {
|
||||
VisitedDirIDs: make(map[string]struct{}),
|
||||
ExcludedFileIDs: make(map[string]struct{}),
|
||||
}
|
||||
if err := s.walk(ctx, startDirID, "", 0, &stats); err != nil {
|
||||
|
||||
// heartbeat 闭包:进 / 退每个目录、每处理完一个文件后调一下,用一个时间戳节流。
|
||||
// 闭包持有的状态都是单 goroutine 顺序写读,不需要锁。
|
||||
interval := s.ProgressInterval
|
||||
if interval == 0 {
|
||||
interval = defaultScanProgressInterval
|
||||
}
|
||||
started := time.Now()
|
||||
lastBeat := started
|
||||
driveID := ""
|
||||
if s.Drive != nil {
|
||||
driveID = s.Drive.ID()
|
||||
}
|
||||
progress := func(currentDir string) {
|
||||
if interval < 0 {
|
||||
return
|
||||
}
|
||||
now := time.Now()
|
||||
if now.Sub(lastBeat) < interval {
|
||||
return
|
||||
}
|
||||
lastBeat = now
|
||||
shown := currentDir
|
||||
if shown == "" {
|
||||
shown = "(root)"
|
||||
}
|
||||
log.Printf("[scanner] drive=%s progress: scanned=%d added=%d errors=%d dirs=%d elapsed=%s at=%s",
|
||||
driveID, stats.Scanned, stats.Added, stats.Errors, len(stats.VisitedDirIDs),
|
||||
now.Sub(started).Round(time.Second), shown)
|
||||
}
|
||||
|
||||
if err := s.walk(ctx, startDirID, "", 0, &stats, progress); err != nil {
|
||||
return stats, err
|
||||
}
|
||||
return stats, nil
|
||||
}
|
||||
|
||||
func (s *Scanner) walk(ctx context.Context, dirID, dirName string, depth int, stats *Stats) error {
|
||||
func (s *Scanner) walk(ctx context.Context, dirID, dirName string, depth int, stats *Stats, progress func(string)) error {
|
||||
if depth >= s.MaxDepth {
|
||||
return nil
|
||||
}
|
||||
@@ -71,6 +109,7 @@ func (s *Scanner) walk(ctx context.Context, dirID, dirName string, depth int, st
|
||||
return err
|
||||
}
|
||||
stats.VisitedDirIDs[dirID] = struct{}{}
|
||||
progress(dirName) // 心跳:进入新目录前后是天然的节流点
|
||||
|
||||
entries, err := s.Drive.List(ctx, dirID)
|
||||
if err != nil {
|
||||
@@ -90,7 +129,7 @@ func (s *Scanner) walk(ctx context.Context, dirID, dirName string, depth int, st
|
||||
}
|
||||
continue
|
||||
}
|
||||
if err := s.walk(ctx, e.ID, e.Name, depth+1, stats); err != nil {
|
||||
if err := s.walk(ctx, e.ID, e.Name, depth+1, stats, progress); err != nil {
|
||||
stats.Errors++
|
||||
log.Printf("[scanner] walk %s error: %v", e.Name, err)
|
||||
}
|
||||
@@ -185,6 +224,10 @@ func (s *Scanner) walk(ctx context.Context, dirID, dirName string, depth int, st
|
||||
if s.OnNewVideo != nil {
|
||||
s.OnNewVideo(v)
|
||||
}
|
||||
// 兜底:如果某个目录里挤了几千个文件,仅靠"进目录心跳"会很久不响一下;
|
||||
// 在每条文件处理完之后再 ping 一次,progress 内部的 30s 节流会把绝大多数
|
||||
// 调用变成廉价的时间比较。
|
||||
progress(dirName)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
@@ -3,7 +3,10 @@ package scanner
|
||||
import (
|
||||
"context"
|
||||
"database/sql"
|
||||
"fmt"
|
||||
"io"
|
||||
"log"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
@@ -601,3 +604,79 @@ func (d *scannerTreeFakeDrive) EnsureDir(context.Context, string) (string, error
|
||||
return "", drives.ErrNotSupported
|
||||
}
|
||||
func (d *scannerTreeFakeDrive) RootID() string { return "root" }
|
||||
|
||||
|
||||
// captureLog 把 log 包默认输出引到一个 bytes.Buffer,便于断言进度日志被打印;
|
||||
// 测试结束自动恢复。
|
||||
func captureLog(t *testing.T) *strings.Builder {
|
||||
t.Helper()
|
||||
buf := &strings.Builder{}
|
||||
originalOutput := log.Writer()
|
||||
originalFlags := log.Flags()
|
||||
log.SetOutput(buf)
|
||||
log.SetFlags(0)
|
||||
t.Cleanup(func() {
|
||||
log.SetOutput(originalOutput)
|
||||
log.SetFlags(originalFlags)
|
||||
})
|
||||
return buf
|
||||
}
|
||||
|
||||
func TestScannerProgressHeartbeatEmits(t *testing.T) {
|
||||
ctx := context.Background()
|
||||
cat, err := catalog.Open(t.TempDir() + "/catalog.db")
|
||||
if err != nil {
|
||||
t.Fatalf("open catalog: %v", err)
|
||||
}
|
||||
t.Cleanup(func() { cat.Close() })
|
||||
|
||||
// 准备 5 个文件,足够触发循环结尾的 progress() 调用至少一次。
|
||||
entries := make([]drives.Entry, 5)
|
||||
for i := range entries {
|
||||
entries[i] = drives.Entry{
|
||||
ID: fmt.Sprintf("file-%d", i),
|
||||
Name: fmt.Sprintf("clip-%d.mp4", i),
|
||||
Size: 100,
|
||||
ModTime: time.Now(),
|
||||
}
|
||||
}
|
||||
drv := &scannerFakeDrive{entries: entries}
|
||||
|
||||
sc := New(cat, drv, []string{".mp4"}, 5, nil)
|
||||
// 极短间隔,确保至少一次 heartbeat 在 walk 内被触发
|
||||
sc.ProgressInterval = 1 * time.Microsecond
|
||||
|
||||
buf := captureLog(t)
|
||||
if _, err := sc.Run(ctx, ""); err != nil {
|
||||
t.Fatalf("scan: %v", err)
|
||||
}
|
||||
|
||||
out := buf.String()
|
||||
if !strings.Contains(out, "[scanner] drive=drive progress:") {
|
||||
t.Fatalf("expected progress heartbeat in log, got:\n%s", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestScannerProgressHeartbeatDisabled(t *testing.T) {
|
||||
ctx := context.Background()
|
||||
cat, err := catalog.Open(t.TempDir() + "/catalog.db")
|
||||
if err != nil {
|
||||
t.Fatalf("open catalog: %v", err)
|
||||
}
|
||||
t.Cleanup(func() { cat.Close() })
|
||||
|
||||
drv := &scannerFakeDrive{entries: []drives.Entry{
|
||||
{ID: "f-1", Name: "x.mp4", Size: 1, ModTime: time.Now()},
|
||||
}}
|
||||
sc := New(cat, drv, []string{".mp4"}, 5, nil)
|
||||
sc.ProgressInterval = -1 // 显式关闭
|
||||
|
||||
buf := captureLog(t)
|
||||
if _, err := sc.Run(ctx, ""); err != nil {
|
||||
t.Fatalf("scan: %v", err)
|
||||
}
|
||||
|
||||
if strings.Contains(buf.String(), "progress:") {
|
||||
t.Fatalf("progress heartbeat should be silenced when interval < 0, got:\n%s", buf.String())
|
||||
}
|
||||
}
|
||||
|
||||
@@ -129,8 +129,11 @@ type Config struct {
|
||||
Catalog *catalog.Catalog
|
||||
Registry Registry
|
||||
GetTargetDriveID func() string // 通常对应 App.Spider91UploadDriveID()
|
||||
Interval time.Duration // 0 时默认 60s
|
||||
BatchLimit int // 单轮最多迁多少个,0 时默认 50
|
||||
// Interval 已废弃 —— 旧版迁移 worker 是周期 ticker,新版只通过 nightly
|
||||
// pipeline 调用 RunOnce,不再有内置定时器。保留字段不删是为了兼容外
|
||||
// 部 yaml / 测试代码里仍传值的场景。
|
||||
Interval time.Duration
|
||||
BatchLimit int // 单轮最多迁多少个,0 时默认 50
|
||||
// KeepLatestN 是每个 spider91 drive 在本地保留的最新视频数。
|
||||
// 超过的部分中"已迁移"的会被清理;未迁移的不动。0 时默认 15;< 0 关闭清理。
|
||||
KeepLatestN int
|
||||
@@ -143,7 +146,6 @@ type Config struct {
|
||||
|
||||
type Migrator struct {
|
||||
cfg Config
|
||||
trigger chan struct{}
|
||||
mu sync.Mutex
|
||||
running bool
|
||||
|
||||
@@ -158,9 +160,6 @@ type Migrator struct {
|
||||
}
|
||||
|
||||
func New(cfg Config) *Migrator {
|
||||
if cfg.Interval == 0 {
|
||||
cfg.Interval = 60 * time.Second
|
||||
}
|
||||
if cfg.BatchLimit == 0 {
|
||||
cfg.BatchLimit = 50
|
||||
}
|
||||
@@ -171,8 +170,7 @@ func New(cfg Config) *Migrator {
|
||||
cfg.CaptchaCooldown = 5 * time.Minute
|
||||
}
|
||||
return &Migrator{
|
||||
cfg: cfg,
|
||||
trigger: make(chan struct{}, 1),
|
||||
cfg: cfg,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -228,67 +226,19 @@ func (m *Migrator) markCooldownLogged() bool {
|
||||
}
|
||||
|
||||
// Trigger 安排一次"立即跑"。多次调用会被合并成一次(channel buffer=1)。
|
||||
func (m *Migrator) Trigger() {
|
||||
select {
|
||||
case m.trigger <- struct{}{}:
|
||||
default:
|
||||
}
|
||||
}
|
||||
|
||||
// Run 是阻塞循环;ctx 取消时退出。
|
||||
func (m *Migrator) Run(ctx context.Context) {
|
||||
t := time.NewTicker(m.cfg.Interval)
|
||||
defer t.Stop()
|
||||
var cooldownTimer *time.Timer
|
||||
var cooldownC <-chan time.Time
|
||||
stopCooldownTimer := func() {
|
||||
if cooldownTimer == nil {
|
||||
return
|
||||
}
|
||||
if !cooldownTimer.Stop() {
|
||||
select {
|
||||
case <-cooldownTimer.C:
|
||||
default:
|
||||
}
|
||||
}
|
||||
cooldownTimer = nil
|
||||
cooldownC = nil
|
||||
}
|
||||
resetCooldownTimer := func() {
|
||||
stopCooldownTimer()
|
||||
active, until := m.inCooldown()
|
||||
if !active {
|
||||
return
|
||||
}
|
||||
delay := time.Until(until)
|
||||
if delay < 0 {
|
||||
delay = 0
|
||||
}
|
||||
cooldownTimer = time.NewTimer(delay)
|
||||
cooldownC = cooldownTimer.C
|
||||
}
|
||||
defer stopCooldownTimer()
|
||||
|
||||
// 启动后立刻跑一次(不等第一个 tick)
|
||||
// RunOnce 跑一次完整迁移:列出所有 spider91 drive,对每个超过 KeepLatestN 的旧
|
||||
// 视频上传到目标 drive,事务性改写 catalog 行,删本地文件。
|
||||
//
|
||||
// 这是上层 nightly 流水线 Phase 3 的入口;不再有周期 ticker / Trigger 通道。
|
||||
// captcha cooldown 状态在单次 RunOnce 内仍生效(多 drive 时遇到 4002 立即停整轮);
|
||||
// 跨调用持久 5 分钟,下次 RunOnce 命中冷却期会直接 noop。
|
||||
//
|
||||
// 当前实现不会向调用方返回 error —— 单条迁移失败已在内部记日志并跳过;
|
||||
// 整轮被 cooldown / context 取消时也通过日志可观测。保留 error 返回签名是为
|
||||
// 给未来需要把 nightly 失败状态展示给 admin 用。
|
||||
func (m *Migrator) RunOnce(ctx context.Context) error {
|
||||
m.runOnce(ctx)
|
||||
resetCooldownTimer()
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return
|
||||
case <-t.C:
|
||||
m.runOnce(ctx)
|
||||
resetCooldownTimer()
|
||||
case <-m.trigger:
|
||||
m.runOnce(ctx)
|
||||
resetCooldownTimer()
|
||||
case <-cooldownC:
|
||||
cooldownTimer = nil
|
||||
cooldownC = nil
|
||||
m.runOnce(ctx)
|
||||
resetCooldownTimer()
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// runOnce 单轮:扫所有 spider91 drive,对每条还有本地文件的视频做迁移。
|
||||
|
||||
@@ -324,8 +324,7 @@ func TestRunOnceMigratesSpider91VideosAndCleansLocalFiles(t *testing.T) {
|
||||
Catalog: cat,
|
||||
Registry: reg,
|
||||
GetTargetDriveID: func() string { return pp.ID() },
|
||||
Interval: time.Hour, // 测试不靠 ticker
|
||||
KeepLatestN: -1, // 关闭"保留最新 N 个",让 1 条也能立即上传
|
||||
KeepLatestN: -1, // 关闭"保留最新 N 个",让 1 条也能立即上传
|
||||
})
|
||||
m.runOnce(context.Background())
|
||||
|
||||
@@ -805,75 +804,6 @@ func TestRunOnceResumesAfterCooldownExpires(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunWakesWhenCooldownExpires 验证 Run 循环会在 cooldown 到点后主动唤醒
|
||||
// 一次迁移,而不是等下一个普通 interval tick。
|
||||
func TestRunWakesWhenCooldownExpires(t *testing.T) {
|
||||
cat := setupCatalog(t)
|
||||
src, _ := setupSpider91(t)
|
||||
pp := newFakePikPak("pikpak-target", "pikpak-root-id")
|
||||
|
||||
migrated := make(chan struct{}, 1)
|
||||
var failOnce sync.Once
|
||||
pp.uploadFunc = func(ctx context.Context, parentID, name string, r io.Reader, size int64) (UploadResult, error) {
|
||||
body, _ := io.ReadAll(r)
|
||||
var failed bool
|
||||
failOnce.Do(func() { failed = true })
|
||||
if failed {
|
||||
captcha := &pikpak.APIError{ErrorCode: 4002, ErrorMsg: "captcha_invalid"}
|
||||
return UploadResult{}, fmt.Errorf("pikpak upload: request session: %w", captcha)
|
||||
}
|
||||
pp.mu.Lock()
|
||||
pp.gotBodies[name] = body
|
||||
pp.mu.Unlock()
|
||||
return UploadResult{
|
||||
FileID: "remote-" + name,
|
||||
Hash: "FAKEHASH40CHARSXXXXXXXXXXXXXXXXXXXXXXXXX",
|
||||
Size: int64(len(body)),
|
||||
}, nil
|
||||
}
|
||||
reg := newFakeRegistry()
|
||||
reg.Add(src)
|
||||
reg.Add(pp)
|
||||
|
||||
now := time.Now()
|
||||
id := writeSpider91Video(t, cat, src, "vk-auto-resume", ".mp4", []byte("payload"), now)
|
||||
|
||||
m := New(Config{
|
||||
Catalog: cat,
|
||||
Registry: reg,
|
||||
GetTargetDriveID: func() string { return pp.ID() },
|
||||
Interval: time.Hour,
|
||||
KeepLatestN: -1,
|
||||
CaptchaCooldown: 30 * time.Millisecond,
|
||||
OnMigrated: func(videoID string) {
|
||||
if videoID == id {
|
||||
select {
|
||||
case migrated <- struct{}{}:
|
||||
default:
|
||||
}
|
||||
}
|
||||
},
|
||||
})
|
||||
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
defer cancel()
|
||||
go m.Run(ctx)
|
||||
|
||||
select {
|
||||
case <-migrated:
|
||||
case <-time.After(500 * time.Millisecond):
|
||||
t.Fatalf("Run did not resume migration after cooldown expired")
|
||||
}
|
||||
|
||||
got, err := cat.GetVideo(context.Background(), id)
|
||||
if err != nil {
|
||||
t.Fatalf("get video: %v", err)
|
||||
}
|
||||
if got.DriveID != pp.ID() {
|
||||
t.Fatalf("after auto resume, drive_id = %q, want PikPak", got.DriveID)
|
||||
}
|
||||
}
|
||||
|
||||
// TestNonCaptchaErrorDoesNotTriggerCooldown 验证非 captcha 类的上传错误(如
|
||||
// 网络抖动)不会让整个 worker 进冷却 —— 只跳过这一条,继续尝试 batch 里其它的。
|
||||
func TestNonCaptchaErrorDoesNotTriggerCooldown(t *testing.T) {
|
||||
@@ -936,7 +866,6 @@ func TestRunOnceMigratesToP115Target(t *testing.T) {
|
||||
Catalog: cat,
|
||||
Registry: reg,
|
||||
GetTargetDriveID: func() string { return target.ID() },
|
||||
Interval: time.Hour,
|
||||
KeepLatestN: -1,
|
||||
})
|
||||
m.runOnce(context.Background())
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import { useEffect, useMemo, useState } from "react";
|
||||
import { Download, Plus, Power, PowerOff, RefreshCw, RotateCcw, Trash2 } from "lucide-react";
|
||||
import { Download, PlayCircle, Plus, Power, PowerOff, RefreshCw, RotateCcw, Trash2 } from "lucide-react";
|
||||
import * as api from "./api";
|
||||
import { useToast } from "./ToastContext";
|
||||
import { Modal } from "./Modal";
|
||||
@@ -199,6 +199,20 @@ export function DrivesPage() {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* 立即触发完整凌晨流水线(Phase1 扫所有云盘 → Phase2 spider91 爬虫 →
|
||||
* Phase3 spider91 → 云盘迁移)。后端立即返回 202;进度看 backend 日志。
|
||||
* 已在跑时后端会丢弃此次触发,按钮再点也不会重复进入。
|
||||
*/
|
||||
async function handleRunNightly() {
|
||||
try {
|
||||
await api.runNightlyJob();
|
||||
show("已触发完整流水线(扫盘 → 91 爬虫 → 迁移),耗时较长,可在 backend 日志观察进度", "success");
|
||||
} catch (e) {
|
||||
show(e instanceof Error ? e.message : "触发失败", "error");
|
||||
}
|
||||
}
|
||||
|
||||
async function handleRegenFailed(d: api.AdminDrive) {
|
||||
setRegenFailedId(d.id);
|
||||
try {
|
||||
@@ -253,9 +267,19 @@ export function DrivesPage() {
|
||||
<section>
|
||||
<header className="admin-page__header">
|
||||
<h1 className="admin-page__title">网盘管理</h1>
|
||||
<button className="admin-btn is-primary" onClick={openCreate}>
|
||||
<Plus size={14} /> 新建
|
||||
</button>
|
||||
<div style={{ display: "flex", gap: "8px" }}>
|
||||
<button
|
||||
type="button"
|
||||
className="admin-btn"
|
||||
onClick={handleRunNightly}
|
||||
title="立即跑一次完整流水线:扫所有云盘 → 91 爬虫 → spider91 视频迁移到云盘。耗时较长,期间不要重复触发。"
|
||||
>
|
||||
<PlayCircle size={14} /> 立即跑全流程
|
||||
</button>
|
||||
<button className="admin-btn is-primary" onClick={openCreate}>
|
||||
<Plus size={14} /> 新建
|
||||
</button>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
{storage && <StorageSummary storage={storage} />}
|
||||
|
||||
@@ -273,3 +273,16 @@ export function updateSettings(body: Partial<Settings>) {
|
||||
body: JSON.stringify(body),
|
||||
});
|
||||
}
|
||||
|
||||
|
||||
// ---------- Jobs ----------
|
||||
|
||||
/**
|
||||
* 立即触发一次完整的凌晨流水线(Phase1 扫盘 + Phase2 91 爬虫 + Phase3 迁移),
|
||||
* 不论当前时间或今日是否已跑。立即返回 202;进度通过 backend 日志观察。
|
||||
*
|
||||
* 流水线已在跑时后端会丢弃此次触发。
|
||||
*/
|
||||
export function runNightlyJob() {
|
||||
return request<{ ok: boolean }>("/jobs/nightly/run", { method: "POST" });
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user