1488 Commits

Author SHA1 Message Date
admin 824c6d9133 feat: add Tencent Cloud Coding Plan support and handle coding plan URLs in OpenAI adaptor
Docker Build / Build and Push Docker Image (push) Failing after 2m43s
2026-06-12 17:40:53 +08:00
admin 97ddfeab59 chore: rebrand copyright from QuantumNous to modelstoken
Docker Build / Build and Push Docker Image (push) Failing after 1m2s
2026-06-12 01:05:46 +08:00
CaIon 59a93cf5c7 fix(openai): align image streaming relay governance
Route OpenAI image streaming through shared stream handling, split image/realtime/usage helpers for maintainability, and include the related image request and rate limit updates.
2026-06-10 17:47:37 +08:00
Benson Yan 867d8acfc3 fix: normalize kimi k2.6 temperature (#5390) 2026-06-10 17:19:57 +08:00
gaoren002 d2576ddcd3 fix(openai): support streaming image relay and image edit for images API (#4608)
* fix(openai): support streaming image relay

* fix(openai): keep image edit multipart body reusable

* test(openai): cover image stream usage details

* test(openai): cover image edit fallback stream field

* fix(openai): wrap image json fallback as stream

* fix(relay): support OpenAI image streaming

* fix(openai): record image stream upstream error events

* fix(openai): harden image stream relay

* fix(openai): return image JSON errors

* fix(relay): reset stream status per scanner run

* fix(relay): drop upstream credit passthrough

* fix(openai): keep image errors minimal

* fix(openai): keep image error status from response

---------

Co-authored-by: CaIon <i@caion.me>
2026-06-08 18:36:17 +08:00
xujiantop-crypto 32805849d6 fix: reuse stream scanner buffer in channel handlers (#5225) 2026-06-05 12:18:57 +08:00
Don Ganesh 01c2128e23 fix: 收窄 OpenAI o 系列模型适配范围 (#5293)
* fix: 收窄 OpenAI o 系列模型适配范围

* fix(openai): 限制 gpt-5 适配仅作用于 OpenAI 模型

* fix(openai): narrow o-series reasoning model detection

---------

Co-authored-by: Seefs <i@seefs.me>
2026-06-05 12:12:45 +08:00
Chen011214 83068d115e fix(relay): fix Anthropic-compatible compatibility for GLM (avoid chunked encoding) (#5307) 2026-06-05 11:31:20 +08:00
Rain 3aa113b5a3 fix(dify): initialize file pointer before remote-image field assignment (#5134) 2026-06-04 18:21:35 +08:00
GGXH 0c7aceb831 feat: add claude opus 4.8 support (#5177) 2026-05-31 13:50:52 +08:00
花月喵梦 465c5edab9 fix:gemini to claude tool_use err (#5041) 2026-05-25 23:14:01 +08:00
learner-i ff06067a18 fix: 移除 fcIdx -1 偏移,修复并发工具调用撞键问题 (#5095)
当 Claude 直接以多个 tool_use 块起始(无文本前导,index=0)时,
-1 偏移导致 index=0 和 index=1 同时映射到 fcIdx=0:
- index=0 的工具 args 先流完,发出一次合法调用 ✓
- index=1 的 args 追加到同一 map 槽位,污染后为非法 JSON,该工具丢失 ✗
- index=2 以后的工具各自独占唯一 fcIdx,正常发出 ✓

结果:每轮并发调用中第 2 个工具必然丢失,
模型收不到对应的工具结果后重试剩余工具,
产生雪球效应(10个→9个→8个...逐轮收缩)。

修复:直接使用 Claude 的 block index 作为 fcIdx,不做偏移。
fcIdx 仅作为本地 map 的 key,只需保证唯一性,无需从 0 开始。
2026-05-25 23:13:06 +08:00
真的非她不可 2a528d46cb fix(relay): correct image quality parameter handling (#5103) 2026-05-25 22:57:02 +08:00
CaIon fddf54ccc5 perf: reduce heap residency for large base64 relay requests
Three layered optimizations targeting Gemini-style 5MB base64 payloads where
RSS could balloon to tens of GB under concurrent load:

1. Byte-based param override (relay/common/override.go)
   - Switch legacy/operations hot paths from common.Marshal round-trips and
     map[string]any conversions to gjson/sjson on []byte directly.
   - Avoids cloning 5MB strings during each Set/Delete operation.

2. strings.Builder for Gemini response markdown (relay/channel/gemini/relay-gemini.go)
   - Replace string concatenation + strings.Join when assembling
     "![image](data:...;base64,DATA)" content for inline image responses.
   - Pre-allocates capacity from inline_data byte sizes.

3. Outbound BodyStorage + streaming Decoder (this commit's core)
   - New relay/common/outbound_body.go helper wraps marshaled upstream bodies
     in common.BodyStorage, allowing disk-cache mode to offload jsonData to
     a temp file while waiting for upstream TTFB. The original []byte can
     then be GC'd, removing ~5MB/req of heap residency during the longest
     window of a request.
   - All 7 relay handlers (gemini/claude/responses/embedding/image/compatible/
     rerank) plus chat_completions_via_responses adopt the helper with
     defer closer.Close() and explicit jsonData = nil.
   - relay/common/relay_info.go: new UpstreamRequestBodySize so
     relay/channel/api_request.go can populate req.ContentLength (lost when
     body becomes a type-erased io.Reader).
   - common/gin.go UnmarshalBodyReusable: when storage is disk-backed and
     content-type is JSON, decode via DecodeJson(storage) instead of
     storage.Bytes()+Unmarshal, removing one transient 5MB copy per request.
     memory mode and form/multipart paths unchanged.
2026-05-22 19:08:38 +08:00
Seefs ae6a03364d perf: optimize request metadata extraction and disabled field filtering (#5009)
* perf: optimize request metadata extraction and disabled field filtering

* perf: optimize stream usage estimation path
2026-05-22 10:32:11 +08:00
Seefs 0d4b25795a fix: expose param override audits for sensitive message fields (#4974) 2026-05-19 18:28:03 +08:00
Seefs 0936e25046 perf: avoid eager formatting in debug log calls (#4929) 2026-05-19 12:11:24 +08:00
CaIon aa56667b8f feat: track upstream request ID and prevent response header override
When proxying through another new-api instance, the upstream
X-Oneapi-Request-Id was overwriting the local one in client responses.
This adds a new `upstream_request_id` field to the logs table, captures
the upstream ID during relay, and filters it from being copied back to
the client. Frontend gains search/filter and detail display support.
2026-05-12 21:53:54 +08:00
Seefs 38a3314b9b fix: preserve OpenAI image edit reference fields (#4646)
* fix: preserve OpenAI image edit reference fields

* feat: support json image edit requests
2026-05-06 21:27:47 +08:00
Calcium-Ion 5114ad0677 Merge pull request #4200 from yyhhyyyyyy/fix/vertex-gateway-base-url
fix(vertex): honor custom base_url as gateway prefix
2026-04-30 20:11:17 +08:00
yyhhyyyyyy 987b7ecd22 fix(vertex): honor custom base_url as gateway prefix 2026-04-30 15:08:10 +08:00
heimoshuiyu 8ca103342d fix: Message.ReasoningContent/Reasoning 改为 *string,修复空思考内容在请求转发时被静默丢弃的问题
问题:
在非 passThrough 模式下,客户端发送的 reasoning_content: "" 经过
Go struct 反序列化再序列化后,因 string + omitempty 无法区分空串和
字段缺失,导致空的思考内容被静默丢弃。

根因:
dto.Message.ReasoningContent 和 Message.Reasoning 使用 string(非指针)
加 omitempty,违反 AGENTS.md Rule 6(可选标量字段必须用指针类型)。

修复:
1. Message.ReasoningContent/Reasoning 类型从 string 改为 *string
   - nil = 字段缺失 → JSON 省略
   - &"" = 显式空串 → JSON 保留 reasoning_content: ""
2. 新增 Message.GetReasoningContent() 辅助方法
3. 更新所有读写处:relay-openai, relay-claude, relay-gemini, ollama
4. 新增测试覆盖空串保留、字段省略、getter 回退逻辑
2026-04-29 13:43:26 +08:00
Calcium-Ion d604f48c06 Merge pull request #4469 from seefs001/fix/tool-arguments-object
fix: support raw JSON response tool arguments
2026-04-26 20:20:03 +08:00
Seefs db89b57e1c fix: support raw JSON response tool arguments 2026-04-26 13:47:37 +08:00
Seefs 62d4b63fc3 feat: configure native messages model matching 2026-04-26 13:37:59 +08:00
CaIon f2f3410dcf feat: add len variable for tier conditions and LLM prompt helper 2026-04-25 13:24:20 +08:00
Calcium-Ion 8993386743 feat: support DeepSeek V4 reasoning suffix handling (#4428) 2026-04-24 17:06:59 +08:00
HynoR 435d7ae0dd feat: support DeepSeek V4 reasoning suffix handling 2026-04-24 16:50:35 +08:00
CaIon 3a2138ba61 refactor: rename and relocate HasModelBillingConfig function for clarity 2026-04-24 16:39:12 +08:00
yyhhyyyyyy e3d64cb76d Merge pull request #4431 from yyhhyyyyyy/fix/tiered-billing-model-list
fix: include tiered billing models in model listing
2026-04-24 16:24:36 +08:00
Xyfacai 69ba18d392 fix(image): only price image model use N ratio 2026-04-24 01:24:14 +08:00
CaIon eab478bdc8 fix: miscellaneous quick fixes from CodeRabbit review
- log_info_generate.go: add nil guard in InjectTieredBillingInfo
- billing_expr_request.go: merge headers instead of replacing
- go.mod: remove incorrect // indirect on expr-lang/expr
- ToolPriceSettings.jsx: add null check in syncToVisual
- tool_billing.go: fix PricePer1K for image_generation (per-call, not per-1K)
- utils.jsx: add minute() to time condition regex
- useUsageLogsData.jsx: pass displayMode to renderTieredModelPrice
- AGENTS.md, CLAUDE.md: fix Rule 6/7 ordering
- relay-gemini.go: add TEXT modality case in CandidatesTokensDetails
2026-04-24 00:34:06 +08:00
CaIon 3e5f2ee1d6 fix(billing): correct tiered billing settlement and edge cases
- quota.go: add missing SettleBilling call in PostWssConsumeQuota
- text_quota.go: gate InjectTieredBillingInfo on tieredBillingApplied bool
  instead of tieredResult != nil, so fallback billing still logs metadata
- price.go: remove quotaBeforeGroup == 0 from freeModel condition to avoid
  bypassing settlement for output-only expressions
- tiered_settle.go: split cc/cc1h subtraction using UsageSemantic to
  distinguish OpenAI vs Claude cache creation token formats
- pricing.go: only set BillingMode when a non-empty expression exists
- useModelPricingEditorState.js: only write billing_mode when
  finalBillingExpr is non-empty
2026-04-24 00:33:54 +08:00
CaIon 6bde1a9c8d Merge origin/main into nightly
Resolve conflicts:
- .gitignore: keep nightly additions (.test, skills-lock.json)
- relay/helper/price.go: keep both billingexpr and model imports
- en.json / zh-CN.json: keep nightly's superset of i18n entries
- service/billing_session.go: add missing 3rd arg to DecreaseUserQuota
- en.json / zh-CN.json: deduplicate 129+320 duplicate i18n keys
2026-04-23 21:37:03 +08:00
papersnake 47d7bca268 feat: support claude-opus-4-7 (#4293)
* feat: support claude-opus-4-7

* feat: summarized display for opus 4.7
2026-04-17 13:52:34 +08:00
CaIon 3cad6b9d7f fix(claude): improve handling of empty string content in OpenAI to Claude message conversion 2026-04-16 17:44:38 +08:00
Seefs f7adf02eb4 feat(claude): add cache_control and speed passthrough controls (#4247) 2026-04-15 20:55:01 +08:00
woan1136 3ab65a8221 fix: add Azure channel support for /v1/responses/compact URL routing (#4149)
The Azure channel's GetRequestURL method only handled RelayModeResponses
but missed RelayModeResponsesCompact. This caused compact requests to
fall through to the generic deployments URL pattern, producing an
incorrect path that Azure returns 404 for.

This fix extends the existing responses API special handling to also
cover the compact mode, appending /compact to the subUrl when the relay
mode is ResponsesCompact.

Affected URLs (before → after):
- Normal Azure: /openai/deployments/{model}/responses/compact → /openai/v1/responses/compact
- cognitiveservices: same pattern → /openai/responses/compact
- Custom AzureResponsesVersion: properly respected for compact too

Co-authored-by: 彭俊杰 <pengjunjie@onero.com>
2026-04-13 15:23:38 +08:00
CaIon 8b22161527 fix: set TopP to nil in Claude request configuration 2026-04-13 14:36:22 +08:00
skynono b4df9955f4 fix: isStream status in error logs instead of hardcoded false (#4195) 2026-04-12 17:41:26 +08:00
CaIon ed7f839911 feat: improve model price error UX with role-aware messages and cleaner UI
- Backend: differentiate error messages for admin vs regular users in price.go
- Backend: include error_code in channel test response for structured error handling
- Frontend: render model_price_error as a styled card in Playground with admin nav button
- Frontend: show inline error details and settings link in channel test modal
- Frontend: parse error codes from both SSE and non-streaming API responses
- i18n: remove redundant "Settings" suffix from setting tab translations (en/fr/ru/ja/vi)
- i18n: update "Group & Model Pricing" translations across all locales
2026-04-11 17:19:38 +08:00
CaIon 4d2993e4cc Merge remote-tracking branch 'origin/main' into nightly
# Conflicts:
#	web/src/helpers/render.jsx
#	web/src/hooks/usage-logs/useUsageLogsData.jsx
#	web/src/i18n/locales/en.json
2026-04-09 17:12:21 +08:00
yyhhyyyyyy 0220df8429 fix(channel-test): support tiered billing model tests (#4145)
Pre-fill BillingRequestInput from dto.Request before ModelPriceHelper,
so tiered_expr billing resolves param() from the structured request
instead of reading HTTP body (which is empty in channel-test context).

- attachTestBillingRequestInput: marshal dto.Request → RequestInput
- ResolveIncomingBillingExprRequestInput: early-return when pre-filled
- settleTestQuota / buildTestLogOther: align test settlement & logging
  with production TryTieredSettle / InjectTieredBillingInfo paths
2026-04-09 17:08:52 +08:00
Calcium-Ion b07f0b9626 Merge pull request #4154 from seefs001/feature/vllm-extensions-params
feat: fill in some custom fields for vllm-omini
2026-04-09 14:35:05 +08:00
Calcium-Ion 53cf37a469 fix(ali): accept string usage values in task polling (#4155) 2026-04-09 14:34:44 +08:00
NyaMisty 160cb28572 fix(zhipu_4v): use correct endpoint for coding plan image generation (#4146) 2026-04-09 14:33:48 +08:00
Seefs 274307b0a9 fix(ali): accept string usage values in task polling 2026-04-09 12:48:17 +08:00
Seefs a19a63b98c feat: fill in some custom fields for vllm-omini. 2026-04-09 12:41:51 +08:00
forsakenyang c734db34e8 feat: add minimax image generation relay support (#4103) 2026-04-08 16:57:44 +08:00
Calcium-Ion 9ffb85a36b Merge pull request #4068 from feitianbubu/seedance-support-duration
Seedance support duration
2026-04-08 15:01:25 +08:00