跳转到主要内容
POST
/
v1beta
/
models
/
{model}
:
{operator}
from google import genai

client = genai.Client(
    api_key="<COMETAPI_KEY>",
    http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
)

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Explain how AI works in a few words",
)

print(response.text)
{
  "candidates": [
    {
      "content": {
        "role": "<string>",
        "parts": [
          {
            "text": "<string>",
            "functionCall": {
              "name": "<string>",
              "args": {}
            },
            "inlineData": {
              "mimeType": "<string>",
              "data": "<string>"
            },
            "thought": true
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "<string>",
          "probability": "<string>",
          "blocked": true
        }
      ],
      "citationMetadata": {
        "citationSources": [
          {
            "startIndex": 123,
            "endIndex": 123,
            "uri": "<string>",
            "license": "<string>"
          }
        ]
      },
      "tokenCount": 123,
      "avgLogprobs": 123,
      "groundingMetadata": {
        "groundingChunks": [
          {
            "web": {
              "uri": "<string>",
              "title": "<string>"
            }
          }
        ],
        "groundingSupports": [
          {
            "groundingChunkIndices": [
              123
            ],
            "confidenceScores": [
              123
            ],
            "segment": {
              "startIndex": 123,
              "endIndex": 123,
              "text": "<string>"
            }
          }
        ],
        "webSearchQueries": [
          "<string>"
        ]
      },
      "index": 123
    }
  ],
  "promptFeedback": {
    "blockReason": "SAFETY",
    "safetyRatings": [
      {
        "category": "<string>",
        "probability": "<string>",
        "blocked": true
      }
    ]
  },
  "usageMetadata": {
    "promptTokenCount": 123,
    "candidatesTokenCount": 123,
    "totalTokenCount": 123,
    "trafficType": "<string>",
    "thoughtsTokenCount": 123,
    "promptTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ],
    "candidatesTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ]
  },
  "modelVersion": "<string>",
  "createTime": "<string>",
  "responseId": "<string>"
}

概述

CometAPI 支持 Gemini 原生 API 格式,让你可以完整使用 Gemini 专属特性,例如思考控制、Google 搜索 grounding、原生图像生成模态等。当你需要使用 OpenAI-compatible chat endpoint 无法提供的能力时,请使用此端点。

快速开始

将任意 Gemini SDK 或 HTTP 客户端中的基础 URL 和 API 密钥替换为以下内容:
设置Google 默认值CometAPI
基础 URLgenerativelanguage.googleapis.comapi.cometapi.com
API 密钥$GEMINI_API_KEY$COMETAPI_KEY
支持使用 x-goog-api-keyAuthorization: Bearer 请求头进行身份验证。

思考(推理)

Gemini 模型在生成响应前可以先进行内部推理。具体控制方式取决于模型代际。
Gemini 3 模型使用 thinkingLevel 控制推理深度。可用级别包括:MINIMALLOWMEDIUMHIGH
curl "https://api.cometapi.com/v1beta/models/gemini-3.1-pro-preview:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Explain quantum physics simply."}]}],
    "generationConfig": {
      "thinkingConfig": {"thinkingLevel": "LOW"}
    }
  }'
在 Gemini 2.5 模型中使用 thinkingLevel(或在 Gemini 3 模型中使用 thinkingBudget)可能会导致错误。请根据你的模型版本使用正确的参数。

流式输出(Streaming)

使用 streamGenerateContent?alt=sse 作为 operator,即可在模型生成内容时接收 Server-Sent Events。每个 SSE 事件都包含一行 data:,其中携带一个 JSON 格式的 GenerateContentResponse 对象。
curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  --no-buffer \
  -d '{
    "contents": [{"parts": [{"text": "Write a short poem about the stars"}]}]
  }'

系统指令

使用 systemInstruction 在整个对话过程中引导模型行为:
curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "What is 2+2?"}]}],
    "systemInstruction": {
      "parts": [{"text": "You are a math tutor. Always show your work."}]
    }
  }'

JSON 模式

使用 responseMimeType 强制结构化 JSON 输出。也可以选择提供 responseSchema 以进行严格的 schema 校验:
curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "List 3 planets with their distances from the sun"}]}],
    "generationConfig": {
      "responseMimeType": "application/json"
    }
  }'

Google 搜索基础支撑

通过添加 googleSearch 工具启用实时网页搜索:
curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Who won the euro 2024?"}]}],
    "tools": [{"google_search": {}}]
  }'
响应中包含带有来源 URL 和置信度分数的 groundingMetadata

响应示例

这是 CometAPI 的 Gemini 端点返回的一个典型响应:
{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [{"text": "Hello"}]
      },
      "finishReason": "STOP",
      "avgLogprobs": -0.0023
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 5,
    "candidatesTokenCount": 1,
    "totalTokenCount": 30,
    "trafficType": "ON_DEMAND",
    "thoughtsTokenCount": 24,
    "promptTokensDetails": [{"modality": "TEXT", "tokenCount": 5}],
    "candidatesTokensDetails": [{"modality": "TEXT", "tokenCount": 1}]
  },
  "modelVersion": "gemini-2.5-flash",
  "createTime": "2026-03-25T04:21:43.756483Z",
  "responseId": "CeynaY3LDtvG4_UP0qaCuQY"
}
usageMetadata 中的 thoughtsTokenCount 字段表示模型在内部推理上消耗了多少 Token,即使响应中不包含思考输出也是如此。

与 OpenAI-Compatible 端点的主要区别

功能Gemini 原生(/v1beta/models/...OpenAI-Compatible(/v1/chat/completions
Thinking 控制带有 thinkingLevel / thinkingBudgetthinkingConfig不可用
Google 搜索基础支撑tools: [\{"google_search": \{\}\}]不可用
Google Maps 基础支撑tools: [\{"googleMaps": \{\}\}]不可用
图像生成模态responseModalities: ["IMAGE"]不可用
认证请求头x-goog-api-keyBearerBearer
响应格式Gemini 原生(candidates, partsOpenAI 格式(choices, message

授权

x-goog-api-key
string
header
必填

Your CometAPI key passed via the x-goog-api-key header. Bearer token authentication (Authorization: Bearer <key>) is also supported.

路径参数

model
string
必填

The Gemini model ID to use. See the Models page for current Gemini model IDs.

示例:

"gemini-2.5-flash"

operator
enum<string>
必填

The operation to perform. Use generateContent for synchronous responses, or streamGenerateContent?alt=sse for Server-Sent Events streaming.

可用选项:
generateContent,
streamGenerateContent?alt=sse
示例:

"generateContent"

请求体

application/json
contents
object[]
必填

The conversation history and current input. For single-turn queries, provide a single item. For multi-turn conversations, include all previous turns.

systemInstruction
object

System instructions that guide the model's behavior across the entire conversation. Text only.

tools
object[]

Tools the model may use to generate responses. Supports function declarations, Google Search, Google Maps, and code execution.

toolConfig
object

Configuration for tool usage, such as function calling mode.

safetySettings
object[]

Safety filter settings. Override default thresholds for specific harm categories.

generationConfig
object

Configuration for model generation behavior including temperature, output length, and response format.

cachedContent
string

The name of cached content to use as context. Format: cachedContents/{id}. See the Gemini context caching documentation for details.

响应

200 - application/json

Successful response. For streaming requests, the response is a stream of SSE events, each containing a GenerateContentResponse JSON object prefixed with data:.

candidates
object[]

The generated response candidates.

promptFeedback
object

Feedback on the prompt, including safety blocking information.

usageMetadata
object

Token usage statistics for the request.

modelVersion
string

The model version that generated this response.

createTime
string

The timestamp when this response was created (ISO 8601 format).

responseId
string

Unique identifier for this response.