Gemini 콘텐츠 생성

POST

v1beta

models

{model}

{operator}

from google import genai

client = genai.Client(
    api_key="<COMETAPI_KEY>",
    http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
)

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Explain how AI works in a few words",
)

print(response.text)

{
  "candidates": [
    {
      "content": {
        "role": "<string>",
        "parts": [
          {
            "text": "<string>",
            "functionCall": {
              "name": "<string>",
              "args": {}
            },
            "inlineData": {
              "mimeType": "<string>",
              "data": "<string>"
            },
            "thought": true
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "<string>",
          "probability": "<string>",
          "blocked": true
        }
      ],
      "citationMetadata": {
        "citationSources": [
          {
            "startIndex": 123,
            "endIndex": 123,
            "uri": "<string>",
            "license": "<string>"
          }
        ]
      },
      "tokenCount": 123,
      "avgLogprobs": 123,
      "groundingMetadata": {
        "groundingChunks": [
          {
            "web": {
              "uri": "<string>",
              "title": "<string>"
            }
          }
        ],
        "groundingSupports": [
          {
            "groundingChunkIndices": [
              123
            ],
            "confidenceScores": [
              123
            ],
            "segment": {
              "startIndex": 123,
              "endIndex": 123,
              "text": "<string>"
            }
          }
        ],
        "webSearchQueries": [
          "<string>"
        ]
      },
      "index": 123
    }
  ],
  "promptFeedback": {
    "blockReason": "SAFETY",
    "safetyRatings": [
      {
        "category": "<string>",
        "probability": "<string>",
        "blocked": true
      }
    ]
  },
  "usageMetadata": {
    "promptTokenCount": 123,
    "candidatesTokenCount": 123,
    "totalTokenCount": 123,
    "trafficType": "<string>",
    "thoughtsTokenCount": 123,
    "promptTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ],
    "candidatesTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ]
  },
  "modelVersion": "<string>",
  "createTime": "<string>",
  "responseId": "<string>"
}

개요

CometAPI는 Gemini 네이티브 API 형식을 지원하여 사고 제어, Google Search grounding, 네이티브 이미지 생성 modalities 등 Gemini 고유 기능에 완전히 접근할 수 있게 해줍니다. OpenAI 호환 채팅 엔드포인트에서 제공되지 않는 기능이 필요할 때 이 엔드포인트를 사용하세요.

빠른 시작

모든 Gemini SDK 또는 HTTP 클라이언트에서 base URL과 API 키를 교체하세요:

설정	Google 기본값	CometAPI
Base URL	`generativelanguage.googleapis.com`	`api.cometapi.com`
API Key	`$GEMINI_API_KEY`	`$COMETAPI_KEY`

인증에는 x-goog-api-key와 Authorization: Bearer 헤더를 모두 사용할 수 있습니다.

사고(추론)

Gemini 모델은 응답을 생성하기 전에 내부 추론을 수행할 수 있습니다. 제어 방식은 모델 세대에 따라 달라집니다.

Gemini 3 (thinkingLevel)
Gemini 2.5 (thinkingBudget)

Gemini 3 모델은 thinkingLevel을 사용해 추론 깊이를 제어합니다. 사용 가능한 수준: MINIMAL, LOW, MEDIUM, HIGH.

curl "https://api.cometapi.com/v1beta/models/gemini-3.1-pro-preview:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Explain quantum physics simply."}]}],
    "generationConfig": {
      "thinkingConfig": {"thinkingLevel": "LOW"}
    }
  }'

Gemini 2.5 모델은 세밀한 토큰(Token) 단위 제어를 위해 thinkingBudget을 사용합니다:

0 — 사고 비활성화
-1 — 동적(모델이 결정, 기본값)
> 0 — 특정 토큰(Token) 예산(예: 1024, 2048)

curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Solve this logic puzzle step by step."}]}],
    "generationConfig": {
      "thinkingConfig": {"thinkingBudget": 2048}
    }
  }'

Gemini 2.5 모델에서 thinkingLevel을 사용하거나(Gemini 3 모델에서 thinkingBudget을 사용하는 경우도 포함) 오류가 발생할 수 있습니다. 모델 버전에 맞는 올바른 파라미터를 사용하세요.

스트리밍(Streaming)

콘텐츠를 생성하는 동안 Server-Sent Events를 받으려면 operator로 streamGenerateContent?alt=sse를 사용하세요. 각 SSE 이벤트에는 JSON GenerateContentResponse 객체가 들어 있는 data: 줄이 포함됩니다.

curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  --no-buffer \
  -d '{
    "contents": [{"parts": [{"text": "Write a short poem about the stars"}]}]
  }'

시스템 지침

systemInstruction으로 전체 대화에 걸친 모델의 동작을 안내할 수 있습니다:

curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "What is 2+2?"}]}],
    "systemInstruction": {
      "parts": [{"text": "You are a math tutor. Always show your work."}]
    }
  }'

JSON 모드

responseMimeType를 사용해 구조화된 JSON 출력을 강제할 수 있습니다. 엄격한 스키마 검증이 필요하다면 responseSchema를 선택적으로 제공할 수 있습니다:

curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "List 3 planets with their distances from the sun"}]}],
    "generationConfig": {
      "responseMimeType": "application/json"
    }
  }'

Google Search Grounding

googleSearch 도구를 추가해 실시간 웹 검색을 활성화할 수 있습니다:

curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Who won the euro 2024?"}]}],
    "tools": [{"google_search": {}}]
  }'

응답에는 소스 URL과 신뢰도 점수가 포함된 groundingMetadata가 포함됩니다.

응답 예시

CometAPI의 Gemini 엔드포인트에서 반환되는 일반적인 응답 예시입니다:

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [{"text": "Hello"}]
      },
      "finishReason": "STOP",
      "avgLogprobs": -0.0023
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 5,
    "candidatesTokenCount": 1,
    "totalTokenCount": 30,
    "trafficType": "ON_DEMAND",
    "thoughtsTokenCount": 24,
    "promptTokensDetails": [{"modality": "TEXT", "tokenCount": 5}],
    "candidatesTokensDetails": [{"modality": "TEXT", "tokenCount": 1}]
  },
  "modelVersion": "gemini-2.5-flash",
  "createTime": "2026-03-25T04:21:43.756483Z",
  "responseId": "CeynaY3LDtvG4_UP0qaCuQY"
}

usageMetadata의 thoughtsTokenCount 필드는 응답에 thinking 출력이 포함되지 않더라도, 모델이 내부 추론에 사용한 토큰 수를 보여줍니다.

OpenAI 호환 엔드포인트와의 주요 차이점

기능	Gemini 네이티브 (`/v1beta/models/...`)	OpenAI 호환 (`/v1/chat/completions`)
thinking 제어	`thinkingConfig`에서 `thinkingLevel` / `thinkingBudget` 사용	지원되지 않음
Google Search grounding	`tools: [\{"google_search": \{\}\}]`	지원되지 않음
Google Maps grounding	`tools: [\{"googleMaps": \{\}\}]`	지원되지 않음
이미지 생성 modality	`responseModalities: ["IMAGE"]`	지원되지 않음
인증 헤더	`x-goog-api-key` 또는 `Bearer`	`Bearer`만 지원
응답 형식	Gemini 네이티브 (`candidates`, `parts`)	OpenAI 형식 (`choices`, `message`)

인증

x-goog-api-key

string

header

필수

Your CometAPI key passed via the x-goog-api-key header. Bearer token authentication (Authorization: Bearer <key>) is also supported.

경로 매개변수

model

string

필수

The Gemini model ID to use. See the Models page for current Gemini model IDs.

예시:

"gemini-2.5-flash"

operator

enum<string>

필수

The operation to perform. Use generateContent for synchronous responses, or streamGenerateContent?alt=sse for Server-Sent Events streaming.

사용 가능한 옵션:

generateContent,

streamGenerateContent?alt=sse

예시:

"generateContent"

본문

application/json

contents

object[]

필수

The conversation history and current input. For single-turn queries, provide a single item. For multi-turn conversations, include all previous turns.

Show child attributes

systemInstruction

object

System instructions that guide the model's behavior across the entire conversation. Text only.

Show child attributes

tools

object[]

Tools the model may use to generate responses. Supports function declarations, Google Search, Google Maps, and code execution.

Show child attributes

toolConfig

object

Configuration for tool usage, such as function calling mode.

Show child attributes

safetySettings

object[]

Safety filter settings. Override default thresholds for specific harm categories.

Show child attributes

generationConfig

object

Configuration for model generation behavior including temperature, output length, and response format.

Show child attributes

cachedContent

string

The name of cached content to use as context. Format: cachedContents/{id}. See the Gemini context caching documentation for details.

응답

200 - application/json

Successful response. For streaming requests, the response is a stream of SSE events, each containing a GenerateContentResponse JSON object prefixed with data: .

candidates

object[]

The generated response candidates.

Show child attributes

promptFeedback

object

Feedback on the prompt, including safety blocking information.

Show child attributes

usageMetadata

object

Token usage statistics for the request.

Show child attributes

modelVersion

string

The model version that generated this response.

createTime

string

The timestamp when this response was created (ISO 8601 format).

responseId

string

Unique identifier for this response.

Anthropic 메시지

임베딩

from google import genai

client = genai.Client(
    api_key="<COMETAPI_KEY>",
    http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
)

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Explain how AI works in a few words",
)

print(response.text)

{
  "candidates": [
    {
      "content": {
        "role": "<string>",
        "parts": [
          {
            "text": "<string>",
            "functionCall": {
              "name": "<string>",
              "args": {}
            },
            "inlineData": {
              "mimeType": "<string>",
              "data": "<string>"
            },
            "thought": true
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "<string>",
          "probability": "<string>",
          "blocked": true
        }
      ],
      "citationMetadata": {
        "citationSources": [
          {
            "startIndex": 123,
            "endIndex": 123,
            "uri": "<string>",
            "license": "<string>"
          }
        ]
      },
      "tokenCount": 123,
      "avgLogprobs": 123,
      "groundingMetadata": {
        "groundingChunks": [
          {
            "web": {
              "uri": "<string>",
              "title": "<string>"
            }
          }
        ],
        "groundingSupports": [
          {
            "groundingChunkIndices": [
              123
            ],
            "confidenceScores": [
              123
            ],
            "segment": {
              "startIndex": 123,
              "endIndex": 123,
              "text": "<string>"
            }
          }
        ],
        "webSearchQueries": [
          "<string>"
        ]
      },
      "index": 123
    }
  ],
  "promptFeedback": {
    "blockReason": "SAFETY",
    "safetyRatings": [
      {
        "category": "<string>",
        "probability": "<string>",
        "blocked": true
      }
    ]
  },
  "usageMetadata": {
    "promptTokenCount": 123,
    "candidatesTokenCount": 123,
    "totalTokenCount": 123,
    "trafficType": "<string>",
    "thoughtsTokenCount": 123,
    "promptTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ],
    "candidatesTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ]
  },
  "modelVersion": "<string>",
  "createTime": "<string>",
  "responseId": "<string>"
}

개요

API 레퍼런스

통합 가이드

오류

요금 및 결제

지원

개요

빠른 시작

사고(추론)

스트리밍(Streaming)

시스템 지침

JSON 모드

Google Search Grounding

응답 예시

OpenAI 호환 엔드포인트와의 주요 차이점

인증

경로 매개변수

본문

응답

개요

API 레퍼런스

통합 가이드

오류

요금 및 결제

지원

​개요

​빠른 시작

​사고(추론)

​스트리밍(Streaming)

​시스템 지침

​JSON 모드

​Google Search Grounding

​응답 예시

​OpenAI 호환 엔드포인트와의 주요 차이점

인증

경로 매개변수

본문

응답

개요

빠른 시작

사고(추론)

스트리밍(Streaming)

시스템 지침

JSON 모드

Google Search Grounding

응답 예시

OpenAI 호환 엔드포인트와의 주요 차이점