임베딩

from openai import OpenAI client = OpenAI( base_url="https://api.cometapi.com/v1", api_key="<COMETAPI_KEY>", ) response = client.embeddings.create( model="text-embedding-3-small", input="The food was delicious and the waiter was friendly.", ) print(response.data[0].embedding[:5]) # First 5 dimensions print(f"Dimensions: {len(response.data[0].embedding)}")

{ "object": "list", "data": [ { "object": "embedding", "index": 0, "embedding": [ -0.0021, -0.0491, 0.0209, 0.0314, -0.0453 ] } ], "model": "text-embedding-3-small", "usage": { "prompt_tokens": 2, "total_tokens": 2 } }

개요

임베딩 API는 의미를 포착하는 텍스트의 벡터 표현을 생성합니다. 이러한 벡터는 시맨틱 검색, 클러스터링, 분류, 이상 탐지, 검색 증강 생성(RAG)에 사용할 수 있습니다.

CometAPI는 여러 제공업체의 임베딩 모델을 지원합니다. 하나 이상의 텍스트 문자열을 전달하면 벡터 데이터베이스에 저장하거나 유사도 계산에 직접 사용할 수 있는 수치 벡터를 반환합니다.

사용 가능한 모델

Model	Dimensions	Max Tokens	Best For
`text-embedding-3-large`	3,072 (조정 가능)	8,191	가장 높은 품질의 임베딩
`text-embedding-3-small`	1,536 (조정 가능)	8,191	비용 효율적이고 빠름
`text-embedding-ada-002`	1,536 (고정)	8,191	레거시 호환성

사용 가능한 모든 임베딩 모델과 가격은 모델 목록에서 확인하세요.

중요 참고 사항

차원 축소 — text-embedding-3-* 모델은 dimensions 파라미터를 지원하므로 정확도를 크게 잃지 않으면서 임베딩 벡터를 더 짧게 만들 수 있습니다. 이를 통해 대부분의 의미 정보를 유지하면서 저장 비용을 최대 75%까지 줄일 수 있습니다.

배치 입력 — input 파라미터에 문자열 배열을 전달하여 단일 요청에서 여러 텍스트를 임베딩할 수 있습니다. 이는 각 텍스트마다 개별 요청을 보내는 것보다 훨씬 효율적입니다.

인증

Authorization

string

header

필수

Bearer token authentication. Use your CometAPI key.

본문

application/json

model

string

필수

The embedding model to use. See the Models page for current embedding model IDs.

예시:

"text-embedding-3-small"

input

필수

The text to embed. Can be a single string, an array of strings, or an array of token arrays. Each input must not exceed the model's maximum token limit (8,191 tokens for text-embedding-3-* models).

encoding_format

enum<string>

기본값:float

The format of the returned embedding vectors. float returns an array of floating-point numbers. base64 returns a base64-encoded string representation, which can reduce response size for large batches.

사용 가능한 옵션:

float,

base64

dimensions

integer

The number of dimensions for the output embedding vector. Only supported by text-embedding-3-* models. Reducing dimensions can lower storage costs while maintaining most of the embedding's utility.

필수 범위: x >= 1

user

string

A unique identifier for your end-user, which can help monitor and detect abuse.

응답

200 - application/json

A list of embedding vectors for the input text(s).

object

enum<string>

The object type, always list.

사용 가능한 옵션:

list

예시:

"list"

data

object[]

An array of embedding objects, one per input text. When multiple inputs are provided, results are returned in the same order as the input.

Show child attributes

model

string

The model used to generate the embeddings.

예시:

"text-embedding-3-small"

usage

object

Token usage statistics for this request.

Show child attributes

개요

API 레퍼런스

통합 가이드

오류

요금 및 결제

지원

개요

사용 가능한 모델

중요 참고 사항

인증

본문

응답

개요

API 레퍼런스

통합 가이드

오류

요금 및 결제

지원

​개요

​사용 가능한 모델

​중요 참고 사항

인증

본문

응답

개요

사용 가능한 모델

중요 참고 사항