使用 CometAPI POST /v1/audio/transcriptions 將音訊轉錄為原始語言的文字。支援 Whisper 模型與多種輸出格式。
from openai import OpenAI
client = OpenAI(
api_key="<COMETAPI_KEY>",
base_url="https://api.cometapi.com/v1"
)
audio_file = open("audio.mp3", "rb")
transcription = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
print(transcription.text){
"text": "Hello, welcome to CometAPI."
}Bearer token authentication. Use your CometAPI key.
The audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
The speech-to-text model to use. Choose a current speech model from the Models page.
The language of the input audio in ISO-639-1 format (e.g., en, zh, ja). Supplying the language improves accuracy and latency.
Optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
The output format for the transcription.
json, text, srt, verbose_json, vtt Sampling temperature between 0 and 1. Higher values produce more random output; lower values are more focused. When set to 0, the model auto-adjusts temperature using log probability.
0 <= x <= 1The transcription result.
The transcribed text.
from openai import OpenAI
client = OpenAI(
api_key="<COMETAPI_KEY>",
base_url="https://api.cometapi.com/v1"
)
audio_file = open("audio.mp3", "rb")
transcription = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
print(transcription.text){
"text": "Hello, welcome to CometAPI."
}