Chat Completions

개요

Chat Completions API는 대화 메시지를 기반으로 AI 모델의 응답을 생성하는 엔드포인트입니다. OpenAI Chat Completions API와 호환되는 규격을 따르며, 기존 OpenAI 기반 애플리케이션을 쉽게 전환하거나 OpenAI SDK를 그대로 사용할 수 있습니다.

엔드포인트: POST https://api.clovastudio.go.kr/api/v1/chat/completions
인증: Authorization: Bearer YOUR_API_KEY 헤더 사용

지원하는 모델과 개요을 확인하실려면 언어 모델 종류 페이지를 참고하세요.

공통 파라미터

요청 예시 (Request Form)

{
  "model": "HCX-GOV-THINK",
  "messages": [
    {"role": "system", "content": "당신은 친절한 AI 어시스턴트입니다."},
    {"role": "user", "content": "안녕하세요!"}
  ],
  "temperature": 0.0,
  "top_p": 1.0,
  "max_tokens": 1024,
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "현재 날씨를 조회합니다.",
        "parameters": {
          "type": "object",
          "required": ["location"],
          "properties": {
            "location": {"type": "string", "description": "도시명"}
          }
        }
      }
    }
  ],
  "tool_choice": "auto",
  "stop": ["<|im_end|><|endofturn|>", "<|im_end|><|stop|>"],
  "frequency_penalty": 0.0,
  "presence_penalty": 0.0,
  "skip_special_tokens": false
}

파라미터 설명

model: 사용할 모델의 ID입니다.
messages: 대화 메시지 배열입니다.
- system: 모델의 역할/규칙을 정의합니다.
- user: 사용자 질문/요청입니다.
- assistant: 이전 응답 이력(멀티 턴)을 전달할 때 사용합니다.
temperature: 생성 다양성 조절 값입니다. (0.0 ~ 2.0, 기본값 1.0)
- 낮을수록 일관된 응답, 높을수록 창의적 응답이 생성됩니다.
top_p: 누적 확률 기반 샘플링 값입니다. (0.0 ~ 1.0, 기본값 1.0)
max_tokens: 생성할 최대 답변 토큰 수입니다.
stream: true 설정 시 SSE 형식으로 스트리밍 응답을 받습니다.
tools: 함수 호출 도구 정의(JSON Schema)입니다.
tool_choice: 도구 선택 방식입니다. (auto, none, 특정 함수 지정)
stop: 지정 문자열이 생성되면 토큰 생성을 중단합니다.
frequency_penalty: 반복 억제 강도입니다. (-2.0 ~ 2.0, 기본값 0.0)
presence_penalty: 새로운 주제 유도 강도입니다. (-2.0 ~ 2.0, 기본값 0.0)
skip_special_tokens: 특수 토큰을 제외할지 여부입니다. (기본값 false)

stream: true일 때는 응답이 delta 청크로 순차 전달되며 마지막에 data: [DONE] 이벤트가 전송됩니다.

모델별 상세 가이드

HCX-GOV-THINK / HCX-GOV
HCX-GOV-THINK-V1-32B
LLM42-Translate
LLM42
LLM42-Gemma4
K-Exaone-236B
openai/gpt-oss-120b

네이버의 HyperCLOVAX-007 기반의 추론형 언어 모델입니다.

HCX-GOV-THINK: 기본적으로 추론(Reasoning)이 활성화되어 있습니다. 복잡한 문제 해결, 논리적 분석, 코드 디버깅에 적합합니다.
HCX-GOV: 기본적으로 추론이 비활성화되어 있습니다. 빠른 응답이 필요한 일반 대화, 문서 작성, 간단한 질의에 적합합니다.

두 모델 모두 chat_template_kwargs로 추론 동작을 명시적으로 제어할 수 있습니다.

chat_template_kwargs

파라미터	타입	설명
`force_reasoning`	boolean	추론을 강제로 활성화합니다. `reasoning_content` 필드에 사고 과정이 반환됩니다.
`skip_reasoning`	boolean	추론을 강제로 비활성화합니다. 즉시 `content`만 반환됩니다.

force_reasoning과 skip_reasoning을 동시에 true로 설정하지 마십시오.

멀티 턴 대화 시 주의: 이전 응답의 reasoning_content는 다음 턴의 messages에 포함하지 마십시오. content 필드만 포함해야 합니다.

코드 예시: HCX-GOV-THINK (기본 — 추론 활성화)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.clovastudio.go.kr/api/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="HCX-GOV-THINK",
    messages=[
        {
            "role": "system",
            "content": "for maximum 1024 tokens, and do not say or answer your used token for thinking"
        },
        {
            "role": "user",
            "content": "안녕!"
        }
    ],
    max_tokens=16384,
    temperature=0,
    stop=["<|im_end|><|endofturn|>", "<|im_end|><|stop|>"],
    extra_body={"skip_special_tokens": False, "chat_template_kwargs": {"force_reasoning": True}}
)

print("추론 과정:", response.choices[0].message.reasoning_content)
print("최종 답변:", response.choices[0].message.content)

응답 예시: HCX-GOV-THINK (stream: false)

추론 모드가 활성화된 경우 reasoning_content에 사고 과정이, content에 최종 답변이 반환됩니다.

{
  "id": "chatcmpl-bfc3d6c4e2724109a707e66368abefe4",
  "object": "chat.completion",
  "created": 1776910952,
  "model": "HCX-GOV-THINK",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "reasoning_content": "오늘 사용자가 \"안녕!\"이라고 인사했어. 한국어로 응답해야 하니까 \"안녕하세요!\"라고 답하는 게 좋겠지. 짧고 친절하게.",
        "content": "안녕하세요! 오늘 어떻게 도와드릴까요? 😊",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop",
      "stop_reason": "<|im_end|><|endofturn|>"
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 185,
    "total_tokens": 227
  }
}

응답 예시: HCX-GOV-THINK (stream: true)

스트리밍 시 delta.reasoning_content 청크가 먼저 전송되고, 이후 delta.content 청크가 전송됩니다.

data: {"id":"chatcmpl-e8ee4858266b48128b02d7b8813ce349","object":"chat.completion.chunk","created":1776911065,"model":"HCX-GOV-THINK","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-e8ee4858266b48128b02d7b8813ce349","object":"chat.completion.chunk","created":1776911065,"model":"HCX-GOV-THINK","choices":[{"index":0,"delta":{"reasoning_content":"오늘 사용자가 \"안녕!\"이라고 인사했어."},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-e8ee4858266b48128b02d7b8813ce349","object":"chat.completion.chunk","created":1776911065,"model":"HCX-GOV-THINK","choices":[{"index":0,"delta":{"reasoning_content":" 짧고 친절하게 답변해야 해."},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-e8ee4858266b48128b02d7b8813ce349","object":"chat.completion.chunk","created":1776911065,"model":"HCX-GOV-THINK","choices":[{"index":0,"delta":{"content":"안녕하세요!"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-e8ee4858266b48128b02d7b8813ce349","object":"chat.completion.chunk","created":1776911065,"model":"HCX-GOV-THINK","choices":[{"index":0,"delta":{"content":" 오늘 어떻게 도와드릴까요? 😊"},"logprobs":null,"finish_reason":"stop","stop_reason":"<|im_end|><|endofturn|>"}]}

data: [DONE]

요청 / 응답 예시: HCX-GOV-THINK (Tool Call)

도구 호출이 필요한 경우 finish_reason이 "tool_calls"로 반환되며, tool_calls 배열에 호출 정보가 포함됩니다.

response = client.chat.completions.create(
    model="HCX-GOV-THINK",
    messages=[
        {"role": "system", "content": "도구가 필요할 때는 tool_calls만 반환하세요."},
        {"role": "user", "content": "서울의 현재 날씨를 알려주세요."}
    ],
    max_tokens=512,
    temperature=0,
    stop=["<|im_end|><|endofturn|>", "<|im_end|><|stop|>"],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "특정 도시의 현재 날씨를 조회합니다.",
                "parameters": {
                    "type": "object",
                    "required": ["location"],
                    "properties": {
                        "location": {"type": "string", "description": "날씨를 조회할 도시 이름"}
                    }
                }
            }
        }
    ],
    tool_choice="auto",
    extra_body={"skip_special_tokens": False},
)

{
  "id": "chatcmpl-a3c53bb075a94eba91694b50b14d66e1",
  "object": "chat.completion",
  "created": 1776911148,
  "model": "HCX-GOV-THINK",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "reasoning_content": null,
        "content": null,
        "tool_calls": [
          {
            "id": "chatcmpl-tool-e352682269174fbca0addbad8fb9bef2",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"서울\"}"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "tool_calls",
      "stop_reason": "<|im_end|><|stop|>"
    }
  ],
  "usage": {
    "prompt_tokens": 99,
    "completion_tokens": 25,
    "total_tokens": 124
  }
}

스트리밍 응답 예시: HCX-GOV-THINK (Tool Call, stream: true)

HCX-GOV-THINK는 스트리밍 Tool Call 시 delta.tool_calls[].function에 id와 name 없이 arguments만 전달합니다. id와 name은 비스트리밍 응답에서만 확인할 수 있습니다.

data: {"id":"chatcmpl-f44dc0a3056340f38c29e890dc85f4c0","object":"chat.completion.chunk","created":1776912639,"model":"HCX-GOV-THINK","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-f44dc0a3056340f38c29e890dc85f4c0","object":"chat.completion.chunk","created":1776912639,"model":"HCX-GOV-THINK","choices":[{"index":0,"delta":{"content":null},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-f44dc0a3056340f38c29e890dc85f4c0","object":"chat.completion.chunk","created":1776912639,"model":"HCX-GOV-THINK","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"location\": \"서울\"}"}}]},"logprobs":null,"finish_reason":"tool_calls","stop_reason":"<|im_end|><|stop|>"}]}

data: [DONE]

코드 예시: HCX-GOV (기본 — 빠른 응답)

response = client.chat.completions.create(
    model="HCX-GOV",
    messages=[
        {"role": "user", "content": "안녕!"}
    ],
    max_tokens=16384,
    temperature=0,
)
print(response.choices[0].message.content)

응답 예시: HCX-GOV (stream: false)

추론이 비활성화된 경우 reasoning_content는 null로 반환됩니다.

{
  "id": "chatcmpl-74db3a45479044748cc015b7f891b7c0",
  "object": "chat.completion",
  "created": 1776911261,
  "model": "HCX-GOV",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "reasoning_content": null,
        "content": "안녕하세요! 무엇을 도와드릴까요? 😊",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop",
      "stop_reason": "<|im_end|><|endofturn|>"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 7,
    "total_tokens": 20
  }
}

응답 예시: HCX-GOV (stream: true)

추론이 없으므로 delta.content 청크만 순차 전송됩니다.

data: {"id":"chatcmpl-038a2a12108647d5aa47742a3e311281","object":"chat.completion.chunk","created":1776911088,"model":"HCX-GOV","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-038a2a12108647d5aa47742a3e311281","object":"chat.completion.chunk","created":1776911088,"model":"HCX-GOV","choices":[{"index":0,"delta":{"content":"안녕하세요"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-038a2a12108647d5aa47742a3e311281","object":"chat.completion.chunk","created":1776911088,"model":"HCX-GOV","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-038a2a12108647d5aa47742a3e311281","object":"chat.completion.chunk","created":1776911088,"model":"HCX-GOV","choices":[{"index":0,"delta":{"content":" 무엇을 도와드릴까요?"},"logprobs":null,"finish_reason":"stop","stop_reason":"<|im_end|><|endofturn|>"}]}

data: [DONE]

usage.completion_tokens는 reasoning_content와 content의 토큰 수를 합산한 값입니다. 추론 모드 사용 시 토큰 소비량이 크게 증가할 수 있습니다.

스트리밍

stream: true로 설정하면 응답이 실시간으로 스트리밍됩니다.

모델별 추론 Delta 키 비교

추론 모델마다 스트리밍 시 사고 과정을 전달하는 필드 이름이 다릅니다.

모델	추론 Delta 키	비고
HCX-GOV-THINK	`delta.reasoning_content`
HCX-GOV	(없음)	추론 없음, `delta.content`만 전송
HCX-GOV-THINK-V1-32B	`delta.reasoning` + `delta.reasoning_content`	두 필드에 동일한 내용이 함께 전송됨
K-Exaone-236B	`delta.reasoning`	`delta.reasoning_content` 없음
LLM42-Gemma4	`delta.reasoning`	`delta.reasoning_content` 없음
openai/gpt-oss-120b	`delta.reasoning_content`	마지막 청크는 빈 `delta: {}`로 종료
LLM42	(없음)	추론 없음, `delta.content`만 전송
LLM42-Translate	(없음)	추론 없음, `delta.content`만 전송

from openai import OpenAI

client = OpenAI(
    base_url="https://api.clovastudio.go.kr/api/v1",
    api_key="YOUR_API_KEY",
)

stream = client.chat.completions.create(
    model="HCX-GOV-THINK",
    messages=[{"role": "user", "content": "안녕하세요!"}],
    max_tokens=2048,
    stop=["<|im_end|><|endofturn|>", "<|im_end|><|stop|>"],
    stream=True,
    extra_body={"skip_special_tokens": False},
)

reasoning_content = ""
content = ""

for chunk in stream:
    delta = chunk.choices[0].delta
    if hasattr(delta, "reasoning_content") and delta.reasoning_content:
        reasoning_content += delta.reasoning_content
    if delta.content:
        content += delta.content
        print(delta.content, end="", flush=True)

추론 모델에서 스트리밍 시 추론 Delta가 먼저 전송되고, 이후 content가 전송됩니다. 모델별로 추론 Delta 키가 다르므로 위 표를 참고하여 처리하세요.

응답 형식

필드	타입	설명
`id`	string	요청 고유 식별자
`object`	string	`"chat.completion"`
`created`	integer	Unix 타임스탬프
`model`	string	사용된 모델명
`choices[].message.role`	string	항상 `"assistant"`
`choices[].message.content`	string	최종 답변. Tool call만 있을 경우 빈 문자열
`choices[].message.reasoning_content`	string \| null	추론 과정. HCX-GOV-THINK 계열에서만 반환
`choices[].message.reasoning`	string \| null	추론 과정. K-Exaone-236B, LLM42-Gemma4에서 반환
`choices[].message.tool_calls`	array	파싱된 Tool Call 목록
`choices[].finish_reason`	string	`"stop"`, `"length"`, `"tool_calls"` 등
`usage.prompt_tokens`	integer	입력 토큰 수
`usage.completion_tokens`	integer	생성 토큰 수 (추론 + 답변 합산)
`usage.total_tokens`	integer	전체 토큰 수

오류 응답

OpenAPI 스펙 기준 주요 오류 코드는 다음과 같습니다.

HTTP 상태 코드	의미	점검 항목
`400`	잘못된 요청	`model`, `messages`, 모델별 `chat_template_kwargs` 키/타입 점검
`401`	인증 실패	`Authorization: Bearer <API_KEY>` 헤더 점검
`500`	서버 내부 오류	동일 요청 재시도 후 지속 시 운영 담당자 문의

오류 본문 형식 예시는 아래와 같습니다.

{
  "error": {
    "message": "Invalid request",
    "type": "invalid_request_error",
    "code": "400"
  }
}

API Reference

Chat

Agents

Tools

RAG42

개요

공통 파라미터

요청 예시 (Request Form)

파라미터 설명

모델별 상세 가이드

chat_template_kwargs

코드 예시: HCX-GOV-THINK (기본 — 추론 활성화)

응답 예시: HCX-GOV-THINK (stream: false)

응답 예시: HCX-GOV-THINK (stream: true)

요청 / 응답 예시: HCX-GOV-THINK (Tool Call)

스트리밍 응답 예시: HCX-GOV-THINK (Tool Call, stream: true)

코드 예시: HCX-GOV (기본 — 빠른 응답)

응답 예시: HCX-GOV (stream: false)

응답 예시: HCX-GOV (stream: true)

스트리밍

모델별 추론 Delta 키 비교

응답 형식

오류 응답

API Reference

Chat

Agents

Tools

RAG42

​개요

​공통 파라미터

​요청 예시 (Request Form)

​파라미터 설명

​모델별 상세 가이드

​chat_template_kwargs

​코드 예시: HCX-GOV-THINK (기본 — 추론 활성화)

​응답 예시: HCX-GOV-THINK (stream: false)

​응답 예시: HCX-GOV-THINK (stream: true)

​요청 / 응답 예시: HCX-GOV-THINK (Tool Call)

​스트리밍 응답 예시: HCX-GOV-THINK (Tool Call, stream: true)

​코드 예시: HCX-GOV (기본 — 빠른 응답)

​응답 예시: HCX-GOV (stream: false)

​응답 예시: HCX-GOV (stream: true)

​스트리밍

​모델별 추론 Delta 키 비교

​응답 형식

​오류 응답

개요

공통 파라미터

요청 예시 (Request Form)

파라미터 설명

모델별 상세 가이드

chat_template_kwargs

코드 예시: HCX-GOV-THINK (기본 — 추론 활성화)

응답 예시: HCX-GOV-THINK (stream: false)

응답 예시: HCX-GOV-THINK (stream: true)

요청 / 응답 예시: HCX-GOV-THINK (Tool Call)

스트리밍 응답 예시: HCX-GOV-THINK (Tool Call, stream: true)

코드 예시: HCX-GOV (기본 — 빠른 응답)

응답 예시: HCX-GOV (stream: false)

응답 예시: HCX-GOV (stream: true)

스트리밍

모델별 추론 Delta 키 비교

응답 형식

오류 응답