Back to Blog

LLM API Integration Guide: Start Calling GPT, Claude, and Kimi

API IntegrationLLMGPTClaudeDeveloper Guide

LLM API Integration Guide: Start Calling GPT, Claude, and Kimi

If you are building AI features, API integration is the fastest path to production. This guide covers the core concepts, the major protocol choices, and production-ready examples so you can ship quickly.

1) Core concepts you must know

  • API Key: your credential for authentication and billing.
  • Base URL: the endpoint root (switching providers is often just changing this value).
  • Token: the unit used for model input and output billing.

2) Main protocol options

OpenAI-compatible (/v1/chat/completions)

The most widely supported format. Most SDKs and tools can use it immediately.

Claude-native (/v1/messages)

Best when you need Anthropic-specific capabilities and semantics.

Responses API (/v1/responses)

Useful for tool-calling and agent workflows.

3) Integration workflow

  1. Create an account and generate an API key.
  2. Store keys in environment variables.
  3. Send a first request with curl.
  4. Move to SDK integration in Python or Node.js.
  5. Add retries, timeouts, and streaming.

4) Quick code examples

Use OpenAI SDK with a custom base URL:

from openai import OpenAI

client = OpenAI(api_key="YOUR_KEY", base_url="https://api.example.com/v1")
resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role":"user","content":"Explain REST API in one sentence."}]
)
print(resp.choices[0].message.content)

5) Why an aggregation platform helps

  • One key for many models
  • Unified protocol for GPT, Claude, Kimi, DeepSeek, Qwen, and more
  • Centralized billing and usage tracking
  • Faster model switching with minimal code changes

6) Common errors and fixes

  • 401: invalid key or wrong auth header
  • 402: insufficient balance
  • 429: rate limit reached, apply exponential backoff
  • Timeout: increase client timeout and use streaming

Conclusion

Start with a minimal request, confirm quality and cost, then scale safely with retries, monitoring, and model routing. With a good integration baseline, expanding across providers becomes mostly configuration work.