API Basics: Your First Programmatic Request
Session 3.5 · ~5 min read
An API call is a structured message. You are filling out a form, not having a conversation. The form says: who you are (authentication), what you want (prompt and parameters), and where to send it (endpoint). The response is another structured message: here is what the model generated, here is how many tokens it used, and here is why it stopped.
This session walks through the anatomy of a single API request and response. No code yet. Just the structure.
The Request: What You Send
Every API request has three parts: the URL you are sending to, the headers (metadata about the request), and the body (the actual content of your request).
(endpoint)"] A --> C["Headers
(authentication +
content type)"] A --> D["Body
(model + system prompt +
messages + parameters)"] B --> E["POST https://api.anthropic.com/v1/messages"] C --> F["x-api-key: sk-ant-...
content-type: application/json"] D --> G["model: claude-sonnet-4-...
system: Your instructions
messages: user prompt
max_tokens: 1024"] style A fill:#2a2a28,stroke:#c8a882,color:#ede9e3
The URL (Endpoint)
The URL tells your request where to go. For text generation, each provider has one main endpoint. You send a POST request to it. POST means "I am sending data to you and I want something back." This is different from GET, which means "just give me information." Every LLM text generation uses POST because you are sending a prompt and receiving generated text.
The Headers
Headers are key-value pairs that travel with your request but are not part of the prompt. They contain metadata: who you are (your API key), what format you are sending (JSON), and what format you expect back (JSON). Headers are like the envelope around a letter. The recipient reads the envelope to know how to handle the letter before opening it.
The Body
The body is the actual content of your request. For LLM APIs, the body is a JSON object containing your model selection, system prompt, conversation messages, and generation parameters.
| Body Field | Purpose | Example Value |
|---|---|---|
| model | Which model to use | "claude-sonnet-4-20250514" |
| system | Persistent instructions | "You are a technical writer. No hedging. No filler." |
| messages | The conversation (user prompts + any prior responses) | [{"role": "user", "content": "Write a product description for..."}] |
| max_tokens | Maximum response length | 1024 |
| temperature | Randomness level | 0.3 |
The Response: What You Get Back
The server processes your request and sends back a response. The response also has structure: a status code (did it work?), headers (metadata), and a body (the actual generated content plus usage data).
Status Codes
The status code is a three-digit number that tells you what happened. You need to know four:
| Code | Meaning | What To Do |
|---|---|---|
| 200 | Success | Parse the response body for your content |
| 400 | Bad request (your input was wrong) | Check your request body for errors |
| 401 | Unauthorized (bad API key) | Check your API key |
| 429 | Rate limited (too many requests) | Wait and retry, or reduce concurrency |
The Response Body
On success, the response body contains the generated text and metadata about the generation. The key fields are the content (what the model generated), the usage statistics (how many tokens were consumed), and the stop reason (why the model stopped generating).
The generated text"] A --> C["usage
input_tokens: 350
output_tokens: 512"] A --> D["stop_reason
end_turn or max_tokens"] A --> E["model
Exact version used"] style B fill:#2a2a28,stroke:#6b8f71,color:#ede9e3 style C fill:#2a2a28,stroke:#c8a882,color:#ede9e3
The stop reason matters. "end_turn" means the model finished naturally. "max_tokens" means it was cut off because it hit your token limit. If your outputs are being truncated, increase max_tokens.
The Complete Flow
Putting it all together: you construct a request with your API key in the headers and your prompt in the body. You send it as a POST to the provider's endpoint. The server processes it and returns a JSON response with the generated text, usage data, and metadata. Your script parses the response, extracts the content, logs the usage, and passes the content to the next stage of your pipeline.
That is the entire mechanism. Every AI content generation, from a single product description to a batch of 500 articles, follows this same request-response pattern. The complexity is not in the API call itself. It is in what you put into the request (your prompts, your system instructions, your parameters) and what you do with the response (your quality checks, your formatting, your routing).
Further Reading
- Understanding API Requests and Responses (Being Technical Writer)
- What Are REST API Headers? A Complete Guide (BrowserStack)
- Messages API Reference (Anthropic Documentation)
- Create Chat Completion (OpenAI API Reference)
Assignment
- Using your browser, a tool like Postman, or a simple curl command, make one API call to any LLM API. It can be the simplest possible request: "Say hello."
- The goal is to successfully send a request and receive a response outside of a web chat interface. If you do not have an API key yet, sign up for one (most providers offer free trial credits).
- Document every step: what URL did you send to? What headers did you include? What was in the body? What status code did you get back? What was in the response? This documentation is your proof that you understand the mechanism.