
You've built a database. You've shaped your data. Now you need to talk to the outside world.
Every day, your Python applications need to fetch weather data, submit orders, pull tweets, push notifications, or sync with third-party services. That's where HTTP requests come in. REST APIs are the lingua franca of modern web services, and if you're serious about Python, you need to master how to consume them.
We'll start with the industry standard, the requests library, then explore the newer httpx alternative for async work. By the end, you'll know how to authenticate, handle errors, validate responses, and retry failed requests like a pro.
Table of Contents
- Why API Consumption Is a Core Python Skill
- HTTP Protocol Essentials
- The Basics: GET and POST with Requests
- GET: Fetching Data
- POST: Sending Data
- Headers: Adding Auth and Metadata
- API Key Authentication
- Basic Authentication
- The Full Spectrum: PUT, PATCH, DELETE
- Handling Responses Like a Professional
- Status Code Checking
- Response Validation with Pydantic
- Sessions: Connection Pooling and Efficiency
- Authentication Patterns
- Bearer Token (OAuth 2.0)
- API Keys in Headers vs. Query Params
- Custom Headers and Request Tracing
- The Full Spectrum: PUT, PATCH, DELETE
- Error Handling and Retries
- Timeouts
- Exponential Backoff Retries
- Rate Limit Handling
- Real-World Example: Consuming a Paginated API
- Streaming Large Responses
- Async Requests with HTTPX
- When to Use Async
- Advanced HTTPX: Timeouts, Limits, and Monitoring
- Common API Mistakes
- Testing API Code Without Hitting the Network
- Using unittest.mock
- Using responses Library
- Testing with Fixtures
- Summary
Why API Consumption Is a Core Python Skill
Think about how modern software actually works. Your application rarely does everything itself. It reaches out to Stripe to charge a card, to Twilio to send an SMS, to OpenAI to generate text, to GitHub to read repository data. Every one of those interactions happens over HTTP, using the REST architectural style. If you want to build anything meaningful in Python, automation scripts, data pipelines, web apps, AI integrations, you will be consuming APIs constantly.
The good news is that Python has arguably the best ecosystem for this work. The requests library alone has been downloaded billions of times, and for good reason: it takes what is genuinely a complicated network communication problem and reduces it to a handful of intuitive function calls. But "easy to start" doesn't mean "nothing to learn." The gap between a script that works on your laptop under ideal conditions and a production system that handles flaky networks, expired tokens, rate limits, and malformed responses gracefully is enormous. That gap is exactly what this article closes.
We'll cover the full spectrum: from your first requests.get() call all the way through async concurrency with httpx, structured error hierarchies, token management, pagination, streaming, and testing without ever hitting a real network. Whether you're automating a side project or building a service that your company depends on, these patterns will serve you. By the time you finish, you'll understand not just how to make HTTP requests in Python, but why each technique exists and when to reach for it.
HTTP Protocol Essentials
Before you write a single line of requests code, it pays to understand what's actually happening under the hood. HTTP, HyperText Transfer Protocol, is a request/response protocol that runs over TCP/IP. Your client sends a request message; the server sends back a response. Every HTTP request has three fundamental parts: a method (GET, POST, PUT, PATCH, DELETE), a URL that identifies the resource, and optional headers that carry metadata like authentication tokens or content type descriptions. Optionally, requests also have a body, a payload of data sent to the server, common with POST and PUT operations.
The server's response mirrors this structure: a status code that summarizes what happened, response headers, and an optional body containing the actual data. The status code is your first signal of success or failure. Codes in the 200s mean success (200 OK, 201 Created, 204 No Content). Codes in the 400s indicate client errors, something wrong with your request (400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests). Codes in the 500s indicate server errors, something went wrong on the other end (500 Internal Server Error, 503 Service Unavailable). Understanding these ranges intuitively makes you far better at diagnosing problems when they arise. The requests library surfaces all of this cleanly, but you need the mental model to use it well.
REST, Representational State Transfer, is an architectural style that uses these HTTP primitives to model operations on resources. A "user" at /users/123 is a resource. GET retrieves it. PUT replaces it. PATCH partially updates it. DELETE removes it. POST to /users creates a new one. This uniform interface is what makes REST APIs so learnable: once you understand the pattern, you can consume almost any API with minimal documentation.
The Basics: GET and POST with Requests
The requests library is so simple and intuitive that it feels almost suspicious. But that simplicity comes from thoughtful API design.
Install it first:
pip install requestsNow let's make your first request.
GET: Fetching Data
The GET method is what your browser uses every time you navigate to a URL. In Python, translating that same action into code takes one line. Here is the simplest possible form:
import requests
# The simplest possible request
response = requests.get('https://api.example.com/users')
# Check if it worked
if response.status_code == 200:
users = response.json()
print(users)
else:
print(f"Failed with status {response.status_code}")That's it. requests.get() blocks until the server responds, then gives you back a Response object packed with everything you need.
The Response object is your window into everything the server told you. Spend a moment learning its most important attributes before moving on:
status_code: HTTP status (200 = success, 404 = not found, 500 = server error)json(): Parse response body as JSON (throws if not valid JSON)text: Raw response as a stringheaders: Dictionary of response headersurl: The actual URL visited (useful after redirects)
Most APIs accept query parameters to filter, sort, or paginate results. The requests library handles URL encoding for you automatically, which saves you from a class of subtle bugs:
# Without requests (ugh)
url = 'https://api.example.com/users?page=2&limit=50'
response = requests.get(url)
# With requests (clean)
params = {'page': 2, 'limit': 50}
response = requests.get('https://api.example.com/users', params=params)
# Both hit the same URL, but the second is maintainable
print(response.url) # Shows the full URL with paramsWhen you pass params, requests URL-encodes them for you. No string concatenation, no manual escaping. It handles edge cases, None values are skipped, lists become key=val1&key=val2, everything is properly quoted. This matters more than it sounds: hand-building query strings is a common source of bugs that only manifest with unusual inputs.
POST: Sending Data
While GET retrieves data, POST creates it. The key difference is that POST requests carry a body, the data you are sending to the server. The json= parameter in requests.post() is one of the library's best-designed features:
# Create a new user
payload = {
'name': 'Alice',
'email': 'alice@example.com',
'age': 30
}
response = requests.post(
'https://api.example.com/users',
json=payload # Automatically serializes to JSON + sets Content-Type header
)
if response.status_code == 201: # 201 = Created
new_user = response.json()
print(f"Created user with ID {new_user['id']}")
else:
print(response.text) # Server error, see the messageNotice json=payload. This does two things:
- Serializes your Python dict to JSON
- Sets
Content-Type: application/jsonautomatically
If you already have JSON as a string, or need different Content-Type headers, use data= instead. This is less common but occasionally necessary when working with legacy systems or APIs that expect form-encoded data:
import json
json_string = json.dumps(payload)
response = requests.post(
'https://api.example.com/users',
data=json_string,
headers={'Content-Type': 'application/json'}
)But 99% of the time, just use json=.
Headers: Adding Auth and Metadata
APIs rarely accept anonymous requests. They need auth tokens, API keys, or custom headers.
API Key Authentication
The most common authentication pattern you will encounter is a Bearer token passed in the Authorization header. Here is the pattern, first the wrong way and then the right way:
headers = {
'Authorization': 'Bearer YOUR_API_KEY_HERE',
'User-Agent': 'MyApp/1.0'
}
response = requests.get(
'https://api.example.com/data',
headers=headers
)Important: Never hardcode keys in your source files. Use environment variables. If your API key ever ends up in a git commit, you should consider it compromised immediately, even private repositories get breached, and git history is permanent. The correct pattern reads the key from the environment at runtime:
import os
api_key = os.getenv('API_KEY')
if not api_key:
raise ValueError("API_KEY environment variable not set")
headers = {'Authorization': f'Bearer {api_key}'}
response = requests.get('https://api.example.com/data', headers=headers)Then run: export API_KEY=sk_live_abc123... before your script.
Basic Authentication
Some APIs use old-school username/password (HTTP Basic Auth). requests has a shortcut:
from requests.auth import HTTPBasicAuth
response = requests.get(
'https://api.example.com/secure',
auth=HTTPBasicAuth('username', 'password')
)
# Even shorter
response = requests.get(
'https://api.example.com/secure',
auth=('username', 'password')
)Internally, requests base64-encodes your credentials and adds them to the Authorization header. Always use HTTPS with Basic Auth, plain HTTP leaks credentials. Base64 encoding is not encryption; it is trivially reversible.
The Full Spectrum: PUT, PATCH, DELETE
REST APIs use different HTTP verbs for different operations.
- GET: Retrieve data (idempotent, safe)
- POST: Create new resource
- PUT: Replace entire resource
- PATCH: Partial update
- DELETE: Remove resource
Understanding the distinction between PUT and PATCH trips up many developers. PUT semantics mean "replace the entire resource with what I'm sending." If the current user has 10 fields and you PUT an object with 3 fields, the server stores an object with 3 fields, the other 7 disappear. PATCH semantics mean "update only the fields I'm sending." This is why most modern APIs prefer PATCH. Here is a complete CRUD example to make these concrete:
import requests
BASE_URL = 'https://api.example.com/posts'
# CREATE
new_post = {
'title': 'My First Post',
'content': 'Hello world',
'author_id': 1
}
create_response = requests.post(BASE_URL, json=new_post)
post_id = create_response.json()['id']
print(f"Created post {post_id}")
# READ
read_response = requests.get(f'{BASE_URL}/{post_id}')
post_data = read_response.json()
print(f"Title: {post_data['title']}")
# UPDATE (partial)
updates = {'title': 'My Updated Post'}
update_response = requests.patch(f'{BASE_URL}/{post_id}', json=updates)
print(f"Updated: {update_response.status_code}")
# DELETE
delete_response = requests.delete(f'{BASE_URL}/{post_id}')
if delete_response.status_code == 204: # 204 = No Content
print("Post deleted")This four-operation pattern, Create, Read, Update, Delete, maps directly to the four HTTP methods most APIs expose. Notice that DELETE typically returns 204 No Content, meaning the request succeeded but there is nothing to return. A 200 with an empty body is also valid; check the API documentation to know which to expect.
Handling Responses Like a Professional
A successful HTTP status (2xx) doesn't mean your data is valid. A 200 response could contain garbage JSON or missing required fields.
Status Code Checking
The temptation for beginners is to call .json() immediately on every response and hope for the best. This works fine in development but falls apart in production when APIs return errors, maintenance pages, or malformed data. Build the habit of checking status codes explicitly:
response = requests.get('https://api.example.com/data')
# Bad: Blindly assume success
data = response.json() # Crashes if status is 404
# Good: Check first
if response.status_code == 200:
data = response.json()
elif response.status_code == 404:
print("Resource not found")
elif response.status_code == 500:
print("Server error")
else:
print(f"Unexpected status: {response.status_code}")
# Better: Use raise_for_status()
response = requests.get('https://api.example.com/data')
response.raise_for_status() # Raises HTTPError if status >= 400
data = response.json()raise_for_status() converts 4xx/5xx responses into exceptions. Catch them explicitly so you can log intelligently and respond appropriately to different failure modes:
import requests
try:
response = requests.get('https://api.example.com/data')
response.raise_for_status()
data = response.json()
except requests.exceptions.HTTPError as e:
print(f"HTTP error: {e}")
except requests.exceptions.RequestException as e:
print(f"Request failed: {e}")
except ValueError as e:
print(f"Invalid JSON response: {e}")This three-layer exception handling covers the three main failure modes: the server returned an error status, the network itself failed, and the response body was not valid JSON. Each requires a different response from your code.
Response Validation with Pydantic
JSON responses need validation. Python dicts are flexible, but your code expects specific fields with specific types. That's where Pydantic shines.
The problem with raw dictionaries is that they fail silently. If the API returns {"id": "abc"} when you expected {"id": 123}, you might not discover the problem until much later in your code when you try to use user["id"] as a number. Pydantic catches this at the boundary, exactly where you want it caught:
from pydantic import BaseModel, ValidationError
class User(BaseModel):
id: int
name: str
email: str
age: int | None = None # Optional field
# Parse and validate in one step
response = requests.get('https://api.example.com/users/1')
response.raise_for_status()
try:
user = User(**response.json())
print(f"User {user.name} is {user.age} years old")
except ValidationError as e:
print(f"Invalid response structure: {e}")If the API returns {"id": "not_a_number", "name": "Bob"}, Pydantic catches it immediately. No silent data corruption, no TypeError three functions deep.
For lists of items, the approach is just as clean:
from typing import List
response = requests.get('https://api.example.com/users')
response.raise_for_status()
try:
users = [User(**item) for item in response.json()]
for user in users:
print(f"{user.name}: {user.email}")
except ValidationError as e:
print(f"Invalid response: {e}")Pydantic also gives you free documentation of what you expect from the API, the model class itself is a specification. If the API changes and starts returning different fields, your Pydantic model immediately surfaces the discrepancy rather than letting bad data propagate through your system silently.
Sessions: Connection Pooling and Efficiency
When you call requests.get() repeatedly, each request opens a new TCP connection and closes it. For many requests, this is wasteful. Session objects reuse connections.
The cost of opening a TCP connection is not trivial. It requires a three-way handshake between client and server, and if you are using HTTPS (which you always should be in production), there is also a TLS handshake on top of that. For a handful of requests, this overhead is negligible. For hundreds or thousands of requests, it becomes the dominant cost. Sessions solve this with connection pooling:
import requests
# Without sessions (creates new connection each time)
for i in range(100):
response = requests.get(f'https://api.example.com/items/{i}')
print(response.status_code)
# With sessions (reuses connection)
session = requests.Session()
for i in range(100):
response = session.get(f'https://api.example.com/items/{i}')
print(response.status_code)The second version is much faster. Session keeps the underlying TCP connection alive between requests, reducing overhead.
Sessions also let you configure persistent headers and auth, which eliminates repetitive code and reduces the chance of accidentally sending a request without authentication:
session = requests.Session()
session.headers.update({'Authorization': f'Bearer {api_key}'})
# Every request now includes the auth header
response1 = session.get('https://api.example.com/users')
response2 = session.get('https://api.example.com/posts')
# Clean up when done
session.close()
# Or use as a context manager
with requests.Session() as session:
session.headers.update({'Authorization': f'Bearer {api_key}'})
response = session.get('https://api.example.com/data')
# Automatically closesThe context manager pattern (with statement) is cleaner, Python closes the session automatically when you exit the with block, even if an exception occurs. This prevents connection leaks in error paths.
Authentication Patterns
Different APIs demand different authentication approaches. Understanding the landscape lets you recognize what an API requires just from its documentation and implement it correctly the first time.
Bearer Token (OAuth 2.0)
Modern APIs use Bearer tokens, often obtained through OAuth 2.0 flows. The token is included in the Authorization header and typically has an expiration time. Handling expiration correctly is what separates robust clients from fragile ones:
import requests
import os
def get_bearer_token():
"""
In a real app, you'd obtain this from an OAuth provider.
Here we retrieve it from environment or a token store.
"""
token = os.getenv('BEARER_TOKEN')
if not token:
raise ValueError("BEARER_TOKEN not configured")
return token
def fetch_protected_resource():
token = get_bearer_token()
headers = {'Authorization': f'Bearer {token}'}
response = requests.get(
'https://api.example.com/protected',
headers=headers,
timeout=5
)
response.raise_for_status()
return response.json()For applications where tokens expire and need refreshing, a token manager class centralizes that logic so the rest of your code never has to think about it:
import time
from typing import Optional
class TokenManager:
def __init__(self, refresh_url: str, client_id: str, client_secret: str):
self.refresh_url = refresh_url
self.client_id = client_id
self.client_secret = client_secret
self.token = None
self.expires_at = 0
def get_token(self) -> str:
"""Get valid token, refreshing if necessary."""
current_time = time.time()
# Refresh if expired or within 60 seconds of expiry
if current_time >= (self.expires_at - 60):
self._refresh_token()
return self.token
def _refresh_token(self):
"""Obtain new token from refresh endpoint."""
response = requests.post(
self.refresh_url,
auth=(self.client_id, self.client_secret),
timeout=5
)
response.raise_for_status()
data = response.json()
self.token = data['access_token']
self.expires_at = time.time() + data.get('expires_in', 3600)
# Usage
token_manager = TokenManager(
refresh_url='https://auth.example.com/token',
client_id=os.getenv('CLIENT_ID'),
client_secret=os.getenv('CLIENT_SECRET')
)
headers = {'Authorization': f'Bearer {token_manager.get_token()}'}
response = requests.get('https://api.example.com/data', headers=headers)This automatically refreshes tokens before they expire, so your requests never fail due to stale authentication. The 60-second buffer before expiry ensures that a token does not expire mid-request during high-latency operations.
API Keys in Headers vs. Query Params
Some APIs expect the key as a header (as shown above), others expect it as a query parameter, and some support both. Always prefer the header approach, query parameters end up in server logs, browser history, and anywhere else the URL gets stored. A header value is far less likely to be accidentally leaked:
# Less secure: key in URL (avoid when possible)
response = requests.get(
'https://api.example.com/data',
params={'api_key': api_key}
)
# More secure: key in header (prefer this)
response = requests.get(
'https://api.example.com/data',
headers={'X-API-Key': api_key}
)Custom Headers and Request Tracing
For production systems, adding a request ID to every outbound call is a best practice that makes debugging dramatically easier. When something goes wrong and you contact the API provider, they can search their logs for your request ID:
import uuid
headers = {
'Authorization': f'Bearer {api_key}',
'User-Agent': 'MyApp/1.0 (+http://example.com)',
'X-API-Version': '2024-02',
'X-Request-ID': str(uuid.uuid4()) # For request tracing
}
response = requests.get('https://api.example.com/data', headers=headers)The X-Request-ID header is a best practice, it helps the API provider correlate your request with server logs if debugging is needed. Generate a new UUID for each request so each one is uniquely identifiable.
The Full Spectrum: PUT, PATCH, DELETE
REST APIs use different HTTP verbs for different operations.
- GET: Retrieve data (idempotent, safe)
- POST: Create new resource
- PUT: Replace entire resource
- PATCH: Partial update
- DELETE: Remove resource
Here's a complete CRUD example:
import requests
BASE_URL = 'https://api.example.com/posts'
# CREATE
new_post = {
'title': 'My First Post',
'content': 'Hello world',
'author_id': 1
}
create_response = requests.post(BASE_URL, json=new_post)
post_id = create_response.json()['id']
print(f"Created post {post_id}")
# READ
read_response = requests.get(f'{BASE_URL}/{post_id}')
post_data = read_response.json()
print(f"Title: {post_data['title']}")
# UPDATE (partial)
updates = {'title': 'My Updated Post'}
update_response = requests.patch(f'{BASE_URL}/{post_id}', json=updates)
print(f"Updated: {update_response.status_code}")
# DELETE
delete_response = requests.delete(f'{BASE_URL}/{post_id}')
if delete_response.status_code == 204: # 204 = No Content
print("Post deleted")PUT vs PATCH: PUT replaces the entire resource. PATCH updates just the fields you specify. Most modern APIs prefer PATCH.
Error Handling and Retries
Network calls fail. Servers go down, connections drop, routers lose packets. Rate limits get hit. Tokens expire at the worst moment. Professional code handles all of these gracefully, and the key insight is that different failures warrant different responses.
Some errors are transient: a timeout, a momentary network blip, a server restart. These are worth retrying because the next attempt may succeed. Other errors are permanent: a 401 Unauthorized means your credentials are wrong, retrying immediately will just get you another 401. A 400 Bad Request means your data is malformed, retrying will get you another 400. Mixing these up by retrying everything is a common mistake that wastes time and can even make things worse (hammering a rate-limited endpoint with retries will just extend how long you are blocked).
Timeouts
# Without timeout (bad for production)
response = requests.get('https://slow-api.example.com/data')
# Hangs forever if server doesn't respond
# With timeout (good)
response = requests.get('https://slow-api.example.com/data', timeout=5)
# Raises requests.Timeout if no response within 5 secondsYou can specify separate connect and read timeouts. The connect timeout controls how long to wait while establishing the TCP connection; the read timeout controls how long to wait for data after the connection is established:
# (connect_timeout, read_timeout)
response = requests.get(
'https://api.example.com/data',
timeout=(3, 10) # 3 seconds to connect, 10 to receive
)Always set a reasonable timeout for production code. No timeout is a security vulnerability, it allows attackers to hang your application by sending slow responses. It also means a single slow dependency can block your entire application indefinitely.
Exponential Backoff Retries
When a request fails transiently (network fluke, server temporarily down), retry with escalating delays. The exponential backoff pattern prevents all clients from hammering a recovering server at the same time:
import time
import requests
def fetch_with_retry(url, max_attempts=3, base_delay=1):
"""Fetch with exponential backoff."""
for attempt in range(1, max_attempts + 1):
try:
response = requests.get(url, timeout=5)
response.raise_for_status()
return response
except requests.exceptions.RequestException as e:
if attempt == max_attempts:
raise # Give up after final attempt
# Calculate backoff: 1s, 2s, 4s
delay = base_delay * (2 ** (attempt - 1))
print(f"Attempt {attempt} failed: {e}. Retrying in {delay}s...")
time.sleep(delay)
# Usage
try:
response = fetch_with_retry('https://flaky-api.example.com/data')
data = response.json()
except requests.exceptions.RequestException as e:
print(f"Failed after retries: {e}")This retries on transient errors (connection timeout, temporarily unreachable) but not on permanent ones (400 Bad Request, 401 Unauthorized). For production, use the tenacity library which gives you declarative retry logic with far less boilerplate:
pip install tenacityfrom tenacity import retry, stop_after_attempt, wait_exponential
import requests
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=1, max=10)
)
def fetch_data(url):
response = requests.get(url, timeout=5)
response.raise_for_status()
return response.json()
# Automatically retries with exponential backoff
data = fetch_data('https://api.example.com/data')tenacity handles the exponential backoff math for you and has hooks for logging, callbacks, and conditional retries (only retry on certain error codes). The wait_exponential configuration here means the library will wait between 1 and 10 seconds between retries, doubling each time.
Rate Limit Handling
Many APIs enforce rate limits and return a 429 Too Many Requests status when you exceed them. The response usually includes a Retry-After header telling you how long to wait. Respecting this is both polite and practical, continuing to send requests while rate-limited achieves nothing and may result in your access being revoked:
import time
def rate_limit_aware_request(session, url, **kwargs):
"""Make request, respecting rate limit responses."""
response = session.get(url, **kwargs)
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
print(f"Rate limited. Waiting {retry_after} seconds...")
time.sleep(retry_after)
# Retry once after waiting
response = session.get(url, **kwargs)
response.raise_for_status()
return responseFor comprehensive error handling that distinguishes between all the failure modes, a structured client class is the cleanest approach:
import requests
from requests.exceptions import (
ConnectionError,
Timeout,
HTTPError,
RequestException
)
class APIClient:
def __init__(self, base_url: str, api_key: str):
self.base_url = base_url
self.api_key = api_key
self.session = requests.Session()
self.session.headers.update({
'Authorization': f'Bearer {api_key}'
})
def request(self, method: str, endpoint: str, **kwargs):
"""Make a request with comprehensive error handling."""
url = f"{self.base_url}/{endpoint}"
try:
response = self.session.request(
method,
url,
timeout=10,
**kwargs
)
# Distinguish error types
if response.status_code == 429:
raise RateLimitError(f"Rate limited: {response.headers.get('Retry-After')}")
elif response.status_code == 401:
raise AuthenticationError("Invalid or expired API key")
elif response.status_code == 403:
raise AuthorizationError("Access denied")
elif response.status_code >= 400:
raise APIError(f"API error: {response.status_code} - {response.text}")
return response.json()
except Timeout:
raise APIError("Request timeout, server not responding")
except ConnectionError:
raise APIError("Connection failed, check network")
except HTTPError as e:
raise APIError(f"HTTP error: {e}")
except ValueError:
raise APIError("Invalid JSON response from API")
def get(self, endpoint: str, **kwargs):
return self.request('GET', endpoint, **kwargs)
def post(self, endpoint: str, **kwargs):
return self.request('POST', endpoint, **kwargs)
def close(self):
self.session.close()
# Custom exception classes
class APIError(Exception):
pass
class RateLimitError(APIError):
pass
class AuthenticationError(APIError):
pass
class AuthorizationError(APIError):
pass
# Usage with explicit error handling
client = APIClient(
base_url='https://api.example.com',
api_key=os.getenv('API_KEY')
)
try:
data = client.get('users/123')
except AuthenticationError:
print("Re-authenticate and try again")
except RateLimitError as e:
print(f"Wait {e.args[0]} seconds before retrying")
except APIError as e:
print(f"Permanent failure: {e}")
finally:
client.close()This pattern separates transient errors (network hiccups) from permanent ones (auth failure). You retry transient errors; permanent errors need human intervention.
Real-World Example: Consuming a Paginated API
Most APIs paginate large result sets. Here's how to handle pagination:
import requests
from typing import List, Dict, Any
def fetch_all_posts(base_url: str, api_key: str) -> List[Dict[str, Any]]:
"""
Fetch all posts from a paginated API.
Assumes: 'page' query param, 'per_page=50', response has 'data' and 'total' keys.
"""
session = requests.Session()
session.headers.update({'Authorization': f'Bearer {api_key}'})
all_posts = []
page = 1
per_page = 50
while True:
try:
response = session.get(
base_url,
params={'page': page, 'per_page': per_page},
timeout=10
)
response.raise_for_status()
data = response.json()
posts = data.get('data', [])
if not posts:
break # No more posts
all_posts.extend(posts)
# Check if there are more pages
total = data.get('total', 0)
if len(all_posts) >= total:
break
page += 1
except requests.exceptions.RequestException as e:
print(f"Error fetching page {page}: {e}")
break # Or retry with backoff
session.close()
return all_posts
# Usage
posts = fetch_all_posts(
'https://api.example.com/posts',
api_key='your-api-key'
)
print(f"Fetched {len(posts)} posts")This pattern:
- Maintains a session for efficiency
- Handles pagination via
pageparameter - Gracefully stops when no more data
- Catches errors without crashing
Cursor-Based Pagination: Some APIs use cursors (opaque tokens) instead of page numbers. They're more efficient for large datasets:
def fetch_all_with_cursor(base_url: str, api_key: str):
"""Fetch all items using cursor-based pagination."""
session = requests.Session()
session.headers.update({'Authorization': f'Bearer {api_key}'})
all_items = []
cursor = None
while True:
try:
params = {'limit': 100}
if cursor:
params['cursor'] = cursor
response = session.get(base_url, params=params, timeout=10)
response.raise_for_status()
data = response.json()
items = data.get('items', [])
if not items:
break
all_items.extend(items)
# Check for next cursor
cursor = data.get('next_cursor')
if not cursor:
break # No more pages
except requests.exceptions.RequestException as e:
print(f"Error during pagination: {e}")
break
session.close()
return all_itemsCursor-based pagination is more robust, it handles insertions/deletions without skipping items.
Streaming Large Responses
When downloading large files or streaming responses, don't load everything into memory. Use streaming:
import requests
def download_large_file(url: str, output_path: str, api_key: str):
"""Download large file without loading into memory."""
headers = {'Authorization': f'Bearer {api_key}'}
# stream=True returns bytes as they arrive
with requests.get(url, headers=headers, stream=True, timeout=30) as response:
response.raise_for_status()
# Get total size from headers
total_size = int(response.headers.get('Content-Length', 0))
# Write in chunks
with open(output_path, 'wb') as f:
downloaded = 0
for chunk in response.iter_content(chunk_size=8192):
if chunk: # Filter keep-alive chunks
f.write(chunk)
downloaded += len(chunk)
percent = (downloaded / total_size * 100) if total_size else 0
print(f"Downloaded: {percent:.1f}%")
download_large_file(
'https://api.example.com/files/data.csv',
'local_data.csv',
api_key='your-key'
)The chunk_size=8192 means you process 8KB at a time, not the entire megabyte or gigabyte at once. This keeps memory usage constant regardless of file size, a 10MB file and a 10GB file use the same amount of RAM to download with this approach.
For JSON streaming (common in logs or search APIs):
import requests
import json
def stream_json_lines(url: str, api_key: str):
"""Stream JSON lines (newline-delimited JSON)."""
headers = {'Authorization': f'Bearer {api_key}'}
with requests.get(url, headers=headers, stream=True) as response:
response.raise_for_status()
for line in response.iter_lines():
if line:
obj = json.loads(line)
yield obj # Process one line at a time
# Usage
for log_entry in stream_json_lines('https://api.example.com/logs', api_key='key'):
print(f"Event: {log_entry['event_type']}")This processes a 100MB log file without loading it all into memory. The generator pattern here is elegant: yield obj turns this function into a lazy iterator, so the caller processes each record as it arrives rather than waiting for the entire file to download first.
Async Requests with HTTPX
If you're making hundreds of concurrent requests, synchronous code (with requests) becomes a bottleneck. Each request blocks the thread. For I/O-bound work, async is transformative.
Enter httpx, the spiritual successor to requests:
pip install httpxThe beauty of httpx is that its API is nearly identical to requests. If you already know requests, you already know httpx. The only meaningful addition is the await keyword and the async with context manager. Here is a concrete comparison that shows just how similar they are:
import httpx
async def fetch_user(client, user_id):
"""Fetch single user asynchronously."""
response = await client.get(f'https://api.example.com/users/{user_id}')
response.raise_for_status()
return response.json()
async def main():
"""Fetch multiple users concurrently."""
async with httpx.AsyncClient() as client:
# Launch all requests concurrently
tasks = [
fetch_user(client, user_id)
for user_id in range(1, 11)
]
users = await asyncio.gather(*tasks)
for user in users:
print(f"{user['name']}: {user['email']}")
import asyncio
asyncio.run(main())This fetches 10 users concurrently in the time one synchronous request would take. The asyncio.gather(*tasks) call launches all 10 requests simultaneously and waits for all of them to complete. If each request takes 500ms, the synchronous version takes 5 seconds; the async version takes about 500ms.
Why HTTPX over requests for async?
requestsdoesn't support async natively (it's designed for sync)httpxhas identical API torequestsbut with async/await support- Drop-in replacement if you already know
requests
Comparison:
# Requests (synchronous)
import requests
response = requests.get('https://api.example.com/users/1')
data = response.json()
# HTTPX (asynchronous)
import httpx
response = await client.get('https://api.example.com/users/1')
data = response.json() # Same!Headers, auth, timeouts, status code checking, all identical. The only difference is await.
When to Use Async
Use async when:
- Many concurrent requests: 10+ simultaneous API calls
- I/O-bound bottleneck: Waiting for network is your slowest part
- Web servers/frameworks: FastAPI, Quart, Starlette all expect async
Avoid async if:
- Few requests: Single or double-digit requests per operation
- CPU-bound work: Data processing dominates, not network
- Synchronous dependencies: Calling non-async libraries forces sync
Advanced HTTPX: Timeouts, Limits, and Monitoring
Just like requests, httpx supports timeouts and has session-like pooling. Connection limits are especially important in async code, because it is easy to accidentally open hundreds of simultaneous connections to the same server, which can get your IP blocked:
import httpx
import asyncio
async def fetch_with_limits():
"""
HTTPX with connection pooling limits.
"""
limits = httpx.Limits(
max_connections=10, # Total concurrent connections
max_keepalive_connections=5 # Reuse limit
)
# Configure once, reuse for all requests
async with httpx.AsyncClient(
limits=limits,
timeout=10.0
) as client:
# These requests share the connection pool
tasks = [
client.get(f'https://api.example.com/users/{i}')
for i in range(50)
]
responses = await asyncio.gather(*tasks)
return [r.json() for r in responses]
# Run it
data = asyncio.run(fetch_with_limits())HTTPX automatically manages the connection pool and respects the limits you set. This prevents overwhelming the server or exhausting your system's file descriptors.
For monitoring and debugging:
import httpx
import logging
logging.basicConfig(level=logging.DEBUG)
async def fetch_with_monitoring():
async with httpx.AsyncClient() as client:
# httpx logs all requests when DEBUG is enabled
response = await client.get('https://api.example.com/data')
print(f"Status: {response.status_code}")
print(f"Size: {len(response.content)} bytes")
print(f"Elapsed: {response.elapsed.total_seconds()}s")
return response.json()The response.elapsed attribute tells you how long the request took, useful for performance monitoring.
Common API Mistakes
Even experienced developers fall into patterns that cause subtle bugs and fragile code. Recognizing these mistakes is the fastest path to writing production-quality API consumers.
Mistake 1: Not checking the status code. The most common beginner error is calling .json() directly on a response without checking whether the request succeeded. A 404 or 500 response often returns a JSON body too, just not the JSON you wanted. Always call raise_for_status() or check the status explicitly before parsing the body.
Mistake 2: Hardcoding credentials. API keys, tokens, and passwords should never appear in source code. Not even in private repositories. Use environment variables or a secrets manager. If you accidentally commit a credential, rotate it immediately, do not assume that reverting the commit makes you safe, because git history is permanent and may already have been cloned or cached elsewhere.
Mistake 3: Omitting timeouts. Without explicit timeouts, your application will hang indefinitely waiting for a server that has crashed or become unreachable. In a web server context, this can exhaust your thread pool and take down your entire service because of one slow dependency. Always set both connect and read timeouts.
Mistake 4: Retrying non-retryable errors. Retrying a 400 Bad Request is pointless, the request is malformed and will fail every time. Retrying a 401 Unauthorized is counterproductive, you are just burning through your rate limit. Only retry on network errors and 5xx server errors (and 429 rate limit errors after the specified wait). Be explicit about which errors trigger a retry.
Mistake 5: Creating new sessions for each request in a loop. Each requests.get() call at the top level creates and tears down a connection. Inside a loop making dozens of calls, this is a significant performance penalty. Always use a Session when making multiple requests to the same host. The difference can be 2-5x faster in real workloads.
Here is the full set of patterns side by side:
# Mistake 1: No status check
data = requests.get('https://api.example.com/data').json()
# Fix:
response = requests.get('https://api.example.com/data')
response.raise_for_status()
data = response.json()
# Mistake 2: Hardcoded key
headers = {'Authorization': 'Bearer sk_live_abc123def456'}
# Fix:
import os
headers = {'Authorization': f'Bearer {os.getenv("API_KEY")}'}
# Mistake 3: No timeout
requests.get('https://api.example.com/slow')
# Fix:
requests.get('https://api.example.com/slow', timeout=5)
# Mistake 4: Retrying everything
for attempt in range(3):
response = requests.get(url) # Will retry 400s pointlessly
# Fix: Only retry on transient errors
# Mistake 5: No session in loops
for i in range(100):
requests.get(f'https://api.example.com/item/{i}')
# Fix:
with requests.Session() as session:
for i in range(100):
session.get(f'https://api.example.com/item/{i}')Avoiding these five mistakes eliminates the majority of real-world API consumption bugs before they occur.
Testing API Code Without Hitting the Network
Unit testing API consumers is tricky, you can't rely on external APIs during tests. Mock them instead.
Tests that make real network calls are slow, flaky (what happens when the API is down?), and can have side effects (what if the test creates real records?). Mocking solves all three problems at once. The goal is to test your code's behavior given particular API responses, not to test that the API itself works:
Using unittest.mock
import unittest
from unittest.mock import patch, MagicMock
import requests
from my_api_client import fetch_user_data
class TestAPIClient(unittest.TestCase):
@patch('requests.get')
def test_fetch_user_success(self, mock_get):
"""Test successful API call."""
# Mock the response
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
'id': 1,
'name': 'Alice',
'email': 'alice@example.com'
}
mock_get.return_value = mock_response
# Call your function
user = fetch_user_data(1)
# Assert it worked
self.assertEqual(user['name'], 'Alice')
mock_get.assert_called_once()
@patch('requests.get')
def test_fetch_user_not_found(self, mock_get):
"""Test 404 handling."""
mock_response = MagicMock()
mock_response.status_code = 404
mock_response.raise_for_status.side_effect = requests.HTTPError()
mock_get.return_value = mock_response
with self.assertRaises(requests.HTTPError):
fetch_user_data(999)The @patch decorator replaces the real requests.get with a mock object. You control what it returns, so tests run instantly without network calls.
Using responses Library
For more realistic mocking, use the responses library:
pip install responsesimport responses
import requests
@responses.activate
def test_fetch_users():
"""Mock API responses realistically."""
# Register mock response
responses.add(
responses.GET,
'https://api.example.com/users',
json=[
{'id': 1, 'name': 'Alice'},
{'id': 2, 'name': 'Bob'}
],
status=200
)
# Your code makes the request
response = requests.get('https://api.example.com/users')
users = response.json()
# Test assertions
assert len(users) == 2
assert users[0]['name'] == 'Alice'responses intercepts all HTTP calls to registered URLs and returns your mock data. It's more readable than unittest.mock for API testing.
Testing with Fixtures
For pytest, create reusable fixtures:
import pytest
import responses
@pytest.fixture
def mock_api():
"""Context manager for mocked API calls."""
with responses.RequestsMock() as rsps:
rsps.add(
responses.GET,
'https://api.example.com/users/1',
json={'id': 1, 'name': 'Alice'},
status=200
)
yield rsps
def test_with_fixture(mock_api):
"""Use the fixture in tests."""
import requests
response = requests.get('https://api.example.com/users/1')
assert response.json()['name'] == 'Alice'Fixtures let you compose complex test setups from small, reusable pieces. A single mock_api fixture can be shared across dozens of test functions, making your test suite both thorough and maintainable.
Summary
You now understand the full lifecycle of HTTP requests in Python:
requestslibrary: Simple, synchronous API consumption- GET/POST/PUT/PATCH/DELETE: CRUD operations over HTTP
- Headers and Auth: API keys, Bearer tokens, Basic auth, token refresh
- Response Validation: Status codes and Pydantic models
- Sessions: Connection pooling for efficiency
- Error Handling and Retries: Transient vs. permanent failures, exponential backoff
- HTTPX: Async alternative for high-concurrency scenarios
- Common Mistakes: The five antipatterns that break most API integrations
- Testing: Mocking with
unittest.mock,responses, and pytest fixtures
The progression here is deliberate. Start with requests for simple scripts, it handles the fundamentals with almost no ceremony. Graduate to Session for repeated calls to the same host, and add timeouts for production safety. Layer in structured error handling and retry logic when your script needs to run unattended. Add Pydantic validation when you need to trust the data your code processes. Reach for httpx when concurrency becomes the bottleneck.
The internet is not a reliable place. Networks partition, APIs throttle, servers crash, tokens expire, and JSON schemas drift. But with the patterns in this article, your Python code can handle all of it gracefully, logging useful errors, retrying what is worth retrying, failing fast on what is not, and staying secure throughout. That is the difference between code that works in a demo and code that runs in production for years.