July 22, 2025
Python APIs HTTP Requests

HTTP Requests in Python: Consuming REST APIs

You've built a database. You've shaped your data. Now you need to talk to the outside world.

Every day, your Python applications need to fetch weather data, submit orders, pull tweets, push notifications, or sync with third-party services. That's where HTTP requests come in. REST APIs are the lingua franca of modern web services, and if you're serious about Python, you need to master how to consume them.

We'll start with the industry standard, the requests library, then explore the newer httpx alternative for async work. By the end, you'll know how to authenticate, handle errors, validate responses, and retry failed requests like a pro.

Table of Contents
  1. Why API Consumption Is a Core Python Skill
  2. HTTP Protocol Essentials
  3. The Basics: GET and POST with Requests
  4. GET: Fetching Data
  5. POST: Sending Data
  6. Headers: Adding Auth and Metadata
  7. API Key Authentication
  8. Basic Authentication
  9. The Full Spectrum: PUT, PATCH, DELETE
  10. Handling Responses Like a Professional
  11. Status Code Checking
  12. Response Validation with Pydantic
  13. Sessions: Connection Pooling and Efficiency
  14. Authentication Patterns
  15. Bearer Token (OAuth 2.0)
  16. API Keys in Headers vs. Query Params
  17. Custom Headers and Request Tracing
  18. The Full Spectrum: PUT, PATCH, DELETE
  19. Error Handling and Retries
  20. Timeouts
  21. Exponential Backoff Retries
  22. Rate Limit Handling
  23. Real-World Example: Consuming a Paginated API
  24. Streaming Large Responses
  25. Async Requests with HTTPX
  26. When to Use Async
  27. Advanced HTTPX: Timeouts, Limits, and Monitoring
  28. Common API Mistakes
  29. Testing API Code Without Hitting the Network
  30. Using unittest.mock
  31. Using responses Library
  32. Testing with Fixtures
  33. Summary

Why API Consumption Is a Core Python Skill

Think about how modern software actually works. Your application rarely does everything itself. It reaches out to Stripe to charge a card, to Twilio to send an SMS, to OpenAI to generate text, to GitHub to read repository data. Every one of those interactions happens over HTTP, using the REST architectural style. If you want to build anything meaningful in Python, automation scripts, data pipelines, web apps, AI integrations, you will be consuming APIs constantly.

The good news is that Python has arguably the best ecosystem for this work. The requests library alone has been downloaded billions of times, and for good reason: it takes what is genuinely a complicated network communication problem and reduces it to a handful of intuitive function calls. But "easy to start" doesn't mean "nothing to learn." The gap between a script that works on your laptop under ideal conditions and a production system that handles flaky networks, expired tokens, rate limits, and malformed responses gracefully is enormous. That gap is exactly what this article closes.

We'll cover the full spectrum: from your first requests.get() call all the way through async concurrency with httpx, structured error hierarchies, token management, pagination, streaming, and testing without ever hitting a real network. Whether you're automating a side project or building a service that your company depends on, these patterns will serve you. By the time you finish, you'll understand not just how to make HTTP requests in Python, but why each technique exists and when to reach for it.

HTTP Protocol Essentials

Before you write a single line of requests code, it pays to understand what's actually happening under the hood. HTTP, HyperText Transfer Protocol, is a request/response protocol that runs over TCP/IP. Your client sends a request message; the server sends back a response. Every HTTP request has three fundamental parts: a method (GET, POST, PUT, PATCH, DELETE), a URL that identifies the resource, and optional headers that carry metadata like authentication tokens or content type descriptions. Optionally, requests also have a body, a payload of data sent to the server, common with POST and PUT operations.

The server's response mirrors this structure: a status code that summarizes what happened, response headers, and an optional body containing the actual data. The status code is your first signal of success or failure. Codes in the 200s mean success (200 OK, 201 Created, 204 No Content). Codes in the 400s indicate client errors, something wrong with your request (400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests). Codes in the 500s indicate server errors, something went wrong on the other end (500 Internal Server Error, 503 Service Unavailable). Understanding these ranges intuitively makes you far better at diagnosing problems when they arise. The requests library surfaces all of this cleanly, but you need the mental model to use it well.

REST, Representational State Transfer, is an architectural style that uses these HTTP primitives to model operations on resources. A "user" at /users/123 is a resource. GET retrieves it. PUT replaces it. PATCH partially updates it. DELETE removes it. POST to /users creates a new one. This uniform interface is what makes REST APIs so learnable: once you understand the pattern, you can consume almost any API with minimal documentation.

The Basics: GET and POST with Requests

The requests library is so simple and intuitive that it feels almost suspicious. But that simplicity comes from thoughtful API design.

Install it first:

bash
pip install requests

Now let's make your first request.

GET: Fetching Data

The GET method is what your browser uses every time you navigate to a URL. In Python, translating that same action into code takes one line. Here is the simplest possible form:

python
import requests
 
# The simplest possible request
response = requests.get('https://api.example.com/users')
 
# Check if it worked
if response.status_code == 200:
    users = response.json()
    print(users)
else:
    print(f"Failed with status {response.status_code}")

That's it. requests.get() blocks until the server responds, then gives you back a Response object packed with everything you need.

The Response object is your window into everything the server told you. Spend a moment learning its most important attributes before moving on:

  • status_code: HTTP status (200 = success, 404 = not found, 500 = server error)
  • json(): Parse response body as JSON (throws if not valid JSON)
  • text: Raw response as a string
  • headers: Dictionary of response headers
  • url: The actual URL visited (useful after redirects)

Most APIs accept query parameters to filter, sort, or paginate results. The requests library handles URL encoding for you automatically, which saves you from a class of subtle bugs:

python
# Without requests (ugh)
url = 'https://api.example.com/users?page=2&limit=50'
response = requests.get(url)
 
# With requests (clean)
params = {'page': 2, 'limit': 50}
response = requests.get('https://api.example.com/users', params=params)
 
# Both hit the same URL, but the second is maintainable
print(response.url)  # Shows the full URL with params

When you pass params, requests URL-encodes them for you. No string concatenation, no manual escaping. It handles edge cases, None values are skipped, lists become key=val1&key=val2, everything is properly quoted. This matters more than it sounds: hand-building query strings is a common source of bugs that only manifest with unusual inputs.

POST: Sending Data

While GET retrieves data, POST creates it. The key difference is that POST requests carry a body, the data you are sending to the server. The json= parameter in requests.post() is one of the library's best-designed features:

python
# Create a new user
payload = {
    'name': 'Alice',
    'email': 'alice@example.com',
    'age': 30
}
 
response = requests.post(
    'https://api.example.com/users',
    json=payload  # Automatically serializes to JSON + sets Content-Type header
)
 
if response.status_code == 201:  # 201 = Created
    new_user = response.json()
    print(f"Created user with ID {new_user['id']}")
else:
    print(response.text)  # Server error, see the message

Notice json=payload. This does two things:

  1. Serializes your Python dict to JSON
  2. Sets Content-Type: application/json automatically

If you already have JSON as a string, or need different Content-Type headers, use data= instead. This is less common but occasionally necessary when working with legacy systems or APIs that expect form-encoded data:

python
import json
 
json_string = json.dumps(payload)
response = requests.post(
    'https://api.example.com/users',
    data=json_string,
    headers={'Content-Type': 'application/json'}
)

But 99% of the time, just use json=.

Headers: Adding Auth and Metadata

APIs rarely accept anonymous requests. They need auth tokens, API keys, or custom headers.

API Key Authentication

The most common authentication pattern you will encounter is a Bearer token passed in the Authorization header. Here is the pattern, first the wrong way and then the right way:

python
headers = {
    'Authorization': 'Bearer YOUR_API_KEY_HERE',
    'User-Agent': 'MyApp/1.0'
}
 
response = requests.get(
    'https://api.example.com/data',
    headers=headers
)

Important: Never hardcode keys in your source files. Use environment variables. If your API key ever ends up in a git commit, you should consider it compromised immediately, even private repositories get breached, and git history is permanent. The correct pattern reads the key from the environment at runtime:

python
import os
 
api_key = os.getenv('API_KEY')
if not api_key:
    raise ValueError("API_KEY environment variable not set")
 
headers = {'Authorization': f'Bearer {api_key}'}
response = requests.get('https://api.example.com/data', headers=headers)

Then run: export API_KEY=sk_live_abc123... before your script.

Basic Authentication

Some APIs use old-school username/password (HTTP Basic Auth). requests has a shortcut:

python
from requests.auth import HTTPBasicAuth
 
response = requests.get(
    'https://api.example.com/secure',
    auth=HTTPBasicAuth('username', 'password')
)
 
# Even shorter
response = requests.get(
    'https://api.example.com/secure',
    auth=('username', 'password')
)

Internally, requests base64-encodes your credentials and adds them to the Authorization header. Always use HTTPS with Basic Auth, plain HTTP leaks credentials. Base64 encoding is not encryption; it is trivially reversible.

The Full Spectrum: PUT, PATCH, DELETE

REST APIs use different HTTP verbs for different operations.

  • GET: Retrieve data (idempotent, safe)
  • POST: Create new resource
  • PUT: Replace entire resource
  • PATCH: Partial update
  • DELETE: Remove resource

Understanding the distinction between PUT and PATCH trips up many developers. PUT semantics mean "replace the entire resource with what I'm sending." If the current user has 10 fields and you PUT an object with 3 fields, the server stores an object with 3 fields, the other 7 disappear. PATCH semantics mean "update only the fields I'm sending." This is why most modern APIs prefer PATCH. Here is a complete CRUD example to make these concrete:

python
import requests
 
BASE_URL = 'https://api.example.com/posts'
 
# CREATE
new_post = {
    'title': 'My First Post',
    'content': 'Hello world',
    'author_id': 1
}
create_response = requests.post(BASE_URL, json=new_post)
post_id = create_response.json()['id']
print(f"Created post {post_id}")
 
# READ
read_response = requests.get(f'{BASE_URL}/{post_id}')
post_data = read_response.json()
print(f"Title: {post_data['title']}")
 
# UPDATE (partial)
updates = {'title': 'My Updated Post'}
update_response = requests.patch(f'{BASE_URL}/{post_id}', json=updates)
print(f"Updated: {update_response.status_code}")
 
# DELETE
delete_response = requests.delete(f'{BASE_URL}/{post_id}')
if delete_response.status_code == 204:  # 204 = No Content
    print("Post deleted")

This four-operation pattern, Create, Read, Update, Delete, maps directly to the four HTTP methods most APIs expose. Notice that DELETE typically returns 204 No Content, meaning the request succeeded but there is nothing to return. A 200 with an empty body is also valid; check the API documentation to know which to expect.

Handling Responses Like a Professional

A successful HTTP status (2xx) doesn't mean your data is valid. A 200 response could contain garbage JSON or missing required fields.

Status Code Checking

The temptation for beginners is to call .json() immediately on every response and hope for the best. This works fine in development but falls apart in production when APIs return errors, maintenance pages, or malformed data. Build the habit of checking status codes explicitly:

python
response = requests.get('https://api.example.com/data')
 
# Bad: Blindly assume success
data = response.json()  # Crashes if status is 404
 
# Good: Check first
if response.status_code == 200:
    data = response.json()
elif response.status_code == 404:
    print("Resource not found")
elif response.status_code == 500:
    print("Server error")
else:
    print(f"Unexpected status: {response.status_code}")
 
# Better: Use raise_for_status()
response = requests.get('https://api.example.com/data')
response.raise_for_status()  # Raises HTTPError if status >= 400
data = response.json()

raise_for_status() converts 4xx/5xx responses into exceptions. Catch them explicitly so you can log intelligently and respond appropriately to different failure modes:

python
import requests
 
try:
    response = requests.get('https://api.example.com/data')
    response.raise_for_status()
    data = response.json()
except requests.exceptions.HTTPError as e:
    print(f"HTTP error: {e}")
except requests.exceptions.RequestException as e:
    print(f"Request failed: {e}")
except ValueError as e:
    print(f"Invalid JSON response: {e}")

This three-layer exception handling covers the three main failure modes: the server returned an error status, the network itself failed, and the response body was not valid JSON. Each requires a different response from your code.

Response Validation with Pydantic

JSON responses need validation. Python dicts are flexible, but your code expects specific fields with specific types. That's where Pydantic shines.

The problem with raw dictionaries is that they fail silently. If the API returns {"id": "abc"} when you expected {"id": 123}, you might not discover the problem until much later in your code when you try to use user["id"] as a number. Pydantic catches this at the boundary, exactly where you want it caught:

python
from pydantic import BaseModel, ValidationError
 
class User(BaseModel):
    id: int
    name: str
    email: str
    age: int | None = None  # Optional field
 
# Parse and validate in one step
response = requests.get('https://api.example.com/users/1')
response.raise_for_status()
 
try:
    user = User(**response.json())
    print(f"User {user.name} is {user.age} years old")
except ValidationError as e:
    print(f"Invalid response structure: {e}")

If the API returns {"id": "not_a_number", "name": "Bob"}, Pydantic catches it immediately. No silent data corruption, no TypeError three functions deep.

For lists of items, the approach is just as clean:

python
from typing import List
 
response = requests.get('https://api.example.com/users')
response.raise_for_status()
 
try:
    users = [User(**item) for item in response.json()]
    for user in users:
        print(f"{user.name}: {user.email}")
except ValidationError as e:
    print(f"Invalid response: {e}")

Pydantic also gives you free documentation of what you expect from the API, the model class itself is a specification. If the API changes and starts returning different fields, your Pydantic model immediately surfaces the discrepancy rather than letting bad data propagate through your system silently.

Sessions: Connection Pooling and Efficiency

When you call requests.get() repeatedly, each request opens a new TCP connection and closes it. For many requests, this is wasteful. Session objects reuse connections.

The cost of opening a TCP connection is not trivial. It requires a three-way handshake between client and server, and if you are using HTTPS (which you always should be in production), there is also a TLS handshake on top of that. For a handful of requests, this overhead is negligible. For hundreds or thousands of requests, it becomes the dominant cost. Sessions solve this with connection pooling:

python
import requests
 
# Without sessions (creates new connection each time)
for i in range(100):
    response = requests.get(f'https://api.example.com/items/{i}')
    print(response.status_code)
 
# With sessions (reuses connection)
session = requests.Session()
for i in range(100):
    response = session.get(f'https://api.example.com/items/{i}')
    print(response.status_code)

The second version is much faster. Session keeps the underlying TCP connection alive between requests, reducing overhead.

Sessions also let you configure persistent headers and auth, which eliminates repetitive code and reduces the chance of accidentally sending a request without authentication:

python
session = requests.Session()
session.headers.update({'Authorization': f'Bearer {api_key}'})
 
# Every request now includes the auth header
response1 = session.get('https://api.example.com/users')
response2 = session.get('https://api.example.com/posts')
 
# Clean up when done
session.close()
 
# Or use as a context manager
with requests.Session() as session:
    session.headers.update({'Authorization': f'Bearer {api_key}'})
    response = session.get('https://api.example.com/data')
    # Automatically closes

The context manager pattern (with statement) is cleaner, Python closes the session automatically when you exit the with block, even if an exception occurs. This prevents connection leaks in error paths.

Authentication Patterns

Different APIs demand different authentication approaches. Understanding the landscape lets you recognize what an API requires just from its documentation and implement it correctly the first time.

Bearer Token (OAuth 2.0)

Modern APIs use Bearer tokens, often obtained through OAuth 2.0 flows. The token is included in the Authorization header and typically has an expiration time. Handling expiration correctly is what separates robust clients from fragile ones:

python
import requests
import os
 
def get_bearer_token():
    """
    In a real app, you'd obtain this from an OAuth provider.
    Here we retrieve it from environment or a token store.
    """
    token = os.getenv('BEARER_TOKEN')
    if not token:
        raise ValueError("BEARER_TOKEN not configured")
    return token
 
def fetch_protected_resource():
    token = get_bearer_token()
    headers = {'Authorization': f'Bearer {token}'}
 
    response = requests.get(
        'https://api.example.com/protected',
        headers=headers,
        timeout=5
    )
    response.raise_for_status()
    return response.json()

For applications where tokens expire and need refreshing, a token manager class centralizes that logic so the rest of your code never has to think about it:

python
import time
from typing import Optional
 
class TokenManager:
    def __init__(self, refresh_url: str, client_id: str, client_secret: str):
        self.refresh_url = refresh_url
        self.client_id = client_id
        self.client_secret = client_secret
        self.token = None
        self.expires_at = 0
 
    def get_token(self) -> str:
        """Get valid token, refreshing if necessary."""
        current_time = time.time()
 
        # Refresh if expired or within 60 seconds of expiry
        if current_time >= (self.expires_at - 60):
            self._refresh_token()
 
        return self.token
 
    def _refresh_token(self):
        """Obtain new token from refresh endpoint."""
        response = requests.post(
            self.refresh_url,
            auth=(self.client_id, self.client_secret),
            timeout=5
        )
        response.raise_for_status()
 
        data = response.json()
        self.token = data['access_token']
        self.expires_at = time.time() + data.get('expires_in', 3600)
 
# Usage
token_manager = TokenManager(
    refresh_url='https://auth.example.com/token',
    client_id=os.getenv('CLIENT_ID'),
    client_secret=os.getenv('CLIENT_SECRET')
)
 
headers = {'Authorization': f'Bearer {token_manager.get_token()}'}
response = requests.get('https://api.example.com/data', headers=headers)

This automatically refreshes tokens before they expire, so your requests never fail due to stale authentication. The 60-second buffer before expiry ensures that a token does not expire mid-request during high-latency operations.

API Keys in Headers vs. Query Params

Some APIs expect the key as a header (as shown above), others expect it as a query parameter, and some support both. Always prefer the header approach, query parameters end up in server logs, browser history, and anywhere else the URL gets stored. A header value is far less likely to be accidentally leaked:

python
# Less secure: key in URL (avoid when possible)
response = requests.get(
    'https://api.example.com/data',
    params={'api_key': api_key}
)
 
# More secure: key in header (prefer this)
response = requests.get(
    'https://api.example.com/data',
    headers={'X-API-Key': api_key}
)

Custom Headers and Request Tracing

For production systems, adding a request ID to every outbound call is a best practice that makes debugging dramatically easier. When something goes wrong and you contact the API provider, they can search their logs for your request ID:

python
import uuid
 
headers = {
    'Authorization': f'Bearer {api_key}',
    'User-Agent': 'MyApp/1.0 (+http://example.com)',
    'X-API-Version': '2024-02',
    'X-Request-ID': str(uuid.uuid4())  # For request tracing
}
 
response = requests.get('https://api.example.com/data', headers=headers)

The X-Request-ID header is a best practice, it helps the API provider correlate your request with server logs if debugging is needed. Generate a new UUID for each request so each one is uniquely identifiable.

The Full Spectrum: PUT, PATCH, DELETE

REST APIs use different HTTP verbs for different operations.

  • GET: Retrieve data (idempotent, safe)
  • POST: Create new resource
  • PUT: Replace entire resource
  • PATCH: Partial update
  • DELETE: Remove resource

Here's a complete CRUD example:

python
import requests
 
BASE_URL = 'https://api.example.com/posts'
 
# CREATE
new_post = {
    'title': 'My First Post',
    'content': 'Hello world',
    'author_id': 1
}
create_response = requests.post(BASE_URL, json=new_post)
post_id = create_response.json()['id']
print(f"Created post {post_id}")
 
# READ
read_response = requests.get(f'{BASE_URL}/{post_id}')
post_data = read_response.json()
print(f"Title: {post_data['title']}")
 
# UPDATE (partial)
updates = {'title': 'My Updated Post'}
update_response = requests.patch(f'{BASE_URL}/{post_id}', json=updates)
print(f"Updated: {update_response.status_code}")
 
# DELETE
delete_response = requests.delete(f'{BASE_URL}/{post_id}')
if delete_response.status_code == 204:  # 204 = No Content
    print("Post deleted")

PUT vs PATCH: PUT replaces the entire resource. PATCH updates just the fields you specify. Most modern APIs prefer PATCH.

Error Handling and Retries

Network calls fail. Servers go down, connections drop, routers lose packets. Rate limits get hit. Tokens expire at the worst moment. Professional code handles all of these gracefully, and the key insight is that different failures warrant different responses.

Some errors are transient: a timeout, a momentary network blip, a server restart. These are worth retrying because the next attempt may succeed. Other errors are permanent: a 401 Unauthorized means your credentials are wrong, retrying immediately will just get you another 401. A 400 Bad Request means your data is malformed, retrying will get you another 400. Mixing these up by retrying everything is a common mistake that wastes time and can even make things worse (hammering a rate-limited endpoint with retries will just extend how long you are blocked).

Timeouts

python
# Without timeout (bad for production)
response = requests.get('https://slow-api.example.com/data')
# Hangs forever if server doesn't respond
 
# With timeout (good)
response = requests.get('https://slow-api.example.com/data', timeout=5)
# Raises requests.Timeout if no response within 5 seconds

You can specify separate connect and read timeouts. The connect timeout controls how long to wait while establishing the TCP connection; the read timeout controls how long to wait for data after the connection is established:

python
# (connect_timeout, read_timeout)
response = requests.get(
    'https://api.example.com/data',
    timeout=(3, 10)  # 3 seconds to connect, 10 to receive
)

Always set a reasonable timeout for production code. No timeout is a security vulnerability, it allows attackers to hang your application by sending slow responses. It also means a single slow dependency can block your entire application indefinitely.

Exponential Backoff Retries

When a request fails transiently (network fluke, server temporarily down), retry with escalating delays. The exponential backoff pattern prevents all clients from hammering a recovering server at the same time:

python
import time
import requests
 
def fetch_with_retry(url, max_attempts=3, base_delay=1):
    """Fetch with exponential backoff."""
    for attempt in range(1, max_attempts + 1):
        try:
            response = requests.get(url, timeout=5)
            response.raise_for_status()
            return response
        except requests.exceptions.RequestException as e:
            if attempt == max_attempts:
                raise  # Give up after final attempt
 
            # Calculate backoff: 1s, 2s, 4s
            delay = base_delay * (2 ** (attempt - 1))
            print(f"Attempt {attempt} failed: {e}. Retrying in {delay}s...")
            time.sleep(delay)
 
# Usage
try:
    response = fetch_with_retry('https://flaky-api.example.com/data')
    data = response.json()
except requests.exceptions.RequestException as e:
    print(f"Failed after retries: {e}")

This retries on transient errors (connection timeout, temporarily unreachable) but not on permanent ones (400 Bad Request, 401 Unauthorized). For production, use the tenacity library which gives you declarative retry logic with far less boilerplate:

bash
pip install tenacity
python
from tenacity import retry, stop_after_attempt, wait_exponential
import requests
 
@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=1, max=10)
)
def fetch_data(url):
    response = requests.get(url, timeout=5)
    response.raise_for_status()
    return response.json()
 
# Automatically retries with exponential backoff
data = fetch_data('https://api.example.com/data')

tenacity handles the exponential backoff math for you and has hooks for logging, callbacks, and conditional retries (only retry on certain error codes). The wait_exponential configuration here means the library will wait between 1 and 10 seconds between retries, doubling each time.

Rate Limit Handling

Many APIs enforce rate limits and return a 429 Too Many Requests status when you exceed them. The response usually includes a Retry-After header telling you how long to wait. Respecting this is both polite and practical, continuing to send requests while rate-limited achieves nothing and may result in your access being revoked:

python
import time
 
def rate_limit_aware_request(session, url, **kwargs):
    """Make request, respecting rate limit responses."""
    response = session.get(url, **kwargs)
 
    if response.status_code == 429:
        retry_after = int(response.headers.get('Retry-After', 60))
        print(f"Rate limited. Waiting {retry_after} seconds...")
        time.sleep(retry_after)
        # Retry once after waiting
        response = session.get(url, **kwargs)
 
    response.raise_for_status()
    return response

For comprehensive error handling that distinguishes between all the failure modes, a structured client class is the cleanest approach:

python
import requests
from requests.exceptions import (
    ConnectionError,
    Timeout,
    HTTPError,
    RequestException
)
 
class APIClient:
    def __init__(self, base_url: str, api_key: str):
        self.base_url = base_url
        self.api_key = api_key
        self.session = requests.Session()
        self.session.headers.update({
            'Authorization': f'Bearer {api_key}'
        })
 
    def request(self, method: str, endpoint: str, **kwargs):
        """Make a request with comprehensive error handling."""
        url = f"{self.base_url}/{endpoint}"
 
        try:
            response = self.session.request(
                method,
                url,
                timeout=10,
                **kwargs
            )
 
            # Distinguish error types
            if response.status_code == 429:
                raise RateLimitError(f"Rate limited: {response.headers.get('Retry-After')}")
            elif response.status_code == 401:
                raise AuthenticationError("Invalid or expired API key")
            elif response.status_code == 403:
                raise AuthorizationError("Access denied")
            elif response.status_code >= 400:
                raise APIError(f"API error: {response.status_code} - {response.text}")
 
            return response.json()
 
        except Timeout:
            raise APIError("Request timeout, server not responding")
        except ConnectionError:
            raise APIError("Connection failed, check network")
        except HTTPError as e:
            raise APIError(f"HTTP error: {e}")
        except ValueError:
            raise APIError("Invalid JSON response from API")
 
    def get(self, endpoint: str, **kwargs):
        return self.request('GET', endpoint, **kwargs)
 
    def post(self, endpoint: str, **kwargs):
        return self.request('POST', endpoint, **kwargs)
 
    def close(self):
        self.session.close()
 
# Custom exception classes
class APIError(Exception):
    pass
 
class RateLimitError(APIError):
    pass
 
class AuthenticationError(APIError):
    pass
 
class AuthorizationError(APIError):
    pass
 
# Usage with explicit error handling
client = APIClient(
    base_url='https://api.example.com',
    api_key=os.getenv('API_KEY')
)
 
try:
    data = client.get('users/123')
except AuthenticationError:
    print("Re-authenticate and try again")
except RateLimitError as e:
    print(f"Wait {e.args[0]} seconds before retrying")
except APIError as e:
    print(f"Permanent failure: {e}")
finally:
    client.close()

This pattern separates transient errors (network hiccups) from permanent ones (auth failure). You retry transient errors; permanent errors need human intervention.

Real-World Example: Consuming a Paginated API

Most APIs paginate large result sets. Here's how to handle pagination:

python
import requests
from typing import List, Dict, Any
 
def fetch_all_posts(base_url: str, api_key: str) -> List[Dict[str, Any]]:
    """
    Fetch all posts from a paginated API.
    Assumes: 'page' query param, 'per_page=50', response has 'data' and 'total' keys.
    """
    session = requests.Session()
    session.headers.update({'Authorization': f'Bearer {api_key}'})
 
    all_posts = []
    page = 1
    per_page = 50
 
    while True:
        try:
            response = session.get(
                base_url,
                params={'page': page, 'per_page': per_page},
                timeout=10
            )
            response.raise_for_status()
 
            data = response.json()
            posts = data.get('data', [])
 
            if not posts:
                break  # No more posts
 
            all_posts.extend(posts)
 
            # Check if there are more pages
            total = data.get('total', 0)
            if len(all_posts) >= total:
                break
 
            page += 1
 
        except requests.exceptions.RequestException as e:
            print(f"Error fetching page {page}: {e}")
            break  # Or retry with backoff
 
    session.close()
    return all_posts
 
# Usage
posts = fetch_all_posts(
    'https://api.example.com/posts',
    api_key='your-api-key'
)
print(f"Fetched {len(posts)} posts")

This pattern:

  1. Maintains a session for efficiency
  2. Handles pagination via page parameter
  3. Gracefully stops when no more data
  4. Catches errors without crashing

Cursor-Based Pagination: Some APIs use cursors (opaque tokens) instead of page numbers. They're more efficient for large datasets:

python
def fetch_all_with_cursor(base_url: str, api_key: str):
    """Fetch all items using cursor-based pagination."""
    session = requests.Session()
    session.headers.update({'Authorization': f'Bearer {api_key}'})
 
    all_items = []
    cursor = None
 
    while True:
        try:
            params = {'limit': 100}
            if cursor:
                params['cursor'] = cursor
 
            response = session.get(base_url, params=params, timeout=10)
            response.raise_for_status()
 
            data = response.json()
            items = data.get('items', [])
 
            if not items:
                break
 
            all_items.extend(items)
 
            # Check for next cursor
            cursor = data.get('next_cursor')
            if not cursor:
                break  # No more pages
 
        except requests.exceptions.RequestException as e:
            print(f"Error during pagination: {e}")
            break
 
    session.close()
    return all_items

Cursor-based pagination is more robust, it handles insertions/deletions without skipping items.

Streaming Large Responses

When downloading large files or streaming responses, don't load everything into memory. Use streaming:

python
import requests
 
def download_large_file(url: str, output_path: str, api_key: str):
    """Download large file without loading into memory."""
    headers = {'Authorization': f'Bearer {api_key}'}
 
    # stream=True returns bytes as they arrive
    with requests.get(url, headers=headers, stream=True, timeout=30) as response:
        response.raise_for_status()
 
        # Get total size from headers
        total_size = int(response.headers.get('Content-Length', 0))
 
        # Write in chunks
        with open(output_path, 'wb') as f:
            downloaded = 0
            for chunk in response.iter_content(chunk_size=8192):
                if chunk:  # Filter keep-alive chunks
                    f.write(chunk)
                    downloaded += len(chunk)
                    percent = (downloaded / total_size * 100) if total_size else 0
                    print(f"Downloaded: {percent:.1f}%")
 
download_large_file(
    'https://api.example.com/files/data.csv',
    'local_data.csv',
    api_key='your-key'
)

The chunk_size=8192 means you process 8KB at a time, not the entire megabyte or gigabyte at once. This keeps memory usage constant regardless of file size, a 10MB file and a 10GB file use the same amount of RAM to download with this approach.

For JSON streaming (common in logs or search APIs):

python
import requests
import json
 
def stream_json_lines(url: str, api_key: str):
    """Stream JSON lines (newline-delimited JSON)."""
    headers = {'Authorization': f'Bearer {api_key}'}
 
    with requests.get(url, headers=headers, stream=True) as response:
        response.raise_for_status()
 
        for line in response.iter_lines():
            if line:
                obj = json.loads(line)
                yield obj  # Process one line at a time
 
# Usage
for log_entry in stream_json_lines('https://api.example.com/logs', api_key='key'):
    print(f"Event: {log_entry['event_type']}")

This processes a 100MB log file without loading it all into memory. The generator pattern here is elegant: yield obj turns this function into a lazy iterator, so the caller processes each record as it arrives rather than waiting for the entire file to download first.

Async Requests with HTTPX

If you're making hundreds of concurrent requests, synchronous code (with requests) becomes a bottleneck. Each request blocks the thread. For I/O-bound work, async is transformative.

Enter httpx, the spiritual successor to requests:

bash
pip install httpx

The beauty of httpx is that its API is nearly identical to requests. If you already know requests, you already know httpx. The only meaningful addition is the await keyword and the async with context manager. Here is a concrete comparison that shows just how similar they are:

python
import httpx
 
async def fetch_user(client, user_id):
    """Fetch single user asynchronously."""
    response = await client.get(f'https://api.example.com/users/{user_id}')
    response.raise_for_status()
    return response.json()
 
async def main():
    """Fetch multiple users concurrently."""
    async with httpx.AsyncClient() as client:
        # Launch all requests concurrently
        tasks = [
            fetch_user(client, user_id)
            for user_id in range(1, 11)
        ]
        users = await asyncio.gather(*tasks)
 
        for user in users:
            print(f"{user['name']}: {user['email']}")
 
import asyncio
asyncio.run(main())

This fetches 10 users concurrently in the time one synchronous request would take. The asyncio.gather(*tasks) call launches all 10 requests simultaneously and waits for all of them to complete. If each request takes 500ms, the synchronous version takes 5 seconds; the async version takes about 500ms.

Why HTTPX over requests for async?

  • requests doesn't support async natively (it's designed for sync)
  • httpx has identical API to requests but with async/await support
  • Drop-in replacement if you already know requests

Comparison:

python
# Requests (synchronous)
import requests
response = requests.get('https://api.example.com/users/1')
data = response.json()
 
# HTTPX (asynchronous)
import httpx
response = await client.get('https://api.example.com/users/1')
data = response.json()  # Same!

Headers, auth, timeouts, status code checking, all identical. The only difference is await.

When to Use Async

Use async when:

  • Many concurrent requests: 10+ simultaneous API calls
  • I/O-bound bottleneck: Waiting for network is your slowest part
  • Web servers/frameworks: FastAPI, Quart, Starlette all expect async

Avoid async if:

  • Few requests: Single or double-digit requests per operation
  • CPU-bound work: Data processing dominates, not network
  • Synchronous dependencies: Calling non-async libraries forces sync

Advanced HTTPX: Timeouts, Limits, and Monitoring

Just like requests, httpx supports timeouts and has session-like pooling. Connection limits are especially important in async code, because it is easy to accidentally open hundreds of simultaneous connections to the same server, which can get your IP blocked:

python
import httpx
import asyncio
 
async def fetch_with_limits():
    """
    HTTPX with connection pooling limits.
    """
    limits = httpx.Limits(
        max_connections=10,      # Total concurrent connections
        max_keepalive_connections=5  # Reuse limit
    )
 
    # Configure once, reuse for all requests
    async with httpx.AsyncClient(
        limits=limits,
        timeout=10.0
    ) as client:
        # These requests share the connection pool
        tasks = [
            client.get(f'https://api.example.com/users/{i}')
            for i in range(50)
        ]
        responses = await asyncio.gather(*tasks)
        return [r.json() for r in responses]
 
# Run it
data = asyncio.run(fetch_with_limits())

HTTPX automatically manages the connection pool and respects the limits you set. This prevents overwhelming the server or exhausting your system's file descriptors.

For monitoring and debugging:

python
import httpx
import logging
 
logging.basicConfig(level=logging.DEBUG)
 
async def fetch_with_monitoring():
    async with httpx.AsyncClient() as client:
        # httpx logs all requests when DEBUG is enabled
        response = await client.get('https://api.example.com/data')
        print(f"Status: {response.status_code}")
        print(f"Size: {len(response.content)} bytes")
        print(f"Elapsed: {response.elapsed.total_seconds()}s")
        return response.json()

The response.elapsed attribute tells you how long the request took, useful for performance monitoring.

Common API Mistakes

Even experienced developers fall into patterns that cause subtle bugs and fragile code. Recognizing these mistakes is the fastest path to writing production-quality API consumers.

Mistake 1: Not checking the status code. The most common beginner error is calling .json() directly on a response without checking whether the request succeeded. A 404 or 500 response often returns a JSON body too, just not the JSON you wanted. Always call raise_for_status() or check the status explicitly before parsing the body.

Mistake 2: Hardcoding credentials. API keys, tokens, and passwords should never appear in source code. Not even in private repositories. Use environment variables or a secrets manager. If you accidentally commit a credential, rotate it immediately, do not assume that reverting the commit makes you safe, because git history is permanent and may already have been cloned or cached elsewhere.

Mistake 3: Omitting timeouts. Without explicit timeouts, your application will hang indefinitely waiting for a server that has crashed or become unreachable. In a web server context, this can exhaust your thread pool and take down your entire service because of one slow dependency. Always set both connect and read timeouts.

Mistake 4: Retrying non-retryable errors. Retrying a 400 Bad Request is pointless, the request is malformed and will fail every time. Retrying a 401 Unauthorized is counterproductive, you are just burning through your rate limit. Only retry on network errors and 5xx server errors (and 429 rate limit errors after the specified wait). Be explicit about which errors trigger a retry.

Mistake 5: Creating new sessions for each request in a loop. Each requests.get() call at the top level creates and tears down a connection. Inside a loop making dozens of calls, this is a significant performance penalty. Always use a Session when making multiple requests to the same host. The difference can be 2-5x faster in real workloads.

Here is the full set of patterns side by side:

python
# Mistake 1: No status check
data = requests.get('https://api.example.com/data').json()
# Fix:
response = requests.get('https://api.example.com/data')
response.raise_for_status()
data = response.json()
 
# Mistake 2: Hardcoded key
headers = {'Authorization': 'Bearer sk_live_abc123def456'}
# Fix:
import os
headers = {'Authorization': f'Bearer {os.getenv("API_KEY")}'}
 
# Mistake 3: No timeout
requests.get('https://api.example.com/slow')
# Fix:
requests.get('https://api.example.com/slow', timeout=5)
 
# Mistake 4: Retrying everything
for attempt in range(3):
    response = requests.get(url)  # Will retry 400s pointlessly
# Fix: Only retry on transient errors
 
# Mistake 5: No session in loops
for i in range(100):
    requests.get(f'https://api.example.com/item/{i}')
# Fix:
with requests.Session() as session:
    for i in range(100):
        session.get(f'https://api.example.com/item/{i}')

Avoiding these five mistakes eliminates the majority of real-world API consumption bugs before they occur.

Testing API Code Without Hitting the Network

Unit testing API consumers is tricky, you can't rely on external APIs during tests. Mock them instead.

Tests that make real network calls are slow, flaky (what happens when the API is down?), and can have side effects (what if the test creates real records?). Mocking solves all three problems at once. The goal is to test your code's behavior given particular API responses, not to test that the API itself works:

Using unittest.mock

python
import unittest
from unittest.mock import patch, MagicMock
import requests
from my_api_client import fetch_user_data
 
class TestAPIClient(unittest.TestCase):
    @patch('requests.get')
    def test_fetch_user_success(self, mock_get):
        """Test successful API call."""
        # Mock the response
        mock_response = MagicMock()
        mock_response.status_code = 200
        mock_response.json.return_value = {
            'id': 1,
            'name': 'Alice',
            'email': 'alice@example.com'
        }
        mock_get.return_value = mock_response
 
        # Call your function
        user = fetch_user_data(1)
 
        # Assert it worked
        self.assertEqual(user['name'], 'Alice')
        mock_get.assert_called_once()
 
    @patch('requests.get')
    def test_fetch_user_not_found(self, mock_get):
        """Test 404 handling."""
        mock_response = MagicMock()
        mock_response.status_code = 404
        mock_response.raise_for_status.side_effect = requests.HTTPError()
        mock_get.return_value = mock_response
 
        with self.assertRaises(requests.HTTPError):
            fetch_user_data(999)

The @patch decorator replaces the real requests.get with a mock object. You control what it returns, so tests run instantly without network calls.

Using responses Library

For more realistic mocking, use the responses library:

bash
pip install responses
python
import responses
import requests
 
@responses.activate
def test_fetch_users():
    """Mock API responses realistically."""
    # Register mock response
    responses.add(
        responses.GET,
        'https://api.example.com/users',
        json=[
            {'id': 1, 'name': 'Alice'},
            {'id': 2, 'name': 'Bob'}
        ],
        status=200
    )
 
    # Your code makes the request
    response = requests.get('https://api.example.com/users')
    users = response.json()
 
    # Test assertions
    assert len(users) == 2
    assert users[0]['name'] == 'Alice'

responses intercepts all HTTP calls to registered URLs and returns your mock data. It's more readable than unittest.mock for API testing.

Testing with Fixtures

For pytest, create reusable fixtures:

python
import pytest
import responses
 
@pytest.fixture
def mock_api():
    """Context manager for mocked API calls."""
    with responses.RequestsMock() as rsps:
        rsps.add(
            responses.GET,
            'https://api.example.com/users/1',
            json={'id': 1, 'name': 'Alice'},
            status=200
        )
        yield rsps
 
def test_with_fixture(mock_api):
    """Use the fixture in tests."""
    import requests
    response = requests.get('https://api.example.com/users/1')
    assert response.json()['name'] == 'Alice'

Fixtures let you compose complex test setups from small, reusable pieces. A single mock_api fixture can be shared across dozens of test functions, making your test suite both thorough and maintainable.

Summary

You now understand the full lifecycle of HTTP requests in Python:

  • requests library: Simple, synchronous API consumption
  • GET/POST/PUT/PATCH/DELETE: CRUD operations over HTTP
  • Headers and Auth: API keys, Bearer tokens, Basic auth, token refresh
  • Response Validation: Status codes and Pydantic models
  • Sessions: Connection pooling for efficiency
  • Error Handling and Retries: Transient vs. permanent failures, exponential backoff
  • HTTPX: Async alternative for high-concurrency scenarios
  • Common Mistakes: The five antipatterns that break most API integrations
  • Testing: Mocking with unittest.mock, responses, and pytest fixtures

The progression here is deliberate. Start with requests for simple scripts, it handles the fundamentals with almost no ceremony. Graduate to Session for repeated calls to the same host, and add timeouts for production safety. Layer in structured error handling and retry logic when your script needs to run unattended. Add Pydantic validation when you need to trust the data your code processes. Reach for httpx when concurrency becomes the bottleneck.

The internet is not a reliable place. Networks partition, APIs throttle, servers crash, tokens expire, and JSON schemas drift. But with the patterns in this article, your Python code can handle all of it gracefully, logging useful errors, retrying what is worth retrying, failing fast on what is not, and staying secure throughout. That is the difference between code that works in a demo and code that runs in production for years.

Need help implementing this?

We build automation systems like this for clients every day.

Discuss Your Project