Python Modules and Imports: Organizing Your Code

You've written some solid Python code by now. But here's the problem: as your projects grow, keeping everything in a single .py file becomes chaos. Functions pile up. Related code scatters across hundreds of lines. You're searching for where you defined that utility function. And don't even get me started on code reuse across multiple projects.
That's where modules and imports come in. They're Python's solution to code organization, reusability, and maintainability. Once you understand how they work, you'll write cleaner projects, collaborate better, and stop reinventing the wheel.
Table of Contents
- Why Code Organization Matters
- What Is a Module?
- Import Syntax: Three Ways to Bring in Code
- Method 1: `import module_name`
- Method 2: `from module import name`
- Method 3: `import module as alias`
- How Python Finds Modules
- Packages: Organizing Modules into Folders
- Package Structure Patterns
- The `__name__ == '__main__'` Guard: When to Run Code
- Relative vs. Absolute Imports in Packages
- Absolute Import
- Relative Import
- Circular Import Solutions
- Common Import Mistakes
- Controlling a Module's Public API with `__all__`
- Putting It All Together: A Real Project Structure
- Common Import Gotchas
- Gotcha 1: Forgetting `__init__.py`
- Gotcha 2: Running from the Wrong Directory
- Gotcha 3: `from module import *` (Generally Avoid)
- Summary
Why Code Organization Matters
Before we dive into the mechanics, let's talk about why this matters so much, especially as you progress toward AI and machine learning work. When you first start learning Python, every script you write lives in a single file. That's perfectly fine when you're experimenting with a few dozen lines. But real-world projects, data pipelines, machine learning models, web scrapers, or even small utilities you share with teammates, quickly grow into hundreds or thousands of lines of code. Without structure, that codebase becomes a maintenance nightmare.
Think about it from the perspective of someone reading your code six months from now. If everything is in one massive file, they have to read the whole thing to understand any part of it. There's no separation between your data fetching logic and your data processing logic. There's no clear boundary between what's configuration and what's computation. Making a change in one area accidentally breaks something in another. Testing becomes painful because you can't isolate individual pieces. The codebase resists change because every part is tangled up with every other part.
Good code organization solves all of this. By splitting your code into logical modules, you create clear boundaries. Each file has a single responsibility. You can swap out one piece without touching the rest. You can test each module independently. New contributors understand the codebase faster because the structure communicates intent. And perhaps most importantly for your journey into AI/ML: every major Python library you'll use, NumPy, Pandas, Scikit-learn, TensorFlow, PyTorch, is itself a carefully organized collection of modules and packages. Understanding how Python's module system works means you'll understand how those libraries are structured, why you import things the way you do, and how to structure your own ML projects professionally.
Python's module system is the foundation of every large Python project ever written. Mastering it now pays dividends forever.
Let's dig in.
What Is a Module?
Here's the beautiful part: you already understand modules. You just don't know it yet.
A module is any .py file. That's it.
When you create a file called calculator.py, you've created a module. When you write math_utils.py, you've created another module. Each one contains Python code, functions, classes, variables, whatever, waiting to be imported and used elsewhere.
The magic happens when you import that module into another file. Python loads the code, executes it, and makes everything defined in that module available to you. This is how code reuse works at its most fundamental level: you write something once, put it in a module, and import it wherever you need it.
Let's see this in action. Create two files in the same directory:
calculator.py:
def add(a, b):
"""Return the sum of two numbers."""
return a + b
def multiply(a, b):
"""Return the product of two numbers."""
return a * b
PI = 3.14159Notice that this file is self-contained, it defines functions and a constant, but it doesn't run anything on its own. That's intentional. A good module does one job and does it well.
main.py:
import calculator
result = calculator.add(5, 3)
print(f"5 + 3 = {result}")
product = calculator.multiply(4, 7)
print(f"4 × 7 = {product}")
print(f"PI value: {calculator.PI}")Here's the key line: import calculator tells Python to find calculator.py, execute all the code in it, and bind the result to the name calculator in your current file. Everything defined in that module is now accessible through dot notation.
Run main.py:
5 + 3 = 8
4 × 7 = 28
PI value: 3.14159
See? When you write import calculator, Python finds the calculator.py file, executes it, and gives you access to everything defined in it. Functions, classes, and variables, all available via the module name. This is the most basic but most important concept in Python's module system. Everything else we cover builds on top of this foundation.
Import Syntax: Three Ways to Bring in Code
Python gives you flexibility in how you import. Each has a purpose, and knowing when to use each one will make your code clearer and more idiomatic.
Method 1: import module_name
This is the straightforward approach. You import the entire module and access its contents via dot notation. It's the most explicit form of import, anyone reading your code immediately knows exactly where math.pi or math.sqrt() comes from.
import math
radius = 5
area = math.pi * radius ** 2
print(f"Area of circle: {area}")Output:
Area of circle: 78.53981633974483
Here, math is the module (part of Python's standard library), and you access its constants and functions through it: math.pi, math.sqrt(), etc. The dot notation is your constant reminder of the source.
Pro: Clear namespace. You always know where something comes from.
Con: More typing if you use the module heavily.
Method 2: from module import name
When you only need specific things from a module, import just those. This is useful when you're using a handful of functions frequently and don't want to repeat the module name every time. It makes the code read more naturally, almost like English.
from math import pi, sqrt
radius = 5
area = pi * radius ** 2
diagonal = sqrt(area)
print(f"Area: {area}, Diagonal: {diagonal}")Output:
Area: 78.53981633974483, Diagonal: 8.862938119652561
Now you use pi and sqrt() directly, no module prefix needed. The tradeoff is that someone skimming your code might not immediately know where pi came from without scrolling back to the imports at the top.
Pro: Cleaner, less repetitive code.
Con: Less clear where pi comes from if you're skimming the code.
Method 3: import module as alias
Sometimes a module name is long or conflicts with a name you're using. Give it a nickname. This is especially common in the data science and AI/ML world, where community conventions for aliases have become standardized.
import numpy as np
import pandas as pd
array = np.array([1, 2, 3, 4, 5])
print(f"Array: {array}")Output:
Array: [1 2 3 4 5]
You'll see this constantly in data science code. import numpy as np is the de facto standard. When you're deep in an ML notebook and writing np.array, np.mean, and np.reshape dozens of times, that two-letter alias saves a lot of keystrokes, and more importantly, it signals to other data scientists that you know the conventions.
Pro: Shorter names, consistency with community conventions.
Con: Requires explanation for people unfamiliar with the alias.
You can also combine methods to get the best of both worlds, importing specific names from a module and giving them aliases at the same time:
from math import pi as PI_VALUE, sqrt
print(PI_VALUE)
print(sqrt(16))Output:
3.141592653589793
4.0
This pattern is handy when you want the brevity of a direct import but need to avoid a naming collision with something already in your namespace.
How Python Finds Modules
When you write import calculator, where does Python actually look for that file? This question matters more than you might think, because the answer determines whether your imports succeed or fail, and understanding it helps you diagnose mysterious ModuleNotFoundError messages.
Python searches through a list of directories called the module search path. You can inspect it at any time:
import sys
print(sys.path)Output (on a typical system):
['/Users/yourname/projects',
'/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11',
'/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages',
...]
Python searches these directories in order, stopping as soon as it finds a match. Understanding this order explains why some imports work and others fail:
- The script's directory (highest priority). If you run
python main.pyfrom/projects/myapp, Python first looks in/projects/myapp. This is why you can import a module that lives in the same folder as your script. - PYTHONPATH environment variable (if set). A special list you can configure at the OS level to add directories to the search path.
- Standard library locations. Where built-in modules like
math,os, andsyslive. - site-packages. Where third-party packages installed via
pip installgo. When you runpip install numpy, it lands here.
This is why you can do import calculator when calculator.py is in the same folder, it's first on the search path. And it explains why import numpy works after you pip-install it but not before: the installation places files in site-packages, which Python checks during the module search.
If you need Python to find modules in a directory that isn't on the path by default, you can add it programmatically:
import sys
sys.path.insert(0, '/path/to/my/modules')Inserting at position 0 gives that directory the highest priority, so Python searches it before anything else. This technique is useful for development setups and specialized project structures, but most day-to-day work won't require it, the default path usually covers everything you need.
Packages: Organizing Modules into Folders
A module is a single .py file. But what if you want to organize multiple related modules into a folder? Enter packages.
A package is a directory containing modules and a special file called __init__.py. The presence of __init__.py is what tells Python "treat this directory as a package, not just a folder full of files." Without it, Python won't recognize the directory as importable.
Let's build a real-world example. Create this structure:
myproject/
├── main.py
└── accounting/
├── __init__.py
├── income.py
├── expenses.py
└── reports.py
accounting/__init__.py:
# This file makes the accounting folder a package
# It can be empty, or contain initialization code
print("Initializing accounting package")The __init__.py file runs once when you first import anything from the package. You can leave it completely empty (which is totally valid and common) or use it to set up the package, importing names you want to expose, running initialization code, or setting package-level variables.
accounting/income.py:
def record_salary(amount):
"""Record salary income."""
return f"Salary recorded: ${amount}"
def record_bonus(amount):
"""Record bonus income."""
return f"Bonus recorded: ${amount}"accounting/expenses.py:
def record_rent(amount):
"""Record rent expense."""
return f"Rent recorded: ${amount}"
def record_utilities(amount):
"""Record utilities expense."""
return f"Utilities recorded: ${amount}"main.py:
from accounting.income import record_salary
from accounting.expenses import record_rent
print(record_salary(50000))
print(record_rent(1500))Run it:
Initializing accounting package
Salary recorded: $50000
Rent recorded: $1500
Notice a few things worth highlighting:
- You use dot notation to navigate the package hierarchy:
accounting.incomerefers to theincome.pymodule inside theaccountingpackage. - The
__init__.pyfile executed when you imported from the package (printed "Initializing..."). - You can import specific functions directly:
from accounting.income import record_salary.
The __init__.py file is the package's initialization script. It can be empty (which just tells Python "this is a package"), or it can contain setup code. It runs when anyone imports anything from that package, and it runs only once per Python session regardless of how many times you import from that package.
You can also import the module itself and then call functions through it:
import accounting.income as income
print(income.record_salary(50000))Output:
Initializing accounting package
Salary recorded: $50000
Both styles work. Use whichever makes your code clearest in context.
Package Structure Patterns
As your projects grow, how you structure your packages starts to matter as much as how you write your code. There are a few patterns that experienced Python developers rely on, and knowing them will help you build projects that scale.
The flat package pattern works best for small to medium projects. All your modules live at the same level inside a single package directory, and __init__.py re-exports the most important names. Users of your package can do from mypackage import the_thing without needing to know which submodule the_thing lives in. This is how many popular libraries expose their API, NumPy, for example, lets you write np.array() instead of np.core.multiarray.array().
The domain-driven structure works well for larger applications. You create separate packages for each domain of your application: auth/, billing/, notifications/, reporting/. Each package is self-contained with its own models, utilities, and logic. This makes it easy to understand where code lives and who owns what.
The layered structure separates your code by technical concern rather than business domain: models/ for data structures, services/ for business logic, utils/ for shared utilities, api/ for external interfaces. This is common in web applications and ML projects where the pipeline stages are distinct.
For ML projects specifically, a pattern that works well is organizing by pipeline stage: data/ for data loading and preprocessing, features/ for feature engineering, models/ for model definitions, training/ for training scripts, and evaluation/ for metrics and visualization. Each stage is isolated, testable, and replaceable independently. When you need to try a new preprocessing approach, you change only the data/ package. When you want to experiment with a different model architecture, you change only models/. The rest stays stable.
The right structure depends on your project's size, team, and how the code is likely to evolve. But any structure is better than no structure, pick one and be consistent.
The __name__ == '__main__' Guard: When to Run Code
Here's a situation you'll face constantly: you create a module with useful functions, but you also write some test code to verify they work.
Create utils.py:
def greet(name):
"""Greet someone."""
return f"Hello, {name}!"
# Test the function
print(greet("Alice"))
print(greet("Bob"))Now create app.py:
from utils import greet
print(greet("Charlie"))Run app.py:
Hello, Alice!
Hello, Bob!
Hello, Charlie!
Wait, you didn't want to run those test prints. You only wanted to use the greet function. But importing the module ran all its code. This is a fundamental aspect of how Python modules work: when you import a module, Python executes every line of it at import time. That's usually what you want for function and class definitions, but not for "run this now" code like print statements or script logic.
This is where the __name__ guard saves you.
When Python runs a file directly, it sets a special variable __name__ to '__main__'. When a file is imported as a module, __name__ is set to the module's name instead. You can use this difference to separate "code that defines things" from "code that runs things."
Rewrite utils.py:
def greet(name):
"""Greet someone."""
return f"Hello, {name}!"
if __name__ == '__main__':
# This only runs if you execute utils.py directly
print(greet("Alice"))
print(greet("Bob"))Now run app.py:
Hello, Charlie!
Perfect. The test code didn't run because utils.py was imported, not executed directly. Python saw that __name__ was 'utils' (the module name) rather than '__main__', so it skipped the guarded block entirely.
But run python utils.py directly:
Hello, Alice!
Hello, Bob!
The guard code runs. This pattern is essential for writing reusable modules. Every module should guard its test or demo code with if __name__ == '__main__':.
Here's a more realistic example, a configuration module that can verify itself when run directly:
config.py:
DATABASE_URL = "postgresql://localhost/myapp"
SECRET_KEY = "your-secret-key"
DEBUG = True
if __name__ == '__main__':
print(f"Database: {DATABASE_URL}")
print(f"Debug mode: {DEBUG}")app.py:
from config import DATABASE_URL, DEBUG
print(f"Connecting to {DATABASE_URL}")
print(f"Debug enabled: {DEBUG}")Run app.py:
Connecting to postgresql://localhost/myapp
Debug enabled: True
The config guard code didn't run. You imported the values you needed cleanly. And any developer on the team can run python config.py directly to see what configuration is in effect, a handy built-in debug tool.
Relative vs. Absolute Imports in Packages
When you're inside a package and want to import from another module in the same package, you have two options. Both work, and understanding when to use each one helps you write imports that stay correct even as your project structure evolves.
Consider this structure:
myapp/
├── core/
│ ├── __init__.py
│ ├── database.py
│ └── helpers.py
└── main.py
Absolute Import
Use the full path from the project root:
core/database.py:
from core.helpers import format_connection_string
def connect():
return format_connection_string("localhost")This always works, regardless of where you run your script from. It's the recommended approach for most projects because the import statement tells you exactly where the code comes from, there's no ambiguity, no matter how deep the nesting goes.
Relative Import
Use dots to refer to other modules in the same package:
core/database.py:
from .helpers import format_connection_string
def connect():
return format_connection_string("localhost")The single dot (.) means "from the current package." Two dots (..) mean "go up one level." This is more concise, and it has one key advantage: if you rename or move the entire package, the relative imports inside it remain correct because they're expressed relative to the package itself, not to the project root.
Relative imports work only within packages, and they're handy for deep package structures. Here are all three variants in action:
# Import from the same package
from .helpers import format_connection_string
# Import from parent package
from ..config import DATABASE_URL
# Import from a sibling package
from ..auth.models import UserEach dot represents one level up in the package hierarchy. The first example stays within core/. The second goes up to myapp/ and imports from config. The third goes up to myapp/ and then down into the auth/ subpackage.
When to use:
- Absolute imports: Default choice. Clear and unambiguous. Easy to search for in an IDE. Works when running scripts directly.
- Relative imports: When building a library or package meant to be installed and used elsewhere. Keeps internal paths stable if the package is renamed.
For beginners, stick with absolute imports. They're simpler to reason about, and most Python style guides (including the official PEP 8) recommend them as the default.
Circular Import Solutions
Here's a tricky situation that trips up developers regularly: Module A imports from Module B, and Module B imports from Module A. When Python tries to load either one, it gets stuck in a loop.
user.py:
from profile import get_profile
def create_user(name):
print(f"Creating user: {name}")
profile = get_profile(name)
return profileprofile.py:
from user import create_user
def get_profile(name):
print(f"Getting profile for: {name}")
user = create_user(name)
return userTry running either file:
ImportError: cannot import name 'get_profile' from partially initialized module 'profile'
This is a circular import. Python tries to import user.py, which tries to import from profile.py, which tries to import from user.py again. Python's import system detects the loop and fails with a partially initialized module error. The fix isn't always obvious, but there are three reliable approaches.
Option 1: Restructure the code
The cleanest solution is to eliminate the circular dependency by extracting shared functionality into a third module. If two modules need each other, that's often a sign that some logic belongs in a shared location:
common.py:
def create_user_and_profile(name):
"""Shared logic here."""
return f"User and profile for {name}"user.py:
from common import create_user_and_profile
def create_user(name):
return create_user_and_profile(name)profile.py:
from common import create_user_and_profile
def get_profile(name):
return create_user_and_profile(name)Now both modules import from common, not from each other. The dependency graph is a tree again, not a cycle. This is the preferred solution because it improves the design, not just the import mechanics.
Option 2: Import inside the function
Instead of importing at the top of the file, import only when you need it, inside the function body. This defers the import until the function is called, by which time both modules are fully loaded:
user.py:
def create_user(name):
from profile import get_profile
print(f"Creating user: {name}")
profile = get_profile(name)
return profileprofile.py:
def get_profile(name):
print(f"Getting profile for: {name}")
# Don't import create_user here
return f"Profile for {name}"By importing inside the function, you delay the import until the function is called, avoiding the circular dependency. This works but has a small performance cost (the import machinery runs on each call, though Python caches modules so the penalty is minimal after the first call). Use it as a last resort when restructuring isn't practical.
Option 3: Use type hints with TYPE_CHECKING
For type hints when you have circular dependencies, which is common when you're annotating function signatures that reference types from other modules:
user.py:
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from profile import Profile
def create_user(name) -> 'Profile':
from profile import get_profile
return get_profile(name)The TYPE_CHECKING block only executes during static type checking (when a tool like mypy analyzes your code), not at runtime. This lets you use type hints without creating a circular import. Note the string 'Profile' in the return annotation, that's a forward reference, telling the type checker what type this is without actually importing it at runtime.
Best practice: Restructure your code to avoid circular imports in the first place. If modules depend on each other, they probably belong in the same module, or you need a shared third module that both can import from. Circular imports are a code smell that usually signals a design problem worth fixing.
Common Import Mistakes
Even experienced developers occasionally stumble on these. Knowing them upfront saves you debugging sessions.
The most frustrating mistake is the shadowing problem: you name your file the same as a standard library module. Create a file called math.py in your project, and suddenly import math imports your file instead of Python's built-in math library. Every call to math.sqrt() fails mysteriously. The fix is simple, don't name your files after standard library modules. Avoid names like os.py, sys.py, math.py, random.py, datetime.py, or any other name you plan to import. This is easy to miss when you're just experimenting, but it causes genuine confusion.
The missing __init__.py mistake is another common one, especially as Python 3 introduced "namespace packages" that can sometimes work without __init__.py, creating inconsistent behavior. When in doubt, always add __init__.py to any directory you want to treat as a package. An empty file is fine.
Wildcard imports (from module import *) are tempting for brevity but create invisible problems. They dump potentially dozens of names into your namespace without you knowing what they are. Later in the file, you call a function that you think you defined locally, but it was actually overwritten by the wildcard import. Debugging this is maddening. Be explicit: name every import.
Running scripts from the wrong directory breaks imports that work fine from the right directory. If your project expects you to run python main.py from the project root, and you instead cd into a subdirectory and run it, the module search path changes and imports fail. Get in the habit of always running Python scripts from the project root, and using tools like python -m module_name to run modules as scripts, it sets up the path correctly regardless of where you run it from.
Finally, stale .pyc files can cause confusing behavior. Python caches compiled bytecode in __pycache__/ directories. Normally this is transparent, but if you move or rename modules, sometimes the old cached versions cause unexpected behavior. If imports are acting strangely after a refactor, try deleting the __pycache__/ directories and running again.
Controlling a Module's Public API with __all__
Sometimes you want to hide internal implementation details and expose only certain functions or classes. Every module has things that are implementation details, helper functions, internal constants, private utilities, and things that are meant to be used by the outside world. The __all__ variable is how you communicate that distinction.
Use the __all__ variable:
utils.py:
def _internal_helper():
"""Private function, not meant for external use."""
return "internal"
def public_function():
"""This is public."""
return "public"
def another_public():
"""This is also public."""
return "also public"
__all__ = ['public_function', 'another_public']The leading underscore on _internal_helper is a naming convention signaling "this is private," but it's just convention. __all__ is what actually enforces the distinction when someone uses wildcard imports.
Now when someone does from utils import *, they get only what's in __all__:
from utils import *
print(public_function())
print(another_public())
print(_internal_helper()) # NameError: not in __all__Output:
public
also public
NameError: name '_internal_helper' is not defined
Even though _internal_helper is defined, it's not in __all__, so it's not imported with *. This is how you protect the users of your code from accidentally depending on implementation details that you might change or remove later.
Note: __all__ is a convention. Someone determined enough can still import _internal_helper directly: from utils import _internal_helper. But __all__ signals intent, "this is an implementation detail, don't use it." And for wildcard imports, it's enforced.
For documentation purposes, it's also incredibly helpful. When other developers read your module, they immediately see what's intended for public use:
__all__ = [
'public_function',
'another_public',
'UsefulClass'
]Many popular libraries define __all__ explicitly in every module. It makes the public API crystal clear and helps IDEs and documentation generators understand what to surface to users.
Putting It All Together: A Real Project Structure
Let's build a simple project to tie everything together. This example shows a weather app, but the patterns apply to any project, including the ML projects you'll build later in this series.
weather_app/
├── main.py
├── config.py
├── weather/
│ ├── __init__.py
│ ├── api.py
│ ├── parser.py
│ └── cache.py
└── utils/
├── __init__.py
├── validators.py
└── formatters.py
config.py:
API_KEY = "your-api-key-here"
CACHE_DIR = "./cache"
TIMEOUT = 10
if __name__ == '__main__':
print(f"API Key: {API_KEY}")utils/__init__.py:
# Empty or minimal initializationutils/validators.py:
def is_valid_city(city):
"""Check if city name is valid."""
return isinstance(city, str) and len(city) > 0
__all__ = ['is_valid_city']utils/formatters.py:
def format_temperature(temp, unit='C'):
"""Format temperature for display."""
return f"{temp}°{unit}"
__all__ = ['format_temperature']weather/__init__.py:
# Initialize the weather package
from .api import get_weather
from .parser import parse_response
__all__ = ['get_weather', 'parse_response']weather/api.py:
import requests
from config import API_KEY, TIMEOUT
from .parser import parse_response
def get_weather(city):
"""Fetch weather for a city."""
url = f"https://api.example.com/weather?city={city}&key={API_KEY}"
response = requests.get(url, timeout=TIMEOUT)
return parse_response(response.json())
if __name__ == '__main__':
print(get_weather("London"))weather/parser.py:
def parse_response(data):
"""Parse weather API response."""
return {
'temp': data.get('main', {}).get('temp'),
'condition': data.get('weather', [{}])[0].get('main')
}
if __name__ == '__main__':
sample = {'main': {'temp': 20}, 'weather': [{'main': 'Sunny'}]}
print(parse_response(sample))weather/cache.py:
from config import CACHE_DIR
def cache_result(key, value):
"""Cache a weather result."""
# Implement caching logic
pass
def get_cached(key):
"""Retrieve cached result."""
passmain.py:
from weather import get_weather
from utils.validators import is_valid_city
from utils.formatters import format_temperature
city = "London"
if is_valid_city(city):
weather = get_weather(city)
temp = weather.get('temp')
formatted = format_temperature(temp)
print(f"Weather in {city}: {formatted}, {weather.get('condition')}")
else:
print("Invalid city name")This structure shows every concept from this article working together in a realistic scenario. Notice how main.py imports from weather (not weather.api), that's the __init__.py doing its job of re-exporting the most important names. Notice how weather/api.py uses a relative import for parser.py (.parser) but an absolute import for config. Notice every module has __all__ or if __name__ == '__main__' guards where appropriate. This is what professional Python code looks like.
This structure shows:
- Packages organize related modules (
weather/,utils/). __init__.pycontrols what's public in each package.- Absolute imports are used throughout (
from weather import,from config import). __name__ == '__main__'guards test code in each module.__all__clarifies the public API.- Configuration is centralized (
config.py).
Common Import Gotchas
Gotcha 1: Forgetting __init__.py
You create a folder and put .py files in it, but imports fail:
mypackage/
├── module1.py
├── module2.py
from mypackage.module1 import something # ModuleNotFoundErrorAdd mypackage/__init__.py (even if empty):
mypackage/
├── __init__.py
├── module1.py
├── module2.py
Now it works. The __init__.py file doesn't need to contain anything, its mere existence is enough to tell Python that this directory is a package.
Gotcha 2: Running from the Wrong Directory
You have:
project/
├── main.py
└── subdir/
└── utils.py
In main.py, you write:
import subdir.utilsBut then you run:
cd subdir
python main.pyPython won't find subdir because it's not in the current path. The script directory (now subdir/) is on the path, but subdir/ doesn't contain another subdir/. Always run from the project root:
python main.pyAlternatively, use python -m to run a module as a script, which sets up the path relative to where you run the command rather than where the script file lives.
Gotcha 3: from module import * (Generally Avoid)
This imports everything from a module:
from os import *It's convenient but pollutes your namespace. Unless __all__ is defined, you import every name, including internals. Names can collide with things you've defined. It becomes impossible to tell where any particular name came from. Better to be explicit:
from os import path, getcwdThe few characters you save with wildcard imports cost hours in debugging and confusion later. Explicit is always better than implicit in Python, and that's not just a preference, it's literally in the language's design philosophy.
Summary
You now understand Python's module system, why it matters, and how to use it well. Let's pull it all together.
Modules are .py files, the most fundamental unit of code organization. Import them to reuse code across your project without copying and pasting. Imports have three main forms: import module for clarity and namespace safety, from module import name for frequently-used specific items, and import module as alias for community conventions and name conflicts. Each form has its place.
How Python finds modules depends on sys.path, a list of directories searched in order. The script's directory comes first, then any PYTHONPATH directories, then the standard library, then site-packages where pip-installed packages live. Understanding this explains every ModuleNotFoundError you'll ever encounter.
Packages are directories with __init__.py files, organizing multiple related modules. Package structure patterns, flat, domain-driven, or layered, help you scale as projects grow. The __name__ == '__main__' guard lets modules serve double duty as both importable libraries and runnable scripts.
Circular imports happen when modules depend on each other; the cleanest fix is restructuring into a shared third module. Common import mistakes, shadowing standard library names, wildcard imports, running from the wrong directory, are all avoidable once you know about them. __all__ declares a module's public API, protecting users from depending on implementation details and communicating intent clearly.
Structure matters: good organization makes projects maintainable, scalable, and understandable. Every major Python library and framework you'll use, especially in AI/ML, is built on exactly these principles. When you write import torch or from sklearn.ensemble import RandomForestClassifier, you're using the exact same module system we just covered. Now you know what's happening under the hood.
With modules and imports mastered, you can now organize real projects, share code across files, and collaborate with others without everything breaking.