Python Project Structures
Updated on: 2023-07-24.
Python project structures are confusing. Let’s make them less so.
Terminology
First, a few standard Python terminology definitions:
- a module is a single Python script,
- a package is a collection of modules1.
A project may be an application or a library:
- an application is a program that is meant to be deployed, such as a script for numerical calculations, a server, or a Discord bot,
- a library is a package that will be imported by other libraries or applications.
Project Structures
There are a few common, recommended Python project structures2. This is the one I prefer:
1project_root/
2 docs/
3 conf.py
4 index.md
5 package/
6 __init__.py
7 subpackageA/
8 __init__.py
9 moduleA.py
10 module.py
11 tests/
12 __init__.py
13 test_module.py
14 requirements.txt
15 runner.py
16 README.md
17 LICENSE
18 pyproject.toml
- The
runner.pyscript is the entry point for the application, such as a CLI (if you’re building a library, you can omit this file). - The
packagefolder is the package that will be imported by other libraries or applications. - The
testsfolder contains unit tests for the package. - The
docsfolder contains documentation for the package.
The application is run with python runner.py from the top-level directory.
Imports
Imports in each file can be handled as follows:
1# absolute import in runner.py
2from package import module
3
4# absolute import in package/module.py
5from package.subpackageA import moduleA
6# relative import in package/module.py
7from .subpackageA import moduleA
8
9# absolute import in package/subpackageA/moduleA.py
10from package import module
11# relative import in package/subpackageA/moduleA.py
12from .. import module
13
14# absolute import in tests/test_module.py
15from package import module
16from package.subpackageA import moduleA
Note that scripts can’t import relative to each other, so you can’t do from . import module in runner.py (this is because the __name__ variable for runner.py is __main__ when run as python runner.py and therefore Python can’t do module name manipulation, see more here).
If you run python runner.py from the top-level directory, the project_root will be in your PYTHONPATH environment variable, so you can import package and its submodules from anywhere in the project.
If you run python package/module.py from the top-level directory, you will get an error because PYTHONPATH will not contain project_root.
You can test this by showing the Python path in runner.py and package/module.py (via the sys.path variable):
1# runner.py
2import sys
3print(sys.path)
4
5# package/module.py
6import sys
7print(sys.path)
8
9# output of `python runner.py`:
10['/path/to/project_root', ...]
11
12# output of `python package/module.py`:
13['/path/to/project_root/package', ...]
Alternatively, you can run a package script from the top-level directory by using -m flag (see more on the -m flag here):
1python -m package.module
For testing, pytest . should work from any directory3.
Managing Pythons, Dependencies, and Environments
When it comes to Python management, I like to keep it simple.
I use pyenv to manage Python versions and python -m venv venv to create virtual environments in my project.
My standard requirements.txt file looks like this:
1black
2ruff
3pytest
4typer[all]
The modern place to store tool configuration settings is pyproject.toml. Here is a sample pyproject.toml I use for my projects.
(See here for an introductory blog post by Brett Cannon, one of the authors of PEP-518 and PEP-621.)
Footnotes
These definitions are 90% true. Package is an overloaded term in Python. ↩︎
This structure is recommended by the popular Hitchhiker’s Guide to Python. It is also default for
poetryprojects, a common Python workflow tool. ↩︎pytestwill find your project root by traversing up the directory hierarchy, see more here ↩︎