Tutorial:PracticalPython/9 Packages
Packages
We conclude the course with a few details on how to organize your code into a package structure. We’ll also discuss the installation of third party packages and preparing to give your own code away to others.
The subject of packaging is an ever-evolving, overly complex part of Python development. Rather than focus on specific tools, the main focus of this section is on some general code organization principles that will prove useful no matter what tools you later use to give code away or manage dependencies.
[[../Contents.md|Contents]] | [[../08_Testing_debugging/00_Overview.md|Prev (8 Testing and Debugging)]] [[../Contents.md|Contents]] | [[../08_Testing_debugging/03_Debugging.md|Previous (8.3 Debugging)]] | Next (9.2 Third Party Packages)
Packages
If writing a larger program, you don’t really want to organize it as a large of collection of standalone files at the top level. This section introduces the concept of a package.
Modules
Any Python source file is a module.
# foo.py def grok(a): ... def spam(b): ...
An import
statement loads and executes a module.
# program.py import foo a = foo.grok(2) b = foo.spam('Hello') ...
Packages vs Modules
For larger collections of code, it is common to organize modules into a package.
# From this pcost.py report.py fileparse.py # To this porty/ __init__.py pcost.py report.py fileparse.py
You pick a name and make a top-level directory. porty
in the example above (clearly picking this name is the most important first step).
Add an __init__.py
file to the directory. It may be empty.
Put your source files into the directory.
Using a Package
A package serves as a namespace for imports.
This means that there are now multilevel imports.
import porty.report port = porty.report.read_portfolio('port.csv')
There are other variations of import statements.
from porty import report port = report.read_portfolio('portfolio.csv') from porty.report import read_portfolio port = read_portfolio('portfolio.csv')
Two problems
There are two main problems with this approach.
- imports between files in the same package break.
- Main scripts placed inside the package break.
So, basically everything breaks. But, other than that, it works.
Problem: Imports
Imports between files in the same package must now include the package name in the import. Remember the structure.
porty/ __init__.py pcost.py report.py fileparse.py
Modified import example.
# report.py from porty import fileparse def read_portfolio(filename): return fileparse.parse_csv(...)
All imports are absolute, not relative.
# report.py import fileparse # BREAKS. fileparse not found ...
Relative Imports
Instead of directly using the package name, you can use .
to refer to the current package.
# report.py from . import fileparse def read_portfolio(filename): return fileparse.parse_csv(...)
Syntax:
from . import modname
This makes it easy to rename the package.
Problem: Main Scripts
Running a package submodule as a main script breaks.
bash $ python porty/pcost.py # BREAKS ...
Reason: You are running Python on a single file and Python doesn’t see the rest of the package structure correctly (sys.path
is wrong).
All imports break. To fix this, you need to run your program in a different way, using the -m
option.
bash $ python -m porty.pcost # WORKS ...
__init__.py
files
The primary purpose of these files is to stitch modules together.
Example: consolidating functions
# porty/__init__.py from .pcost import portfolio_cost from .report import portfolio_report
This makes names appear at the top-level when importing.
from porty import portfolio_cost portfolio_cost('portfolio.csv')
Instead of using the multilevel imports.
from porty import pcost pcost.portfolio_cost('portfolio.csv')
Another solution for scripts
As noted, you now need to use -m package.module
to run scripts within your package.
bash % python3 -m porty.pcost portfolio.csv
There is another alternative: Write a new top-level script.
#!/usr/bin/env python3 # pcost.py import porty.pcost import sys porty.pcost.main(sys.argv)
This script lives outside the package. For example, looking at the directory structure:
pcost.py # top-level-script porty/ # package directory __init__.py pcost.py ...
Application Structure
Code organization and file structure is key to the maintainability of an application.
There is no “one-size fits all” approach for Python. However, one structure that works for a lot of problems is something like this.
porty-app/ README.txt script.py # SCRIPT porty/ # LIBRARY CODE __init__.py pcost.py report.py fileparse.py
The top-level porty-app
is a container for everything else–documentation, top-level scripts, examples, etc.
Again, top-level scripts (if any) need to exist outside the code package. One level up.
#!/usr/bin/env python3 # porty-add/script.py import sys import porty porty.report.main(sys.argv)
Exercises
At this point, you have a directory with several programs:
pcost.py # computes portfolio cost report.py # Makes a report ticker.py # Produce a real-time stock ticker
There are a variety of supporting modules with other functionality:
stock.py # Stock class portfolio.py # Portfolio class fileparse.py # CSV parsing tableformat.py # Formatted tables follow.py # Follow a log file typedproperty.py # Typed class properties
In this exercise, we’re going to clean up the code and put it into a common package.
Exercise 9.1: Making a simple package
Make a directory called porty/
and put all of the above Python files into it. Additionally create an empty __init__.py
file and put it in the directory. You should have a directory of files like this:
porty/ __init__.py fileparse.py follow.py pcost.py portfolio.py report.py stock.py tableformat.py ticker.py typedproperty.py
Remove the file __pycache__
that’s sitting in your directory. This contains pre-compiled Python modules from before. We want to start fresh.
Try importing some of package modules:
>>> import porty.report >>> import porty.pcost >>> import porty.ticker
If these imports fail, go into the appropriate file and fix the module imports to include a package-relative import. For example, a statement such as import fileparse
might change to the following:
# report.py from . import fileparse ...
If you have a statement such as from fileparse import parse_csv
, change the code to the following:
# report.py from .fileparse import parse_csv ...
Exercise 9.2: Making an application directory
Putting all of your code into a “package” isn’t often enough for an application. Sometimes there are supporting files, documentation, scripts, and other things. These files need to exist OUTSIDE of the porty/
directory you made above.
Create a new directory called porty-app
. Move the porty
directory you created in Exercise 9.1 into that directory. Copy the Data/portfolio.csv
and Data/prices.csv
test files into this directory. Additionally create a README.txt
file with some information about yourself. Your code should now be organized as follows:
porty-app/ portfolio.csv prices.csv README.txt porty/ __init__.py fileparse.py follow.py pcost.py portfolio.py report.py stock.py tableformat.py ticker.py typedproperty.py
To run your code, you need to make sure you are working in the top-level porty-app/
directory. For example, from the terminal:
shell % cd porty-app shell % python3 >>> import porty.report >>>
Try running some of your prior scripts as a main program:
shell % cd porty-app shell % python3 -m porty.report portfolio.csv prices.csv txt Name Shares Price Change ---------- ---------- ---------- ---------- AA 100 9.22 -22.98 IBM 50 106.28 15.18 CAT 150 35.46 -47.98 MSFT 200 20.89 -30.34 GE 95 13.48 -26.89 MSFT 50 20.89 -44.21 IBM 100 106.28 35.84 shell %
Exercise 9.3: Top-level Scripts
Using the python -m
command is often a bit weird. You may want to write a top level script that simply deals with the oddities of packages. Create a script print-report.py
that produces the above report:
#!/usr/bin/env python3 # print-report.py import sys from porty.report import main main(sys.argv)
Put this script in the top-level porty-app/
directory. Make sure you can run it in that location:
shell % cd porty-app shell % python3 print-report.py portfolio.csv prices.csv txt Name Shares Price Change ---------- ---------- ---------- ---------- AA 100 9.22 -22.98 IBM 50 106.28 15.18 CAT 150 35.46 -47.98 MSFT 200 20.89 -30.34 GE 95 13.48 -26.89 MSFT 50 20.89 -44.21 IBM 100 106.28 35.84 shell %
Your final code should now be structured something like this:
porty-app/ portfolio.csv prices.csv print-report.py README.txt porty/ __init__.py fileparse.py follow.py pcost.py portfolio.py report.py stock.py tableformat.py ticker.py typedproperty.py
Third Party Modules
Python has a large library of built-in modules (batteries included).
There are even more third party modules. Check them in the Python Package Index or PyPi. Or just do a Google search for a specific topic.
How to handle third-party dependencies is an ever-evolving topic with Python. This section merely covers the basics to help you wrap your brain around how it works.
The Module Search Path
sys.path
is a directory that contains the list of all directories checked by the import
statement. Look at it:
>>> import sys >>> sys.path ... look at the result ... >>>
If you import something and it’s not located in one of those directories, you will get an ImportError
exception.
Standard Library Modules
Modules from Python’s standard library usually come from a location such as `/usr/local/lib/python3.6’. You can find out for certain by trying a short test:
>>> import re >>> re <module 're' from '/usr/local/lib/python3.6/re.py'> >>>
Simply looking at a module in the REPL is a good debugging tip to know about. It will show you the location of the file.
Third-party Modules
Third party modules are usually located in a dedicated site-packages
directory. You’ll see it if you perform the same steps as above:
>>> import numpy <module 'numpy' from '/usr/local/lib/python3.6/site-packages/numpy/__init__.py'> >>>
Again, looking at a module is a good debugging tip if you’re trying to figure out why something related to import
isn’t working as expected.
Installing Modules
The most common technique for installing a third-party module is to use pip
. For example:
bash % python3 -m pip install packagename
This command will download the package and install it in the site-packages
directory.
Problems
- You may be using an installation of Python that you don’t directly control.
- A corporate approved installation
- You’re using the Python version that comes with the OS.
- You might not have permission to install global packages in the computer.
- There might be other dependencies.
Virtual Environments
A common solution to package installation issues is to create a so-called “virtual environment” for yourself. Naturally, there is no “one way” to do this–in fact, there are several competing tools and techniques. However, if you are using a standard Python installation, you can try typing this:
bash % python -m venv mypython bash %
After a few moments of waiting, you will have a new directory mypython
that’s your own little Python install. Within that directory you’ll find a bin/
directory (Unix) or a Scripts/
directory (Windows). If you run the activate
script found there, it will “activate” this version of Python, making it the default python
command for the shell. For example:
bash % source mypython/bin/activate (mypython) bash %
From here, you can now start installing Python packages for yourself. For example:
(mypython) bash % python -m pip install pandas ...
For the purposes of experimenting and trying out different packages, a virtual environment will usually work fine. If, on the other hand, you’re creating an application and it has specific package dependencies, that is a slightly different problem.
Handling Third-Party Dependencies in Your Application
If you have written an application and it has specific third-party dependencies, one challange concerns the creation and preservation of the environment that includes your code and the dependencies. Sadly, this has been an area of great confusion and frequent change over Python’s lifetime. It continues to evolve even now.
Rather than provide information that’s bound to be out of date soon, I refer you to the Python Packaging User Guide.
Exercises
Exercise 9.4 : Creating a Virtual Environment
See if you can recreate the steps of making a virtual environment and installing pandas into it as shown above.
Distribution
At some point you might want to give your code to someone else, possibly just a co-worker. This section gives the most basic technique of doing that. For more detailed information, you’ll need to consult the Python Packaging User Guide.
Creating a setup.py file
Add a setup.py
file to the top-level of your project directory.
# setup.py import setuptools setuptools.setup( name="porty", version="0.0.1", author="Your Name", author_email="you@example.com", description="Practical Python Code", packages=setuptools.find_packages(), )
Creating MANIFEST.in
If there are additional files associated with your project, specify them with a MANIFEST.in
file. For example:
# MANIFEST.in include *.csv
Put the MANIFEST.in
file in the same directory as setup.py
.
Creating a source distribution
To create a distribution of your code, use the setup.py
file. For example:
bash % python setup.py sdist
This will create a .tar.gz
or .zip
file in the directory dist/
. That file is something that you can now give away to others.
Installing your code
Others can install your Python code using pip
in the same way that they do for other packages. They simply need to supply the file created in the previous step. For example:
bash % python -m pip install porty-0.0.1.tar.gz
Commentary
The steps above describe the absolute most minimal basics of creating a package of Python code that you can give to another person. In reality, it can be much more complicated depending on third-party dependencies, whether or not your application includes foreign code (i.e., C/C++), and so forth. Covering that is outside the scope of this course. We’ve only taken a tiny first step.
Exercises
Exercise 9.5: Make a package
Take the porty-app/
code you created for Exercise 9.3 and see if you can recreate the steps described here. Specifically, add a setup.py
file and a MANIFEST.in
file to the top-level directory. Create a source distribution file by running python setup.py sdist
.
As a final step, see if you can install your package into a Python virtual environment.
The End!
You’ve made it to the end of the course. Thanks for your time and your attention. May your future Python hacking be fun and productive!
I’m always happy to get feedback. You can find me at https://dabeaz.com or on Twitter at [@dabeaz](https://twitter.com/dabeaz). - David Beazley.