Tutorial:PracticalPython/8 Testing debugging

From HandWiki


Testing. Overview

This section introduces a few basic topics related to testing, logging, and debugging.

Testing

Testing Rocks, Debugging Sucks

The dynamic nature of Python makes testing critically important to most applications. There is no compiler to find your bugs. The only way to find bugs is to run the code and make sure you try out all of its features.

Assertions

The assert statement is an internal check for the program. If an expression is not true, it raises a AssertionError exception.

assert statement syntax.

assert <expression> [, 'Diagnostic message']

For example.

assert isinstance(10, int), 'Expected int'

It shouldn’t be used to check the user-input (i.e., data entered on a web form or something). It’s purpose is more for internal checks and invariants (conditions that should always be true).

Contract Programming

Also known as Design By Contract, liberal use of assertions is an approach for designing software. It prescribes that software designers should define precise interface specifications for the components of the software.

For example, you might put assertions on all inputs of a function.

def add(x, y):
    assert isinstance(x, int), 'Expected int'
    assert isinstance(y, int), 'Expected int'
    return x + y

Checking inputs will immediately catch callers who aren’t using appropriate arguments.

>>> add(2, 3)
5
>>> add('2', '3')
Traceback (most recent call last):
...
AssertionError: Expected int
>>>

Inline Tests

Assertions can also be used for simple tests.

def add(x, y):
    return x + y

assert add(2,2) == 4

This way you are including the test in the same module as your code.

Benefit: If the code is obviously broken, attempts to import the module will crash.

This is not recommended for exhaustive testing. It’s more of a basic “smoke test”. Does the function work on any example at all? If not, then something is definitely broken.

unittest Module

Suppose you have some code.

# simple.py

def add(x, y):
    return x + y

Now, suppose you want to test it. Create a separate testing file like this.

# test_simple.py

import simple
import unittest

Then define a testing class.

# test_simple.py

import simple
import unittest

# Notice that it inherits from unittest.TestCase
class TestAdd(unittest.TestCase):
    ...

The testing class must inherit from unittest.TestCase.

In the testing class, you define the testing methods.

# test_simple.py

import simple
import unittest

# Notice that it inherits from unittest.TestCase
class TestAdd(unittest.TestCase):
    def test_simple(self):
        # Test with simple integer arguments
        r = simple.add(2, 2)
        self.assertEqual(r, 5)
    def test_str(self):
        # Test with strings
        r = simple.add('hello', 'world')
        self.assertEqual(r, 'helloworld')
  • Important: Each method must start with test.

Using unittest

There are several built in assertions that come with unittest. Each of them asserts a different thing.

# Assert that expr is True
self.assertTrue(expr)

# Assert that x == y
self.assertEqual(x,y)

# Assert that x != y
self.assertNotEqual(x,y)

# Assert that x is near y
self.assertAlmostEqual(x,y,places)

# Assert that callable(arg1,arg2,...) raises exc
self.assertRaises(exc, callable, arg1, arg2, ...)

This is not an exhaustive list. There are other assertions in the module.

Running unittest

To run the tests, turn the code into a script.

# test_simple.py

...

if __name__ == '__main__':
    unittest.main()

Then run Python on the test file.

bash % python3 test_simple.py
F.
========================================================
FAIL: test_simple (__main__.TestAdd)
--------------------------------------------------------
Traceback (most recent call last):
  File "testsimple.py", line 8, in test_simple
    self.assertEqual(r, 5)
AssertionError: 4 != 5
--------------------------------------------------------
Ran 2 tests in 0.000s
FAILED (failures=1)

Commentary

Effective unit testing is an art and it can grow to be quite complicated for large applications.

The unittest module has a huge number of options related to test runners, collection of results and other aspects of testing. Consult the documentation for details.

Third Party Test Tools

The built-in unittest module has the advantage of being available everywhere–it’s part of Python. However, many programmers also find it to be quite verbose. A popular alternative is pytest. With pytest, your testing file simplifies to something like the following:

# test_simple.py
import simple

def test_simple():
    assert simple.add(2,2) == 4

def test_str():
    assert simple.add('hello','world') == 'helloworld'

To run the tests, you simply type a command such as python -m pytest. It will discover all of the tests and run them.

There’s a lot more to pytest than this example, but it’s usually pretty easy to get started should you decide to try it out.

Exercises

In this exercise, you will explore the basic mechanics of using Python’s unittest module.

In earlier exercises, you wrote a file stock.py that contained a Stock class. For this exercise, it assumed that you’re using the code written for [[../07_Advanced_Topics/03_Returning_functions|Exercise 7.9]] involving typed-properties. If, for some reason, that’s not working, you might want to copy the solution from Solutions/7_9 to your working directory.

Exercise 8.1: Writing Unit Tests

In a separate file test_stock.py, write a set a unit tests for the Stock class. To get you started, here is a small fragment of code that tests instance creation:

# test_stock.py

import unittest
import stock

class TestStock(unittest.TestCase):
    def test_create(self):
        s = stock.Stock('GOOG', 100, 490.1)
        self.assertEqual(s.name, 'GOOG')
        self.assertEqual(s.shares, 100)
        self.assertEqual(s.price, 490.1)

if __name__ == '__main__':
    unittest.main()

Run your unit tests. You should get some output that looks like this:

.
----------------------------------------------------------------------
Ran 1 tests in 0.000s

OK

Once you’re satisifed that it works, write additional unit tests that check for the following:

  • Make sure the s.cost property returns the correct value (49010.0)
  • Make sure the s.sell() method works correctly. It should decrement the value of s.shares accordingly.
  • Make sure that the s.shares attribute can’t be set to a non-integer value.

For the last part, you’re going to need to check that an exception is raised. An easy way to do that is with code like this:

class TestStock(unittest.TestCase):
    ...
    def test_bad_shares(self):
         s = stock.Stock('GOOG', 100, 490.1)
         with self.assertRaises(TypeError):
 
= Logging =

This section briefly introduces the logging module.

=== logging Module ===

The <code>logging</code> module is a standard library module for recording diagnostic information. It’s also a very large module with a lot of sophisticated functionality. We will show a simple example to illustrate its usefulness.

=== Exceptions Revisited ===

In the exercises, we wrote a function <code>parse()</code> that looked something like this:

<source lang="python"># fileparse.py
def parse(f, types=None, names=None, delimiter=None):
    records = []
    for line in f:
        line = line.strip()
        if not line: continue
        try:
            records.append(split(line,types,names,delimiter))
        except ValueError as e:
            print("Couldn't parse :", line)
            print("Reason :", e)
    return records

Focus on the try-except statement. What should you do in the except block?

Should you print a warning message?

try:
    records.append(split(line,types,names,delimiter))
except ValueError as e:
    print("Couldn't parse :", line)
    print("Reason :", e)

Or do you silently ignore it?

try:
    records.append(split(line,types,names,delimiter))
except ValueError as e:
    pass

Neither solution is satisfactory because you often want both behaviors (user selectable).

Using logging

The logging module can address this.

# fileparse.py
import logging
log = logging.getLogger(__name__)

def parse(f,types=None,names=None,delimiter=None):
    ...
    try:
        records.append(split(line,types,names,delimiter))
    except ValueError as e:
        log.warning("Couldn't parse : %s", line)
        log.debug("Reason : %s", e)

The code is modified to issue warning messages or a special Logger object. The one created with logging.getLogger(__name__).

Logging Basics

Create a logger object.

log = logging.getLogger(name)   # name is a string

Issuing log messages.

log.critical(message [, args])
log.error(message [, args])
log.warning(message [, args])
log.info(message [, args])
log.debug(message [, args])

Each method represents a different level of severity.

All of them create a formatted log message. args is used with the % operator to create the message.

logmsg = message % args # Written to the log

Logging Configuration

The logging behavior is configured separately.

# main.py

...

if __name__ == '__main__':
    import logging
    logging.basicConfig(
        filename  = 'app.log',      # Log output file
        level     = logging.INFO,   # Output level
    )

Typically, this is a one-time configuration at program startup. The configuration is separate from the code that makes the logging calls.

Comments

Logging is highly configurable. You can adjust every aspect of it: output files, levels, message formats, etc. However, the code that uses logging doesn’t have to worry about that.

Exercises

Exercise 8.2: Adding logging to a module

In fileparse.py, there is some error handling related to exceptions caused by bad input. It looks like this:

# fileparse.py
import csv

def parse_csv(lines, select=None, types=None, has_headers=True, delimiter=',', silence_errors=False):
    '''
    Parse a CSV file into a list of records with type conversion.
    '''
    if select and not has_headers:
        raise RuntimeError('select requires column headers')

    rows = csv.reader(lines, delimiter=delimiter)

    # Read the file headers (if any)
    headers = next(rows) if has_headers else []

    # If specific columns have been selected, make indices for filtering and set output columns
    if select:
        indices = [ headers.index(colname) for colname in select ]
        headers = select

    records = []
    for rowno, row in enumerate(rows, 1):
        if not row:     # Skip rows with no data
            continue

        # If specific column indices are selected, pick them out
        if select:
            row = [ row[index] for index in indices]

        # Apply type conversion to the row
        if types:
            try:
                row = [func(val) for func, val in zip(types, row)]
            except ValueError as e:
                if not silence_errors:
                    print(f"Row {rowno}: Couldn't convert {row}")
                    print(f"Row {rowno}: Reason {e}")
                continue

        # Make a dictionary or a tuple
        if headers:
            record = dict(zip(headers, row))
        else:
            record = tuple(row)
        records.append(record)

    return records

Notice the print statements that issue diagnostic messages. Replacing those prints with logging operations is relatively simple. Change the code like this:

# fileparse.py
import csv
import logging
log = logging.getLogger(__name__)

def parse_csv(lines, select=None, types=None, has_headers=True, delimiter=',', silence_errors=False):
    '''
    Parse a CSV file into a list of records with type conversion.
    '''
    if select and not has_headers:
        raise RuntimeError('select requires column headers')

    rows = csv.reader(lines, delimiter=delimiter)

    # Read the file headers (if any)
    headers = next(rows) if has_headers else []

    # If specific columns have been selected, make indices for filtering and set output columns
    if select:
        indices = [ headers.index(colname) for colname in select ]
        headers = select

    records = []
    for rowno, row in enumerate(rows, 1):
        if not row:     # Skip rows with no data
            continue

        # If specific column indices are selected, pick them out
        if select:
            row = [ row[index] for index in indices]

        # Apply type conversion to the row
        if types:
            try:
                row = [func(val) for func, val in zip(types, row)]
            except ValueError as e:
                if not silence_errors:
                    log.warning("Row %d: Couldn't convert %s", rowno, row)
                    log.debug("Row %d: Reason %s", rowno, e)
                continue

        # Make a dictionary or a tuple
        if headers:
            record = dict(zip(headers, row))
        else:
            record = tuple(row)
        records.append(record)

    return records

Now that you’ve made these changes, try using some of your code on bad data.

>>> import report
>>> a = report.read_portfolio('Data/missing.csv')
Row 4: Bad row: ['MSFT', '', '51.23']
Row 7: Bad row: ['IBM', '', '70.44']
>>>

If you do nothing, you’ll only get logging messages for the WARNING level and above. The output will look like simple print statements. However, if you configure the logging module, you’ll get additional information about the logging levels, module, and more. Type these steps to see that:

>>> import logging
>>> logging.basicConfig()
>>> a = report.read_portfolio('Data/missing.csv')
WARNING:fileparse:Row 4: Bad row: ['MSFT', '', '51.23']
WARNING:fileparse:Row 7: Bad row: ['IBM', '', '70.44']
>>>

You will notice that you don’t see the output from the log.debug() operation. Type this to change the level.

>>> logging.getLogger('fileparse').level = logging.DEBUG
>>> a = report.read_portfolio('Data/missing.csv')
WARNING:fileparse:Row 4: Bad row: ['MSFT', '', '51.23']
DEBUG:fileparse:Row 4: Reason: invalid literal for int() with base 10: ''
WARNING:fileparse:Row 7: Bad row: ['IBM', '', '70.44']
DEBUG:fileparse:Row 7: Reason: invalid literal for int() with base 10: ''
>>>

Turn off all, but the most critical logging messages:

>>> logging.getLogger('fileparse').level=logging.CRITICAL
>>> a = report.read_portfolio('Data/missing.csv')
>>>

Exercise 8.3: Adding Logging to a Program

To add logging to an application, you need to have some mechanism to initialize the logging module in the main module. One way to do this is to include some setup code that looks like this:

# This file sets up basic configuration of the logging module.
# Change settings here to adjust logging output as needed.
import logging
logging.basicConfig(
    filename = 'app.log',            # Name of the log file (omit to use stderr)
    filemode = 'w',                  # File mode (use 'a' to append)
    level    = logging.WARNING,      # Logging level (DEBUG, INFO, WARNING, ERROR, or CRITICAL)
)

Again, you’d need to put this someplace in the startup steps of your program. For example, where would you put this in your report.py program?

Debugging

Debugging Tips

So, you’re program has crashed…

bash % python3 blah.py
Traceback (most recent call last):
  File "blah.py", line 13, in ?
    foo()
  File "blah.py", line 10, in foo
    bar()
  File "blah.py", line 7, in bar
    spam()
  File "blah.py", 4, in spam
    line x.append(3)
AttributeError: 'int' object has no attribute 'append'

Now what?!

Reading Tracebacks

The last line is the specific cause of the crash.

bash % python3 blah.py
Traceback (most recent call last):
  File "blah.py", line 13, in ?
    foo()
  File "blah.py", line 10, in foo
    bar()
  File "blah.py", line 7, in bar
    spam()
  File "blah.py", 4, in spam
    line x.append(3)
# Cause of the crash
AttributeError: 'int' object has no attribute 'append'

However, it’s not always easy to read or understand.

PRO TIP: Paste the whole traceback into Google.

Using the REPL

Use the option -i to keep Python alive when executing a script.

bash % python3 -i blah.py
Traceback (most recent call last):
  File "blah.py", line 13, in ?
    foo()
  File "blah.py", line 10, in foo
    bar()
  File "blah.py", line 7, in bar
    spam()
  File "blah.py", 4, in spam
    line x.append(3)
AttributeError: 'int' object has no attribute 'append'
>>>

It preserves the interpreter state. That means that you can go poking around after the crash. Checking variable values and other state.

Debugging with Print

print() debugging is quite common.

Tip: Make sure you use repr()

def spam(x):
    print('DEBUG:', repr(x))
    ...

repr() shows you an accurate representation of a value. Not the nice printing output.

>>> from decimal import Decimal
>>> x = Decimal('3.4')
# NO `repr`
>>> print(x)
3.4
# WITH `repr`
>>> print(repr(x))
Decimal('3.4')
>>>

The Python Debugger

You can manually launch the debugger inside a program.

def some_function():
    ...
    breakpoint()      # Enter the debugger (Python 3.7+)
    ...

This starts the debugger at the breakpoint() call.

In earlier Python versions, you did this. You’ll sometimes see this mentioned in other debugging guides.

import pdb
...
pdb.set_trace()       # Instead of `breakpoint()`
...

Run under debugger

You can also run an entire program under debugger.

bash % python3 -m pdb someprogram.py

It will automatically enter the debugger before the first statement. Allowing you to set breakpoints and change the configuration.

Common debugger commands:

(Pdb) help            # Get help
(Pdb) w(here)         # Print stack trace
(Pdb) d(own)          # Move down one stack level
(Pdb) u(p)            # Move up one stack level
(Pdb) b(reak) loc     # Set a breakpoint
(Pdb) s(tep)          # Execute one instruction
(Pdb) c(ontinue)      # Continue execution
(Pdb) l(ist)          # List source code
(Pdb) a(rgs)          # Print args of current function
(Pdb) !statement      # Execute statement

For breakpoints location is one of the following.

(Pdb) b 45            # Line 45 in current file
(Pdb) b file.py:45    # Line 34 in file.py
(Pdb) b foo           # Function foo() in current file
(Pdb) b module.foo    # Function foo() in a module

Exercises

Exercise 8.4: Bugs? What Bugs?

It runs. Ship it!