Skip to main content

The ultimate guide to debugging in Python

Even if you write clear and readable code, even if you cover your code with tests, even if you are very experienced developer, weird bugs will inevitably appear and you will need to debug them in some way. Lots of people resort to just using bunch of print statements to see what's happening in their code. This approach is far from ideal and there are much better ways to find out what's wrong with your code, some of which we will explore in this article.


Logging is a Must

If you write application without some sort of logging setup you will eventually come to regret it. Not having any logs from your application can make it very difficult to troubleshoot any bugs. Luckily - in Python - setting up basic logger is very simple:


import logging

logging.basicConfig(

    filename='application.log',

    level=logging.WARNING,

    format= '[%(asctime)s] {%(pathname)s:%(lineno)d} %(levelname)s - %(message)s',

    datefmt='%H:%M:%S'

)


logging.error("Some serious error occurred.")

logging.warning('Function you are using is deprecated.')


This is all you need to start writing logs to file which will look something like this (you can find path to the file using logging.getLoggerClass().root.handlers[0].baseFilename):


[12:52:35] {<stdin>:1} ERROR - Some serious error occurred.

[12:52:35] {<stdin>:1} WARNING - Function you are using is deprecated.


This setup might seem like it's good enough (and often it is), but having well configured, formatted, readable logs can make your life so much easier. One way to improve and expand the config is to use .ini or .yaml file that gets read by logger. As an example for what you could do in your config:


version: 1

disable_existing_loggers: true


formatters:

  standard:

    format: "[%(asctime)s] {%(pathname)s:%(lineno)d} %(levelname)s - %(message)s"

    datefmt: '%H:%M:%S'


handlers:

  console:  # handler which will log into stdout

    class: logging.StreamHandler

    level: DEBUG

    formatter: standard  # Use formatter defined above

    stream: ext://sys.stdout

  file:  # handler which will log into file

    class: logging.handlers.RotatingFileHandler

    level: WARNING

    formatter: standard  # Use formatter defined above

    filename: /tmp/warnings.log

    maxBytes: 10485760 # 10MB

    backupCount: 10

    encoding: utf8


root:  # Loggers are organized in hierarchy - this is the root logger config

  level: ERROR

  handlers: [console, file]  # Attaches both handler defined above


loggers:  # Defines descendants of root logger

  mymodule:  # Logger for "mymodule"

    level: INFO

    handlers: [file]  # Will only use "file" handler defined above

    propagate: no  # Will not propagate logs to "root" logger


Having this kind of extensive config inside you python code would be hard to navigate, edit and maintain. Keeping things in YAML file makes it much easier to setup and tweak multiple loggers with very specific settings like the ones above.

So, having the config in the file now, means that we need to load is somehow. The simplest way to do so with YAML files:


import yaml

from logging import config


with open("config.yaml", 'rt') as f:

    config_data = yaml.safe_load(f.read())

    config.dictConfig(config_data)


Python logger doesn't actually support YAML files directly, but it supports dictionary configs, which can be easily created from YAML using yaml.safe_load. If you are inclined to rather use old .ini files, then I just want to point out that using dictionary configs is the recommended approach for new application as per docs. For more examples checkout the logging cookbook.


Logging Decorators


Continuing with the previous logging tip, you might get into a situation where you need log calls of some buggy function. Instead of modifying body of the said function, you could employ logging decorator which would log every function call with a specific log level and an optional message. Let's look at the decorator:


from functools import wraps, partial

import logging


def attach_wrapper(obj, func=None):  # Helper function that attaches function as attribute of an object

    if func is None:

        return partial(attach_wrapper, obj)

    setattr(obj, func.__name__, func)

    return func


def log(level, message):  # Actual decorator

    def decorate(func):

        logger = logging.getLogger(func.__module__)  # Setup logger

        formatter = logging.Formatter(

            '%(asctime)s - %(name)s - %(levelname)s - %(message)s')

        handler = logging.StreamHandler()

        handler.setFormatter(formatter)

        logger.addHandler(handler)

        log_message = f"{func.__name__} - {message}"


        @wraps(func)

        def wrapper(*args, **kwargs):  # Logs the message and before executing the decorated function

            logger.log(level, log_message)

            return func(*args, **kwargs)


        @attach_wrapper(wrapper)  # Attaches "set_level" to "wrapper" as attribute

        def set_level(new_level):  # Function that allows us to set log level

            nonlocal level

            level = new_level


        @attach_wrapper(wrapper)  # Attaches "set_message" to "wrapper" as attribute

        def set_message(new_message):  # Function that allows us to set message

            nonlocal log_message

            log_message = f"{func.__name__} - {new_message}"


        return wrapper

    return decorate


# Example Usage

@log(logging.WARN, "example-param")

def somefunc(args):

    return args


somefunc("some args")


somefunc.set_level(logging.CRITICAL)  # Change log level by accessing internal decorator function

somefunc.set_message("new-message")  # Change log message by accessing internal decorator function

somefunc("some args")


Not gonna lie, this one might take a bit to wrap your head around (you might want to just copy-paste it and use it). The idea here is that log function takes the arguments and makes them available to inner wrapper function. These arguments are then made adjustable by adding the accessor functions, which are attached to the decorator. As for the functools.wraps decorator - if we didn't use it here, name of the function (func.__name__) would get overwritten by name of the decorator. But that's a problem, because we want to print the name. This gets solved by functools.wraps as it copies function name, docstring and arguments list onto the decorator function.

Anyway, this is the output of above code. Pretty neat, right?


2020-05-01 14:42:10,289 - __main__ - WARNING - somefunc - example-param

2020-05-01 14:42:10,289 - __main__ - CRITICAL - somefunc - new-message


__repr__ For More Readable Logs


Easy improvement to your code that makes it more debuggable is adding __repr__ method to your classes. In case you're not familiar with this method - all it does is return string representation of the an instance of a class. Best practice with __repr__ method is to output text that could be used to recreate the instance. For example:


class Circle:

    def __init__(self, x, y, radius):

        self.x = x

        self.y = y

        self.radius = radius


    def __repr__(self):

        return f"Rectangle({self.x}, {self.y}, {self.radius})"


...

c = Circle(100, 80, 30)

repr(c)

# Circle(100, 80, 30)


If representing object as shown above is not desirable or not possible, good alternative is to use representation using <...>, e.g. <_io.TextIOWrapper name='somefile.txt' mode='w' encoding='UTF-8'>.

Apart from __repr__, it's also a good idea to implement __str__ method which is by default used when print(instance) is called. With these 2 methods you can get lots of information just by printing your variables.


__missing__ Dunder Method For Dictionaries


If you for whatever reason need to implement custom dictionary class, then you can expect some bugs arising from KeyErrors when you try to access some key that doesn't actually exist. To avoid having to poke around in the code and see which key is missing, you could implement special __missing__ method, which is called every time KeyError is raised.


class MyDict(dict):

    def __missing__(self, key):

        message = f'{key} not present in the dictionary!'

        logging.warning(message)

        return message  # Or raise some error instead


The implementation above is very simple and only returns and logs message with the missing key, but you could also log other valuable information to give you more context as to what went wrong in the code.

Debugging Crashing Application

If your application crashes before you get a chance to see what is going on in it, you might find this trick quite useful.


Running the application with-i argument (python3 -i app.py) causes it to start interactive shell as soon as the program exits. At that point you can inspect variables and functions.


If that's not good enough, you can bring a bigger hammer - pdb - Python Debugger. pdb has quite a few features which would warrant an article on its own. But here is example and a rundown of the most important bits. Let's first see our little crashing script:


# crashing_app.py

SOME_VAR = 42


class SomeError(Exception):

    pass


def func():

    raise SomeError("Something went wrong...")


func()


Now, if we run it with -i argument, we get a chance to debug it:


# Run crashing application

~ $ python3 -i crashing_app.py

Traceback (most recent call last):

  File "crashing_app.py", line 9, in <module>

    func()

  File "crashing_app.py", line 7, in func

    raise SomeError("Something went wrong...")

__main__.SomeError: Something went wrong...

>>> # We are interactive shell

>>> import pdb

>>> pdb.pm()  # start Post-Mortem debugger

> .../crashing_app.py(7)func()

-> raise SomeError("Something went wrong...")

(Pdb) # Now we are in debugger and can poke around and run some commands:

(Pdb) p SOME_VAR  # Print value of variable

42

(Pdb) l  # List surrounding code we are working with

  2

  3   class SomeError(Exception):

  4       pass

  5

  6   def func():

  7  ->     raise SomeError("Something went wrong...")

  8

  9   func()

[EOF]

(Pdb)  # Continue debugging... set breakpoints, step through the code, etc.


Debugging session above shows very briefly what you could do with pdb. After program terminates we enter interactive debugging session. First, we import pdb and start the debugger. At that point we can use all the pdb commands. As an example above, we print variable using p command and list code using l command. Most of the time you would probably want to set breakpoint which you can do with b LINE_NO and run the program until the breakpoint is hit (c) and then continue stepping through the function with s, optionally maybe printing stacktrace with w. For a full listing of commands you can go over to pdb docs.


Inspecting Stack Traces


Let's say your code is for example Flask or Django application running on remote server where you can't get interactive debugging session. In that case you can use traceback and sys packages to get more insight on what's failing in your code:


import traceback

import sys


def func():

    try:

        raise SomeError("Something went wrong...")

    except:

        traceback.print_exc(file=sys.stderr)


When ran, the code above will print the last exception that was raised. Apart from printing exceptions, you can also use traceback package to print stacktrace (traceback.print_stack()) or extract raw stack frame, format it and inspect it further (traceback.format_list(traceback.extract_stack())).

Reloading Modules During Debugging

Sometimes you might be debugging or experimenting with some function in interactive shell and making frequent changes to it. To make the cycle of running/testing and modifying easier, you can run importlib.reload(module) to avoid having to restart the interactive session after every change:


>>> import func from module

>>> func()

"This is result..."


# Make some changes to "func"

>>> func()

"This is result..."  # Outdated result

>>> from importlib import reload; reload(module)  # Reload "module" after changes made to "func"

>>> func()

"New result..."


This tip is more about efficiency than debugging. It's always nice to be able to skip a few unnecessary steps and make your workflow faster and more efficient. In general, reloading modules from time to time is good idea, as it can help you avoid trying to debug code that was already modified bunch of times in the meantime.


Debugging is an Art.


Conclusion


Most of the time, what programming really is - is just a lot of trial and error. Debugging on the other hand is - in my opinion - an Art and becoming good at it takes time and experience - the more you know the libraries or framework you use, the easier it gets. Tips and tricks listed above can make your debugging a bit more efficient and faster, but apart from these Python specific tools you might want to familiarize yourself with general approaches to debugging - for example The Art of Debugging by Remy Sharp.

Comments

Popular posts from this blog

How the Python import system works

How the Python import system works From:  https://tenthousandmeters.com/blog/python-behind-the-scenes-11-how-the-python-import-system-works/ If you ask me to name the most misunderstood aspect of Python, I will answer without a second thought: the Python import system. Just remember how many times you used relative imports and got something like  ImportError: attempted relative import with no known parent package ; or tried to figure out how to structure a project so that all the imports work correctly; or hacked  sys.path  when you couldn't find a better solution. Every Python programmer experienced something like this, and popular StackOverflow questions, such us  Importing files from different folder  (1822 votes),  Relative imports in Python 3  (1064 votes) and  Relative imports for the billionth time  (993 votes), are a good indicator of that. The Python import system doesn't just seem complicated – it is complicated. So even though the  documentation  is really good, it d

On working remote

The last company I worked for, did have an office space, but the code was all on Github, infra on AWS, we tracked issues over Asana and more or less each person had at least one project they could call "their own" (I had a bunch of them ;-)). This worked pretty well. And it gave me a feeling that working remote would not be very different from this. So when we started working on our own startup, we started with working from our homes. It looked great at first. I could now spend more time with Mom and could work at leisure. However, it is not as good as it looks like. At times it just feels you are busy without business, that you had been working, yet didn't achieve much. If you are evaluating working from home and are not sure of how to start, or you already do (then please review and add your views in comments) and feel like you were better off in the office, do read on. Remote work is great. But a physical office is better. So if you can, find yourself a co-working s

Todo lists are overrated

My tasks come from a variety of sources: 1) Tasks from emails  2) Meeting notes with details of people who participated  3) Project related tasks that can have a long format and can be tagged/ delegated  4) Scratchpad for unrefined ideas  5) Detailed documentation for completed technical tasks / ideas  6) FIFO list of high priority small daily tasks No one app has been able to map all the requirements above, and I have tried a lot of them! In my lifetime I’ve tried a dozen todo apps. In the beginning they all seem different, novel and special. Slick UI, shortcuts, tags, subtasks, the list goes on and on. But all our stories were the same: I start using the new app, then after awhile I stop using it. Up until the last week I thought the problem was in myself (you probably think so too). After all, David Allen seems to have figured this shit out. Also there are people leaving long 5 star reviews on every major todo list app, they discuss them on forums, recommend them to friends. But the