A better software architecture for SaaS startups and consumer apps
Source: https://alexkrupp.typepad.com/sensemaking/2021/06/django-for-startup-founders-a-better-software-architecture-for-saas-startups-and-consumer-apps.html
The code snippets in this guide are written using Python, Django, and DRF, but they're purposely designed to be understandable even if you don't know much Python or Django. I've created a full app using these code snippets to ensure they work, which is published here. The reason for choosing Python is that it's generally the best language for startups with less than $100M in ARR; as the so-called second-best language for everything, Python gives startups the most optionality for experimenting with different product functionality and monetization models. It's also easy to hire for. I do discuss some issues specific to Django, but most of this advice is applicable regardless of what language and web framework you're using.
Predictability
Everyone knows that creating software is insanely expensive, and yet almost everyone still vastly underestimates its total cost of ownership.
According to various academic and industry estimates, 60% - 80% of the cost associated with any given line of code is incurred as maintenance after that line is originally written. This is due to bugs, changing feature requirements, updates to dependencies, unmaintained dependencies that need to be replaced or maintained in-house, etc. As such, it should be easy to understand why the best type of code is usually the code that never gets written in the first place.
What about the second-best type of code though?
For context, it's important to understand that in most codebases, the bulk of maintenance costs aren't inevitable, but rather are because the existing code is written in such a way that makes it time consuming to read and understand. When developers new to a codebase are tasked with making even a basic change, it often takes days or weeks of research before the first lines of code get written or edited. This is true even for competent and experienced developers, so it's for this reason that the next best type of code is code that's utterly predictable.
What do we mean by this?
Essentially, each REST endpoint should perform the same basic steps in the same order. This way if a person has enough knowledge to understand how any one endpoint works, they should have enough knowledge to understand how every endpoint works. And if they're tasked with fixing a bug or adding a feature, there should be a simple step-by-step process they can use each time to find exactly where in the codebase they need to modify or extend the code.
Rule #1: Every endpoint should tell a story
Consider how every newspaper article follows the same basic inverted pyramid format, and answers the same basic questions of who, what, where, when, why, and how. This convention doesn't in any way limit what topics journalists can write about or put any limit on the sophistication of their ideas; it just makes newspaper articles super easy to read even for folks at a middle school reading level, and makes it easy to remember the main ideas well enough to share with your friends.
This sort of clarity and lucidity is not just something we can — and should — aspire to within our code. Rather, it's something that we can actually get 85% of the way toward achieving just by making each of our endpoints follow the same predictable pattern.
What pattern is this?
The key insight is that there are up to seven basic steps that any given REST endpoint will perform. For many endpoints, there are multiple different orders in which you could perform these steps, but there also happens to exist one specific order that always works for every endpoint. And as it turns out, there are substantial benefits to always performing these seven steps, or at least the subset that are necessary for any given endpoint, in the same way and in the same order:
- Specify permissions — Who is allowed to access this endpoint?
- Copy input to local variables — What parameters (query params or body params) does this endpoint take?
- Sanitize user input — User input must always be sanitized before further processing.
- Validate user input — Ensure the user has supplied all of the same required parameters for this endpoint in the correct format. If there are errors, aggregate all of the input validation errors into a dictionary-style response as described below.
- Enforce business requirements — Check for cases where the user is allowed to access the endpoint and submitted all of the required parameters correctly, but they aren't allowed to perform a specific action due to the business logic of the application. E.g. creating an account with a username that has already been taken. If there is an error, return the first error that occurs, as described below.
- Perform business logic — Do whatever this endpoint is actually supposed to do, e.g. altering state in the database, returning data to the API consumer, sending data to a third-party processor, etc.
- Return HTTP response — Return any data necessary for the API consumer(s), along with a status code.
This might strike some as being overly prescriptive and formulaic. But in much the same way that no one would want to go back to the age of heroic medicine, embracing a subtler, more prosaic style comes with many benefits — vastly reducing the total cost of ownership of your codebase and increasing developer velocity more than anything else you could possibly implement. How? Quite simply, it's because:
- New code should never go live without first being audited for correctness and security, among other things.
- It's much faster to determine whether things have been done correctly if they always happen in the same order. For example, if you see that the user input hasn't been sanitized right at the beginning of a view, you don't need to read through all the methods that implement the business logic to see whether it's been done later — you already know that it's been done incorrectly and that there is a security problem.
- In the future, fixing bugs, changing business logic, and doing maintenance will all require knowing where in the code to make changes. And if the implementation of each endpoint follows the same pattern, there's no need to waste hours or days reading through code that isn't relevant to what you need to accomplish.
As with our newspaper example above, ensuring that each endpoint conforms to this pattern will vastly reduce the time (and cost) required each time a reader needs to uncover the Five Ws and How of that endpoint, e.g. in order to make any necessary changes. Let's look at an example code snippet for creating a new user account, written using Django and Django Rest Framework, and then walk through it step-by-step:
class User(APIView):
permission_classes = ( AccountCreation, )
# Create account
def post(self, request):
unsafe_username = request.data.get("username", "")
unsafe_email_address = request.data.get("email_address", "")
unsafe_terms_of_service_accepted = request.data.get("terms_of_service_accepted", None)
unsafe_password = request.data.get("password", "")
sanitized_username = sanitization_utils.strip_xss(unsafe_username)
sanitized_email_address = sanitization_utils.strip_xss(unsafe_email_address)
sanitized_terms_of_service_accepted = sanitization_utils.string_to_boolean(unsafe_terms_of_service_accepted)
try:
user_model, auth_token = account_management_service.create_account(
sanitized_username,
sanitized_email_address,
unsafe_password,
sanitized_terms_of_service_accepted,
)
except marshmallow.exceptions.ValidationError as e:
return get_validation_error_response(validation_error=e, http_status_code=422)
except custom_errors.UsernameAlreadyExistsError as e:
return get_business_requirement_error_response(business_logic_error=e, http_status_code=409)
except custom_errors.EmailAddressAlreadyExistsError as e:
return get_business_requirement_error_response(business_logic_error=e, http_status_code=409)
except custom_errors.TermsNotAcceptedError as e:
return get_business_requirement_error_response(business_logic_error=e, http_status_code=429)
resp = { "data": { "auth_token": auth_token } }
return Response(data=resp, status=201)
So what exactly is happening here? Let's go through this line-by-line and discuss what makes this code snippet good.
- Line 2: Specify who has access to each endpoint in the view. The important thing here is that the permissions for each endpoint are explicitly specified at the top of each view, not implicitly inherited from your Django settings file. This makes it easy to understand what's happening, audit the codebase for security issues, and make any changes if necessary in the future.
- Lines 6 - 9: Copy each input parameter into a new variable. This makes it easy for anyone, including front-end developers and product managers, to figure out what input each endpoint is expecting — even before any documentation has been written. We prefix each variable with
unsafe_
to draw attention to the fact that they haven't yet been sanitized. - Lines 11 - 13: Always sanitize user input for security vulnerabilities.[4] The one common exception to this rule is with passwords; they're only stored as hashes so there isn't an XSS vector, and accidentally stripping complexity from a password would be its own security risk. But aside from this one exception, user input should always be sanitized before doing anything else with it — validation, business logic, storage, presentation, etc. The reasons for this are explained in Rule #4.
- Lines 16-22: What the endpoint actually does; in this case, creating a user account. This is the method we would need to dig into if we wanted to understand the implementation details of how accounts are actually created. One quick thing to note is that we're calling
account_management_service.create_account(...)
rather than importing the service methodcreate_account
fromaccount_management_service
and calling it directly. This lets the reader immediately know which file they'd need to open to see the method's implementation, without scrolling to the top of the current file or using IDE. Whereas when functions are called from their files, the filenames serve as advertisements for the rest of the codebase. Getting constant reminders of where each function lives makes it more likely that developers will add new code to the right places, rather than putting it in the wrong place or, even worse, duplicating code that already exists. - Lines 23 - 30: All of the possible errors the endpoint can return. These are discussed in more detail under Rule #10.
- Lines 32 - 33: What the endpoint returns.
The reason for writing our views this way is that it makes it easy to answer the Five Ws for each endpoint. That is, we can immediately see who has access to the endpoint, what input the endpoint accepts, how that input is sanitized, where the business logic is performed, what errors get returned when things go sideways, and the why of the endpoint — what a successful response looks like.
Let's contrast this to an example that's basically straight from the tutorial for Django Rest Framework:
class User(APIView):
def post(self, request):
serializer = UserSerializer(data=request.data)
if serializer.is_valid():
serializer.save()
return Response(serializer.data, status=status.HTTP_201_CREATED)
return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)
There's no way to know by looking at this who is allowed to access the endpoint, what input the endpoint accepts, whether that input is properly sanitized, what data it returns, what the possible errors are, etc. And it's not even just a matter of looking at the code for UserSerializer to figure this out; for any real-world application the answers to these questions may well be buried multiple levels deep, and getting adequate answers may take not just hours but days.
If using GenericViews, the situation is even more opaque:
class User(generics.CreateAPIView):
queryset = User.objects.all()
serializer_class = UserSerializer
Call me old fashioned, but I believe that:
- Code should be a step-by-step set of instructions that's meant to be read in order, like a recipe
- You should be able to understand what code is doing by reading it, or at least have pretty good intuition about what's happening.
Here you're effectively just using a weird DSL to specify behavior by setting class properties and overriding methods. This makes sense for something like the Django Admin, where you're just configuring a GUI. But for actual application development, GenericViews are the exact opposite of what good code should be like.
There may be some legitimate use cases for writing code like this, e.g. for a hackathon, or a client demo, or any other rapid prototype that won't be deployed publicly. The problem is that a lot of developers seem to take the fact that this tooling exists as an implicit recommendation that they should be writing production code this way. This issue is then compounded by all tutorials and Stack Overflow answers contributed by folks who may be using their web frameworks merely as fun technical toys, or at least in ways that are wildly different than anything grounded in any sort of business context. Basically GenericViews are best avoided, and overall I think the Django community would benefit from moving them from core Django into a contrib package.
Rule #2: Keep business logic in services
There are three different places where you can put business logic in Django: in models or model managers, in forms or serializers, and in services. I'm pretty strongly of the opinion that business logic should only ever go in services. I'll explain why, but first let's explain what services are and then walk through an example.
A 'service' is just a file with a bunch of functions that contain all the business logic related to some part of your app. E.g. for FWD:Everyone, some services we have are account_management_service.py
, thread_upload_service.py
, content_discovery_service.py
, etc. Each service file usually corresponds closely with a views file, e.g. account_management_views.py
or thread_upload_views.py
. Services live in between the views and the models, so that the request-response lifecycle for each endpoint looks like:
- An HTTP request to a URL gets routed to a view.
- The view copies any query parameters or body data into local variables, sanitizes these local variables for security issues, then passes the sanitized variables to a service.
- The service validates user input (e.g. using serializers), enforces business requirements, performs all business logic, and then returns any output or errors to the views.
- The view then returns an HTTP response to the API consumer.
As an example, let's look at one possible implementation for account_management_service.create_account
, the service method from the previous section:
def create_account(sanitized_username, sanitized_email_address, unsafe_password,
sanitized_terms_of_service_accepted):
fields_to_validate_dict = {
"username": sanitized_username,
"email_address": sanitized_email_address,
"password": unsafe_password,
"terms_of_service": sanitized_terms_of_service_accepted,
}
AccountCreationValidator().load(fields_to_validate_dict)
nfc_username = unicodedata.normalize("NFC", sanitized_username)
nfkc_username = unicodedata.normalize("NFKC", sanitized_username).casefold()
nfc_email_address = unicodedata.normalize("NFC", sanitized_email_address)
nfkc_email_address = unicodedata.normalize("NFKC", sanitized_email_address).casefold()
if User.objects.filter(nfkc_username=nfkc_username).exists():
raise custom_errors.UsernameAlreadyExistsError()
if EmailAddress.objects.filter(nfkc_email_address=nfkc_email_address, is_verified=True).exists():
raise custom_errors.EmailAddressAlreadyExistsError()
if not sanitized_terms_of_service_accepted:
raise custom_errors.TermsNotAcceptedError()
with transaction.atomic():
user_model = User.objects.create_user(
nfkc_username=nfkc_username,
nfc_username=nfc_username,
nfkc_primary_email_address=nfkc_email_address,
password=unsafe_password,
)
user_model.full_clean()
user_model.save()
update_or_create_email_address(
user_model,
nfc_email_address,
nfkc_email_address,
is_primary=True,
is_verified=False,
)
communication_service.send_user_account_activation_email(user_model=user_model)
# Return an auth token so that the front end doesn't need to do another round trip to log in the user.
auth_token = token_utils.manually_generate_auth_token(user_model)
return ( user_model, auth_token, )
At first this looks pretty long, but it's actually only doing a handful of things:
- Lines 4 - 11: Validate that user input is in the correct format. E.g. ensure that the password has enough characters, the email address looks at least something like an email address, etc. In this case I'm doing this using DRF's serializer classes. I'll discuss the pros and cons of using serializers later. If there is at least one validation error, raise an error whose value is a dictionary containing all the validation errors returned by the serializer validation.
- Line 13 - 17: Normalize any Unicode text that needs to be normalized, e.g. so that things like usernames and email addresses can be stored as both NFC and NFKC. Basically NFC is how you normalize text that's meant to be displayed to a user on the web, and NFKC is how you normalize text that's used for searching and guaranteeing uniqueness.
- Lines 19 - 26: Raise the first business logic error that occurs, if there are any. You can wrap the atomic transaction block in a
try
statement if you care about delivering a proper error message in the event of a race condition; in most cases though, it's better to just make your peace with delivering non-specific error messages during race conditions, in favor of easier-to-read code. - Lines 28 - 44: Persist the data to the database.
- Line 51: Return a successful response to the view.
The reason I like putting the business logic into services is that it maximizes for lucidity. As with our views, every service method should tell a story about the most important things that are happening. There are some places where listing out the full step-by-step minutiae would take away from the story by making it more difficult to understand; we prevent this from happening by using helper methods to encapsulate the details. An example of this is the private helper method _is_account_being_created_automatically
, where we purposely excluded the details from the main service method so as to help to make the big picture more clear. Note that it gets exponentially harder to understand code that descends through function after function, so try to avoid having your helper functions call other helper functions. For endpoints with thousands of lines of business logic, this is sometimes unavoidable, or at least the least-worst way forward. But if the percentage of endpoints with deeply nested logic starts to get non-trivial, there may be a serious architecture problem that should be remediated.
So why use services rather than putting the business logic into fat models, or model managers?
As you can see above, very little of the business logic has to do with the database. In theory we could put only that part of the logic on the user model, but we shouldn't — that's the most important part of the story, and we don't want to hide it inside another function! Why not copy-paste this entire function onto the model? For a few reasons:
- The way you configure Django models is by setting properties and overriding methods. When you mix business logic with Django configuration, it gets super difficult to know which is which. E.g. when you're looking at a property on the class, is that a configuration option for the model that comes from Django, or just some variable that the application is using to hold state? Mixing application code with Django code is confusing, especially for anyone without multiple years of Django experience.
- There's no good way to interact with more than one model in a single endpoint. Here is a must-read explanation of this problem. Consider that in the code above, not only are we creating the user model, but we're also storing the email address in a second model, and storing the notification preferences in a third model. The way we wrote the business logic in our service is crisp and logical, whereas trying to do the same thing using a "fat models" approach would immediately break encapsulation, lead to tons of duplicated code, and just generally make the codebase much more difficult to understand and work with.
Putting the business logic in serializers (called forms in vanilla Django) has more-or-less the same problems as putting business logic in models, but even worse. First, DRF serializers are more difficult to understand than models to begin with just due to having a weird API — adding business logic into the mix makes it that much harder to understand both what the serializer is doing and what the business logic is doing. You also lose most of the potential for your business logic to be reusable by coupling it with the code the validates the user input for a specific endpoint, as explained in the next section.
This Hacker News comment summarizes the situation well:
The "Fat Models" recommendation is one of the most destructive in my opinion, along with Django Rest Framework "Model Serializers". A JSON serializer that talks directly to the database is just madness.
Rule #3: Make services the locus of reusability
Service methods should be reused across your project wherever the same functionality is needed — e.g. in other views, services, and the Django Admin.
For example, in account_management_service.create_account
, there was a call to the communication_service.set_default_communication_preferences
method. For most SaaS startups, there are going to be many places and ways that communication preferences can be created or updated, and lots of business logic related to sending out emails, SMS, RCS, etc. We don't want developers to duplicate code each time they need to perform one of these business actions, nor should they need to understand the implementation details of the communications infrastructure in order to update preferences, send messages, etc.
Given these considerations, it makes sense to encapsulate all of this logic in one place, e.g. communication_service.py
. Once we've made the decision to encapsulate our communication logic this way, we wouldn't then want to directly interact with the CommunicationPreference
model from account_management_service.create_account
, because this would break that encapsulation. Rather, we treat each service as a public interface that hides the implementation details.
Note that I'm definitely not saying that each model's state should only ever be read or modified by one service file. To give a real example of a case where one model is updated by many different services, in the code for FWD:Everyone there are several places where an EmailThreadModel
is retrieved or updated:
- In
upload_thread_service.py
, which contains the business logic involved with uploading a new email thread. - In
read_thread_service.py
, which contains the business logic for retrieving a thread from the database and displaying it to the user. - In
content_discovery_service.py
, which recommends email threads that users might be interested in reading.
The reason this is works is that each of these services is purposely encapsulating some subset of the business logic for interacting with a EmailThreadModel
, rather than breaking encapsulation.
What you don't ever want to happen, and what services help us avoid, is needlessly duplicating large chunks of business logic multiple times. E.g. if your app also allows user accounts to be created using the Django admin, we wouldn't want to do this by recreating some version of the logic from create_user
. Instead, users should be created using exactly the same code, so as to keep the code maintainable, not introduce bugs, and not require testing in multiple places. If there are some slight differences in the required business logic for when a user account is created via the website vs. using the admin, then the create_user
method should take some extra keyword arguments to account for this.
Similarly, if there were no communication_service.py
then it would be fine to put the methods for setting communications preferences in account_management_service.py
; but once we've already created communication_service.py
then that's where this business logic should live, and we shouldn't needlessly duplicate it elsewhere.
Rule #4: Always sanitize user input, sometimes save raw input, always escape output
Always sanitize user input before doing anything else with it — validation, business logic, storage, presentation, etc.
What exactly does this mean? Let's start by going over some basic definitions for terms relating to cleaning unsafe data: filtering, escaping, sanitization, and validation.
By way of example, as a kid I was always told that it's never safe to put metal in the microwave. But as an adult with entirely too much Internet access, I now know that, strictly speaking, this isn't true; it's ok to put steel bowls in the microwave if they don't have sharp edges, they aren't touching the sides, and they're filled with water or food. Some definitions:
- Filtering removes unwanted components from user input to prevent them from entering a system. In our microwave example, a filtering strategy might be to check each item going into the microwave to ensure that it isn't metal. In a web application, an example of filtering would be using a security library like bleach to remove any HTML tags and attributes that haven't been allowlisted. Filtering is not reversible; once we've discarded user input, it's lost for good, except for where we've opted to save raw input before filtering.
- Escaping means to encode user input to make it safe to use within a given context. Granted encoding is kind of a tricky concept to understand on its own, but essentially it's a way to mitigate the risk of dangerous content that wasn't removed via filtering. With respect to safely using a microwave, an escaping strategy might be to always ensure that containers don't have sharp edges, that they're filled with water or food, and that they aren't touching the sides of the microwave regardless of what the're made of. In the context of web applications, escaping most often means encoding user data so that it's safe to insert into the contents of HTML elements. Escaping is reversible, so there's no risk of data loss.
- Sanitization is a combination of filtering and escaping. Specifically, it means to first filter user input and then escape it. It's not reversible, since it involves filtering.
- Validation refers to enforcing business rules that are unrelated to security. E.g. we might have a rule that the microwave should only be used to heat up hot chocolate, but this is unrelated to microwave safety. Similarly, in web applications we might validate that usernames should only be comprised of letters and numbers. Validation rules often prevent certain types of security problems, but even in cases where they would prevent an attack from succeeding, there are good reasons (explained below) why validation shouldn't be used as a replacement for sanitization.
There's a lot of nuance here that you need to understand; start with the XSS Prevention Cheat Sheet. But hopefully you now have some basic understanding of what these terms mean.
So, why is it so important to sanitize input before doing anything else? For three reasons:
- So that you don't forget to sanitize it later.
- It's much faster (read: costs less money) to audit the codebase for XSS and other security vulnerabilities — if the data isn't sanitized in the first couple lines of code then we can just say it's a security issue that needs to be remediated, no need to read any further. Reading code is slow (expensive), so aggressively minimizing the amount of code we need to read to know whether user input has been sanitized correctly can save a ton of money over the lifetime of our projects.
- If there's ever a mistake where someone uses unsanitized data in a context where it needs to be sanitized, it will be immediately obvious because of our convention of prepending
unsafe_
to all variables storing user input before that data has been sanitized. Joel Spolsky discusses the value of this convention in his blog post Making Wrong Code Look Wrong.
Some people argue that you should escape output rather than sanitizing input. The basic arguments for this position are that:
- Ensuring user input is safe only makes sense with respect to where that data is used. For example, text being displayed within an HTML element needs to be sanitized differently than text being used within a command-line program. Because of this, data should be escaped as close to where it is used as possible.
- Sanitizing data is lossy, and you should never get rid of user data in case you make a mistake or want to use it differently later.
Both of these arguments are correct, so there's no doubt that in a perfect world it would make more sense to escape output rather than sanitizing input. But in the real world, I don't find these benefits compelling enough to outweigh the risk of forgetting to clean data or the increased costs of auditing code; for startups, and for web applications specifically, the most-common areas where user data gets used (and needs to be sanitized) are:
- HTML element content
- JavaScript data values
- CSS property values
- URL parameter values
- HTML attribute values
- Command-line tools
- Email headers and bodies
But most startups, 99% of the time you'll only be doing the first two, so just sanitizing data to be used within HTML elements and in between script tags is enough. In the exceedingly rare cases where we might want to allow a user to customize their CSS or the subject of an email or whatever, user data should be escaped separately for that purpose. Perhaps there is some increased risk of ending up with Mojibake in your output due to accidentally double escaping your data. But as per this blog post, "double-escaping is embarrassing, but not escaping at all can be deadly."
And with respect to the risk of losing user data due to sanitizing it incorrectly or later needing to sanitize it for a different context where the original data would have been useful, go ahead and save raw input in the specific cases where it's likely to be needed — I'd just recommend against saving it in every case by default. But do so in a way that that preserves the ability to cheaply audit your codebase, e.g. by storing it in a different table that can't be read from in production.
Again, the one common exception to this rule is with passwords; they're only stored as hashes so there isn't any possible XSS vector, now or in the future, and accidentally stripping complexity from a password would be its own security risk.
Rule #5: Don't split files by default & never split your URLs file
For those who don't use Django, all of the code that shares the same configuration settings is called a project. This usually means your entire website or API, or else multiple different websites or APIs that are hosted together on the same servers and use the same configuration. Within a project, one approach to managing your codebase is to split your source code into multiple apps, which are basically just folders dedicated to different chunks of functionality — e.g. user account management, sending emails and SMS, completing a purchase, etc.
Here is an example of what that might look like:
project/
config/
...
users/
__init__.py
account_management_views.py
user_models.py
account_management_service.py
user_admin.py
user_urls.py
user_tasks.py
test_account_management_views.py
communications/
__init__.py
communication_views.py
communication_models.py
communication_service.py
communication_admin.py
communication_urls.py
communication_tasks.py
test_communication_views.py
checkout/
__init__.py
checkout_views.py
checkout_models.py
checkout_service.py
checkout_admin.py
checkout_urls.py
checkout_tasks.py
test_checkout_views.py
utils.py
The second approach is to just put all of your Django code into one big app that has different subfolders for views, services, tests, and whatever other types of file types (e.g. models, admin, etc.) get long enough to justify splitting into multiple files, like so:
project/
config/
...
app/
views/
__init__.py
account_management_views.py
communication_views.py
checkout_views.py
services/
__init__.py
account_management_service.py
communication_service.py
checkout_service.py
tests/
__init__.py
test_account_management_views.py
test_communication_views.py
test_checkout_views.py
models.py
admin.py
urls.py
tasks.py
utils.py
The idea here is that views/account_management_views.py
contains the views that would otherwise be in users/account_management_views.py
, services/account_management_service.py
contains the service methods that would otherwise be in users/account_management_service.py
, etc. From a technical perspective, there is zero difference between these two approaches.
So is one approach better, and does it matter which one you choose?
The answer is that either approach can be definitively better, depending on the size and complexity of the codebase. For brand new apps, I usually recommend putting all your code into one big app. The reason is that structuring a new startup into multiple apps right from the beginning results in dozens of files that each have little or no code. Dividing code across multiple files for no good reason wastes enormous amounts of developer time and confers no benefits — not only is it harder to find the code you're looking for, it's also harder to keep everything in your head when every task requires remembering what's in each of many different open tabs. It's a literal memory leak.
As an example of how this wastes developer time, consider the perspective of a developer who has just been tasked with solving an API bug in a codebase they've never worked with. The first thing they need to do is to locate the source of the bug. The process for doing so usually looks something like:
- Go to the page on your website that has the bug.
- Open up the Network tab Chrome's Developer Tools.
- Find the endpoint that's returning incorrect results.
- Locate that URL in your backend code.
- Find the view that corresponds with that URL.
If you just have one file with all the URLs for your entire project then this process should be seamless, and you are hopefully able to find the right URL almost immediately. Whereas if a project's URLs are spread across dozens of files that each have only a handful of URLs, then at best there is no good way to get an overview of the entire project, and at worst it may end up taking hours looking through different files just to find the right view. Even if you have hundreds of endpoints, there is rarely a good reason to split your URLs across multiple files; there is no advantage and it just ends up wasting tons of developer time over the course of a year. Instead, just have one urls.py
file for the entire project, with a blank line between the URLs for each views file, e.g.:
view1/endpoint1
view1/endpoint2
view1/endpoint3
view2/endpoint1
view2/endpoint2
view2/endpoint3
view3/endpoint1
view3/endpoint2
view3/endpoint3
For brand new projects, put all your Django code into one big app, and split your views, services, tests, etc., across multiple files when necessary.
So, now that we've seen the downsides of prematurely dividing your project into multiple apps, what are the benefits of eventually doing so? Grouping large chunks of functionality into apps can make it easier to:
- Understand how complex codebases work
- Find all the files related to an app's functionality
- Have discussions and make decisions about encapsulation and reusability
It's important to understand though that just because you split a project into multiple apps, it doesn't mean that these apps are automatically encapsulated and reusable. Rather, this is something you need to purposely design and work toward, not something that comes automatically from typing python manage.py startapp
. This is a common misconception, and it's important to call out so that we make well-informed decisions based on each approach's actual pros and cons.
So when should you start to split up your project into multiple apps? I think the best proxy to use is your models.py
file. Once this is big enough that you need to start splitting out chunks of your models into their own files, at that point it might make sense to start moving the views, services, and tests associated with those models into their own app. This isn't an absolute rule, rather it's just an observation that may be useful to keep in mind. Another situation where splitting code into its own apps is useful is when you have copy-pasted code with slight variations for different customers. In this case, it can be useful to isolate any forked models and functions that will now be mantained independently, in order to minimize confusion.
But even when you do eventually divide your project into multiple apps, it's almost never a good idea to split your URLs file.
Readability
When development velocity grinds to a halt because of "technical debt," this is almost always just a euphemism for developers not being able to read each other's code.
In fact this is by far the biggest technical problem that most startups face, one that most style guides at best don't sufficiently address, and at worst actively encourage. To quote Google's style guide: "Assume the person reading the code knows Python better than you do."[5] This might be good advice if your company is literally Google, but for most startups this kind of thinking is a surefire recipe for running out of money before achieving product-market fit. Even if every developer working on your project right now is a "world's-best" developer, this may well not be true of the folks that inherit the codebase in a few years.
I'd strongly recommend that if anything, startups should follow the exact opposite approach to coding: Code should be written for the least technical person in the office. Assume the next person to read your code will be a junior developer, it's their first day in the office, and they have more-or-less no idea what the business actually does.
Because very often, this is actually the case.
The best way to prevent development velocity from slowly grinding to a halt is taking an intellectually honest inventory of all the little things that can make code slow to work with, and then preventing these things from happening within your codebase.
Rule #6: Each variable's type or kind should be obvious from its name
When you have a variable with a name like user_profiles
, how do you know whether it's a list, a dictionary, a set, an object, or something else entirely?
The answer is you can't. Determining the kind of thing a variable contains should be the easiest and most trivial task in the world. But for code that isn't well-architected, there's no upper limit on the number of billable developer hours this can waste. In the best-case scenario, you just need to scroll up to see where in the function that variable is defined, or look at the type hint in the function signature. Even though this is easy, it still wastes time and makes the code much more difficult to read by needlessly adding cognitive overhead.
And that's the best-case scenario.
The problem gets much worse when a variable wasn't declared locally, but was instead passed into the function as a parameter with no type hint. The first thing to try is dropping a debugger into the code and running the tests. But if that code doesn't have test coverage, you then need to figure out where the function you're reading is being called from; this is often difficult, and sometimes needs to be done recursively. The problem compounds when you have lots of variables whose types aren't clear or intuitive, because it's easy to get caught in a loop of forgetting variable types you've already traced out while trying to make sense of the next one.
The point is that, even in the best case you're needlessly adding cognitive overhead. And in the worst case, you're adding hours or even days to what might otherwise be trivial tasks.
You can ameliorate this situation by using type hints, but these are still half-baked; right now mypy doesn't have official stubs for most libraries, and using community-contributed stubs risks making it difficult or impossible to later update your dependencies. And if you don't use something like mypy and are just using Python type hints without validating them, you risk writing them incorrectly. This is something you might not realize until years later when you've already made a mess of your codebase. And Python type hints aren't even a great solution. They're nice for simple built-in types like string or int, but as soon as you need to start passing around instances of more complicated objects and classes the code can get super messy and unreadable.
Much better to just entirely avoid this problem in the first place. The solution that works well in my experience is just enforcing some basic naming rules for variables. These rules use a version of Hungarian Notation for the most-commonly used types and kinds in web development. Joel Spolsky has a more in-depth explanation of types and kinds, as well as the benefits of Hungarian notation, but as a quick overview:
Types — primitives and data structures that correspond directly to concepts from mathematics and computer science. The most-common examples in Python are the built-in primitives (e.g. int, string) and the built-in data structures (e.g. lists, dicts). Of course types don't have to be built-ins, e.g. a trie would be a type, even though tries aren't part of the Python standard library.
Kinds — objects that encapsulate related functionality or business logic. You should create naming conventions for the kinds within your codebase whenever A) you have many objects that have the same or similar functionality B) these objects are all instances of the same class, or else their classes are all subclasses of some common ancestor class. For example, commonly used kinds in Django include models and querysets.
When a variable is a commonly-used kind, the kind should be obvious from that variable's name. And when a variable is a type (but not a kind), its type should be obvious from its name.
Rules for naming kinds
- All Django model instances should end in
_model
, for exampleuser_model
- If you have a function with multiple model instances of the same type, just name each one something logical that follows the same general scheme. For example, for a social network endpoint that allows users to follow one another, good variable name choices might be something like
following_user_model
andfollowed_user_model
- All variables containing a Django model id should end with
_model_id
. For example,user_model_id
- If you have a function with multiple model instances of the same type, just name each one something logical that follows the same general scheme. For example, for a social network endpoint that allows users to follow one another, good variable name choices might be something like
- All Django querysets should end in
_queryset
, for exampleactive_user_queryset
- Instances of serializers should end in
_serializer
, for exampleemail_address_serializer
. - Date fields on models should start with
date_
, for exampledate_user_last_logged_in
. Datetime fields on models should start withdatetime_
.
Rules for naming types
- Python lists should end in
_list
. This should be prepended by the type of variable the list contains, for exampleuser_model_list
- Python sets should end in
_set
. - Python dicts should end in
_dict
. This should be prepended by what the dict is mapping, for exampleusername_to_user_profile_dict
- If you can't think of a logical name for your dict, consider choosing something different for your keys or values, even if it means passing around some unused data
- Booleans (and functions that return booleans) should start with some form of the verb to be, for example
is_public
orhas_disposable_email_address
For one-off classes that aren't types or kinds, instances of MyPythonClass
should be called my_python_class_instance
.
These naming rules aren't meant to cover 100% of cases, rather they're meant to make the majority of your code easier to read. For the variables whose names aren't covered by these rules or where it's ambiguous, just use your best judgment. The goal is to be obsequious to your future self and your coworkers by making your code easy as possible to understand.
I find that it's rarely difficult to figure out what's a string, int, or float, since those usually have names like username
or DAYS_PER_WEEK
; because they aren't pluralized collections, there isn't nearly as much room for ambiguity. But for cases where it's not obvious, follow the same pattern and append _str
, _int
, or _float
. This is often useful in cases where the colloquial name for some concept is actively trying to trick you. For example, because serial numbers are generally strings rather than numbers, consider using serial_number_str
instead of serial_number
. It might be weird seeing variables like phone_number_str
or confirmation_number_str
in your codebase, but it's better than having the next person waste a bunch of time trying to figure out why the code is failing when they pass it an integer.
Although most variables can (and should) be named in a very straightforward way, there's more room for creativity when naming features, apps, services, methods, models, model fields, etc. If you're going to fight about anything as a team, you should fight about what things should be called. Different words have different meanings and connotations to different people, and if what people think something does is different than what it actually does then that costs money.
It's always worth having everyone give their thoughts on names for high-level features during architecture discussions. And then during code reviews, have the reviewer(s) just flag everything they disagree with; then in cases where the code author disagrees with the reviewer, have a meeting every week or two to go through all of the names that people have disagreed about in person. Spending the nominal amount of time it takes to make sure things are named in a way that actually means something to people is one of the highest-ROI activities you can do together as a team.
Rule #7: Assign unique names to files, classes, and functions
With the exception of local variables inside functions and methods, each named entity in your codebase should have a unique name.
Why?
Let's explain by way of some examples. First, imagine your project has a users
app that contains a file called test_account_management_views.py
. If you want to run the tests in that file (super common), you can do so from the command line by typing python manage.py test test_account_management_views
.
Easy, right?
Now let's contrast that workflow with what performing the same task might look like if each of your apps has a file named test_views.py
:
- Scroll through several different open tabs at the top of your text editor called
test_views.py
while looking for the right one. - Hover your mouse over the tab so that you can see the full path to the file.
- Tab back and forth between your text editor and the command line a couple times where trying to type out the full path to the file so that you can run the tests:
python manage.py test users.tests.test_views
Different IDEs sometimes attempt to ameliorate the problem of having multiple tabs with the same name, but each approach has some inherent drawbacks:
- If your IDE does nothing and each tab shows only the filename, you'll have a bunch of open tabs with the same name. This means that either A) you don't know which tab is which or B) you need to waste mental bandwidth remembering
- Whereas if your IDE prepends part of the file path to give each tab a unique name, e.g.
users.tests.views.py
andcommunications.tests.views.py
, then these ridiculously long names cause your other open tabs to get truncated or hidden.
Many text editors do let you manually rename tabs, but this wastes time and mental bandwidth each time you open your files.
Either way, it's a lose-lose situation. Much better to just give your files names like user_tests.py
and communication_tests.py
, so that way you have complete control over how they appear in your text editor. This might seem weird if you've split your project into multiple apps, since each filename then starts by repeating the name of its parent folder. But it's better than wasting a ton of time each day looking through irrelevant code because your tabs aren't properly named, needing to type out the full path of a file just to run its tests, etc.
The reason for not having classes with the same name is greppability. Whenever I encounter a new class in a codebase that I'm not familiar with, the first thing I want to see is a list of all the places where that class is being used. Not only is this essential for knowing whether there might be an opportunity for refactoring, just being able to see the different contexts where a class is used is often helpful for understanding what it does in the first place. And as I'll explain later (rule #9), classes in Python aren't properly encapsulated. So being able to easily trace the chain-of-custody for an object, from its creation to wherever you first encounter it in the code, is essential for understanding its state and properties.
When I do Cmd-Shft-F in Sublime and type in the name of a class, the only things I want to see are where that class is defined and then all the places where it's used. Having ten other classes with the same name, each defined in different files, makes this much harder; it adds a little complexity to finding where the class you want to look up is defined, and then a whole lot of complexity to figuring out where it's used versus where you're just seeing other different classes with the same name in the search results.
Identically named functions are even worse. Not only can you have functions with identical names defined in different files, but in some languages you can even have multiple functions with the same name in a single file, just with slightly different function signatures. Especially as a junior developer, it was often bewildering to look at overloaded function calls and then try to remember all the rules governing the interactions between different argument types, in order to figure out which of the functions in question was being called. You can't always just drop a print function or a debugger in the code to figure it out. And once you do figure it out, that's now just one more thing you need to keep in memory.
Frankly, even variables, classes, and functions with similar names can be confusing. After all, who among us hasn't gotten into an awkward situation after forgetting the difference between androgyny, anadromy, andragogy, and a dromedary. This is one of the reasons that for pluralized variables, I like names like user_model_list
rather than user_models
, because the latter is too similar to user_model
.
The big picture is that once you need to start devoting mental bandwidth to sorting out these situations in the codebase, this quickly replaces the underlying business problem as the thing you're now trying to solve.
There are certain situations where it does make sense for class methods to have the same name; for example, if there are multiple child classes that inherit from a common parent and share a common interface. And with things like __init__.py
, you don't have any choice. But for the most part, even if you're purposely trying to draw attention to some aspect of parallelism in the codebase there are better ways to do it, and enforcing unique names for files, classes, methods, and functions results in a cost-free improvement in productivity.
Rule #8: Avoid *args and **kwargs in user code
Whenever calling a function, prefer named keyword arguments over positional variables, and prefer positional variables over dynamic arguments. As an example of what I mean, consider three possible ways to call a service method:
Good function call: account_management_service.get_user_profile(include_profile_image=True, include_email_address=False)
Bad function call: account_management_service.get_user_profile(True, False)
Terrible function call: account_management_service.get_user_profile(**request.data)
The second and third function calls are not only less readable, but also less secure; if you accidentally mix up the two boolean parameters, it's impossible to see the mistake from where the function is called. This makes it much more likely that private data will accidentally get exposed. In contrast, if we see a call that looks like my_service.my_function(include_private_information=True)
, and we know that private information isn't supposed to be getting returned in this context, then anyone can immediately see that there's a problem just by glancing at the code.
As a general rule, always pass function parameters via named keyword arguments, with the one possible exception being cases where doing so would make the code less readable. And when defining functions that have parameters with privacy or security implications, always use python's keyword-only feature to enforce that these parameters can only be specified as keyword arguments.
The reasons you should generally prefer keyword arguments are that they:
- Make it clear what variables are being passed to the function. When this isn't clear from reading only the function call, your developers need to waste time figuring it out and then waste cognitive capacity keeping this knowledge in working memory, both of which substantially slow down development velocity.
- Make it easier to see whether parameters have been sanitized for XSS and other vulnerabilities.
- Maximize the likelihood of your tests breaking if the wrong parameters get passed to the function, e.g. if a bug gets introduced on the front end or if the implementation of the function changes such that it now requires different parameters. Remember, if your application is broken then you want your tests to fail, that's literally what they're there for.
That said, there are some settings where it's entirely appropriate to use *args
and **kwargs
. One example is when you're actually doing something dynamic, such as dynamically setting model attributes:
def _save_model_attributes_dynamically(model_instance, **kwargs):
for kwarg, value in kwargs.items():
setattr(model_instance, kwarg, value)
model_instance.save()
return model_instance
More good use cases for *args
and **kwargs
include when subclassing an object in a library to change some behavior, or when building a library that allows for behavior to be customized by end users. These are powerful features of Python and it's great that they're there, but if you're using them then make sure you have a real reason; don't just use them everywhere because you think it makes the code look cool. Using these advanced features unnecessarily wastes a ton of time and money, both in terms of slowing down development velocity in the short term and in the time it will take to remediate the problem later.
See also items 19 and 21 in Effective Python, for some additional rules and perspective.
Rule #9: Use functions, not classes
Unlike languages like Java and Swift, which were designed from the ground up around object-oriented programming, in Python OOP feels like more of an afterthought. All of the things that make Python a great language make working with classes and objects in Python bad.
Why?
In non-OOP contexts, Python's approach of being strongly, dynamically typed works pretty well. Clearly the language would have been better had there been support for optional static typing from the beginning, rather than having it added as a kludgy hack decades later. But even if we don't get the full benefits of static typing (more useful IDEs, better performance, eliminating certain classes of errors, etc.), at least we mostly don't have to deal with the insane errors that you get with implicit type coercion in JavaScript.
In Python, most of the wat issues that exist have to do with scope and mutability; a good example is the well-known gotcha involving the use of mutable default arguments in function definitions.
So what exactly does this have to do with classes?
In Python, classes and objects are too dynamic and mutable for their own good. For example, consider this snippet:
>>> class Foo: pass
...
>>> Foo.field = 42
>>> x = Foo()
>>> x.field
42
>>> Foo.field2 = 99
>>> x.field2
99
In this case it's easy to understand what's going on. But that's because each of these state changes happens in only one place in the codebase, immediately after the class is defined and the object is instantiated, so the entirety of the code involving this class and its objects can be read and understood in a simple linear order. Once you start adding in confounding factors like:
- Multiple inheritance, multilevel inheritance, and multiple multilevel inheritance.
- Objects that are instantiated in one module and then imported into other modules.
- Classes or instances being monkey patched, including invisibly via the import process, audit hooks, the
exec
function, etc. - Quirks introduced by
@classmethod
,@staticmethod
,@property
, magic methods, etc.
When someone inherits a large codebase that can't be understood by reading the code in a linear, step-by-step way, then that combined with the fact that classes and object state can be modified arbitrarily leaves us with no clear way to grep the code to find the source of state changes; this often makes it time-consuming to find the bugs, change functionality, add features, etc. The design of OOP in Python makes it very easy to get into a goat-stuck-in-the-tree situation, where it can be much easier to make a mess of something than to fix it. Unfortunately it's super easy to write Python code that's impossible for anyone else to understand, because the relationships between all the files and functions and objects and classes are too complicated to fit into anyone's working memory.
The Django documentation on testing is rather telling:
You may reverse the execution order inside groups using the test --reverse
option. This can help with ensuring your tests are independent from each other.
If you actually stop to think about it, the fact that this even needs to exist is pretty messed up. Especially when you realize that most experienced Django developers have probably run into this issue and have had to use this feature, and that it may even be considered a best practice to run your tests both forwards and backwards each time on CI.
Thinking about it, many of the problems that people have noted with using global variables also apply to Python classes:
- Non-locality — Source code is easiest to understand when the scope of its individual elements are limited. Global variables can be read or modified by any part of the program, making it difficult to remember or reason about every possible use.
- No Access Control or Constraint Checking — A global variable can be get or set by any part of the program, and any rules regarding its use can be easily broken or forgotten.
- Testing and Confinement — source that utilizes globals is somewhat more difficult to test because one cannot readily set up a 'clean' environment between runs.
If there were a way to reliably prevent people from using Python classes in crazy ways, even just some shared cultural expectation, then this would certainly help. But frankly there isn't.
So what should we do?
Just use functions. If you stick to writing all your business logic using functions then you can't have any of these problems. There are no weird inheritance issues, no way to change their internal state, etc. And because they inherently read in a linear, step-by-step way, no matter how badly they're written there is generally at least some upper bound on how hard they are to understand and refactor.
There are some situations where it's completely appropriate to adopt an object-oriented style, like if you're creating a library or a framework. But hopefully those with enough experience to create popular libraries and frameworks also have good enough judgment not to completely abuse the tools. Whereas within the context of startups, you need experienced contributors to create guardrails to prevent these kinds of problems from happening. The other time it may make sense to adopt OOP is if you have several dozen developers working on the same codebase; as I said though this is a guide for startups, and usually at around $100M ARR is when companies start thinking about rewriting some of their app's core components using Golang or one of the JVM languages.
If you want to read more criticisms of OOP generally (i.e. unrelated to Python), there is no shortage of opinions on this, e.g.:
What I'll say is that, to whatever extent there are benefits OOP, those benefits seem to rarely manifest themselves when using Python.
Rule #10: There are exactly 4 types of errors
There are exactly four types of errors that any REST API can return:
- Upstream errors: Errors that happen upstream of our user code, e.g. in the authentication package we're using, or in our other middleware. The main thing to know about these errors is that you should leave them alone. Don't try to rewrite errors from upstream middleware to make them look like the kind of errors we throw in our user code, because that will make it very difficult to update our external dependencies later.
- Validation errors: Errors that occur when a user has permission to access an endpoint, but they supply syntactically invalid input. For example, if they enter a username longer than the maximum allowed length or don't enter an email address. The way to handle these is that if there is at least one field with an error, return a dictionary with all validation errors for that endpoint, e.g.:
The idea behind{
'errors': {
'display_error': 'Error message 1',
'field_errors': {
'field1': ['Error message 2'],
'field2': ['Error message 3', 'Error message 4'],
}
}
}
display_error
is that often APIs will be accessed by a user using a form on a webpage, and sometimes you want to display an overarching error for the entire form that's separate from the errors associated with any individual field. And because sticking with this webform metaphor makes it easier to intuit what each component of the error response data will be used for, we refer to any parameter errors as 'field_errors' regardless of whether or not the specific input parameters being sanitized are associated with form fields. - Business requirement errors: Errors that occur when a user has permission to access an endpoint and they enter input in the correct format, but there are business requirements that prevent us from doing the thing they want. For example, if there is a business requirement to not allow users to access private information belonging to other users, or to not allow users to create multiple accounts using the same email address, then these would be business requirements errors. Return the first business requirement error that occurs, there is no need to check for or return multiple errors.[6] Business requirements errors can take the form of:
The{
'errors': {
'display_error': 'Error message 1',
'internal_error_code': 'XYZ01'
}
}
internal_error_code
is a unique 5-digit number associated with each business logic error, whose first three digits are always the same as the HTTP status code. The idea is that developers looking at errors returned to the front end can use this unique number to quickly Command-Shift-F to find the exact line in the backend where the error is raised, which eliminates the time that would otherwise be wasted trying to figure out where in the backend an error is occurring. - 500 errors: Errors that happen when your application randomly bombs out due to a bug in the code. Just let these happen, don't try to catch them and re-raise them as other types of error because it's important to be able to find these errors when (and where) they're happening in whatever APM software you're using to monitor the health of your application.
As a general piece of life advice, avoid using the same HTTP status codes as the errors thrown by your authentication middleware; doing so makes it harder to write reliable front-end middleware to do things like redirecting logged-out users to the landing page. So, for example, if your auth middleware returns 401
and 403
errors, don't create errors with these status codes in your application code.
Simplicity
If on average it takes several months for a new engineer to get reasonably close to full productivity, they work at your company for two years, and they spend their last couple months training their replacement, then what does this mean for your business?
It means that onboarding and knowledge transfer constitute perhaps 30% - 40% of what you're paying each engineer to do, an absolutely enormous expense. Training and knowledge transfer are obviously essential, but it certainly raises the question of whether there might be more cost-efficient ways to get the same benefits. This problem is clearly under-addressed, given that if you Google "knowledge transfer software" there's nothing relevant that comes up. This lack of adequate knowledge transfer software is one of the reasons we built FWD:Everyone, now the leading platform for sharing and publishing email conversations. While doing consulting we realized that many of the most important discussions about software architecture and business requirements were had via email, but that as employees left our clients' companies, all of this valuable knowledge was being lost — so we created a way to make this knowledge permanently accessible, searchable, and shareable.
That said, software isn't the only way to make knowledge transfer more efficient, and it certainly isn't the best way. The best way to reduce the costs of onboarding is to minimize the amount of knowledge it takes to productively contribute in the first place.
What does this mean in practice?
Minimize the pages of documentation a new developer needs to read before having a good working knowledge of all your frameworks, libraries, third-party packages, etc. Don't add dependencies unless the value they add significantly outweighs the costs of both your current and future developers needing to read whatever documentation they would need to read in order to understand them. Be aware of the cognitive overhead and productivity slippage that's added by introducing each new dependency. When it does make sense to add a dependency, don't just automatically use all the features; for each dependency, document not only the features that your team uses, but also the ones your team doesn't — and shouldn't — use, so that way new developers don't waste time reading documentation for functionality the team isn't even using. Be smart; don't use every last feature of Python, Django, Rest Framework, etc., just because they're there. And maintain documentation to serve as guardrails to prevent non-purposeful dependency creep.
Knowledge isn't inherently valuable, and is rarely free; the costs of acquiring required knowledge should never outweigh the utility.
The best way to minimized requisite knowledge is through standardization. When it comes to things like sanitizing input, writing business logic, testing code, etc., there should be one way of doing things. And that should generally be the simplest way that meets our needs.
Rule #11: URL parameters are a scam
One of the concepts behind REST is that each time you create a new API endpoint, the URL for that endpoint should correspond to a unique resource that's being created, read, updated, or deleted. The definition of a resource is pretty loose: "any information that can be named." Resources sometimes correspond to a single table in your database, sometimes correspond to multiple tables, and sometimes correspond to no tables at all. (You could think of the resources in the second and third cases as being "virtual resources.") The key takeaway is that REST provides broad latitude to define resources in whatever way makes sense for your app.
When interacting with API endpoints, you often need to pass data to your views in order to:
- Specify the unique resource that you're interacting with, e.g. if you're interacting with a message that has a unique message ID.
- Perform the necessary business logic.
Data can be passed in one of three ways:
- Via query parameters, like
GET https://api.fwdeveryone.com/user/profile?username=alex3917
- Via URL parameters, like
GET https://api.fwdeveryone.com/user/profile/<str:username>/
- Via request body data, like
POST https://api.fwdeveryone.com/account { 'username': 'alex3917', 'password': 'hunter2' }
So how should you pass data to endpoints? Easy.
For GET and DELETE actions, use only query parameters. For POST and PUT actions, use only request body data. And never use URL parameters.
Why not?
Some people like using URL parameters to specify resources that can be modeled hierarchically, like GET /email_thread/<str:thread_id>/email_message/<str:message_id>/
. The problem is that not all resource relationships can be modeled hierarchically. For example, imagine requesting an outfit. It wouldn't make sense to request GET /clothing/shirt/<str:shirt_id>/pants/<str:pants_id>/
, because pants aren't children of shirts; having a URL structure like that would be confusing and misleading. Instead, you would want to use something like GET /clothing?shirt_id=1&pants_id=2
.
Even for resources that can be nicely represented by hierarchical URLs, you'll still need query parameters to configure the output. For example, when retrieving a user's profile, there might be a query parameter that allows you to specify whether or not you want to include the user's profile image, e.g.: GET /user/<str:username>/profile?include_profile_image=true
.
Why does this matter?
Since some requests involve non-hierarchical resources, we can't use URL parameters for every endpoint, and so we shouldn't use them at all.
Why? Because having multiple ways of doing things adds cognitive overhead. And since every endpoint can just as easily accept data via only query parameters or only body parameters, there's no reason to incur the overhead of learning and remembering a third way of accepting data. What happens when you have a codebase that uses URL parameters is that there ends up being three different ways to:
- Create URLs
- Write integration tests
- Call URLs from the front end
- Trigger endpoints using Postman
- Mock API calls in your front-end tests
And so on.
When having two different ways (URL parameters and body data) is entirely sufficient, and is already complicated enough to learn and remember.
Having URLs with parameters also makes it more time-consuming to go from knowing the URL of an endpoint to actually finding the view that the URL corresponds with. Even though each of these things individually isn't hard per se, cumulatively this is a constant source of bugs and a huge waste of time for something that confers no benefits. Limiting yourself to only using query parameters will easily save at least one full day per developer per year, and probably more. These are exactly the sort of cost-free wins that we should take wherever we can get them.
Rule #12: Write tests. Not too many. Mostly integration.
In my first 18 months as a professional software developer, I wrote exactly zero tests.
Despite taking an entire MOOC on software testing, I couldn't figure out how to actually use any of the test frameworks we had installed. And besides, I didn't really understand the point. After all, the mid-level developer who sat next to me was constantly taking down the entire website because he'd deploy his code as soon as his tests passed without even looking at the product to see if it was working. Whereas I always tested the site manually before and after each deploy, so I almost never made the same mistakes. This meant that my approach must clearly have been better, right?
Well, not exactly.
Software testing is often sold as something that gives us confidence that our code is working correctly, which never made a lot of sense to me. After all, if we can look at the product to see if it's working correctly, then deploying to production just on the basis of seeing the tests pass seemed frankly irresponsible. But after years of doing this for a living and watching real products break in countless different ways, I've come to see some common patterns in how this happens. And as part of this, I've come to understand that whether the code is working as intended is largely orthogonal to whether the product is working correctly; no amount of test coverage, at least that any startup can afford, can ever instill complete confidence that the product itself is working. But knowing that your code is still working as intended is hugely valuable in and of itself, for reasons largely unrelated to product availability.
So, what exactly are the benefits of testing? There are many potential benefits, which depend on the type of tests one writes, the level of test coverage, and a bunch of other factors. So let's work backwards; in the context of startups, the most important benefits that testing can confer are as follows:
Development speed — At the most basic level, an integration test is just a snippet of code that passes some data to whatever endpoint you're testing, and then compares that endpoint's response against some hard-coded expected value. The most immediate benefit to writing a test case, as opposed to just triggering whatever API code you're writing via your web browser, is that running a test is much faster. You can literally configure your development environment so that the relevant tests are run after every single line of code you write, and within a second or two you'll know if you're on the right path or if you've made some mistake. That's much better than writing out twenty or thirty lines of code each time in between manually checking to see if the endpoint is working, and then wasting a bunch of time trying to figure out where the inevitable errors are happening.
Security — There are many reasons why, when it comes to SaaS startups, having a SPA front end that's powered by a REST API is better than having a monolith where the front end is a mix of templates and javascript. Of these, I think the biggest benefit is that there's a clear demarcation line between the front end and the backend; the backend starts where the API receives an HTTP request, and it ends where the JSON response is returned. This split architecture makes REST APIs the perfect match for testing almost entirely via integration tests on your views. If you get back the right JSON response for a given request, your API worked correctly. If you get back the wrong JSON response, it didn't. This makes it easy to ensure that you're never leaking private data, just by always just asserting exactly what the JSON response should look like for any given input. As the sign on President Truman's desk said, "The buck stops here!".
Documentation — One of the most important functions of tests is to serve as documentation. Not only is this necessary for efficiently onboarding future developers, it's also important just for ensuring the productivity of our future selves. This is so important that the best practice is to write the code for our test cases in a way that's completely different than the way in which we write all of our other code, as described in this blog post by Michael Lynch. The reason we do all these things is to make the tests as readable as humanly possible. Tests are at their most valuable when they break, and when they break it's important to immediately be able to understand exactly what's broken, even if the code that's broken is something we previously knew nothing about. Ideally it should be possible to know what's broken just by reading the name of the test case, without even having to look at the code for that test case. Conversely, any test where you can't understand what's being tested, even after looking at its code, isn't conferring any value and should probably be deleted.
So in the context of these benefits, what kind of tests confer the best ROI?
Integration tests. On your views.
Since the line between unit testing and integration testing can be blurry, I'll give my definition: tests where you make a request to a view and get back a JSON response. These tests should execute all of the business logic for the endpoint, including making real requests to your database, caches, etc. The only things that should be mocked are requests to third-party services, e.g. sending emails via an email service provider (ESP).
Here are a few examples of integration tests written using DRF:
class UserTestCase(TestCase):
def setUp(self):
self.factory = APIRequestFactory()
self.view = app.views.account_management_views.User.as_view()
self.maxDiff = None
########
# POST #
########
def test_valid_signup(self):
post_request_data = {
"username": "aoeu",
"email_address": "example@example.com",
"password": "hunter2!",
"terms_of_service_accepted": True,
}
request = self.factory.post("/user", post_request_data)
with self.assertNumQueries(18):
resp = self.view(request)
self.assertEqual(201, resp.status_code)
self.assertIn("auth_token", resp.data['data'])
self.assertEqual(1, User.objects.count())
self.assertEqual(1, EmailAddress.objects.count())
def test_usernames_cant_be_longer_than_15_chars(self):
post_request_data = {
"username": "aoeuaoeuaoeuaoeu",
"email_address": "example@example.com",
"password": "hunter2!",
"terms_of_service_accepted": True,
}
request = self.factory.post("/user", post_request_data)
with self.assertNumQueries(0):
resp = self.view(request)
expected_resp = {
"errors": {
"display_error": "",
"field_errors": {
"username": [
"Usernames must be less than or equal to 15 characters."
]
},
}
}
self.assertEqual(422, resp.status_code)
self.assertEqual(expected_resp, resp.data)
self.assertEqual(0, User.objects.count())
self.assertEqual(0, EmailAddress.objects.count())
def test_user_cannot_create_an_account_with_a_username_thats_already_taken(self):
UserFactory(username="aoeu", email_address="example@example.com", password="hunter2!")
post_request_data = {
"username": "aoeu",
"email_address": "example2@example.com",
"password": "hunter2!",
"terms_of_service_accepted": True,
}
request = self.factory.post("/user", post_request_data)
with self.assertNumQueries(1):
resp = self.view(request)
expected_resp = {
"errors": {
"display_error": "An account with this username already exists!",
"internal_error_code": 40901,
}
}
self.assertEqual(409, resp.status_code)
self.assertEqual(expected_resp, resp.data)
self.assertEqual(1, User.objects.count())
self.assertEqual(1, EmailAddress.objects.count())
Pretty straightforward. The one thing to note is that you should always use Django's assertNumQueries
context manager when calling your views. This will make it immediately obvious if you accidentally do something that causes the ORM to make unnecessary database calls.
In general, every endpoint should perform the same tests in the same order:
- Permissions — Always test the permissions for each endpoint. For example, if a view should only allow authenticated users, the first test should be ensuring that the endpoint throws an error for unauthenticated users.
- Validation errors — Have one test for each validation error. For example, if a username can't be more than 15 characters, test that it returns the correct error message. Also test multiple validation errors at once to ensure that the error messages are aggregated correctly.
- Business requirement errors — Have at least one test that triggers each business requirement error. For example, if an endpoint requires that a user's account be at least two weeks old to submit content, have a test for that.
- Success conditions — Have at least one test for each way you can call an endpoint and receive an HTTP 2xx or 3xx response.
That's it.
If you work for a large company or make airplanes or something then obviously none of the above advice applies, because as companies get bigger the value of unit testing increases relative to the value of integration testing. I discuss this more in the section below.
See also: Kent C. Dodd's eponymous blog post.
Rule #13: Treat unit tests as a specialist tool
In contrast to integration tests, I'll define a unit test as a test that's testing just one function or method within your app, e.g. a single service method, helper method, or utility function. Like integration tests, unit tests also mock calls to third-party service providers, like ESPs. Additionally, unit tests sometimes also mock calls to internal services databases or caches, depending on what's being tested.
For large companies, unit tests are the bread and butter of test coverage. This is because when you have hundreds of engineers working on complex systems, most individual engineers aren't working on across the entire request-response cycle, and often don't understand the whole system. Developers at large companies often work at the level of individual classes and methods, so they're primarily responsible for testing the small components they create to ensure they're working to spec.
In contrast, for your typical SaaS startup, unit tests are more of a specialist tool. Whereas integration tests do the bulk of the work to ensure your app or service is working as intended, unit tests are used in specific situations where integrations would be clunky, so something more lightweight is needed. Some examples:
- Testing code that processes user-generated data, or that processes data from a third-party service. For example, testing a function that linkifies plaintext hyperlinks in email messages.
- Testing code with privacy or security implications, especially where there are lots of variables that can potentially interact in ways that need to be accounted for.
- Testing endpoints with lots of validation rules. For example, if your system only supports English then you might only have 20 or 30 validation rules for your account creation endpoint, each of which can be tested with an integration test. But if you support dozens of languages then you might potentially have hundreds of different validation rules, which all need to be tested.
The basic theme here is that unit tests are most appropriate when:
- You have a specific algorithm that needs to be tested against lots of different input, but where there are only small differences in the input for each test case
- Testing the full request-response cycle for each minor variation of input would add lots of boilerplate code and/or substantially slow down the run time of the test suite, but you wouldn't be getting much additional confidence that the system is working correctly.
The important thing to understand is that, at least in the context of startups, unit tests and integration tests are non-fungible; asking why you might need unit tests if you already have integration tests is like, as Michael Lynch observes, "asking why you need a neurosurgeon if your general practitioner is competent. Unit tests and integration tests have different benefits and drawbacks, so you should use them for different purposes."
So, why not just always write unit tests for every endpoint, in addition to integration tests?
Because they aren't free. They come with a real cost, and not just in terms of writing them; having too many doesn't just result in diminishing returns, it's arguably worse than having no tests at all. I've seen multiple companies fail or come close to failing due to over-testing. The biggest risks are:
- Spending months or years writing tests for a product no one is using as a way to avoid launching or talking with prospective customers.
- Having thousands of test cases where it's not easy to understand what's being tested, so when they break they don't confer any benefit. But because everyone is afraid to just delete them, developers waste enormous amounts of time fixing broken tests that they don't even understand, often when the actual product isn't even broken.
- Having a slow test suite. As an example, I worked at a startup where each time someone was ready to merge a branch, it took 45 minutes for the CI tests to run. And since the CI tests sometimes failed even when the tests passed locally, it could easily take a full day or more to merge even the smallest changes into the codebase.
A colleague had a good metaphor that sums up the problem with over-testing. Imagine you have a door with a lock. Adding a second lock makes your apartment more secure. But adding 100 locks doesn't make it any more secure than having two locks, because at that point a robber would just kick down the door or go through the window. But it does make it take 100 times longer for the actual owner to get into their apartment.
Basically if you're going to unit test a piece of code, make sure the unit tests are A) doing something of value B) doing something that wouldn't be better done via integration tests.
Some miscellaneous final thoughts on testing:
- Don't write model tests. The only reason you would ever need to test your models is if you have business logic in them, which you shouldn't.
- Smoke test your templates, e.g. if you're using Django templates to send out transactional emails. You don't need to go crazy here, just make sure they render without throwing any errors.
- So as not to be hypocritical, if you're a junior developer with less than two years of experience and you're not working on something that's security-sensitive, you get a free pass for not writing any tests. If you know how to write tests then that's great, but if not then just build your app first. It's hard enough just learning how to build features, so it's fine to focus on that and test your code manually, and then go back and learn how to write tests later. You're not going to write a ton of code in your first couple of years anyway, so it's not going to take a ton of time to go back later and fix things.
- Make sure your test suite passes when the WiFi is turned off. If your tests are making network requests, something wasn't mocked correctly. C.f. this guide to mocking in Python.
Rule #14: Use serializers responsibly, or not at all
In computing, serialization is the process of converting a more complex object into a simpler object. The two most common examples in Django development are:
- Converting Django models into Python dictionaries
- Converting Python dictionaries into stringified JSON
In contrast, deserialization is the process of converting simpler objects into more complex objects. The most common examples in Django are:
- Converting stringified JSON into Python dictionaries
- Converting Python dictionaries into Django models
The reason this is relevant is that there are many third-party libraries in Python that offer something called Serializers.
What exactly are they?
Essentially they're classes that A) offer tools that can be configured to perform common tasks that relate to serialization and deserialization, and B) provide a way to organize whatever additional code you need to write for things relating to serialization and deserialization that these classes themselves don't provide for.
The biggest benefit of Serializers (called forms in vanilla Django) is that they provide a nice way to organize the code for raising and coalescing validation errors, e.g. when a user's password doesn't have enough characters or their email address is invalid. As an example, consider how you might otherwise return input-related errors to the front end without serializers:
try:
my_model.full_clean()
except ValidationError as e:
# Do something based on the errors contained in e.message_dict.
# Display them to a user, or handle them programmatically.
pass
Django's full_clean
method validates each model field, validates the model as a whole, and validates the uniqueness constraints of model fields.
By adding error handling around full_clean
, we can generate validation errors to return to the front end by using the message_dict
that's attached to the ValidationError
. We can specify which validation errors the full_clean
method should raise by using Django's built-in validators, combined with our own custom validators on any given model field. The full_clean
method will then add all of these errors to the message_dict
on the ValidationError
.
This pattern works reasonably well if:
- Each of your models is only changed by one endpoint.
- Each of your endpoints only changes one model.
- Each of the validation errors you want to raise is related to a model or a model field.
When these assumptions stop being true, things quickly start to get messy. E.g. consider the hypothetical validation error handling component of this partially implemented function to create user accounts:
def create_account(sanitized_username, sanitized_email_address, unsafe_password,
sanitized_terms_of_service_accepted):
validation_field_error_dict = {}
nfc_username = unicodedata.normalize("NFC", sanitized_username)
nfkc_username = unicodedata.normalize("NFKC", sanitized_username).casefold()
nfc_email_address = unicodedata.normalize("NFC", sanitized_email_address)
nfkc_email_address = unicodedata.normalize("NFKC", sanitized_email_address).casefold()
...
with transaction.atomic():
try:
user_model = user_model = User.objects.create_user(
nfkc_username=nfkc_username,
nfc_username=nfc_username,
nfkc_primary_email_address=nfkc_email_address,
password=unsafe_password,
sanitized_terms_of_service_accepted=sanitized_terms_of_service_accepted
)
user_model.full_clean()
user_model.save()
except ValidationError as e:
coalesce_validation_errors(validation_field_error_dict, e.message_dict)
try:
update_or_create_email_address(user_model, nfc_email_address, nfkc_email_address, is_primary=True,
is_verified=False)
except ValidationError as e:
coalesce_validation_errors(validation_field_error_dict, e.message_dict)
# Don't commit the transaction if there are errors
if validation_field_error_dict:
raise CoalescedValidationError(validation_field_error_dict)
...
return user_model
I've written a lot of Django error handling code this way, and it's not a terrible pattern by any means; if our models are supposed to be the single source of truth, then intuitively it makes sense that validation should be happening as close to this source of truth as possible. The downside is that if you have endpoints that update four or five different models, or if you need to raise validation errors that aren't related to any specific model, things get messy because you need to coalesce errors coming from lots of different sources. The pattern also leaves your error handling tightly coupled to your models, which makes your models more difficult to understand. It also adds complexity in cases where you have multiple endpoints that each have their own rules for persisting state on a single model.
Serializers ostensibly make your code cleaner and easier to understand by A) tying error handling to each endpoint rather than each model, and B) encapsulating the process of raising and coalescing errors.
That said, I've found that the serializers built into DRF have many downsides of their own:
- They add almost 20,000 words of documentation that each new developer needs to read as part of their onboarding
- It often takes hours to figure out how to do things that you could otherwise do in 30 seconds with a single line of Python.
- They can get unreadable very fast, especially when:
- Using nested serializers
- Mixing code for serialization with code for deserialization
- Performing business logic unrelated to serialization
- They can introduce security and performance issues that developers would be unlikely to otherwise introduce
DRF also encourages developers to put all of the business logic for an entire endpoint within the serializers, not just the logic related to serialization; this includes the logic for creating and updating Django models. Unfortunately, almost inevitably the business logic gets buried under several layers of unnecessary misdirection, such that the endpoint no longer tells a story and it's much easier for bugs and security vulnerabilities to get introduced and then go unnoticed.
In my experience, these issues often make code written using DRF serializers significantly worse than code that doesn't use serializers at all.
Fortunately, there are now some very good alternatives to the DRF serializers. Specifically, I'd recommend using Marshmallow. Compared with DRF serializers, it has: clearer and more concise documentation, a more intuitive API, and doesn't encourage the use of serializers to store unrelated business logic or to perform CRUD operations. As a nice bonus, it also has significantly better performance.
Currently, I think the cleanest way to approach error handling is using Marshmallow to validate incoming user data, and then manually serializing responses using pure Python. Here is an example of using Marshmallow to validate user input:
class AccountCreationValidator(Schema):
username = fields.Str(required=True, load_only=True, validate=[
validate.Length(1, 15, error="Usernames must be less than or equal to 15 characters."),
validate.Regexp("^[a-zA-Z][a-zA-Z0-9_]*$", error="Username must start with a letter, and "
"contain only letters, numbers, and underscores.",),
],
)
email_address = fields.Email(required=True, load_only=True)
password = fields.Str(required=True, load_only=True, validate=[validate.Length(settings.MIN_PASSWORD_LENGTH, None)])
terms_of_service = fields.Boolean(required=True, load_only=True)
And then manually serializing output using pure Python might look something like:
def get_user_profile_from_user_model(user_model):
user_model_dict = model_to_dict(user_model)
user_model_dict['date_joined'] = user_model_dict['date_joined'].isoformat()
allowlisted_keys = ['nfc_name', 'nfkc_name', 'nfc_username', 'nfkc_username', 'nfkc_primary_email_address',
'date_joined']
for key in list(user_model_dict.keys()):
if not key in allowlisted_keys:
user_model_dict.pop(key)
return user_model_dict
This pattern gets you the benefit of having error handling that's nicely encapsulated and isn't tightly coupled to your models, while avoiding the problem where serializers get super difficult to create and understand when serialization and deserializion code is mixed together. And, ironically, serializers don't really confer any benefits for serialization — the pure python function we have here is much faster to write, and much easier to understand.
With respect to using model validators to safeguard your source of truth, you can use validators on the models in addition to in the serializers; the validators on the serializers are responsible for generating the user-facing errors, and then you just need to make your peace with losing proper error messages for race conditions. This is a good compromise because it ensures data integrity, and the downside is negligible because having proper error handling on race conditions isn't going to meaningfully affect your conversion numbers.
So while I won't claim that this is the perfect design pattern by any means, I think it's pretty good, and it's probably the best out of all alternative ways that we could write the above code instead.
Rule #15: Write admin functionality as API endpoints
Django comes with a powerful set of tools that make it easy to build an internal admin site for your project. This admin site allows superusers and staff members to directly view the contents of the database, make changes as necessary, and perform any other tasks related to running a website — for example, sending a user a password reset email.
- Subclass one of the Admin classes, usually ModelAdmin. So if your project has a model called
EmailAddress
, you would makeclass EmailAddressAdmin(admin.ModelAdmin)
- Register the ModelAdmin subclass in a way that ties it to the model; in this case, via
admin.site.register(EmailAddress, EmailAddressAdmin)
. This gives you a basic interface to view the contents of the model and perform basic CRUD actions - Configure the presentation and behavior of that model's admin interface by overriding any combination of around 150 different options (class variables) and methods.
This system is extraordinarily powerful and efficient, and is one of the biggest benefits of using Django. It works right out of the box with just two lines of code, so even if you know you're going to make more customizations later, you're not forced to do that work upfront before it's needed.
The one caveat is that if you start adding lots of business logic directly to your admin classes, then things tend to go off the rails super fast; it quickly gets difficult to tell which methods and variables are directly interacting with Django in some prescribed way, versus which are user-defined properties and helper methods that are indirectly interacting with some other predefined hook. This becomes even more of an issue if you're using third-party packages to add more customization options to the admin, or if you're replacing or overriding templates.
So, is there anything we can do to keep our admin code from becoming a complete nightmare?
The trick is something you've almost certainly heard before: "don't mix business logic with presentation." We normally think of this as front-end advice, e.g. as the reason why styles should be defined in CSS rather than in HTML. But as it turns out, this is also good advice when writing Django admin code.
So how should we structure the business logic in our admin code?
The answer is simple: exactly the same way that we structure all of our other code.
For each action that you want a superuser or staff member to be able to perform, just make an API endpoint with the appropriate DRF permissions, e.g. IsAdminUser
. Then write your views and service methods the same way you would for any other endpoint. You should expose these actions as API endpoints even if you intend to only trigger the functionality in question via the Django Admin.
Why?
The benefit is simplicity.
Again, one of our key goals is that "if a person has enough knowledge to understand how any one endpoint works, they should have enough knowledge to understand how every endpoint works." By expressing our admin functionality through views and services and tests, the same way that we write all of our other views and services and tests, we avoid introducing a second way of doing things and doubling the amount of knowledge one needs to understand the codebase and productively contribute.
So how do we integrate these admin-scoped API endpoints with the Django Admin?
Easy. Just take the service methods you've created for these endpoints, and reuse them to perform any business logic inside the Admin Actions in your admin code. This way these services can be tested exactly the same way as the tests for every other endpoint; there isn't any need to learn a new style or syntax for writing tests. When you write your admin business logic the same way as all your other business logic, you don't need to worry about developers not writing tests because doing so would require figuring out how to write tests in an unfamiliar way, nor do you need to worry about new developers not being able to understand pre-existing admin code or its test cases.
Wondering why it's so important to test your admin functionality when most likely only a handful of folks will have access to it? Think of it this way: your staff are going to have access to vastly more powerful tools than your regular website users, so if anything it's even more critical that this functionality be thoroughly tested than it is for your user-facing endpoints.
On that note, a common pattern for startups is having the dev team maintain their own test suite with white box tests, and then having a separate QA team that uses something like Postman to do black-box testing. Exposing your admin logic via endpoints will allow an external QA team to test this functionality as well. Even if your startup isn't big enough to have a QA team right now, this practice doesn't cost anything, so it's a free option for adding one in the future.
Upgradability
If only creating software were like building a megalith, where once all the stones have been quarried and arranged into their fated alignments, they can just stay there, unaltered, silently watching civilizations rise and fall for the next ten thousand years.
If only.
In the best case, creating software is more like building a bridge or a public water system, where if it's not constantly maintained then it quickly stops working optimally and eventually just stops working entirely. And on bad days, keeping software up and running feels more like the scene of Lucy and Ethel eating chocolate off the assembly line.
In some ways, it's a good problem. Each year hardware gets faster, computer scientists make new foundational discoveries, and software libraries get updated to take advantage of both. As long as you keep updating your dependencies each year, you reap enormous benefits over time from the hard work of others outside of your organization.
But these benefits aren't free.
Just keeping your dependencies up-to-date can easily take 25% of the year. And as soon as you decide to stop updating them, or you no longer have enough resources to do so, the grim reaper is never far behind. First your dependencies develop minor bugs, then unpatched security issues, and eventually they just stop working entirely. And when they do, any apps built on top of them die as well.
Rule #16: Your app lives until your dependencies die
The reason codebases eventually get thrown out usually isn't that the business logic has gotten so convoluted that no one can understand any individual endpoint, but rather because there are one or more keystone dependencies that can no longer be updated.
A good example comes from a consulting project I worked on recently; this startup had 3,500 tests that were written using a test framework that had been abandoned for over five years. There wasn't any good path to upgrade from Python 2 to Python 3. And since most libraries were dropping support for Python 2, this meant that we'd no longer be able to upgrade to the latest versions of Django and all other dependencies.
In short, we'd soon be out of security compliance, we'd no longer be able to benefit from all the hard work that people outside our organization had been doing to improve the tools that we used on a daily basis, and we risked that at some point the app would stop working entirely and there would be no way to fix it.
The decision was made to rewrite the codebase from scratch, with everyone being fully aware that big rewrites fail 90% of the time. In this case the rewrite was successful, fixed a bunch of other architectural problems, and left the startup in a much better place overall. But even though in this specific case it worked out well, for most startups doing a rewrite is an existential risk; given the narrow pathway to success, it's not a great position to be in.
The key insight is that your codebase is only as upgradable as your least-maintained dependency. Each of your core dependencies is going to interact with the others in thousands of places throughout the codebase, so as soon as one can't be upgraded, it won't be long until the rest can't be upgraded either. And if it gets to the point where the only way forward is a rewrite, then unless you have the resources to attempt this multiple times, there's a 90% chance your startup is going out of business.
The stakes that come with choosing the right dependencies can't be overstated.
So what can we do? Is there some process we can put in place to ensure that this kind of thing doesn't happen to our startup?
There aren't any guarantees, but here are some guidelines that require zero technical knowledge to implement. For junior developers and product managers, these are an excellent place to start:
- Look for libraries with lots of stars on Github, downloads on PyPI, recent commits, etc.
- Openbase is a neat tool for quantitatively evaluating JavaScript libraries; hopefully there will be support for Python shortly.
- Make sure any Python dependencies are compatible with the most recent version of Python, and any Django dependencies are compatible with the most recent version of Django. It's normal if it takes two or three months after a new version of Python or Django is released for packages to add support, but if it's been six months or more then this is a major red flag.
- Always start new projects using the most recent versions of Python and Django that are stable enough for production, and update them as soon as is feasible, so as to purposely make it difficult for anyone to introduce dependencies that aren't being actively maintained.
- Look at the distribution of commits. If all the recent code in a project is being contributed by a tiny handful of people, then this means the dependency has a high bus factor, especially if it hasn't yet been widely adopted by commercially successful companies.
- Make sure your dependencies have clear, easy-to-read documentation. Not only because this will reduce the onboarding cost for each future developer you hire and decrease the amount of time it will take to build each new feature, but also because projects with good documentation tend to become more popular over time.
Don't add dependencies unless they're either likely to be maintained for at least a decade, or they're small enough and well-written enough that they'll be easy to fork and maintain internally.
For more experienced developers, these are all still good rules to follow. But at some point we should also be able to use our technical skills and domain expertise to identify good dependencies earlier in the product adoption lifecycle, rather than just looking for signal from others. For example, it's always good when the creators of some library have taken the time to deeply understand any prior art, and have taken a thoughtful approach when making improvements. An example of this that's pretty easy to see is if you read the documentation for Vue.js. However, this often takes some experience to spot because critiques of earlier libraries are often subtle and indirect — it's generally considered poor form to criticize others who are giving away their work for free, and even people who make competing libraries depend on there being a core group of motivated open source contributors creating software for their chosen language. But once you've used a bunch of the previous solutions, it's a lot easier to pick up on the significance of certain design decisions and to understand whether or not some new thing is a real improvement that's likely to get adoption.
The other thing more experienced developers should be able to do is actually, you know, read the code. Whereas for junior developers the difference between good and bad code might be understanding mostly nothing vs. understanding absolutely nothing, more experienced developers should hopefully be able to say something about the code quality, how easy it would be to contribute new features and maintain existing ones, etc.
Rule #17: Keep logic out of the front end
If this guide had been written 15 years ago, a lot of the advice here would probably revolve around mistakes in database schema design. Back then the average database could do less than 30 HDD queries per second, so developers went to great lengths to make the most of each query by putting all sorts of completely unrelated data in the same tables. When done right this was a necessary evil, but when done wrong it was a huge source of technical debt that could take down entire companies. Of course now thanks to solid-state disks, today even large databases can handle tens of thousands of queries per second. So for codebases started in the last ten years, having to deal with questionable denormalization decisions is much less common.
These days, the biggest problem in web development is that JavaScript frameworks become obsolete so fast that as soon as you're done building your front end, you pretty much need to throw it away and start over. Whereas any Python code you write today will probably run just fine in ten years with only a nominal amount of maintenance, you'd be lucky to get three years from your SPA before whatever framework you used becomes deprecated and then unmaintained.
Some people say that the reason for the constant churn in JS frameworks is because we're seeing substantial improvements each year in computer hardware, browser technology, and networking protocols, that these improvements can only be fully realized if front-end frameworks are rebuilt from the ground up, and that this framework churn is necessary because any website that doesn't incorporate these fundamental improvements into their products will cease to be competitive. Other people say it's because front-end developers are a bunch of shiftless ne'er-do-wells.
I'm not here to take sides.
What I would say though is that if we're going to have to rewrite our SPAs every few years for the forseeable future, we should do everything we can to minimize the time and cost of doing so by keeping as much business logic as possible out of the front end. How do we do this? By putting it in the backend, even if it doesn't logically belong there. Think of this as the new database denormalization.
A good example of something that normally would belong to the front end is date formatting. The best practice should always be just returning ISO 8601 dates to the front end, since the same dates are likely to be formatted differently depending on where in the app they're being used, whether the user is on desktop or mobile, etc. In the ideal world, doing this formatting should always be the job of the front end. But in practice, given the choice between rewriting all the date formatting logic in three years or just doing it on the backend, in many cases the latter is probably the least worst option. This same logic also applies sorting, string formatting, error messages, etc.
In terms of what this should actually look like, I think the solution is still returning the same front-end agnostic response that you "should" be returning, but then also appending an extra JSON object with pre-digested values based on the current needs of your specific front end. This might look something like:
{
'data': { ... }
'front_end': { ... }
}
It feels wrong, and it is kind of wrong, but it's better than having it take an extra year to rewrite your front end.
Rule #18: Don't break core dependencies
Avoid breaking functionality in your core dependencies.
What do I mean by this?
As an example, using SQLAlchemy instead of Django's ORM, at the expense of no longer being able to (easily) use Django for user management, access control, building an admin dashboard, etc. Similarly, using Django with MongoDB is technically possible, but at the expense of losing all the SQL- and Postgres-specific functionality that Django provides. (And being stuck with Mongo.)
There are even cases where using functionality within a library or framework can break other parts of the same library or framework. An excellent example is with GenericForeignKey considered harmful, where using that one part of Django means giving up some of the data-integrity guarantees provided by Postgres, makes certain ORM queries impossible, causes performance issues, breaks the admin, etc.
This issue doesn't come up all the time, but when it does it can cause problems that are catastrophically expensive to fix. Regardless of the problem you're trying to solve, breaking core dependencies is rarely a good trade-off in the long run. Over time you tend to lose out on increasingly more functionality in whatever dependency you broke, often new functionality that ameliorates whatever issue you were trying to fix in the first place. As more functionality from third-party packages gets baked into core dependencies, what happens is that these peripheral solutions stop being maintained. But by this point you usually can't just rip them out and replace them with something else; now you have the same problem as with any other unmaintained dependency, except usually worse because you also have weird patterns in your code as a result of needing to work around the functionality in your core dependencies that you broke. Usually the end result here is just needing to do a complete rewrite, which again is something that you never want to have to do.
Why make coding easier?
The above rules reflect the patterns and anti-patterns I've observed at startups over the years that have had the biggest impact on productivity. To the extent that there's a common theme connecting them, it's what pg describes in his Great Hackers essay:
Several friends mentioned hackers' ability to concentrate— their ability, as one put it, to "tune out everything outside their own heads." I've certainly noticed this. And I've heard several hackers say that after drinking even half a beer they can't program at all. So maybe hacking does require some special ability to focus. Perhaps great hackers can load a large amount of context into their head, so that when they look at a line of code, they see not just that line but the whole program around it. John McPhee wrote that Bill Bradley's success as a basketball player was due partly to his extraordinary peripheral vision. "Perfect" eyesight means about 47 degrees of vertical peripheral vision. Bill Bradley had 70; he could see the basket when he was looking at the floor. Maybe great hackers have some similar inborn ability. (I cheat by using a very dense language, which shrinks the court.)
Paul Graham is right. Figuring out how to get the absolute maximum out of your working memory is the key to being a good developer. His personal method for doing this is using Lisp. But interestingly, he now recommends to would-be technical founders that they learn Python or Ruby instead, presumably at least in part due to seeing that many folks who attempted to learn Lisp as a first language and use it to build a startup weren't sufficiently successful.
For most entrepreneurs, the fact that Python is easy-to-read and easy-to-learn and has a huge open-source ecosystem makes it a clear win over using more powerful languages that lack these advantages.
That said, maximizing the amount of context we can fit into our heads is still critically important. But choosing a very dense language isn't the only way to do this; we can get most of the same benefits by purposely choosing software architecture patterns that are aligned with how our minds actually work.
When you design your software such that it's predictable, readable, simple, and upgradeable, there are countless benefits that accrue. And some of the most important benefits aren't entirely tangible; when developers don't have problems reading each other's code and doing their assigned tasks, people are less defensive about their own work and less abrasive toward others. People are less stressed, their days are better, the culture benefits, and recruiting gets easier.
But, you might be asking, how does all this fit into my team hitting the metrics we need to raise our next round? Let's look at the impact this praxis will have on: velocity, optionality, security, and diversity.
Velocity
Your developers are wasting millions of dollars every year by writing code that other people on your team can't read, and it's slowly destroying your business. And it's not because their line lengths are longer than 80 characters, or their class names used the wrong type of camel casing, or they didn't indent their docstrings properly.[7]
When velocity grinds to a halt, it's because your developers are wasting their time and cognitive bandwidth:
- Reading irrelevant code unnecessarily. This is like missing the index on a database lookup and needing to do a full table scan, except for that it wastes hours or days. This happens because:
- There isn't a repeatable, step-by-step process that developers can follow to locate the sections of code that are relevant to the problems they need to solve.
- There isn't enough documentation, or the documentation is out of date.
- There's dead code in the codebase, but it's not obvious which code is dead.
- The names of variables, functions, files, and/or classes don't accurately reflect what they contain or do.
- Complex sections of code haven't been properly encapsulated to make it easy to skim over them with only a high-level understanding of what's happening.
- Making sense of multiple solutions to the same problem, when it would have been more appropriate for there to have been one solution that was just reused.
- Trying to remember what's in each of their open tabs, because what would otherwise be easy-to-understand code has been needlessly split across many different files.
- Learning, remembering, and debugging multiple ways of doing the same thing, when there aren't any benefits that accrue from doing so.
- Needing to learn, work with, and maintain lots of dependencies that aren't actually doing that much.
- Trying to find the source of behavior that's happening implicitly.
- Being unable to easily upgrade software or install new libraries.
- Having no tests, or having too many tests.
That's it.
It's absolutely worth reading the Google style guide, Effective Python, Two Scoops of Django, etc. Even the sections I don't agree with, I'm vastly better off for having read.
But none of these resources focus specifically on the needs of startups, nor do they get to the heart of what causes development velocity to grind to a halt or how to prevent that from happening. That's why I've documented the patterns presented above; each has proved its worth in the companies where I've seen them implemented or have implemented them myself.
Optionality
The hardest part of starting a new business is making something people want. Why?
Because everyone is lying to you.
But even when following customer development best practices (read the book!), it's almost impossible to know which functionality users will gravitate toward until it's actually built. There are two basic solutions to this:
- Find a highly profitable consulting niche, and then find a way to productize that consulting. As this talk explains, this is actually how many of the world's most valuable SaaS companies got their start.
- Build some core piece of deep technology, and then have twenty or thirty different business hypotheses you want to test out using that technology. Talk with potential customers in advance and try to get some letters of intent, but also accept that this only gets you so far.
A good example of the latter strategy is Uber, where their core bet was on the ability of mobile phones to unlock a frozen labor market. When black cars didn't work they quickly shifted to taxis, and if that hadn't worked they would have just pivoted into food delivery or some other business using the same technology.
With both strategies, optionality is essential. But in the second model, where the entire bet is on the value of some sort of frozen asset that you're unlocking rather than on a specific market need, optionality is everything.
For software teams, the best way to think about optionality is like the subway. When cities are building a new subway line, they usually extend it a few dozen yards past the last station in whatever directions they think they might want to expand in the future. This way they aren't incurring any significant cost upfront, but if they eventually do decide to expand then they don't have to pay a ton of extra money to completely rebuild the stations at the end of the line.
This planning method allows cities to preserve optionality without paying extra for it, which is the same mentality that's required for startups. Don't write all the software needed to test each hypothesis upfront; rather, write software in a way that minimizes the friction of transitioning from one hypothesis to the next.
For startups, keeping the architecture simple and readable is a cost-free tactic for creating optionality. Don't guess at what business logic might need to be abstracted in the future before knowing the business requirements. Keeping things simple will make it easier to take advantage of new opportunities in the future.
Security
Security isn't everything; it's the only thing.
I'd say that it doesn't matter whether or not your app is working correctly if it isn't secure, but that's actually not true. If your app isn't secure, it's actually much better if it's not working.
At all.
It's tricky because in most jobs being reasonably competent and well-meaning is good enough, but this isn't most jobs. In software development, it's unfortunately super easy for folks who are reasonably good at their jobs and well-intentioned to do enormous amounts of damage extremely quickly.
Figuring out how to deal with this dynamic is one of the more interesting management challenges. It's probably why there's a famous tech industry quote to the effect of, "If the person who made your computer weren't an asshole, it probably wouldn't even turn on." That's obviously terrible management advice, but the truth is that most developers aren't security experts, and most tech managers don't have unlimited time to review pull requests for security issues or the finesse necessary to turn security incidents into teachable moments while simultaneously give appropriate emphasis to their gravity.
That's why big companies have security teams responsible for teaching security, auditing code, monitoring for intrusions, and putting guardrails in place to make certain classes of mistakes impossible.
For smaller startups, the best we can do is to make our code as readable as possible, to write it in a way so that security mistakes are as visible as possible, and to purposely maximize the chances of our tests breaking if there is ever a problem.
The first rule of security is that security is a process, not a product. In other words, it's not about paying for security audits or firewalls or what have you, but rather it's about the systems and cultural expectations you put in place that govern how everyday tasks get done. You can — and should — go out and read the books and blog posts that have been written about this; it's impossible to give much general advice because the security processes that make sense for each company are going to be largely determined by the nature and value of the assets under protection, and so even for a single company the best practices are going to shift significantly over time.
But regardless, no matter what company you work for and what assets you're protecting, the basic foundation is always going to be the same. Write your software so that:
- It's as easy as possible to understand what the software is doing.
- Security mistakes are as visible as possible.
- There are systems in place, starting with integration tests, to ensure that private data isn't accidentally exposed.
There are lots of systems you'll need to put in place outside of the code, everything from code reviews to break-glass procedures. But adopting the sorts of clear style conventions outlined above is the foundation of any good security process. It's the most impactful step any company can take, and from a dollar-for-dollar perspective, the most cost-effective.
Diversity
Of the 60,000 tree species that exist in the world, I can identify barely more than 0.1%. And yet if you put me in any local forest, I can easily identify upwards of 99% of trees on sight.
How is this possible?
It's because species aren't evenly distributed. For example, 30% of all trees in New England are red maples. If the only thing you can say about a tree is that it doesn't look like a pine tree, the odds of it being a red maple are over 50%. As with most things in life, the distribution of species roughly follows Zipf's Law. That is, the most common tree species is about 10x as common as the 10th most common tree species, 100x as common as the 100th most common tree species, etc.[8]
In contrast, if you put me in any botanical garden or arboretum, where there are only a couple of each species and they're sourced from all over the world, the only way I'm going to be able to identify more than 0% of trees correctly is if it's winter so the shrubs and ferns aren't yet covering up the metal signs.
What's the point?
Your codebase should be like a forest, not a botanical garden.
With any new project or business, you're starting with some market need you want to address or some technical problem you want to solve. To get there, you're going to need some set of tools, abstractions, patterns, and so on. So it's completely normal for your tooling and architecture choices to roughly follow Zipf's Law. That is, easy-to-use and powerful tools and patterns should be ubiquitous, whereas those that are difficult to learn or understand should appear rarely, and only when necessary for implementing specific niche functionality.
But all too often, as the system gets bigger, organizational dysfunction results in the barrier-to-entry to contributing increasing over time. At the code level, this often happens because:
- Lots of dependencies get added to the project without strong justification. At best because there is no clear process for deciding when to add dependencies, and at worst because devs just want to add random technologies to their resumes and no one is paying attention.
- Basic functionality that already exists in the codebase is reimplemented in multiple places, due to inadequate documentation, knowledge transfer, and code organization.
- The same types of tasks are done with wildly different architectural patterns and code styles.
- People have implemented functionality and abstractions that weren't needed, resulting in both unnecessary complexity and dead code.
And so on.
If there weren't any consequences then these things wouldn't especially matter, but there's actually an enormous amount of money to be made by minimizing the technical barriers to contributing.[9] For example, the key insight behind the Low-Code and No-Code movements is that:
Making things easier has nonlinear effects. Making something 10x easier can cause 1000x more of that thing to happen. Hence the explosion of online creativity you see on YouTube, with chess, Minecraft, math videos, Khan Academy, Twitchstreams, Soundcloud, etc; you remove a small bit of friction and get a large result.
The aspirational promise is that the majority of the world's most creative and entrepreneurial people don't know how to code, and so by radically lowering the barrier to entry we're going to get a new generation of products and services that are more innovative and more successful than anything seen to date.
And while I completely agree with the premise, I'm not convinced that a purely visual approach is realistic. I'd suggest instead that perhaps it's possible to unlock this tidal wave of entrepreneurship and creativity by just making coding itself 10x easier.
Maybe the future isn't low-code or no-code, but rather it's just readable code.
We'll know we've succeeded when it's taken for granted that everyone will be successful at the programming part of their job, so we only evaluate candidates on their skills that are unrelated to programming. Much like how today every analyst at every venture capital firm needs to be good at working with spreadsheets, but no one hires their analyst primarily based on this ability.
Rather than hiring lots of folks with deep Python skills and experience using tracemalloc to debug memory leaks, startups would be much better off if they minimized the use of patterns and features associated with memory leaks in the first place, and instead hired just developers who were also experts at things like marketing, sales, and design.
An explosion of productivity and creativity is going to happen regardless of whether your company reduces its barriers to contributing or not; the only difference is whether the wealth created is going to accumulate within your company or elsewhere. If the sorts of practices I'm recommending made it possible for people who couldn't otherwise get hired to become 1x developers, but at the expense of your most talented developers, then clearly none of this would be worth it. But that's not what happens. Instead, everyone at every ability level becomes vastly more productive.
The important thing to understand here is that there's no inherent conflict between velocity and TCO; software written this way is no more expensive to write, and is no less useful, or powerful, or performant.
It's just better.
Alex Krupp is the co-founder & CEO of FWD:Everyone, a platform for sharing and publishing email conversations. He intermittently takes on software consulting engagements, and has spent several years developing software for Fortune 500 companies, pre-seed startups, high-growth venture-backed startups, and everything in between.
[1] This architecture is also very well-suited for LOB apps, which are the internal-facing software applications that tend to make up the bulk of the software written within large companies. Their defining feature is that the bulk of their complexity tends to be in the business logic, as opposed to things like developer tools where the complexity is often more algorithmic in nature. Both SaaS startups and LOB apps tend to be around the same size, use the same architecture, and have the same general concerns, which is why the recommendations here are good for both.
[2] For Django specifically, this problem is compounded by the fact that much of the Django-specific architecture advice and tooling has been created by, and for the benefit of, dev shops — who inherently have a principal-agent conflict with their clients.
[3] Most series A startups I've seen tend to have 20 - 30 KLOC of Python, not including blank lines, tests, or migrations. If you're curious about how many lines of code your codebase is, check out CLOC.
[4] Some people prefer sanitizing for things like XSS in middleware, on the grounds that A) relying on developers to explicitly sanitize input is risky B) it makes it easy to return an error immediately if any potentially malicious input is detected, which in some cases is better than proceeding with sanitized input. I think the risk of writing sanitization middleware incorrectly is greater than the risk of forgetting to sanitize input. But if someone wants to add sanitization middleware to a project, I'm not opposed as long as explicit sanitization is also required.
[5] https://github.com/google/styleguide/blob/gh-pages/pyguide.md. Arguably this admonition only applies to leaving comments explaining basic Python syntax, in which case this quote is unfairly taken out of context. But regardless, assuming that others know more Python than you is generally bad advice when taken as a broader principle.
[6] There might be several reasons why a user isn't allowed to perform an action, but enumerating every reason would at best serve no purpose and at worst could make it easier for a malicious user to circumvent a restriction.
[7] You should absolutely still do these things, I'm just saying that they're not the things that have the biggest impact on development velocity.
[8] This quirk of the universe makes learning to identify your local trees, plants, and mushrooms a lot easier than you might otherwise suspect.
[9] In this interview, Jon Stewart talks about how The Daily Show didn't hit its creative peak until they started focusing on removing the barriers to contributing as part of their diversity, equity, and inclusion strategy.
Comments
Post a Comment