Domain Driven Design For All
Originally posted on circonus.com
Domain Driven Design (DDD) is usually associated with microservice architectures. As Microservice architectures have been perceived as burdensome and overly complex, so too have organizations started to call into question the relevance of DDD initiatives. The argument is usually that unless an organization reaches a mega-scale that requires eventing to keep and micro-services to scale horizontally, such architectures are overkill. If they are overkill, then we don’t need to model our organization because everything is in a handful of monoliths that own everything. Therefore focusing on DDD is something that can wait as well.
However, when the God Father of DDD, Martin Fowler, came up with this concept, Microservices didn’t even exist! We had barely even scratched the surface of Service Oriented Architectures with Enterprise Service Busses. In reality, the concepts evolved from attempting to design systems with object oriented programming. These objects at the time happened to be monoliths. Thus, we can benefit from applying DDD principles regardless of deployment decisions (monolith vs. microservice, etc) and regardless of organizational size and maturity.
When you have a clear domain model, software quality improves, reliability improves, architectural decisions are more efficient, and data engineering teams are able to produce a far higher quality output.
What is Domain Driven Design
Even though DDD has been around for nearly three decades, it is not something many universities teach (let alone in applied fashion), and so unless you’ve had mentors and direct experience, chances are DDD is a rather fuzzy concept. So let's talk about it from two perspectives: first, how it relates to the business and organizational design; second, how that applies to software.
Apples are Apples; Oranges are Oranges
From a business perspective, DDD brings clarity through a “Common Ubiquitous Language (CUL).” This simply means that across business departments, we all agree what we will call things. Critically, we agree that exactly one name will be applied to a given object and that name will be unique within a given scope or boundary. This is called a “Bounded Context.”
I saw the importance of this first hand when I was a consultant working with companies of every size. Even in the best run organizations there is always some ambiguity around what things are called. This was a while ago, when designing canonical data schemas was in vogue. For example, we’d start looking at the eCommerce platform that had an “Orders” endpoint. However, the language got more confusing when integrating with the finance system that had “Sales Order,” “Return Order,” “Purchase Order.” The warehouse management system that was being replaced had ambiguity with the term “Purchase Order” depending on if you were referring to an inventory purchase vs. a Purchase Order a corporate customer sent to bulk order a product from the company.
In every conversation we had to do the mental gymnastics of translating what type order are we talking about and in the service bus map it to yet another definitive or canonical definition of the same concept. During early phases of implementation, engineers would have to map the canonical definition back to the legacy systems that were still in operations. It got even worse when the data that made up a sales order was spread across two or more objects in the legacy system.
When we have a CUL, we know exactly what an apple is and exactly what an orange is. It may be that translation is still necessary outside of a bounded context, but it is much easier to reason about when it is an entirely separate line of business and entirely separate mental context.
Another huge benefit comes during the operations phase of a system. If we have a common language between customer service, the platform team, and engineering teams, it becomes significantly easier to triage, understand, and resolve incidents and bug reports. The teams can reference a common glossary and not have to ask clarifying questions. It's just straightforward.
Software Quality
A domain can be thought of as an object in object oriented programming, or a logical unit of separation within a system. A domain has certain rules or properties that should be followed, similar to the software engineering principles above like encapsulation and DRY code.
A Domain Has Ownership of its Data; Only it Can Modify the Data or Access the Datastore Directly
When I’ve had to decompose a monolith, one of the biggest challenges is understanding how data is mutated.
A software engineer may be well intentioned, but doesn’t realize that an ORM entity that relies on a foreign key to a table owned by another service starts to cross boundaries. When the developer updates their entity, they don’t realize they are accidentally setting properties on a table owned by that other domain.
When another developer then implements a feature in that other service, there are two possible issues stemming from a new tight coupling:
- The new code expects the changes from the other service, creating a tight coupling that is easy to go unnoticed until there is a bug introduced by the original developer changing their code.
- The new code directly introduces a bug to the original developed code and goes unnoticed because the developer of the new feature was unaware of the coupling.
As a system evolves, the tendency is to not fix these side effects, but to “handle” them. The system is then more brittle, and deployments frequently break things. Only then is there motivation to decompose the application, but the difficulty has magnified because it's impossible to identify how all of the side effects have come to impact unknown parts of the application.
By establishing ownership and a single writer, this problem goes away because all changes are consolidated to a central location.
Domain Functionality Is Exposed Through Well-Defined Interfaces
Domains are hard boundaries of encapsulation. Within a given domain, developers have great freedom to use the right tools and patterns to accomplish the task. However, when interacting with another domain’s services, it can only do so through well defined interfaces, usually lumped into the category of an API. This can be an API or Event schema in the case of a distributed system, or a Class or Service Interface in a monolithic solution.
This supports the single-writer concept, but also enables DevOps practices that lead to more reliable and faster releases. We can’t rapidly deploy if we can't test what has changed.
Running a manual or long-running regression test suite prevents rapid deployment. It simply takes too long. Not to mention, do you have a high confidence that those tests will catch everything?
Instead, we want to have a high level of confidence we can release part of a system and know we won’t break upstream or downstream dependencies. We minimize side effects through proper encapsulation. Imagine the proverbial ball-of-yarn where you pull a string in one place, and it makes a knot on the other side. You had no idea those strings were attached, but they were. Clear boundaries and well defined interfaces ensure this doesn’t happen with your code.
Any issue that does arise is probably related to that single domain and easier to detect. It also makes it easier to identify issues and restore service quickly because the location of the issue is already identified and the cause (the deployment) can quickly be reverted.
Bringing it Together
The key principles of well encapsulated software design are only made possible when we have a correct understanding of the business CUL and consequently its data. Encapsulation will start to break down if there is a problem with how the domains were defined. All of a sudden a new functionality cannot be implemented unless it has access to mutate data in another service. For a while hoops will be jumped through until eventually all the hoops drive velocity to a crawl. Because of external factors and needing to ship fast, an architecture that once was key to speed becomes the cause of delays. The system - micro-service or monolith - gets labeled “poor architecture” and we’re back to the drawing board to figure out how to get out of the mess.
If we look deeper, the root cause stems back to not having a clear understanding of how the business models use its information. This is why putting the effort in up-front and continually re-visiting the domain model is relevant to all organizations. Assume you will get it wrong, be willing to revise, and keep a technical roadmap up-to-date so strategic refactoring and boundary adjustment can occur before panic sets in and a solid system gets tanked by tech debt.