The Sieve of Eratosthenes approach to achieving bug free code
An approach any developer will understand
Introduction
Every software engineer knows what the Sieve of Eratosthenes is (or rather should know). It’s one of the fundamental algorithms taught in Computer Science classes and a favorite Leetcode or coding interview question. It is essentially an algorithm to obtain prime numbers and this is the key:
You obtain prime numbers by iteratively removing non-primes. The non-primes are all the multiples of the primes.
You start with a running list of numbers 2,3,4 … n . We know 2 is a prime. So we remove all multiples of 2 from the list ie. 4,6,8 … . Then next is 3. We remove its multiples ie. (6 is already removed), 9, (12 is already removed), 15 … and so on. You do so until you reach the end of the list and you get the final list of primes.
How do we apply this to the software development process?
The idea is simple: make the code pass through several filters to remove (i) the possibility of bugs (ii) code that could potentially cause bugs and (iii) actual bugs. In other words:
You obtain good code by gradually removing bad code or the possibility of bad code.
Here is a possible (not exhaustive) set of filters:
- Test planning: product or feature requirements translation into test plan and test cases.
- Code Linting
- Static Code Analysis
- Unit Tests
- Code Reviews
- Module level tests
- Module-Module Integration Tests
- End-to-end tests
- Tests for non-functional requirements eg. security, performance
- Post deploy tests
- Synthetic Monitoring
In the following sections I’ll outline why each layer (or filter) plays an important part in removing bugs or the possibility of bugs. Needless to say, each section only briefly touches on the subject but I hope it will give the reader a taste of it and the motivation to find out more.
Test Planning
I cannot stress enough the importance of this step. This is a task where, if there’s an experienced Quality/Test engineer or test minded engineer in the team, they can bring a lot of value to the project. The reason for this is that until the project deliverables are expressed as test cases, it is possible the requirements are not clear.
In the section “Toward a theory of error”, authors Huesser and Larson [1] outline 8 possible root causes of bugs. Of these 8, 6 are related to requirements. These range from missed requirements to unintended consequences of interactions of 2 or more requirements. The key problem with requirement specifications are that they are often not expressed in a rigorous and disciplined manner. Hence when under scrutiny during the process of creating the test plan, questions almost always arise. This is the value of that process — that of forcing the product manager to clarify her ideas about the feature to be delivered.
Another benefit of designing a test plan is that it clarifies to the developer what she is expected to build. This is at the very heart of Test Driven Development or TDD. Until a feature is expressed as a test, what is expected can be ambiguous.
This is even more pronounced on complex projects with multiple teams working on various aspects. Unless there is a clear specification of what the complete product is supposed to look like and how each team contributes to the overall user experience, what is likely to happen is come integration time, teams will realize they misunderstood the requirements. Again, an overarching test plan can play a key role in forcing conversations between teams to better express the “contract” between them. This is the very essence of Contract Based Testing.
Code Linting
This is the most basic form of static code analysis (see next section). It is the process of using a tool to automatically analyze source code for potential errors, stylistic issues, and violations of coding conventions or best practices. Stylistic and coding conventions are important in software development as a developer works as part of a team and her code needs to be understood by other members of the team in order to be properly maintained and hence not introduce bugs down the road.
In [2] I paint the nightmare scenario: “you get a critical production bug on a Friday night at 2am. You’ve had a long week but the SLA requires you to fix this bug ASAP. You go look at the code to see what’s the issue and you curse and swear because the last developer who had worked on this part of the code had made it difficult to understand. Worst of all, he left the company last week and you know that he’s currently taking some time off in the Amazon jungle and thus is not contactable.”
Lastly, sloppily written code usually also points to code that is not well thought through or was rushed — definitely worth the time to do a deeper review on it.
Static Code Analysis
Static code analysis is a software development technique where source code is examined without execution. It involves analyzing code at the source level to identify bugs, security vulnerabilities, and adherence to coding standards. It is essentially automated code inspection. This proactive process detects issues early in the development lifecycle, ensuring cleaner, more maintainable code.
One of the more popular static code analysis tool, SonarQube (or its Cloud product SonarCloud), covers the following key analyses:
- Cyclomatic & Cognitive Complexity
- Code smells
- Code coverage
- Identifying security vulnerabilities
Cyclomatic & Cognitive Complexity
See [3] for a more comprehensive explanation of what these are. It suffices to say that it is a measure of how complex the code is with the latter the more meaningful for modern structured programming languages eg. Java. By understanding which parts of the code eg. classes, functions, methods are more complex, it helps advise the developer on which areas of the code to add unit tests to “bubble wrap” it.
Code Smells
Code smells are indicators in source code that suggest the presence of deeper problems or potential issues in software design or implementation. They are not bugs themselves but rather symptoms that may lead to maintainability, readability, or performance problems. Code smells often result from poor coding practices or design choices. Examples include duplicated code, long methods, excessive commenting, and inappropriate coupling between classes. Identifying and addressing code smells through refactoring helps improve code quality, making it more maintainable, understandable, and adaptable. In essence, badly written code is bad code quality and has a higher tendency to lead to bugs.
Code Coverage
Code coverage is a metric that quantifies the percentage of lines, branches, statements, or other structural elements of the code that have been exercised by the test suite (usually unit tests). Code coverage provides an assessment for the thoroughness of tests and identifying areas of code that remain not covered by tests. SonarQube provides metrics for (i) line (ii) branch and (iii) function/method coverage.
Identifying Security Vulnerabilities
By analyzing the code, tools like SonarQube match code patterns that point to possible security vulnerabilities (OWASP Top 10). This helps prevent possible security exploits.
Unit Tests & Code Reviews
Unit Tests and Code Reviews are actually 2 separate “filters” (since we are using that metaphor in this article). However as I have explained in Unit Tests and Code Reviews: the bedrock of software quality [2] they form a sort of reinforcing loop: unit test results make great input for code reviews & code reviewers benefit from the fact that passing unit tests are very effective documentation of the code under test.
Documentation will become obsolete but code is always the source of truth.
Module (or Component) level tests
The key idea here is to test in isolation. A software module (or component) is a self-contained and independent unit of software that performs a specific set of functions or tasks within a larger software system. This could be a library, micro-service or web-service. It exposes a software contract ie. how it is designed to be used and what it is for eg. a microservice that performs all user registrations.
This module in itself should be able to be tested that its functionality works on its own without (much) interactions with other modules. If it interacts with other modules, then it would make sense to create mocks or stubs that essentially simulate the behavior of these external modules so that the module in test can be verified that it works as defined and that assumptions made about external modules are captured in the mocks, stubs and tests.
Module-Module integration tests
These are tests to test the interactions and the contracts between modules. There are several possible approaches to this to limit the scope and not having it essentially become a End-to-end or System level test:
- Pair-wise
- Logical groups
Pair-wise
In this approach, if a module A interacts with another 2 modules (B & C), then the pair-wise integration tests for A would be A-B (with mock for C) and A-C (with mock for B). As you can see from this example, the number of possible integration tests could easily increase combinatorically. Hence it makes sense only to do with with the most critical integrations.
Logical groups
I find this more practical as groups of modules are usually (a) managed by the same pod or team (b) there’s a lot ways of exploiting cohesion amongst modules eg. all the modules that support the Billing business function.
End-to-end tests
The goal of End-to-end (or E2E) tests is to simulate real user scenarios and ensure that all components of a system, including the frontend user interface, backend services, databases, and external integrations, work together as intended. These contrast from the Module or Unit level tests in that the focus is on how the individual components work together to provide value for the user. These tests are each designed to perform a specific user function eg. purchase a product from an ecommerce site or perform a user registration.
Tests for Non-functional requirements
Non-functional requirements are characteristics that describe how a system should perform its functions rather than specifying what functions the system should perform. Examples of such requirements include: reliability, performance, scalability, usability, regulatory requirements.
There are tests for reliability, performance and scalability and these usually fall into the realm of Performance Engineering. These test the module or subsystem’s performance, scalability and behavior under load situations. The oft cited example is an ecommerce site preparing for Black Friday. In such a situation management will want to know how it will perform if the traffic spikes to 10x or even 100x and what contingencies need to be put in place for such a situation.
Post deploy tests and checks
These are tests or checks that are performed immediately after a production deploy is conducted. The reason for this is simple, bugs show up in production most often after a deploy.
Sometimes, and for good reasons, some tests cannot be performed on the production environment as the cost could be prohibitive or there are security implications eg. performing a purchase with high cost or requiring credit checks. In such situations, the approach to checking to see if there are issues post deploy would lean heavier on Observability ie. instrumenting the code to provide signals of key events eg. failed transactions.
Another approach to post deploy checks is to utilize synthetic monitoring.
Synthetic monitoring
Synthetic monitoring involves the creation of simulated transactions or interactions with a system to evaluate its performance and user experience. For example, such a test could involve a simulated user performing a user registration and then proceed to purchase a product.
If implemented comprehensively covering key/critical functionality of the site, Synthetic Monitoring can become a company’s Iron Dome (the Israeli air-defense system).
Conclusion
In this article I have presented a metaphor for achieving bug free code — that of the Sieve of Eratosthenes or a sequence of filters — and brief explanations of each filter. One may think that this would be very resource heavy and manual but if they are added as part of a CI/CD pipeline and automated, this can be very efficient.
However, to achieve absolute bug free code may be impractical. There are many other factors at play when shipping code eg. time to market, cost of testing etc.
If the customer wants a bicycle, do you build a Porsche? Can the customer afford to own a Porsche?
How much quality can a company or organization afford? That really depends on the stage of growth a company is at. And that is the subject of another article.
References
[1] “Software Testing Strategies: A testing guide for the 2020s”, Heusser M & Larsen M, Packt Publishing, 2023
[2] “Unit Tests and Code Reviews: the bedrock of software quality”, Heemeng Foo, Medium, Sept 2020, https://medium.com/dev-genius/unit-tests-and-code-reviews-the-bedrock-of-software-quality-9a23cd24558b
[3] “Cyclomatic & Cognitive Complexity”, Heemeng Foo, Slideshare, Apr 2020, https://www.slideshare.net/HeemengFoo/cyclomatic-and-cognitive-complexity