Sun Tzu and the art of Software Test Management

8 min readFeb 8, 2023

Know yourself and your enemy

The Art of War written in bamboo (credit: vlasta2, bluefootedbooby on flickr.com — https://www.flickr.com/photos/bluefootedbooby/370458424/)

Introduction

Being a test/quality engineering leader I am often asked what is my test management philosophy. Essentially, the question centers around how to decide on the approach, strategy and execution for various test projects. The thing is this: every company is different, every project is different, every product/service is different and so is every product-engineering team. This is not fast food where one size fits all. Moreover, the amount of resources for test (or any aspect of engineering for that matter) are finite. You need to pick your battles and focus on the things that matter. How do you do that?

I recently reflected on this and, as a student of the Chinese language (I was a terrible student by the way), the proverb “know yourself and your enemy and you will win every battle” (知己知彼，百战百胜) came to mind. This proverb is attributed to Sun Tzu of the famous Art of War and I immediately wondered if it could be applied. More on that later.

The proverb

In its entirety, the proverb attributed to Sun Tzu goes like this:

If you know yourself and your enemy, you will win every battle. If you know yourself but not your enemy, it is 50–50. If you do not know yourself and your enemy, you will lose every battle.

Since we are on the topic of test management, what is knowing yourself and your enemy?

Applying this to Test Management

As I meditated on the proverb and looked back at all the projects I have led, I listed down the following:

Understand the user
Understand the product
Understand the architecture the product is built on
Understand where the software has been, who’s worked on it. This helps you to understand the origins of its complexity
Understand the engineering team
Understand the test team’s capabilities and resources
Understand the product team
Understand the business

How do we map this to “knowing yourself and knowing your enemy?”. Well, in “You can’t fix quality just by catching bugs” [1], I explained that the pursuit of quality is a team sport. It involves Engineering, Test, Product, Customer Support etc. Hence (2), (5), (6), (7) fall into the “knowing yourself” category.

What then is the “enemy”? The enemy here is complexity. Bugs are the result of complexity. All things being equal, no engineer wants bugs. Bugs mean lost sleep, family time, weekends, missed shipped dates etc. Yet, due to the time-to-market requirements and increasingly complex nature of systems, bugs, tech debt and dark debt show up (see [2]). Hence, (1), (3), (4) and (8) fall into the complexity camp and complexity is what we seek to reduce or manage.

On a side note, in “Who should do software testing? Dev or Test?” [3], I wrote about the unhealthy adversarial relationship between the dev and test teams. The root of this conflict stems from having the mindset that bugs are caused by “human error” (see [4]) rather than treating bugs as symptoms of overly or not well managed complexity. What results is that test folks blame devs for allowing bugs to enter the code base and dev blame test for allowing the bugs to get into production. This is ultimately unhealthy. At the end of the day, both dev and test are on the same team. We want to produce quality software.

In the following sections, I briefly explain why I feel each of the 8 “understands” help us get a better picture of how to approach test management.

Understand the user

Most of the time, the test/quality engineering team is asked to be the “voice of the user” so it makes sense to have a good idea of what value the app or service brings to the user, how it is used, when it is used etc. It also helps in test planning and prioritization since it allows you to focus on what is most important to the user at the time of release. For example, I was once involved in leading testing for digital agriculture. In that domain, there are 2 key time periods: planting and harvest. Those are the periods the apps/service are most used. Hence, during planting for the northern hemisphere, the focus on testing is on the planting activities (or rather the software that supports that).

Understand the product

Is the product a consumer or enterprise product? Who pays for the product? In subscription or usage based products, it is the user who pays. Free to use products are usually paid for by ads. In such cases it’s the advertiser who’s the customer. Make sure you know who is the customer.

What is its key value proposition ie. what is the key value that users derive from your product? This really helps you to assess the impact of bugs. In the case of most e-commerce products eg. Etsy, the value proposition is frictionless sale and fulfillment; in most social media sites eg. Instagram, it’s on the one hand FOMO (Fear-of-missing-out), envy or alleviation of boredom for the user and the understanding of consumer intent and desires as well as the ability to serve the right ad to the right user for the advertiser.

Understand the architecture

I was once in charge of testing a portion of mobile search. In a typical search architecture, there are “domains” with trigger keywords. In a large search setup, different domains are handled by different teams and subsystems. One area of brittleness in the architecture lay in how sets of keywords were indexed. To address that, I wrote some synthetic monitoring scripts (it wasn’t called that then) to periodically perform searches on the sets of keywords that my team was in charge of. It was a simple script but that allowed us to bring our MTTR to about 2 hours.

So the key takeaway is this: get a senior engineer or architect to walk you through the system architecture and ask probing questions. That usually leads you to identify weaknesses and hence potential causes of issues.

Also take an interest in common system design patterns. There’s a whole bunch of YouTube channels just on this topic. Remember,

If you don’t know how something is built, you don’t know how it should be tested

Understand where the software has been

I was once put in charge of testing a Fantasy Sports iOS app. This was only a few years after apps on the iPhone and iPad became popular. Needless to say, not only were the devs new to the platform, due to the uncertain nature of the company at that time, there was a fair amount of churn in the code. This led to unused or underused code, code duplication and lots of tech debt. Needless to say, the code quality was bad and hence the product quality too.

In such situations, you can’t expect testable code since a major refactoring is in order. The best you can do is either writing automated E2E tests or manual testing until the dev team finds the resources and budget to overhaul or build it from scratch.

Understand the engineering team

Each engineering (or development) team is different. What are their strengths and weaknesses? Who in the team is most likely to be the source of bugs? How mature in their software engineering processes are they? One key thing I look out for is how good is the code review process and how much participation in code reviews the team has as a whole? Are there a lot of reviews with only a “LGTM” in there? Another area to look at is unit test coverage. In “Unit Tests and Code Reviews: the bedrock of software quality” [5] I explain how both unit tests and code reviews form a strong foundation in achieving good code and how they reinforce one another.

Of course a very obvious data point is also the number of bugs a team produces. Some other very telling metrics are: (a) the aging profile of the team’s bugs (b) how many production incidents were attributed to the team’s code (c) how many bugs reported by customer support gets attributed to the team.

The less mature the development team is, the more valuable E2E tests are. As the dev team matures into a more contract based approach to developing software, the test team can then pivot to a more contract testing approach.

Understand the test team’s capabilities and resources

Let’s just face it, you will never have all the resources you need. At the very heart, the QA or test organization will always be a cost center. Take the time to understand what talent and skillsets each member of the test team brings to the table. Augment their skillsets with test vendors and tools to act as “force multipliers”.

Understand the product team

As a test manager, you need to develop a strong relationship with your product counterpart. Understand what the product or feature brings to the business. Get a good grasp of what the product team’s priorities are and how they look to move the needle with feature changes and overhauls. This will enable you to better plan the resourcing of your team for projects down the road.

Also, quality is a team sport. Get your product counterpart involved in the quality efforts. They too want a product that delights users. They ultimately want a product they can be proud of.

Understand the business

I cannot stress enough how important this is: understand the business. At the end of the day, it is the business that pays the bills. Align your test efforts based on what is most important to the people who sign your (and your team’s) paycheck. Endeavor to understand the complexities the very nature of the business brings to the product.

Develop a deep understanding of what is the Cost of Quality. In other words, what is the opportunity cost of not having the quality (or test) organization in the company. What would likely drop and what is the cost to the business if that were to happen. That gives you an idea of how much value the test organization brings to the table.

Conclusion

In this article I’ve laid down some ideas to approach test management. It starts with understanding the key challenge in software development: taming complexity. To do so you need to both understand the sources of complexity as well as the teams and people involved tackling it. Quality is a team effort and the test manager is a facilitator and orchestrator of the efforts to contain and manage said complexity. If this is done well, it provides a lot of value to the company and the business.

References

[1] “You can’t fix quality just by catching bugs”, Oct 2020, Heemeng Foo, https://medium.com/swlh/you-cant-fix-quality-just-by-catching-bugs-ddc01d900474

[2] “Meltdown: what plane crashes, oil spills and dumb business decisions can teach us about how to succeed in work and at home”, 2018, Chris Clearfield & Andras Tilcsik, Penguin Books

[3] “Who should do testing? Dev or Test?”, Jun 2020, Heemeng Foo, https://medium.com/dev-genius/who-should-do-software-testing-dev-or-test-41c7ea39ee83

[4] “The Field Guide to Understanding ‘Human Error’”, 2014, Sidney Dekker, CRC Press

[5] “Unit Tests and Code Reviews: the bedrock of software quality”, Sep 2020, Heemeng Foo, https://medium.com/dev-genius/unit-tests-and-code-reviews-the-bedrock-of-software-quality-9a23cd24558b