What happens when things break? What happens when software fails? We regard it as a normal and personal inconvenience when apps crash or servers become unavailable, but what are the implications beyond the individual user? Is software reliability simply a business decision or does it have economic, social and cultural consequences? What are the moral and practical implications for software developers? And when we talk of ‘systems’, are we part of the ‘system’? What about the bugs on our side of the keyboard?
JAXenter: What do we mean when we talk about software failures?
Kevlin Henney: We can consider a software failure as an unhandled error condition that upsets not only the software but also something in the world around it, whether that is someone’s temper, someone’s bank account or someone’s social reality.
JAXenter: How can we become more reliable? Should we?
Kevlin Henney: With every passing moment there is more and more software being developed. There are more people doing it and there are more spaces that software is covering, from in-your-palm apps to industrial control systems, from the convenient to the life critical.
There is an implied responsibility. Is our software reliable enough for such great responsibility? Given the number of error screens in public places that people send me, the number of complaints about time spent — or, rather, lost — in fixing defects and the amount of money businesses hemorrhage through software failures, whether high-profile or not, it seems clear that if reliability is a destination, not only are we not there yet, but we’ve got some way to go.
How do we do it? This is more a question of application than a question of possibility — most reliability issues are solved and solvable problems. Certainly, we don’t know how to address all problems but we know a lot more than we are putting into practice.
So, we take what we know and apply it. We make the business case for it. We make the moral case for it. We embed it in practice and attitude. And we make it part of the conversation, part of the culture.
[Developers] need to work more loosely, to recognise that the creation of any successful architecture is a fluid activity involving ongoing changes and emerging understanding of both the problem domain and the solution domain.
JAXenter: “Move fast and break things” is a famous motto. Is this a good or a bad thing?
Kevlin Henney: It is a context-specific thing. When applied within the appropriate context, it can be considered a good thing, an invitation to experiment freely and without restraint, to discover new ways of working and thinking, to break out of an overly comfortable or stuck place. On the other hand, when applied outside the appropriate context it can undermine people and their work, can come across as irresponsible, arrogant and lacking in self-awareness. I would consider this a bad thing.
JAXenter: Is software reliability simply a business decision or does it have other consequences as well? What are the implications for software developers?
Kevlin Henney: The case for reliability is human and economic. Whether dealing with individuals or companies or society as a whole, a lack of reliability consumes people’s trust, time and money. In the worst cases, it can cost lives and livelihood. In other words, this becomes a moral question.
In some domains, the question of reliability threatens to become a legal one, so there is an obvious incentive there. However, in other application areas, where users have a choice and also have a desire for convergence of application features and UX, reliability can prove to be a differentiator. Whichever way you look at it — ethics, laws or markets — there is a strong case for increasing reliability.
Development, and therefore developers, need to account for the context in which a component or a software system. What are the implications of failure in that context? Development needs to go further than the technical stack; the full stack includes the world and people around the software.
JAXenter: More often than not, architecture is seen as a separate concern from the development process. Why is that and what should the relationship between the two be?
Kevlin Henney: Why roles — such as project manager and technical lead — and disciplines — such as development process and architecture — that are ultimately focused on the same thing — in this case, software and its development — end up separating has a long history with many causes and reinforcements. There is a long history of believing that technical work is of lower status than managerial work, which leads to hierarchy and a vertical separation. There is also a long history of horizontal role specialisation in both modern business and in the constantly expanding world of technology — the more there is, the more you need to know, the more you need to develop expertise or draw on expertise, and so on.
My favorite agile tool is the whiteboard.
It’s true that you can’t know everything or be equally good at everything you do but it also turns out that the role overspecialisation and separation brings a narrowness to development work that is itself a problem. We find many breakthroughs in science, technology and the arts come from synthesis and crossover, from breaking down silo walls and glass ceilings, from walking across strict separations.
The idea that such separation promotes expertise through focus ignores the fundamental communication overheads, mismatched frames of knowledge and practical challenges that come with separating two entangled points of view on the same thing. Separating how you build from what you build is a naïve way to approach building, and yet such a separation has captured the imagination — or perhaps lack of imagination? — and orthodoxy of development for too long.
What should the relationship be? It should be intimate. And, like any close relationship, it should attentive, caring and respectful.
JAXenter: What’s the biggest error developers make when trying to create a specific enterprise architecture?
Kevlin Henney: That they’re trying to create a specific enterprise architecture. They need to work more loosely, to recognise that the creation of any successful architecture is a fluid activity involving ongoing changes and emerging understanding of both the problem domain and the solution domain.
JAXenter: Could you name three anti-patterns of agile adoption?
Kevlin Henney: Yes.
- Changing the labels but not the actual roles and practices, e.g., phase becomes iteration or sprint, project manager becomes ScrumMaster, status meeting becomes daily stand-up.
- Churning out functionality without paying attention to technical and team practices. You don’t go faster just because you put your foot down harder on the accelerator; you also have to remember to release the brake, to have fuel and to be aware of the road and the route.
- Simple as it is, and as old as it is, missing both the subtle and obvious implications of the values of the Agile Manifesto, such as obsessing over processes and tools at the expense of people, or following a plan — whether schedule or architectural — regardless of contraindicating change or feedback.
JAXenter: What is your favorite agile tool and why?
Kevlin Henney: The whiteboard. With the exception of my (often poor) font choice, it’s open to possibilities and participation and does not imply an overly strong sense of commitment, i.e., it doesn’t freeze ideas too early or straitjacket thinking to the limitations of code, editors or other software tools.
JAXenter: What can attendees get out of your keynote?
Kevlin Henney: Some good stories, a bigger picture, motivation and a clearer way of reasoning about the relationship between software, its behaviour and its unexpected consequences on the world around it.
Fill in the blanks
Dev and Ops work best together if … they are considered together.
The biggest obstacle for DevOps is … that it is treated as a separate role or activity.
What promotes employee satisfaction is … a sense of progress, a sense of ownership and a sense that the organization around them cares.
The biggest advantage of autonomously-working teams is … risk reduction through increased group intelligence.
It is important for a positive company culture to … recognize that culture is dynamic and its assumptions can become embedded or change subtly over time.