Are you Well Architected?

Photo by Shane Rounce on Unsplash

Whether your cloud infrastructure runs a multi-million pound industry or a start-up aiming to grow, you should be asking yourself “how good is my cloud set-up?”.

This can be a hard question to answer. You may have multiple accounts with your cloud service provider; you may be running multiple applications that use multiple cloud services in various combinations. You may have policies defining  the access and use of cloud services by your employees and contractors, but do you know if these policies are being applied or monitored?

More generally, the question itself is vague. What does “good” mean anyway? How can you define “good”? These are the sorts of questions that keep managers awake at night.

I specialise in Amazon Web Services (AWS), so will try to answer the question for that specific domain. Other cloud providers have different ways of approaching the problem

AWS is by far the biggest provider of cloud services in the world. They currently offer well over 100 different products and services that you can use for your business, from general compute power and storage capacity, to really specialised services like Blockchain or Internet of Things (IoT) capabilities. It is an often bewildering array of options that you can self-serve into any combination you like. And it keeps growing, in depth and breadth.

They recognised that the question of how good a specific combination of these services was could turn quite complex quite quickly. So they came up with the Well Architected Framework.

In essence, this framework tries to answer the question by breaking the problem down into five key areas, or “pillars” in the AWS jargon:

  • Security – How good is your set-up at protecting information, systems and assets?
  • Reliability – How good is your set-up at recovering from infrastructure or service disruptions? Can it grow or shrink to meet demand?
  • Operational Excellence – How good is your set-up at running and monitoring your systems?
  • Performance Efficiency – How good is your set-up at using resources efficiently and at adapting to technological change?
  • Cost Optimisation – How good is your set-up at running in the most cost-effective way?

Interestingly, your “set-up” is not simply your infrastructure and code. It is also, crucially, the processes and procedures that you have in place to implement, monitor and adapt. Your system could be perfect on day one, but things can easily and quickly degrade in any of the above areas if you don’t have the mechanisms in place to stay up to date.

So the Well Architected Framework takes you through a series of questions that, in effect, force you to take a good, hard look at everything that you are doing and identify gaps in your set-up when compared with current architectural best practice.

For example, in the Security section it asks “How do you manage credentials and authentication?” A simple question that leads to many more: Do you enforce multi-factor authentication (MFA) on log-in? Do you force people to rotate passwords regularly? Do you have a written identity and access management policy? 

As you answer these questions, the gaps become easily apparent, as do the paths to remediation. Enforce MFA. Enforce password rotation. Write down your policies.

Or when it comes to Reliability, it asks: “How do you monitor your resources?”. Again, a simple question that leads to others (“Are you even monitoring the default metrics?”, Are you sending notifications based on the monitoring?”, “Do you perform automated responses on events?”), which force you to look at that area of your infrastructure. Now, you may decide that even though you are sending notifications, you are not performing any automated responses based on events (so a human has to decide what to do). This may be OK in your case. But the review forces you to understand that and decide what levels of risk you are prepared to take. There are often no binary answers, just trade-offs.

It sounds dry, and it can be, but it is one of the most fascinating and eye-opening pieces of work that I do. For one thing, it allows you a privileged look inside other people’s operations. You have to dig around in detail so you learn a lot, both good and bad, about how people go about building things and making decisions.

And for another, your work can be really helpful. We did one Well Architected Review for a food delivery start-up with ambitions to grow and doubts about the suitability of their existing set-up. We were able, among other things, to find security vulnerabilities (like databases accessible from the open internet) as well as to provide guidance on how to create separation between development and production infrastructure (have two separate AWS accounts under one Organisation) that was more amenable to expansion and easier to control.

But more importantly, perhaps, we were able to offer some peace of mind. Yes, there were some things they needed to think about and change to go into a growth phase, but they didn’t need to rip everything up and start again. That in itself was probably worth the consultancy fee!

When your business, and your AWS footprint, is bigger and more mature, the Well Architected reviews will involve more people and take longer (in one case, with a FinTech company, it would sometimes take days and trails of emails and meetings to answer single questions). But the principle is the same and the outcomes can be just as revealing. Of course, in those cases, there then comes the problem of implementing and communicating change, but that is for another blog post!

It is worth pointing out that AWS offers a free online tool to run through the Well Architected Review. So you don’t necessarily need to hire consultants who will give you lots of voodoo about how it needs to be done and tell you about the required black arts. It is all there in black and white and it is definitely not rocket science!

What we find is that normally in an organisation most people are occupied with their day job and do not have the mind-space to dedicate to a job like this, which needs to be done in a methodical fashion and within a reasonable amount of time. And often the required cloud expertise is distributed among multiple people in the organisation. So getting someone in who has done it before, who can see it through and who understands the questions that need to be asked and the level of answer that is required can end up being a good investment.

So if you are losing sleep over how good your set-up is, get in touch!