Wednesday, October 7, 2009

Architecture Review Process

If you are lucky enough to keep the engineers who are knowledgeable about the current system around, then conducting a formal "Architecture Review Process" is probably the most efficient way to understand how the existing system work.

However, if these people have already left the company already, then a different reverse engineering effort need to be taken instead.

It is usually involved a number of key persons
  • A facilitator who orchestrate the whole review process and control the level of depth at different stages
  • A recorder who documents key points, ideas, observations and outstanding issues throughout the review process
  • Key architects and developers who collectively understand the details of the legacy systems, and be able to get down to any level of details (even code walk-through) if necessary.
  • Domain expert who understand every details of how people use the system, what are their current pain points and what features will help them most. The domain expert also helps to set business priorities between conflicting goals.

The architecture review process has the following steps.

Use Cases and Actors
From the business domain expert, we focus in the "actors" (who use the system) as well as the "use case" (how they use it). We also look at the key metrics to measure the efficiency of their tasks (e.g. how long does it take to complete the use case)

Activity Analysis
Drill down from each use case, we identify activities that each actor perform. We look at what data need to be captured at each activity and how actors interact with each other as well as with the system.

At this point, we should establish a good external view of the system. Now we dig into the internals of it ...

Technology Stack
This purpose is to understand what are those building blocks underlie the system and get a good sense of whether the build vs buy combination is correct. Things like which programming language, Java vs DOtNet, which App Server, what DB, any ORM vs direct SQL, XML vs JSON, which IOC or AOP container, Messaging framework ... etc need to be discussed. We also need to distinguish the core features (which you mostly want to build) from the non-core features (which you mostly want to leverage 3rd party code). By the end of this exercise, we'll get a very good understanding about the foundation on which we write our code and perhaps we can also identify certain areas where we can swap in 3rd party code.

Component Analysis
This is the portion where most of the time is being spent. Here we dissect the whole system into components. It starts off by the architect highlighting a list of major components of current system. For each component, we look at
  • The responsibility of the component
  • The persistent data owned by the component and the life cycle of maintaining this data
  • The interface of the component
  • The thread model executing the logic of the component (ie: Caller thread vs listening thread vs a new spawn thread) as well as any concurrent access implications
  • What are the potential bottleneck of this component and how we can remove the bottleneck when it occurs.
  • How does the component scale up along growth of different dimensions (e.g. more users, more data, more traffic rate ... etc) ?
  • What is the impact if this component crashes ? How does the recovery happen ?
It is important to realize whether the component communicates across VM boundaries. If so,
  • What is the authentication and authorization mechanism ?
  • What is the message format being communicated ?
  • Is the data transfer in clear text or encrypted ? And where is the secret key being stored ?
  • What is the handshaking sequence (protocol) ?
  • Is the component stateful or stateless ?
Since we already dive deep into the architecture, I usually take a further step to drill into the code by asking the developers to walk me through the code of some key components. This usually will give me a good sense about the code quality. Whether the code is easy to read, whether the logic is easy to follow, whether the method is too many lines of code, whether there is duplicated logic scattering around ... etc, and more important, whether there are sufficient unit tests around.

Maintainability Analysis
This focus in the ongoing maintenance of the architecture, things like ...
  • Is the system sufficiently instrumented for monitoring purpose ?
  • When the problem happens, is there enough trace around to quickly identify what went wrong ?
  • Can the system continue working (with some tolerable degradation) when some components fail ?
Extensibility Analysis
Understand the parameters that affects the behavior of the system. When different scenarios of changes happens, how much code need to be change to accommodate that ? Or can the system still serve by just changing the configurable parameters ? For example, does the system hard code business rules or using some kind of rule engine ? Does the system hard code the business flow or using some kind of workflow system ? What if the system need to serve a different UI device (like mobile devices) ?


The output of an architecture review process is typically
  • A set of documents/diagrams of the key abstractions that gives a better view of how the overall system. This documents should help a new comer to get up to speed quicker as well as communicate the architecture to a broader audiences.
  • A set of recommended action plans on what can be done to improve the current architecture.

No comments: