Agility and Structural Modularity – part III

The first post in this series explored the fundamental relationship between Structural Modularity and Agility. In the second post we learnt how highly agile, and so highly maintainable, software systems are achievable through the use of OSGi.

This third post is based upon a presentation entitled ‘Workflow for Development, Release and Versioning with OSGi / Bndtools: Real World Challenges‘ (http://www.osgi.org/CommunityEvent2012/Schedule), in which Siemens AG’s Research & Development engineers discussed the business drivers for, and subsequent approach taken to realise, a highly agile OSGi based Continuous Integration environment.

The Requirement

Siemens Corporate Technology Research has a diverse engineering team with skills spanning computer science, mathematics, physics, mechanical engineering and electrical engineering. The group provides solutions to Siemens business units based on neural network technologies and other machine learning algorithms. As Siemens’ business units require working examples rather than paper concepts, Siemens Corporate Technology Research engineers are required to rapidly prototype potential solutions for their business units.

Siemens1

Figure 1: Siemens’ Product Repository

To achieve rapid prototyping the ideal solution would be repository-centric, allowing the Siemens research team to rapidly release new capabilities, and also allowing Siemens Business units to rapidly compose new product offerings.

To achieve this a solution must meet the following high level objectives:

  1. Build Repeatability: The solution must ensure that old versions of products can always be rebuilt from exactly the same set of sources and dependencies, even many years in the future. This would allow Siemens to continue supporting multiple versions of released software that have gone out to different customers.
  2. Reliable Versioning: Siemens need to be able to quickly and reliably assemble a set of components (their own software, third party and open source) and have a high degree of confidence that they will all work together.
  3. Full Traceability: the software artifacts that are released are always exactly the same artifacts that were tested by QA, and can be traced back to their original sources and dependencies. There is no necessity to rebuild in order to advance from the testing state into the released state.

Finally, the individual software artifacts, and the resultant composite products, must have a consistent approach to application launching, life-cycle and configuration.

The Approach

OSGi was chosen as the enabling modularity framework, this decision was based upon the maturity of OSGi technology, the open industry specifications which underpin OSGi implementations, and the technology governance provided by the OSGI Alliance. The envisaged Continuous Integration solution was based upon the use of Development and Release/Production OSGi Bundle Repositories (OBR). As OSGi artefacts are fully self-describing (Requirements and Capabilities metadata), specific business functionality could be dynamically determined via automated dependency resolution and subsequent loading of the required OSGi bundles from the relevant repositories.

The Siemens AG team also wanted to apply WYTIWYR best practices (What You Test Is What You Release). Software artefacts should not be rebuilt post testing to generate the release artefacts; between the start and end of the test cycle the build environment may have changed. Many organisations do rebuild software artefacts as part of the release process (e.g.1.0.0.BETA –> 1.0.0.RELEASE); this unfortunate but common practice is caused by dependency management based on artefact name.

Finally from a technical perspective the solution needed to have the following attributes:

  • Work with standard developer tooling i.e. Java with Eclipse.
  • Have strong support for OSGi.
  • Support the concept of multiple repositories.
  • Support automated Semantic Versioning (i.e. automatic calculation of Import Ranges and incrementing of Export Versions)- as this is too hard for human beings!

For these reasons Bndtools was selected.

The Solution

The following sequence of diagrams explain the key attributes of Siemens AG solution.

repo1

 

Figure 2:  Repository centric, rapid iteration and version re-use within development.

Bndtools is a repository centric tool allowing developers to consume OSGi bundles from one or more OSGi Bundle Repositories (a.k.a OBR). In addition to the local read-write DEV OSGi bundle repository, developers may also consume OSGi bundles from other managed read-only repositories; for example, any combination of corporate Open Source repositories, corporate proprietary code repositories and approved 3rd Party repositories. A developer simply selects the desired repository from the list of authorised repository, the desired artefact in the repository, dragging this into the Bndtools workspace.

Developers check code from their local workspaces into their SVN repository. The SVN repository only contains work in progress (WIP). The Jenkins Continuous Integration server builds, tests and pushes the resultant OSGi artifacts to a shared read-only Development OBR. These artefacts are then immediately accessible by all Developers via Bndtools.

As developers rapidly evolve software artefacts, running many builds each day, it would be unmanageable – indeed meaningless – to increment versions for every development build. For this reason, version re-use is permitted in the Development environment.

repo2

Figure 3:  Release.

When ready, a software artefact may be released by the development team to a read-only QA Repository.

repo3

Figure 4:  Locked.

Once an artefact has been released to QA it is read-only in the development repository. Any attempt to modify and re-build the artefact will fail. To proceed, the Developer must now increment the version of the released artefact.

repo4

Figure 5:  Increment.

Bndtools’ automatic semantic versioning can now be used by the developer to ensure that the correct version increment is applied to express the nature of the difference between the current WIP version and its released predecessor. Following the Semantic Versioning rules discussed in previous posts:

  • 1.0.0 => 1.0.1 … “bug fix”
  • 1.0.0 => 1.1.0 … “new feature”
  • 1.0.0 => 2.0.0 … “breaking change”

we can see that the new version (1.0.1) of the artifact is a “bug fix”.

The Agility Maturity Model – Briefly Revisited

In the previous post we introduced the concept of the Agility Maturity Model. Accessing Siemens’ solution against this model verifies that all the necessary characteristics required of a highly Agile environment have been achieved.

  • Devolution: Enabled via Bndtools’ flexible approach to the use of OSGi repositories.
  • Modularity & Services: Integral to the solution. Part and parcel of the decision to adopt an OSGi centric approach.

As discussed by Kirk Knoernschild in his DEVOXX 2012 presentation ‘Architecture All the Way Down‘, while the Agile Movement have focused extensively on the Social and Process aspects of achieving Agile development, the fundamental enabler – ‘Structural Modularity’ – has received little attention. Those of you that have attempted to realise ‘Agile’ with a monolithic code base will be all to aware of the challenges. Siemens’ decision to pursue Agile via structural modularity via OSGi provides the bedrock upon which Siemens’ Agile aspirations, including the Social and Process aspects of Agile development, can be fully realised.

Bndtools was key enabler for Siemens’ Agile aspirations. In return, Siemens’ business requirements helped accelerate and shape key Bndtools capabilities. At this point I would like to take the opportunity to thank Siemens AG for allowing their work to be referenced by Paremus and the OSGi Alliance.

More about Bndtools

Built upon Peter Kriens‘ bnd project, the industries de-facto tool for creation of OSGi bundles, the Bndtools GITHUB project was created by Neil Bartlett early 2009. Bndtools roots included tooling that Neil developed to assist students attending his OSGi training course and the Paremus SIGIL project.

Bndtools objectives have been stated by Neil Bartlett  on numerous occasions. The goal, quite simply is to make is easier to develop Agile, Modular Java applications, than not. As demonstrated by the Siemens’ project, Bndtools is rapidly achieving this fundamental objective. Bndtools is backed by an increasing vibrant open source community with increasing support from a number of software vendors; including long term commitment from Paremus. Current Bndtool community activities include support for OSGi Blueprint, stronger integration with Maven and the ability to simply load runtime release adaptors for OSGi Cloud environments like the Paremus Service Fabric.

Further detail on the rational for building Java Continuous Integration build / release chains on OSGi / Bndtools can be found in the following presentation given by Neil Bartlett to the Japan OSGi User Forum, May 2013: NeilBartlett-OSGiUserForumJapan-20130529. For those interested in pursuing a Java / OSGi Agile strategy, Paremus provide in-depth engineer consultancy services to help you realise this objective. Paremus can also provide in-depth on-site OSGi training for your in-house engineering teams. If interest in ether consulting or training please contact us.

The Final Episode

In the final post in this Agility and Structural Modularity series I will discuss Agility and Runtime Platforms. Agile runtime platforms are the area that Paremus has specialised in since the earliest versions of our Service Fabric product in 2004 (then referred to as Infiniflow), the pursuit of runtime Agility prompted our adoption of OSGi in 2005, and our membership of the OSGi Alliance in 2009.

However, as will be discussed, all OSGi runtime environments are not alike. While OSGi is a fundamental enabler for Agile runtimes,  in itself, the use of OSGi is not sufficient to guarantee runtime Agility. It is quite possibly to build ‘brittle’ systems using OSGi. ‘Next generation’ modular dynamic platforms like the Paremus Service Fabric must not only leverage OSGi, but must also leverage the same fundamental design principles upon which OSGi is itself based.

Agility and Structural Modularity – part II

In this second Agility and Structural Modularity post we explore the importance of OSGi™; the central role that OSGi plays in realising Java™ structural modularity and the natural synergy between OSGi and the aims of popular Agile methodologies.

But we are already Modular!

Most developers appreciate that applications should be modular. However, whereas the need for logical modularity was rapidly embraced in the early years of Object Orientated programming (see http://en.wikipedia.org/wiki/Design_Patterns), it has taken significantly longer for the software industry to appreciate the importance of structural modularity; especially the fundamental importance of structural modularity with respect to increasing application maintainability and controlling / reducing  environmental complexity.

Just a Bunch of JARs

In Java Application Architecture, Kirk Knoernschild explores structural modularity and develops a set of best practice structural design patterns. As Knoernschild explains, no modularity framework is required to develop in a modular fashion; for Java the JAR is sufficient.cover-small-229x300

Indeed, it is not uncommon for ‘Agile’ development teams to break an application into a number of smaller JAR’s as the code-base grows. As JAR artifacts increase in size, they are broken down into collections of smaller JAR’s. From a code perspective, especially if Knoernschild’s structural design patterns have been followed, one would correctly conclude that – at one structural layer – the application is modular.

But is it ‘Agile’ ?

From the perspective of the team that created the application, and who are subsequently responsible for its on-going maintenance, the application is more Agile. The team understand the dependencies and the impact of change. However, this knowledge is not explicitly associated with the components. Should team members leave the company, the application and the business are immediately compromised. Also, for a third party (e.g. a different team within the same organisation), the application may as well have remained a monolithic code-base.

While the application has one layer of structural modularity – it is not self-describing. The metadata that describes the inter-relationship between the components is absent; the resultant business system is intrinsically fragile.

What about Maven?

Maven artifacts (Project Object Model – POM) also express dependencies between components. These dependencies are expressed in-terms of the component names.

A Maven based modular application can be simply assembled by any third party. However, as we already know from the first post in this series, the value of name based dependencies is severely limited. As the dependencies between the components are not expressed in terms of Requirements and Capabilities,  third parties are unable to deduce why the dependencies exist and what might be substitutable.

It is debatable whether Maven makes any additional tangible contribution to our goal of application Agility.

The need for OSGi

As Knoernschild demonstrates in his book Java Application Architecture, once structural modularity is achieved, it is trivially easy to move to OSGi – the modularity standard for Java. 

Not only does OSGi help us enforce structural modularity, it provides the necessary metadata to ensure that the Modular Structures we create are also Agile structures

OSGi expresses dependencies in terms of Requirements and Capabilities. It is therefore immediately apparent to a third party which components may be interchanged. As OSGi also uses semantic versioning, it is immediately apparent to a third party whether a change to a component is potentially a breaking change.

OSGi also has a key part to play with respect to structural hierarchy.

At one end of the modularity spectrum we have Service Oriented Architectures, at  the other end of the spectrum we have Java Packages and Classes. However, as explained by Knoernschild, essential layers are missing between these two extremes.

diag6

Figure 1: Structural Hierarchy: The Missing Middle (Kirk Knoernschild – 2012).

The problem, this missing middle, is directly addressed by OSGi.

diag7

Figure 2: Structural Hierarchy: OSGi Services and Bundles

As explained by Knoernschild the modularity layers provided by OSGi address a number of critical considerations:

  • Code Re-Use: Via the concept of the OSGi Bundle, OSGi enables code re-use.
  • Unit of Intra / Inter Process Re-Use: OSGi Services are light-weight Services that are able to dynamically find and bind to each other. OSGi Services may be collocated within the same JVM, or via use of an implementation of OSGi’s remote service specification, distributed across JVM’s separated by a network. Coarse grained business applications may be composed from a number of finer grained OSGi Services.
  • Unit of Deployment: OSGi bundles provide the basis for a natural unit of deployment, update & patch.
  • Unit of Composition: OSGi bundles and Services are essential elements in the composition hierarchy.

Hence OSGi bundles and services, backed by OSGi Alliance’s open specifications, provide Java with essential – and previously missing – layers of structural modularity. In principle, OSGi technologies enable Java based business systems to be ‘Agile – All the Way Down!’.

As we will now see, the OSGi structures (bundles and services) map well to, and help enable, popular Agile Methodologies.

Embracing Agile

The Agile Movement focuses on the ‘Processes’ required to achieve Agile product development and delivery. While a spectrum of Lean & Agile methodologies exist, each tends to be; a variant of, a blend of, or an extension to, the two best known methodologies; namely Scrum and Kanbanhttp://en.wikipedia.org/wiki/Lean_software_development.

To be effective each of these approaches requires some degree of structural modularity.

Scrum

Customers change their minds. Scrum acknowledges the existence of ‘requirement churn’ and adopts an empirical (http://en.wikipedia.org/wiki/Empirical) approach to software delivery. Accepting that the problem cannot be fully understood or defined up front. Scrum’s focus is instead on maximising the team’s ability to deliver quickly and respond to emerging requirements.

Scrum is an iterative and incremental process, with the ‘Sprint’ being the basic unit of development. Each Sprint is a “time-boxed” (http://en.wikipedia.org/wiki/Timeboxing) effort, i.e. it is restricted to a specific duration. The duration is fixed in advance for each Sprint and is normally between one week and one month. A Sprint is preceded by a planning meeting, where the tasks for the Sprint are identified and an estimated commitment for the Sprint goal is made. This is followed by a review or retrospective meeting, where the progress is reviewed and lessons for the next Sprint are identified.

During each Sprint, the team creates finished portions of a product. The set of features that go into a Sprint come from the product backlog, which is an ordered list of requirements (http://en.wikipedia.org/wiki/Requirement).

Scrum attempts to encourage the creation of self-organizing teams, typically by co-location of all team members, and verbal communication between all team members.

Kanban

‘Kanban’ originates from the Japanese word “signboard” and traces back to Toyota, the Japanese automobile manufacturer in the late 1940’s ( see http://en.wikipedia.org/wiki/Kanban ). Kanban encourages teams to have a shared understanding of work, workflow, process, and risk; so enabling the team to build a shared comprehension of a problems and suggest improvements which can be agreed by consensus.

From the perspective of structural modularity, Kanban’s focus on work-in-progress (WIP), limited pull and feedback are probably the most interesting aspects of the methodology:

  1. Work-In-Process (WIP) should be limited at each step of a multi-stage workflow. Work items are “pulled” to the next stage only when there is sufficient capacity within the local WIP limit.
  2. The flow of work through each workflow stage is monitored, measured and reported. By actively managing ‘flow’, the positive or negative impact of continuous, incremental and evolutionary changes to a System can be evaluated.

Hence Kanban encourages small continuous, incremental and evolutionary changes. As the degree of structural modularity increases, pull based flow rates also increase while each smaller artifact spends correspondingly less time in a WIP state.

 

An Agile Maturity Model

Both Scrum and Kanban’s objectives become easier to realize as the level of structural modularity increases. Fashioned after the Capability Maturity Model (see http://en.wikipedia.org/wiki/Capability_Maturity_Model – which allows organisations or projects to measure the improvements on a software development process), the Modularity Maturity Model is an attempt to describe how far along the modularity path an organisation or project might be; this proposed by Dr Graham Charters at the OSGi Community Event 2011. We now extend this concept further, mapping an organisation’s level of Modularity Maturity to its Agility.

Keeping in step with the Modularity Maturity Model we refer to the following six levels.

Ad Hoc – No formal modularity exists. Dependencies are unknown. Java applications have no, or limited, structure. In such environments it is likely that Agile Management Processes will fail to realise business objectives.

Modules – Instead of classes (or JARs of classes), named modules are used with explicit versioning. Dependencies are expressed in terms of module identity (including version). Maven, Ivy and RPM are examples of modularity solutions where dependencies are managed by versioned identities. Organizations will usually have some form of artifact repository; however the value is compromised by the fact that the artifacts are not self-describing in terms of their Capabilities and Requirements.

This level of modularity is perhaps typical for many of today’s in-house development teams. Agile processes such are Scrum are possible, and do deliver some business benefit. However ultimately the effectiveness & scalability of the Scrum management processes remain limited by deficiencies in structural modularity; for example Requirements and Capabilities between the Modules usually being verbally communicated. The ability to realize Continuous Integration (CI) is again limited by ill-defined structural dependencies.

Modularity – Module identity is not the same as true modularity. As we’ve seen Module dependencies should be expressed via contracts (i.e. Capabilities and Requirements), not via artifact names. At this point, dependency resolution of Capabilities and Requirements becomes the basis of a dynamic software construction mechanism. At this level of structural modularity dependencies will also be semantically versioned.

With the adoption of a modularity framework like OSGi the scalability issues associated with the Scrum process are addressed. By enforcing encapsulation and defining dependencies in terms of Capabilities and Requirements, OSGi enables many small development teams to efficiently work independently and in parallel. The efficiency of Scrum management processes correspondingly increases. Sprints can be clearly associated with one or more well defined structural entities i.e. development or refactoring of OSGi bundles. Meanwhile Semantic versioning enables the impact of refactoring is efficiently communicated across team boundaries. As the OSGi bundle provides strong modularity and isolation, parallel teams can safely Sprint on different structural areas of the same application.

Services – Services-based collaboration hides the construction details of services from the users of those services; so allowing clients to be decoupled from the implementations of the providers. Hence, Services encourage loose-coupling. OSGi Services‘ dynamic find and bind behaviours directly enable loose-coupling, enabling the dynamic formation, or assembly of, composite applications. Perhaps of greater import, Services are the basis upon which runtime Agility may be realised; including rapid enhancements to business functionality, or automatic adaption to environmental changes.

Having achieved this level of structural modularity an organization may simply and naturally apply Kanban principles and achieve the objective of Continuous Integration.

Devolution – Artifact ownership is devolved to modularity-aware repositories which encourage collaboration and enable governance. Assets may selected on their stated Capabilities. Advantages include:

  • Greater awareness of existing modules
  • Reduced duplication and increased quality
  • Collaboration and empowerment
  • Quality and operational control

As software artifacts are described in terms of a coherent set of Requirements and Capabilities, developers can communicate changes (breaking and non-breaking) to third parties through the use of semantic versioning. Devolution allows development teams to rapidly find third-party artifacts that meet their Requirements. Hence Devolution enables significantly flexibility with respect to how artifacts are created, allowing distributed parties to interact in a more effective and efficient manner. Artifacts may be produced by other teams within the same organization, or consumed from external third parties. The Devolution stage promotes code re-use and efficient, low risk, out-sourcing, crowd-sourcing, in-sources of the artifact creation process.

Dynamism This level builds upon Modularity, Services & Devolution and is the culminatation of our Agile journey.

  • Business applications are rapidly assembled from modular components.
  • As strong structural modularity is enforced (isolation by the OSGi bundle boundary),  components may be efficiently and effectively created and maintained by a number of small – on-shore, near-shore or off-shore developement teams.
  • As each application is self-describing, even the most sophisticated of business systems is simple to understand, to maintain, to enhance.
  • As semantic versioning is used; the impact of change is efficiently communicated to all interested parties, including Governance & Change Control processes.
  • Software fixes may be hot-deployed into production – without the need to restart the business system.
  • Application capabilities may be rapidly extended applied, also without needing to restart the business system.

Finally, as the dynamic assembly process is aware of the Capabilities of the hosting runtime environment, application structure and behavior may automatically adapt to location; allowing transparent deployment and optimization for public Cloud or traditional private datacentre environments.

diag8

Figure 3: Modularity Maturity Model

An organization’s Modularisation Migration strategy will be defined by the approach taken to traversing these Modularity levels. Mosts organizations will have already moved from an initial Ad- Hoc phase to Modules. Meanwhile organizations that value a high degree of Agility will wish to reach the endpoint; i.e. Dynamism. Each organisation may traverse from Modules to Dynamism via several paths; adapting migration strategy as necessary.

  • To achieve maximum benefit as soon as possible; an organization may choose to move directly to Modularity by refactor the existing code base into OSGi bundles. The benefits of Devolution and Services naturally follow. This is also the obvious strategy for new greenfield applications.
  • For legacy applications an alternative may be to pursue a Services first approach; first expressing coarse grained software components as OSGi Services; then driving code level modularity (i.e. OSGi bundles) on a Service by Service basis. This approach may be easier to initiate within large organizations with extensive legacy environments.
  • Finally, one might move first to limited Devolution by adoption OSGi metadata for existing artifacts. Adoption of Requirements and Capabilities, and the use of semantic versioning, will clarify the existing structure and impact of change to third parties. While structural modularity has not increased, the move to Devolution positions the organisation for subsequent migration to the Modularity and Services levels.

diverse set of choices and the ability to pursue these choices as appropriate, is exactly what one would hope for, expect from, an increasingly Agile environment!

Agility and Structural Modularity – part I

Introduction

Agile development methodologies are increasingly popular. Yet most ‘Agile’ experts and analysts discuss agility in isolation.  This oversight is surprising given that ‘Agility’ is an emergent characteristic; this meaning a property of the underlying entity. For an entity to be ‘Agile’ it must have a high degree of structural modularity.

Perhaps as a result of this, many organisations attempt to invest in ‘Agile’ processes without ever considering the structure of their applications. Alongside the question, ‘How might one realise an Agile system?’, one must also ask, ‘How might one build systems with high degrees of structural modularity?’.

We start this series of blog articles by exploring the relationship between structural modularity and agility.

 

Structure, Modularity & Agility 

Business Managers and Application Developers face many of the same fundamental challenges. Whether a business, or a software application serving a business, the entity must be cost effective to create and maintain. If the entity is to endure, it must also be able to rapidly adapt to unforeseen changes in a cost effective manner.

If we hope to effectively manage a System, we must first understand the System. Once we understand a System, manageable Change and directed Evolution are possible.

Yet we do not need to understand all of the fundamental constituents of the System; we only need to understand the relevant attributes and behaviors for the level of the hierarchy we are responsible for managing.

Services should be Opaque

From an external perspective, we are interested in the exposed behavior; the type of Service provided, and the properties of that Service. For example is the Service reliable? Is it competitively priced relative to alternative options?

 

diag1

Figure 1: A consumer of a Service.

As a consumer of the Service I have no interest in how these characteristics are achieved. I am only interested in the advertised Capabilities, which may or may not meet my Requirements.

To Manage I need to understand Structure

Unlike the consumer, the implementation of the Service is of fundamental importance to the Service provider. To achieve an understanding, we create a conceptual model by breaking the System responsible for providing the Service into a set of smaller interconnected pieces. This graph of components may represent an ‘Organization Chart’ , if the entity is a business, or a mapping of the components used,  if the entity is a software application.

A first simple attempt to understand our abstract System is shown below.

 

diag2Figure 2: The Service provider / System Maintainer

From this simple representation we immediately know the following:

  • The System is composed of 15 Components.
  • The names of the Components.
  • The dependencies that exist between these Components; though we don’t know why those dependencies exist.
  • While we do not know the responsibilities of the individual Components; from the degree on inter-connectedness, we can infer that component ‘Tom’ is probably more important than ‘Dick’.

It is important to note that, we may not have created these Components, we may have no understanding of their internal construction. Just as the consumers of our Service are interested in the Capabilities offered, we, as a consumer of these components, simply Require their Capabilities.

Requirements & Capabilities

At present, we have no idea why the dependencies exist between the Components, just that those dependencies exist. Also, this is a time independent view. What about change over time?

One might initially resort to using versions or version ranges with the named entities; changes in the structure indicated by version changes on the constituents. However, as shown in figure 3, versioned names, while indicating change, fail to explain why Susan 1.0 can work with Tom 2.1, but Susan 2.0 cannot!

Why is this?

diag3

Figure 3: How do we track structural change over time? The earlier System functioned correctly; the later System – with an upgraded Component – fails. Why is this?

It is only when we look at the Capabilities and Requirements of the entities involved that we understand the issue. Tom 2.1 Requires a Manager Capability, a capability that can be provided by Susan 1.0. However, at the later point in time  Susan 2.0, having reflected upon her career, decided to retrain. Susan 2.0  now no longer advertises a Manager Capability, but instead advertises a  Plumber 1.0 Capability.

This simple illustration demonstrates that dependencies need to be expressed in terms of Requirements and Capabilities of the participating entities and not their names.

These descriptions should also be intrinsic to the entities; i.e. components should be self-describing.

diag4

Figure 4: An Organizational Structure: Defined in terms of Capabilities & Requirements with the use of Semantic versioning.

As shown, we can completely describe the System in terms of Requirements and Capabilities, without referencing specific named entities.

Evolution and the role of Semantic Versioning

Capabilities and Requirements are now the primary means via which we understand the structure of our System. However we are still left with the problem of understanding change over time.

  • In an organization chart; to what degree are the dependencies still valid if an employee is promoted (Capabilities enhanced)?
  • In a graph of interconnected software components; to what degree are the dependencies still valid if we refactor one of the components (changing / not changing a public interface)?

By applying simple versioning we can see that changes have occurred; however we do not understand the impact of these changes. However, if instead of simple versioning, semantic versioning is used (see http://www.osgi.org/wiki/uploads/Links/SemanticVersioning.pdf), the potential impact of a change can be communicated.

This is achieved in the following manner:

  • Capabilities are versioned with a major.minor.micro versioning scheme. In addition, we collectively agree that – minor or micro version changes represent non-breaking changes; e.g.  2.7.1 2.8.7. In contrast major version changes; e.g.  2.7.1. 3.0.0. represent breaking changes which may affect the users of our component.
  • Requirements are now specified in terms of a range of acceptable Capabilities. Square brackets ‘[‘ and ‘]‘ are used to indicate inclusive and parentheses ‘(‘ and ‘)‘ to indicate exclusive. Hence a range [2.7.1, 3.0.0) means any Capability with version at or above  2.7.1 is acceptable up to, but not including 3.0.0.

Using this approach we can see that if Joe is substituted for Helen, Tom’s Requirements are still met. However Harry, while having a Manager Capability, cannot meet Tom’s Requirements as Harry’s 1.7 skill set is outside of the  acceptable range for Tom i.e. [2,3).

Via the use of semantic versioning the impact of change can be communicated. Used in conjunction with Requirements and Capabilities we now have sufficient information to be able to substitute components while ensuring that all the structural dependencies continue to be met.

Our job is almost done. Our simple System is Agile & Maintainable!

 

Agile – All the Way Down

The final challenge concerns complexity. What happens when the size and sophistication of the System increases? An increased number of components and a large increase in inter-dependencies? The reader having already noticed a degree of self-similarity arising in the previous examples may have already guessed the answer.

The Consumer of our Service selected our Service because the advertised Capabilities met the consumers Requirements (see figure 1). The implementation of the System which provided this Service is masked from the consumer. This pattern is once again repeated one layer down. The System’s structure is itself described in-terms of the Capabilities and Requirements of the participating components (see figure 4). This time, the internal structure of the components are masked from the System. As shown in figure 5; this pattern may be re-repeated at many logical layers.

diag5

Figure 5: An Agile Hierarchy: Each layer only exposes the necessary information. Each layer is composite with the dependencies between the participating components expressed in-terms of their Requirements and Capabilities.

All truly Agile systems are built this way, consisting of a hierarchy of structural layers. Within each structural layer the components are self-describing: self-describing in terms of information relevant to that layer, with unnecessary detail from lower layers masked.

This pattern is repeated again and again throughout natural and man-made systems. Natural ecosystems build massive structures from nested hierarchies of modular components:

  • The Organism
  • The Organ
  • The Tissue
  • The Cell

For good reason, commercial organizations attempt the same structures:

  • The Organization
  • The Division
  • The Team
  • The Individual

Hence we might expect a complex Agile software systems to also mirror these best practices:

  • The Business Service
  • Coarse grained business components.
  • Fine grained micro-Services.
  • Code level modularity.

This process started in the mid/late 1990‘s as organizations started to adopt coarse grain modularity as embodied by Service Oriented Architectures (SOA) and Enterprise Service Buses (ESB’s). These approaches allowed business applications to be loosely coupled; interacting via well defined service interfaces or message types. SOA advocates promised more ‘Agile’ IT environments as business systems would be easier to upgrade and/or replace.

However, in many cases the core applications never actually changed. Rather the existing application interfaces were simply exposed as SOA Services. When viewed in this light it is not surprising that SOA failed to deliver the promised cost savings and business agility: http://apsblog.burtongroup.com/2009/01/soa-is-dead-long-live-services.html.

Because of the lack of internal modularity, each post-SOA application was as inflexible as its pre-SOA predecessor.

 

To be Agile?

We conclude this section with a brief summary of the arguments developed so far.

To be ‘Agile’ a System will exhibit the following characteristics:

  • A Hierarchical Structure: The System will be hierarchical. Each layer composed from components from the next lower layer.
  • Isolation: For each structural layer; strong isolation will ensure that the internal composition of each participating component will be masked.
  • Abstraction: For each layer; the behavior of participating components is exposed via stated Requirements and Capabilities.
  • Self-Describing: Within each layer the relationship between the participating components will be self-describing; i.e. dependencies will be defined in terms of published Requirements and Capabilities.
  • Impact of Change: Via semantic versioning the impact of a change on dependencies can be expressed.

Systems built upon these principles are:

  • Understandable: The System’s structure may be understood at each layer in the structural hierarchy.
  • Adaptable: At each layer in the hierarchy, structural modularity ensures that changes remains localized to the affect components; the boundaries created by strong structural modularity shielding the rest of the System from these changes.
  • Evolvable: Components within each layer may be substituted; the System supports diversity and is therefore evolvable.

The System achieves Agility through structural modularity.

In the next post in this series we will discover how OSGi™ – the Java™ Modularity framework – meets the requirements of structure modularity, and thereby provides the necessary foundations for popular Agile Methodologies and ultimately, Agile businesses.

Complex Systems and Failure

Tim Harford’s ADAPT was one of those spontaneous airport bookshop purchases

In summary a good read with a relevant message. Short-termism stifles true innovation. It is only by attempting novel high risk activities that we can hope to make substantive changes and ultimately succeed.

ADAPT provides some advice for putting this philosophy into practice:

  • All interesting systems (ecological, economic, social, political) are Complex.
  • ‘Complexity’ is not the issue:  Tight coupling is the issue.
  • Tight coupling propagates failure; tight coupling must be avoided.
  • Information has context. Lose the context and much of the value of the information is lost.
  • Avoid overly centralised command and control. Rather, delegate the decision making process.
  • Where possible, act locally.

Those interested in ‘Complex Adaptive Systems’ will be aware of the substantive body of background research that underpin ADAPT’s arguments.

Why Complex System Fail

And yet these principles are rarely put into practices by the software industry.

Response to Failure: A tightly-coupled system

This is perplexing as the ‘fail fast‘ mantra is not new: it just seems to have been largely ignored. While Berkley’s Recovery Oriented Computing program demonstrated these ideas almost a decade ago; we see little evidence of them being incorporated in the latest ‘Cloud’ & ‘Virtualisation’ platform offerings from the dominant software vendors. Indeed, peel beneath the marketing covers, and the usual suspects continue to pursue ‘High Availability’ or ‘Fault-Tolerant‘ approaches.

This folly of this is nicely explained by ‘How Complex System Fail‘ (University of Chicago’s Cognitive technologies Laboratory). This paper covers some of the same ground as ADAPT but  explains the problem from an IT Operational perspective. Fault-Tolerance masks component failure, and so paradoxically, such systems are more vulnerable to severe cascading or systemic failures.

When these cascading events finally do occur, Operations are the only defence: Operations pick-up the pieces!

The author (Richard I. Cook) argues a number of points; but the following two are to my mind the most important.

  • Safety is a characteristic of systems and not of their components.
  • Failure free operations require experience with failure.

Surely such fundamental principles should be at the core of modern ‘cloud’ platform runtimes? Surely failure recovery must be an integral part of any overall solution? These behaviours exercised as part of normal ongoing runtime activities and not as responses to rare Black Swan events? Finally, surely cloud environments should aim to be truly loosely coupled environments?!

Note: In my book reliance on centralised message brokers, or naive use of rigid ZooKeeper type lock services are part of the problem, not the solution.

 

Stopping Complex System from Failing?

Markov Chain Analysis of a loosely-coupled ‘Target State’ driven platform

Thanks to early exposure to adaptive SOA frameworks like Jini; Paremus developed a strong intuition with respect to the requirements for mission critical ‘cloud’ environments. To provide concrete theoretical foundations, Paremus in 2005 used Markov Chain analysis to simulate the availability of traditional HA clusters; and contrasted these to alternative architectures we internally referred to as ‘No Frame of Reference (NFoR)‘. A ‘NFoR‘ architecture  had no static control points and could continuously re-allocate software components as required.

To achieve this:

  • Component failure was visible within the runtime environment.
  • Loose coupling at all structural layers ensured that failure was effectively isolated.
  • As the architecture was extremely modular, only the smallest units need to be replaced & recovery was rapid.
  • Sophisticated  ‘Target State Driven‘ dependency management automatically replaced the failed units.

The results of the simulations where clear. A ‘No Frame of Reference‘ runtime platform embodying fail fast and automated repair and recover behaviours significantly outperformed traditional static high availability alternatives.

 

Even in the most volatile of environments with multiple failures being rapidly injected; such platforms always settled back into a functional state.
Such solutions, because of their extreme agility could also be rapidly reconfigured, shutdown and re-started by Operations.

 

 

 

Hence, our own experiences were consistent with the advice offered by ADAPT and  Richard I. Cook’s paper.

The following are fundamental requirements.

  • A high degree of structural modularity as Modular System are Maintainable Systems.
  • Loose coupling between interactive software components (locally or network distributed).
  • Loose coupling between components and the underlying resources (physical or virtual).
With the following implications:
  • Structural modularity requires powerful dependency management.
  • Resource abstraction requires sophisticated ‘Target State‘ provisioning / re-provisioning capabilities.

And for Paremus, the OSGi software modularity framework provided a compelling set of industry standards via which these capabilities might be achieved.

It is worth emphasising that such capabilities are not a function of the programming language used. Choice of language does not in itself  provide an answer: just the notation you might use to realise an answer. For this reason I see the increasing adoption of the OSGi modularity system as far more significant that recent Java developments or even the emergence of languages like Scala.

Nor is resource ‘virtualisation’ relevant in achieving this goal. Virtualisation is an orthogonal and secondary concern! If you need to partition physical resource – by all means use virtual machines. If you need to partition a data-centre pursue a SDC (Software defined Data Centre) strategy. But tread carefully! These solutions do not address your fundamental issues and risk the introduction of yet another complex tightly coupled management layer.

That’s all for today!

If you are interested in further detail on Service Fabric concepts: see Paremus Service Fabric Concepts and Terminology.

Why modularity matters more than virtualization.

Ten years ago it all seemed so simple! Increase utilization of existing compute resource by hosting multiple virtual machines per physical platform; so consolidating applications onto fewer physical machines. As the virtual machine ‘shields’ its hosted application from the underlying physical environment, this is achieved without changes to the application.  As applications may now move runtime location without re-configuration; the idea of  virtual machine based ‘Cloud Computing’ was inevitable.

However, there are downsides.

Virtual machine image sprawl is now a well know phrase. If the virtual machine image is the unit of deployment; any software upgrade or configuration change, no matter how small, generates a new image. With a typical size of ~1 Gbyte  (see table 2 – http://www.ssrc.ucsc.edu/Papers/ssrctr-10-01.pdf) – this soon adds up! Large virtual environments rapidly consume expensive on-line and off-line data storage resource. This in-turn has driven the use of de-duplication technologies. So increasing storage cost and / or increasing operational complexity.

Once constructed, virtual machine images must be propagated, perhaps many times across the network, to the physical hosts. Also, a small configuration change, which results in a new virtual machine image, which needs to be deployed to many nodes; can generate hundreds of Gbytes of network traffic.

When used as the unit of application deployment; virtualization increases operation complexity, and increases the consumption of expensive physical network and storage resources: both of which are ironically probably more expensive than compute resource which virtualization is attempting to optimize the use of.

We’re not finished!

  • Some categories of application simply cannot be de-coupled from the physical environment. Network latency is NOT zero, network bandwidth is NOT infinite and locality of data DOES matter.
  • Virtualization complicates and obscures runtime dependencies. If a physical node fails, which virtual machines were lost? More importantly, which services were lost, which business applications were operationally impacted? Companies are now building monitoring systems that attempt to unravel these questions: further operational band-aids!
  • Centralized VM management solutions introduce new and operationally significant points of failure.
  • As the operational complexity of virtual environments is higher than their physical predecessors; there is an increased the likelihood of catastrophic cascading failure caused by simple human error.

Feeling comfortable with your virtualization strategy?

 

 

For all these reasons, the idea of re-balancing critical production loads by dynamically migrating virtual machine images, is I suggest a popular Marketing Myth. While many analysts, software vendors, investors and end users continue to see virtualization as the ultimate silver bullet! They are, I believe, deluded.

The move to the ‘virtual enterprise’ has not been without significant cost. The move to the ‘virtual enterprise’ has not addressed fundamental IT issues. Nor will moving to public or private Cloud solutions based on virtualization.

 

 

And so the Story Evolves

Acknowledging these issues, a discernible trend has started in the Cloud Computing community. Increasingly the virtual machine image is rejected as the deployment artifact. Rather:

  • Virtual machines are used to partition physical resource.
  • Software is dynamically installed and configured.
  • In more sophisticated solutions, each resource target has a local agent which can act upon an installation command. This agent is able to:
    • Resolve runtime installation dependencies implied by the install command.
    • Download only the required software artifacts.
    • Install, configure and start required ‘services’.
  • Should subsequent re-configure or update commands be received; the agent will only download the changed software component, and / or re-configure artifacts that are already cached locally.

Sort of makes sense, doesn’t it!?

The Elephant in the Room

Dynamic deployment and configuration of software artifacts certainly makes more sense than pushing around virtual machine images. But have we actually addressed the fundamental issues that organisations face?

Not really.

As I’ve referenced on many occasions; Gartner research indicates that software maintenance dominates IT OPEX (http://www.soasymposium.com/pdf_berlin/Anne_Thomas_Manes_Proving_the.pdf). In comparison hardware costs are only ~10% of this OPEX figure.

 

 

“Our virtual cloud strategy sounds awesome: but what are the business benefits again??”

 

To put this into perspective; a large organisation’s annual IT OPEX may be ~$2 billion. Gartner’s research implies that, of this, $1.6 billion will be concerned with the management and maintenance of legacy applications. Indeed, one organization recently explained that each line of code changed in an application generated a downstream cost of >$1 million!

The issue isn’t resolved by virtualisation, nor Cloud. Indeed, software vendors, IT end users, IT investors and IT industry analysts have spent the last decade trying to optimize an increasingly insignificant part of the OPEX equation; while at the same time ignoring the elephant in the room.

 

 

Modular Systems are Maintainable Systems

If one is to address application maintainability – then modularity is THE fundamental requirement.

Luckily for organizations that are pre-dominantly Java based; help is at hand in the form of OSGi. OSGi specifications and corresponding OSGi implementations provide the industry standards upon which an organisation can being to modularise their portfolio of in-house Java applications; thereby containing the on-going cost of application maintenance. For further detail on the business benefits of OSGi based business systems; see http://www.osgi.org/wiki/uploads/Links/OSGiAndTheEnterpriseBusinessWhitepaper.pdf).

But what are the essential characteristics of a ‘modular Cloud runtime’: characteristics that will ensure a successful OSGi strategy? These may be simply deduced from the following principles:

  • The unit of maintenance and the unit of re-use are the same as the unit of deployment. Hence the unit of deployment should be the ‘bundle’.
  • Modularity reduces application maintenance for developers. However, this must not be at the expense of increasing runtime complexity for operations. The unit of operational management should be the ‘business system’.

Aren’t these requirements inconsistent? No, not if the ‘business system’ is dynamically assembled from the required ‘bundles’ at runtime. Operations: deploy, re-configure, scale and up-date ‘business systems’. The runtime internally maps these activities to the deployment and re-configuration of the required OSGi bundles.

Simple.

In addition to these essential characteristics:

  • We would still like to leverage the resource partitioning capabilities of  virtual machines. But the virtual machine image is no-longer the unit of application deployment. As the runtime dynamically manages the mapping of services to virtual and physical resources; operations need no longer be concerned with this level of detail. From an operational perspective, it is sufficient to know that the ‘business system’ is functional and meeting its SLA.
  • Finally, it takes time to achieve a modular enterprise. It would be great if the runtime supported traditional software artifacts including WAR’s, simply POJO deployments and even non-Java artifacts!

Are there any runtime solutions that have such characteristics? Yes, one: the Paremus Service Fabric. A modular Cloud runtime – designed from the ground-up using OSGi; for OSGi based ‘business systems’. The Service Fabric’s unique adaptive, agile and  self-assembling runtime behaviors minimizes operational management whilst increasing service robustness. To get you started – the Service Fabric also supports non OSGi artefacts.

A final note: even Paremus occasionally bends to IT fashion :-/ Our imminent Service Fabric 1.8 release will support deployment of virtual machine images: though if you are reading this blog hopefully you will not be too interested in using that capability!