SOTA Archives

February 4, 2008

Phased Out

The recent announcement by Intel regarding advancements to phase change memory herald a profound shift in how storage and memory systems on computers will work in the near future. The use of material phase states to store data means phase change memory that will blur the necessary distinctions between long and short term memory requirements on current systems. Instead of having fast volatile RAM in systems and secondary storage for data persistence, only a single memory store will be required. The need to load data "into memory" in order to participate in computations will be rendered obsolete. Instead, all persistent data will be accessible just as easily in memory. In essence, we will have large capacity in-memory data stores. This type of memory will unify volatile/non-volatile memory rendering the distinction irrelevant. One of the obvious advantages will be the "instant on" capability of devices which use this type of memory. Computing devices will begin to behave more like appliances without the ubiquitous boot times associated with most computers today.

The long range implications of this advancement is the application of phase change materials to processors themselves. Instead of a lithographed realization of a processor on a silicon die, processors themselves will be reconfigurable to take advantage of new designs. The general purpose CPU will be replaced by processors which can behave as Field Programmable Gate Arrays. The ability to reconfigure the processor depending on the computational task will allow for more efficient execution of instructions within an execution context. Dynamic hardware parallelism will allow for computationally expensive operations to be optimized at the processor level.

Keep an eye out for advancements in material sciences which will allow this type processor morphism.

February 6, 2008

Solid State Santa Rosa

No spin: Ars reviews the MacBook Air with solid state drive

A lot of reviews of the MacBook Air are starting to trickle in and I think one of the most obvious gripes with this machine is the lackadaisical disk performance. Due to the space constraints associated with the Air form factor, it seems obvious to leverage chip set functionality to improve I/O performance. The Santa Rosa chipset already supports "Robson"; external flash based caches for hard drive operations. If this technology were leveraged to cache writes to the disk, you could have the best of both worlds. Random access speed would be matched by sequential write performance.

Of course, this will probably be mitigated by second generation consumer level solid state drives. But until this inevitability materializes and the cost of SSDs descend from the stratosphere, this seems like a reasonable compromise.

April 1, 2008

Apple Delphic Prediction

There seems to be a ground swell regarding the next gen iPhone, but given the recent Intel announcement regarding Atom, I suspect that something else is being developed in the Apple Skunk Works. The growing popularity of the EEE PC from Asus points to a trend of ultra-mobile computing that Apple would be foolish to ignore. This trend combined with the apparent success of the Kindle points to a hand-held device based on a 7-9 inch touch screen with wireless capabilities. I think that Apple acknowledges that there are display limitations inherent in the current iPhone/Touch form factors that can be addressed by a somewhat larger form factor. Think of it as the big brother / premium version of the Touch targeted at media distribution. As well as sporting the ubiquitous music/podcast functionality, it could potentially have the following capabilities:

  • act as a light weight browser application platform with integrated off-line capability (via Air or Google Gears)
  • function as a truly portable e-book reader with the same functionality as Kindle with books accessible via the iTunes store through the wireless connection
  • act as remote viewer for television shows and movies
  • allow for wireless syncing via 802.11n / 3G wireless

In addition to the advantages, it would solve the current keyboard dilemma by offering more screen real estate to display the current software keyboard. Combine this with a fuller syntax of gesture based control and you have a winner. It would fill the product niche between the Air and the Touch.

Think I'm crazy? Remember that the iPhone was in development for years before the actual launch, but the signs were there in the market. Apple had only to develop for the obvious void.

August 8, 2011

The Big Picture : Part 1 - Plan for Change

There are a few misconceptions regarding Selenium / WebDriver that should be clarified from the start. When I first began to use Selenium (pre 1.0) the mechanism for automation relied on the technique of Proxy Injection. While this worked well for applications which rendered largely static content, the wide-spread use of AJAX made this technique very unreliable from an automation point-of-view. Timing was difficult to manage and required client-side javascript to notify automation of the application's ready state. In addition, it was impossible to account for browser events that were not represented in the DOM (such as javascript alerts).

Enter WebDriver 2. Instead of using Proxy Injection, WebDriver automates actions using a browser's native events. This allows for more reliable timing mechanisms as well as the possibility for catching native browser events. Overall, this is great leap forward for automation.

Beware Selenium-IDE

There is just one other caveat with using Selenium from my perspective which should be avoided at all costs - the Selenium IDE. While useful for small, relatively static sites, Selenese tests result in massive duplication, brittle abstractions, and tests that do not encourage reuse. Capture and record tools such as the Selenium IDE are seductively simple to use, but extremely difficult to maintain. Changes to the application require changes to tests themselves. This quickly becomes unmanageable due to the violation of the DRY principle. As such, tools such as Selenium IDE can be classified as semi-automatic as they require a lot of manual intervention when the application changes. Ultimately, semi-automated testing methodologies are doomed to fail simply because the underlying system they depend upon to function ultimately changes. Any system which does not account for application evolution can only capture the requirements at a given point in time.

WebDriver is an API

By favoring programmatic automation, you can leverage general programming principles to create a framework which accounts for the natural evolution of the AUT.

Which brings us to the crux of the matter -WebDriver is an API, not the solution. What does this mean? WebDriver is the means by which automation is achieved, but by itself it does not necessarily give structure to the solution of automating applications in general. The relationships between application components are loosely defined and has no inherent structure in WebDriver. In some cases, the pre-package abstractions are insufficient to reflect the complex relationship between application components.

This is the role of the framework; to support automation using reusable abstractions against an evolving application.

Automation as a goal does not happen in a vacuum. Applications must be constructed to support automation. As such, coordination with application architects is crucial to support any effort. The framework must be flexible enough to support clearly defined tests whose implementations may change as the application evolves, but whose intent remains the same. To this end, I will describe some of the design decisions I made when creating my testing framework.

Web Application Automation Concepts

The Reusable Element

Most applications reuse GUI controls; this must be reflected any testing framework. While WebDriver supports finding generic WebElements and manipulating them via additional actions, this requires test code which rely on these methods to be aware of implementation details. In addition, this introduces duplication as soon as you have more than one of a single type of element. By creating an Element abstraction, you can define similar application components by their locators as well as the types of operations they can support for automation. For example, a Text Box is different from a Dropdown. One cannot select items from a Text Box but both may allow for text input. Common behaviors need to be defined in a single representation, while type specific behaviors need to be differentiated.

Having a reusable Element abstraction allows you to do nifty things like automatic validation of a control based on its type. This is particularly useful for smoke tests. As well, changes to an Element's behavior can easily be propagated throughout the entire testing framework if it is expressed canonically in a single representation. Elements also allow for functional composition. By taking two for more fundamental Elements, you can compose testable aggregate Elements with increasingly sophisticated behavior which still behave as a single functional unit.

You can also localize procedural abstractions such as when an Element is resolved. Ideally, you want to resolve any given just before it is used. This minimizes DOM inconsistencies which arise in applications which re-render output based on post-backs.

XPath is Regex for the DOM

Location Strategies are determined by application structure. Ideally, every element has a consistent element ID that is unique to the page where it is located and the same between application invocations. A lot of applications however, do not meet this criteria. Although there are a number of different location strategies supported by WebDriver, the most powerful by far is using XPath.

Tools such as XPather for Firefox allow you to select elements via XPath, but unfortunately use only positional element expressions (such as table[1]/tr[3]/td[2]). Not only are these expressions difficult to read, but they are heavily reliant on the ordering of the DOM. This makes them brittle.

What is need is a way to specify DOM path expressions which are rooted in the application's vernacular and disambiguate Elements effectively. By leveraging the expressiveness of XPath, you have the ability to specify Elements relative to other Elements. This is useful when labels are distinct from components they decorate. In addition, the annotation method of supplying locators for Elements in WebDriver precludes the ability to use templates to find Elements of a give type which vary only by their identifying characteristic. XPath allows the use of simple String templates. This allows for parameterized Element locators.

The expressiveness of XPath comes at a price. Generally, using XPath for element location is slower than other methods. In addition, not all browsers support XPath natively (Internet Explorer for example). That being said, XPath provides a strategy for taking an existing application and making it amenable to automation. As the application evolves to support unique persistent IDs, these changes can be made globally at the Element abstraction.

For an excellent tutorial on XPath, use the documentation at ZVON.org

The Reusable Page

Essentially a Page is a container for Elements which are manipulated via automation. Ideally, Elements associated with a Page should be lazy-initialized on use. In addition, a Page serves as a navigational component. To test something, you need to know where it is and how to get to it. Relationships between pages which define navigational structure depend on the type of application you are trying to automate.

The Hierarchical Application

In a Hierarchical Application, each page is located only once in the navigational structure. This type of structure is amenable to programmatic page traversal. In addition, if the application constructs page hierarchies programmatically (as they should), this information can be extracted from the application and the Page relationships can be created via code generation. The Generation Gap pattern is particularly useful in this regard. C#'s Partial Classes in addition to the ability to nest Classes makes it well suited to solve this problem.

The Process Oriented Application

By far the more difficult to automate, the Process Oriented Application has no clear notion of location; Pages relationships are defined in the context of a given process. A wizard-based application is the stereotypical Process Oriented Application. This type of application is not well suited to programmatic page traversal simply due the fact that Pages may have circular dependencies. In this case, it may be difficult to automate the creation of Page relationship.

The Case Against WebDriver.PageFactory

PageFactory relies on defaults for the Element lookup strategy that may not be appropriate for the AUT. The use of @FindBy annotation also makes it difficult to create dynamic Element lookups which are parameterized. The modification of annotations requires the use of reflection which is both cumbersome and expensive. In addition, it is questionable whether caching WebElements via @CacheLookup is useful given the possibility of StaleElementExceptions.

Instead of the PageFactory, Pages should express their dependencies explicitly in their constructors and hold a lazily-initialized dictionary of Elements with keys based on the language present in the application. If used in combination with the Element abstraction described previously, Element initialization is delegated back to the Element when accessed, not to Page. The Page's element dictionary provides a mechanism for finding Elements; nothing more. Pages constructed in this manner can be invoked directly or through the use of a dependency injector.

Putting it All Together

The Case For the Use of Fluent Interfaces

From a programming perspective, It is useful to think of the automation framework as serving different clients. There will be programmers responsible for wiring up the framework to the application as opposed to those responsible for wiring up tests to the framework.

These are two different tasks whose difficulty can be mitigated through the use of Fluent Interfaces. Page and Element definitions clearly express their requirements. Page navigation and Element access read more descriptively instead of a series of programmatic operations on application component primitives.

Degrees of Freedom

In the words of Einstein:

"Everything should be made as simple as possible, but not simpler."

All of these abstractions are not designed to introduce unnecessary complexity, but to manage the inherent complexity of automating an application. Application testing must be able to respond to various degrees of freedom which have the ability to destabilize test outcomes. The ultimate goal is reproducibility of the test's intent in the face of change. The following are the different changes which a framework must be resilient to.

The Application Changes : An Element is added/modified/removed

To add an element for automation simple requires associating it to a given Page. When an element is modified (such as when it is superseded with a new control with more advanced functionality), it need only be changed in a single location. Changes cascade throughout the entire framework with little work.

The Application Changes : A Page is added/modified/removed

If you programmatically determine page relationships, then simply running the code-generation component will create a stub for the new page or remove associated references to a deleted page. Most applications however undergo evolution more often; pages are modified. Elements are added/substracted from pages; this should happen independent of Element evolution. Pages should simply bind Elements that are part of their scope of responsibility.

The Automation Framework Changes

While the solution presented here hinges on the use of WebDriver, there is a case to be made for for framework isolation. All software evolves, and WebDriver is no exception. An automation framework built on WebDriver should also isolate changes to WebDriver itself. Leaking implementation details into tests by directly referencing WebDriver primitives results in fragility when the WebDriver API changes. Ideally, Page or Element bindings should not be directly impacted by changes to the framework itself.

The concepts of interface inheritance and implementation delegation to wrap primitive framework calls works well to isolate the automation framework from WebDriver changes. In essence, the Element object behaves much like a WebDriver WebElement without exposing any internal implementation details. This allows extension of the original WebDriver API with custom helper methods/interfaces.

The Target Web Browsers Change

There's a good chance that at some point, you will have to test you application against different browsers. To prepare for this eventuality, tests should be created in a web browser agnostic fashion. No test should depend on a specific browser; all automation operations should be done through the RemoteWebDriver/WebElement. By doing so, not only will you be able to run your tests against other browsers, but you will also be able to accommodate future browser updates as support for them is added to WebDriver.

What's Next?

Despite having the ability to automate testing, it is infeasible to test everything. Not all tests are equal. The most valuable tests reflect actual application usage. This is the role of specification testing. In the next article, I'll talk about how to use JBehave to fill this role.

August 10, 2011

The Big Picture : Part 2 - Testing for the Common Man

What Are We Testing?

To be clear, the focus of this article is to discuss tests which involve various client web browsers which connect to the application. There are other kinds of tests that we will not cover here - unit tests and integration tests (at the service level). There are other tools which are suitable for those tests and they fall under the responsibility of the application developers themselves. More importantly, these types of tests are non-client facing.

What is important however, is that tests cover the behavior of different versions and kinds of web browsers. While WebDriver supports HTMLUnit via the WebDriver interface, we focus primarily on browsers supported by the RemoteWebDriver interface. This includes the following browsers:

  • Firefox
  • Chrome
  • Internet Explorer
  • Opera

The only unsupported major browser is Safari. Writing tests against the RemoteWebDriver allows us to keep them browser agnostic. In addition, Grid 2 currently supports only RemoteWebDriver clients.

Everyone seems to have a different definition of for particular kinds of tests. For the purpose of this discussion, I'd like to offer my own definitions; others however, will likely disagree with these categorizations.

Acceptance Tests

All tests which ensure that the application behaves as expected can be considered Acceptance Tests. Within this categorization are other sub-categories.

Smoke Tests

Smoke Tests ensure that the application is technically functional. Generally, these type of tests ensure navigational and Element behaviors are consistent through the application. Validation at the Element level could be done programmatically as part of these tests. Smoke Tests are capable of verifying technical requirements, but generally at the expense of providing user context to the operations. These types of tests are useful for programmers, but not so much for end users.

Specification Tests

Specification Tests use the language of the user domain to express application behavior. The value of using specifications is that these tests create a consistent vocabulary which is used to communicate end-user expectations to programmers. Essentially, these specifications form the foundation of acceptance criteria which is codified in a readable manner. There are many different specification frameworks, but I choose JBehave for the simple fact that specifications are written in plain text. This makes it easy for non-technical endusers to contribute to the creation of specifications. Features as well as bugs can be described via Stories and Scenarios.

Regression Tests

Regression Tests typically express defects within the application to which fixes have been applied. The difficulty with this type of test is usually a question about with reproducibility. By using the same language used for Specification Tests, preconditions, steps to simulate the behavior, and expected outcomes are clearly expressed.

Performance Tests

Performance Tests are generally designed to exert load on the application back-end. There are tools such as JMeter which apply the record-playback principle for generating load, but tools such as this have limitations. If your application uses dynamically generated assets, these types of tests are difficult to use. In addition, it is questionable whether tests based on JMeter measure actual application behavior against a client or the ability of the application server to respond to concurrent requests.

Synthetic browser tools remove a critical part of the equation - the browser itself. While HTMLUnit can be used in this scenario as well, it is not a browser that end users are likely to use. Different browser introduce variability (such as different javascript engines) that must be accounted for. This is more important for applications that are AJAX aware.

One and the Same

Code reuse is a fundamental principle in managing complexity in large software projects. As test coverage expands to cover more aspects of the AUT, leveraging this principle becomes crucial to manage the tests themselves.

Ideally, Specification and Regression Tests can be re-used for measuring performance. This allows for load testing using a single framework. This can be accomplished through the use of a concurrency demultiplexer that can take single Specification or Regression Tests and dynamically create multiple concurrent instances (determined at runtime) of the test to run. JUnit natively supports concurrency through the RunnerScheduler. You can effectively simulate group tests using this method.

This solution however, requires a number of machines to run the tests. Virtualization can be used to provision client browser machines, but has it's own costs. One must balance the costs of machine resources with human resources.

Domain-specific Languages (DSL) To The Rescue

Behavior Driven Development engages end-users in defining application behavior. Instead of simple technical requirements driving application development, the perspective of the enduser is given a central role. Specifications are written using the language of their domain; this provides the necessary context for considering the success or failure of a behavior. This is the role of the Domain-specific Language; JBehave is particularly useful in this regard.

JBehave allows the following:

  1. Specifications can be written in plain text.
  2. Specifications use a structured syntax that clearly expresses pre-conditions, steps, and expected outcomes.
  3. Specification encourage stories to be written in the point-of-view of a User of an application. This allows allows for specifications to be defined for different perspectives given a User's role within an application.
  4. Instead of tests expressing operations at the component (Element) level, instead they describe functionality in the context of individual user operations (Scenarios) grouped by functionality (Stories).
  5. Stories are reusable a pre-conditions of other stories.
  6. Parameterized steps allow for table-driven input. This effectively removes the need for tools like Fitnesse. Both input and output can be specified with the test.

Caveat Emptor

While JBehave is generally useful for BDD, it is not without its faults. My initial attempt to use the GuiceAnnotatedEmbedderRunner was a complete failure. The documentation surrounding its use and the associated documentation where insufficient to diagnose my issues. Barring a source-code evaluation of JBehave's implementation of this feature, I would recommend avoiding it.

In addition, I would suggest avoiding the JBehave-Selenium module. There are design flaws in its implementation which tightly couple test code to WebDriver. Pages created in the manner suggested by JBehave-Selenium inherit directly from WebDriver thus breaking encapsulation. Changes in WebDriver APIs have a direct impact on the Page implementation. Using Dependency injection via an injector framework, or manually via constructor injection is the only requirement for the Page. As per the Gang of Four, favor composition over inheritance.

In addition, having a typed WebDriverProvider exposes the browser implementation details into the test framework; this should be safely abstracted away into a Configuration object.

The Grid

Grid allows tests to distributed across machines for parallel execution. This is particularly useful for load testing as well as providing concurrent testing against various browser types and versions. When combined with different operating systems, this method provides an effective means of testing an application across a number of different platforms simultaneously. A single virtual machine can have multiple grid nodes which can consist of different browser types. The number of nodes per machine only limited by memory. By having a central repository of available browser resources, scheduling concurrent test execution becomes manageable.

The structure of the automation environment plays a direct part in how Grid is used. I'll cover this in more detail in Part 4.

What's Next?

The number of tests against a mature application only increase over time. This requires a method for organizing the tests themselves. The next article deals with issues of test organization and how to introduce modularity and versioning to your tests.

About SOTA

This page contains an archive of all entries posted to Z1R0 in the SOTA category. They are listed from oldest to newest.

Many more can be found on the main index page or by looking through the archives.

Colophon

Creative Commons License
This weblog is licensed under a Creative Commons License.