Monday, June 6, 2016

Decomposing Page Objects

This month has been heavily dedicated to defining a more robust approach to the page object pattern, which I have been calling Page Modeling. Page Modeling is not going away, however, in a recent 'twitter war' with Marcel de Vries, I was forced to really think about what the classic Page Objects pattern brings to the table that Page Modeling is 'missing' and to be able to highlight where Page Modeling fits into the picture.

This post is dedicated to decomposing the page object pattern into three distinct layers of abstraction. I believe that Page Objects have too many responsibilities and this leads to confusion / ambiguity on how to build the Page Object as well as having a need for multiple changes to the Page Object when only simple changes are made to the UI. Of course I will explain further about what this exactly means.

To start, I would like to identify three distinct aspect of the Page Object pattern which I feel should be decoupled.
  1. Page Modeling
    Finding UI elements on screen and exposing the behaviors and observations of those UI elements
  2. Orchestration
    Given the UI capabilities, what are the interesting things we can do
  3. Scenarios
    Given an orchestration, implements scenarios based on real user use of the application
I realized that page modeling alone does not provide all of the utility of page objects, but the additional utility can easily be added. Now, I will cover each aspect and how they can be assembled to implement the page object pattern in a far more robust and maintainable manner. The additional utility, however, doesn't really have a standard abstraction and would be custom to your application. This is because the higher level abstractions depend on the business concerns around how to use the application and are not representing the UI.

Page Modeling

I have covered Page Modeling in depth in previous blogs as well as CodedUI Examples website so I'll just summarize it here. Martin Fowler indicates that
The basic rule of thumb for a page object is that it should allow a software client to do anything and see anything that a human can. It should also provide an interface that's easy to program to and hides the underlying widgetry in the window.
At some point, you actually do need to interrogate that widgetry to test UI state and that the controls are behaving properly (formatting phone numbers, providing money with $, etc). This point is what makes the traditional page object pattern unmanageable as I will try to highlight below. Page Modeling hides the implementation of the widgetry so that the client doesn't care if they are using a TextBox, TextArea, MyCustomTextControl, etc... Page Modeling exposes the observations and behaviors of a UI element. So what are observations and behaviors you ask?

Behaviors

Behaviors are what the user can do with the UI. For instance, setting the value of a TextBox or Clicking a button. The result of a behavior is typically a Page Model representing the next most-likely thing with which the user will interact. This allows for a fluent syntax where the result of an action returns the next thing to work with.

Page Modeling


interface ILoginPage : IPageModel
{
    // instead, expose the components that allow a login to happen
    IReadWriteValuePageModel<string, ILoginPage> Username {get;}
    IReadWriteValuePageModel<string, ILoginPage> Password {get;}
    ISelectionablePageModel<ILoginPage> RememberMe {get;}
    IClickablePageModel<IAccountSettings> Login {get;}
}

interface IAccountSettings : IPageModel
{
    IReadWriteValuePageModel<string, ILoginPage> FirstName {get;}
    IClickablePageModel<IAccountSettings> Save {get;}
}

public void LoginAndSetFirstName()
{
    ILoginPage loginPage = new LoginPage(browserWindow);// get a reference to the login page

    Assert.IsFalse(loginPage.Login.IsActionable()); // IsActionable <=> enabled and visible
    
    Assert.IsTrue(loginPage.UserName.SetText("MyUserName") // set the username which returns reference to the login page
                           .Password.SetText("MyPassword") // set the password which returns reference to the login page
                           .Login.IsActionable());

    IAccountSettings accountSettings = loginPage.Login.Click(); // click login which returns a reference to account settings page

    // from here, I could do more, but the above logic would probably be refactored out into a Scenario
    // IAccountSettings accountSettings = new LoginScenario(loginPage).LoginStandardUser();

    Assert.IsFalse(loginPage.IsRendered());
    Assert.IsTrue(accountSettings.IsRendered());

    string myName = "MyName";
    accountSettings.FirstName.SetText(myName) // set first name returns reference to account settings page
                   .Save().Click() // click save which returns a reference to account settings page
                   .FirstName.Value; // get the current value in the first name field after the page refreshes from the POST request

    Assert.IsTrue(myName.Equals(myName));
}

This fluent syntax is highly expressive of what the user is doing. In Page Objects, you would have a ton of methods like Is{control}{state}(). In Page Modeling, you have one property for each UI element {control} with methods like Is{state}(). Consider the difference below.

Page Objects


class LoginPage : Page
{
    AccountSettings Login(string username, string password);
    bool IsLoginButtonActionable();
}

class AccountSettings : Page
{
    // I can create one method for each property
    // AccountSettings UpdateFirstNameAndSave(string firstName); // oh boy... do not go this route!

    // or use optional parameters for what I want to set
    // which means every time fields are added or removed, this method has to change
    // and the logic inside is kinda crappy if(!String.IsNullOrWhitespace(firstName)) {/*set first name*/} ...
    AccountSettings SetUserInfoAndSave(string firstName = null, string lastName = null, DateTime? birthDate = null);

    // what about reading?  either need a class/struct to hold all the values
    AccountInfo GetUserInfo();

    // or one for each; again, no clear answer
    string GetFirstName();
}

public void LoginAndSetFirstName()
{
    var loginPage = new LoginPage(browserWindow);
    Assert.IsFalse(loginPage.IsLoginButtonActionable());

    // If I login, I can't assert anything about the button, so let me go update my class...
    loginPage.Login("myUser", "myPass");
}

// updating LoginPage
class LoginPage : Page
{
    AccountSettings Login(string username, string password);
    bool IsLoginButtonActionable();
    LoginPage SetUsernameAndPassword(string username, string password);
}

// updating Test
public void LoginAndSetFirstName()
{
    var loginPage = new LoginPage(browserWindow);
    Assert.IsFalse(loginPage.IsLoginButtonActionable());

    loginPage.SetUsernameAndPassword("myUser", "myPass");
    Assert.IsTrue(loginPage.IsLoginButtonActionable());

    // now, I want to login, but I only need click the login button
    // there is no way to do that, I have to either do a .Login call
    // to (again) set the username and password, or update my class!
}

// updating LoginPage
class LoginPage : Page
{
    AccountSettings Login(string username, string password); // no enforcement that this calls SetUsernameAndPassword which may have special logic for setting
    bool IsLoginButtonActionable();
    LoginPage SetUsernameAndPassword(string username, string password);
    AccountSettings ClickLogin(); // assuming only use this method in conjunction with SetUsernameAndPassword method
}

// updating Test
public void LoginAndSetFirstName()
{
    var loginPage = new LoginPage(browserWindow);
    Assert.IsFalse(loginPage.IsLoginButtonActionable());

    loginPage.SetUsernameAndPassword("myUser", "myPass");
    Assert.IsTrue(loginPage.IsLoginButtonActionable());

    var name = "myName";
    IAccountSettings accountSettings = loginPage.ClickLogin();
    accountSettings.SetUserInfoAndSave(firstName: myName);
    Assert.IsTrue(name.Equals(GetFirstName()));
    Assert.IsTrue(name.Equals(GetUserInfo().FirstName));
}

Not only are there a bunch of methods, but there are overlapping concerns and an ambiguous development strategy. There are two ways to login now. Login() and SetUsernameAndPassword() followed by ClickLogin(). Along the way, multiple methods were added just to test something about the UI. In Page Modeling, there is no ambiguity. Simply, there is a property per UI element that exposes what it can do and what about it can be observed.

Observations

Observations are what the user can observe about your UI. For instance, what is the current value of the text box, is the element visible, does it exist on the screen, is it enabled, ... Observations should not have side affects and should not require an action from the user whenever possible. Sometimes this is unavoidable and the observation becomes more similar to a behavior. However, as long as the side affect or user action doesn't require manipulation of state outside of the UI elements control, it is typically safe. An example of this would be that the value of a TextBox is obscured until you click the eyeball in the textbox. An observation that read the text by first clicking the eyeball if needed, and then resetting the state to obscured would be OK. There should be a way to tell if the state of the box is obscured or plain for rigorous testing.

Orchestration

Orchestrations is simply defining meaningful strings of actions against a page model and exposing only the dependencies of that string of actions to the client to call. Using the above case of a login page, an orchestration method may be the Login(string username, string password) method. The orchestration simply takes a reference to whatever page model it orchestrates and uses the exposed observations and behaviors to create a meaningful set of actions.

interface ILoginActions
{
   IAccountSettings Login(string username, string password);
}

public class LoginActions : ILoginActions
{
    public readonly ILoginPage loginPage;
    public LoginActions(ILoginPage loginPage)
    {
       this.loginPage = loginPage;
    }

    public IAccountSettings Login(string username, string password)
    {
       // the orchestrator does not typically need to make assertions,
       // and can assume that there are tests for Login actions
       return
       this.loginPage
           .Username.SetValue(username)
           .Password.SetValue(password)
           .Login.Click();
    }
}

// using the orchestration in a test
public void LoginAndSetFirstName()
{
    var loginPage = new LoginPage(browserWindow);

    // do not care how to actually login
    IAccountSettings accountSettings = new LoginActions(loginPage).Login("myUsername", "myPassword");

    // perform the interesting work of setting name and asserting
    accountSettings.FirstName.SetValue("myName").Save.Click();
    Assert.IsTrue("myName".Equals(accountSettings.FirstName.Value));
}

// could even create extension methods
public static class LoginActionExtensions
{
    public static IAccountSettings Login(this ILoginPage loginPage, string username, string password)
    {
        return new LoginActions(loginPage).Login(username, password);
    }
}

// using the extension in the test seems more natural
public void LoginAndSetFirstName()
{
    // no need to new up some orchestrator class, just get the page and use the extension
    IAccountSettings accountSettings = new LoginPage(browserWindow).Login("username", "password");

    // perform the interesting work of setting name and asserting
    accountSettings.FirstName.SetValue("myName").Save.Click();
    Assert.IsTrue("myName".Equals(accountSettings.FirstName.Value));
}

Using the orchestration classes, tests which are dependent on some previous page model (eg, must first login to get to desired page), can use the orchestration class to not worry about how to perform the given action while knowing that the details are thoroughly tested elsewhere. Commonly used orchestrations would typically become scenarios, which are described next.

Scenarios

Scenarios are even higher level abstractions than orchestrations and they perform a real world user use of the application. Consider you have three types of users: Basic, Premium, and Administrator. Each type would have different login credentials and those credentials could change or even the way login is performed could change, but the Scenario shields tests from these issues. Methods of a Scenario should typically not require any dependencies exposed to the client.

interface ILoginScenarios
{
    IAccountSettings LoginBasicUser();
    IAccountSettings LoginPremiumUser();
    IAdminDashboard LoginAdminUser(); // notice, we're going somewhere else after login; this would be annoying to handle in a test
}

public class LoginScenarios : ILoginScenarios
{
    public readonly ILoginActions loginActions;
    protected readonly BrowserWindow window;

    public LoginScenarios(ILoginPage loginPage, BrowserWindow window) : this(new LoginActions(loginPage), window) { }

    public LoginScenarios(ILoginActions loginActions, BrowserWindow window)
    {
        this.loginActions = loginActions;
        this.window = window;
    }

    public IAccountSettings LoginBasicUser()
    {
       return this.loginActions.Login("basicUsername", "basicPassword");
    }

    public IAccountSettings LoginPremiumUser()
    {
       return this.loginActions.Login("premiumUsername", "premiumPassword");
    }

    public IAdminDashboard LoginAdminUser()
    {
       // don't return as it's not the right model
       this.loginActions.Login("adminUsername", "adminPassword");
       return new AdminDashboard(this.window);
    }
}

// and possibly extensions here as well
public static class LoginScenarioExtensions
{
     public static IAccountSettings LoginBasicUser(this ILoginPage loginPage)
     {
        return new LoginScenarios(loginPage).LoginBasicUser();
     }

     public static IAdminDashboard LoginAdminUser(this ILoginPage loginPage)
     {
        return new LoginScenarios(loginPage).LoginAdminUser();
     }
}



Of course, the downside with extension method approach is that the extension method class cannot implement the interface it reflects, but you could do something more elegant. Let's combine the power of all three layers into a single Facade.

public LoginFacade : ILoginPage, ILoginActions, ILoginScenarios
{
    public readonly ILoginScenarios LoginScenarios;
    public ILoginActions LoginActions => this.LoginScenarios.LoginActions;
    public ILoginPage LoginPage => this.LoginActions.LoginPage;
    protected readonly BrowserWindow window;

    public LoginFacade(ILoginPage loginPage, BrowserWindow window) : this(new LoginActions(loginPage), window) { }
    public LoginFacade(ILoginActions loginActions, BrowserWindow window) : this(new LoginScenarios(loginActions), window) { }
    public LoginFacade(ILoginScenarios loginScenarios, BrowserWindow window)
    {
       this.LoginScenarios = loginScenarios;
       this.window = window;
    }

    // delegate all actions
    IReadWriteValuePageModel<string, ILoginPage> Username => this.LoginPage.Username;
    IReadWriteValuePageModel<string, ILoginPage> Password => this.LoginPage.Password;

    public IAccountSettings Login(string username, string password)
    {
        return this.LoginActions.Login(username, password);
    }

    public IAdminDashboard LoginAdminUser()
    {
        return this.LoginScenarios.LoginAdminUser();
    }
}



And we have come full circle. There is now a master object that can do all three layers which the client can manipulate. Each layer is exposed so that tests can test the granular widgetry if needed or simply use an orchestration or scenario to navigate past the already tested workflows of the application.

Hopefully I've convinced you that decoupling the Page Object pattern is worth the effort and reduces ambiguity while increasing consistency of the testing strategy.

No comments:

Post a Comment