
Imagrium: A framework for automating cross-platform testing of mobile applications
The company I work for develops custom software, including mobile applications based on Android and iOS. Due to the fact that the competition in this market segment is quite high, testers are not only responsible for the compliance of the final product with the specifications and expectations of the client, but also are placed in a tight framework for the budget and terms of testing. This encourages us to explore new tools and methods that would allow us to reduce testing costs and improve product quality.
Imagrium is the result of one of these studies. Technically, this is a Jython framework for cross-platform testing of mobile Android / iOS applications using image recognitionwritten by our company. It is presented as a working PyDev project, which you can change to fit your needs. The code is distributed under the MIT license and is available on Github. In this article I will talk about the principles of the framework and its structure.
The framework has been around for 2 years, during which time it grew and developed, incorporating the experience of application on combat projects. At the same time, the basic principles, perhaps, have not changed much. They are as follows:
Use the same tests on different platforms
Usually, our customers order an application for several platforms at once, most often Android and iOS. It turns out that there is one specification, the functional tests are the same, so the most effective from our point of view would be to have one test base for different platforms. In other words, the same test must pass under different platforms.
Separate resources from logic
Tools that allow cross-platform testing can be divided into two types:
We are big fans of the first approach, but at that time some of the tools were not yet there, the others were not working very stable, and this forced us to try our hand at writing an alternative. At the same time, we didn’t want another record-and-replay tool, because the tests created by such tools very quickly become very difficult to maintain. Why? Because in them the resources are connected with the logic of work. For example, if you change some primitive operation, you will have to not only forward the pictures in all the texts, but also change the necessary steps in the necessary tests. I wanted to avoid such meaningless and costly work using the popular PageObject pattern.
Support continuous integration and easy debugging
All our projects are automatically collected by Jenkins, so from the very first line of code of this framework we would like to allow tests for the application to be launched automatically too. Initially, it was not so simple, because for working with images we decided to use Sikuli (as the most popular, documented and free solution), which at that time did not have an easy way to separate the library from the IDE and only support Jython 2.5 (which, by the way, there was no json support yet, yes), in unittest there weren’t a bunch of tasty features (for example, auto-finding tests). However, over time, these difficulties were defeated, and now the test results are available in jUnit format, and Ant makes them a beautiful page with statistics.
According to the second point, if you saw the Sikuli IDE, then you understand that doing something a little more serious than a 10-line application in it is a pain. If only because there is no debugger. This was enough for us to switch to using PyDev Eclipse, which is familiar to programmers and contains a lot of auxiliary features to speed up development.
Ensure the same initial system state for all tests
We borrowed the idea of the independence of tests from each other from jUnit since this makes it possible to run tests in parallel or to perform spot checks. Another task that pushed us to this is the fall of tests in the middle of execution. We needed a system in which the fall of one test would not affect the execution of other tests. As a result, we decided to use snapshots (snapshots) of the emulator on Android and the reset function of the simulator on iOS.
Emulator downloads and responses should not critically delay test execution
We really liked the speed of the iOS simulator, which could not be said about the boxed Android emulator. We had hopes for HAXM and x86 images supplied by Intel, but the trouble is that these images before version 4.4 did not contain the Google API, which is used in most of our applications (i.e. applications simply did not put on these images). In turn, the 4.4 image that contains this API worked unstable for us (for example, it might crash when the application is re-installed). Therefore, we chose Genymotion and VirtualBox to create and manage snapshots.
If you share these principles and think of some kind of testing framework, we suggest you consider ours as one of the alternatives.
Imagrium successfully runs on Java 7 x64 and Windows 7 x64 ( win repository branch ) or MacOS 10.9 ( ios repository branch ).
Historically, under Windows only Android was tested, and under MacOS, respectively, only iOS.
Before we briefly go over the rules for writing tests on Imagrium, we want to show a video with the capabilities of the framework (callouts in English). This video shows the test for the HopHop app on iOS and Android.
The code written in Imagrium can be divided into two blocks: pages and tests . In this grouping, a test is a sequence of operations carried out on different pages, as well as transitions between pages. For instance:
As you can guess from the code, first the test loads the AuthPage page, then goes from it to fbAuthPage, fills in the necessary data and sends the form, confirms the user and submits to lobbyPage. In other words, the test walks through the pages and performs understandable with so on. operation tester, leaving the implementation of operations inside the pages. Those. tests are a fairly simple group, and in order to learn how to write them, we just need to learn how to write pages. This is also quite simple.
Page is a Jython representation of the screen / activity / page of an application. Technically, this is a class with fields and methods that control these fields. In the most difficult case, it looks like this:
This snippet uses most of the goodies of Imagrium, so let's discuss this snippet in detail with so on. features of the framework.
Let's start with this line:
This snippet connects the email page field with a graphic resource (image or line). In this case, we associate the field with two resources at once (for the English and Russian locales). When the page asks for the email field , the system tries to find one of these images and returns the area associated with the first successfully found image.
According to our agreements, we store graphic resources in the res directory . To define a resource, we need to pass in ResourceLoader path to the resource or text, for example:
but for flexibility and convenience of the content of the code, we store the paths to resources in the variables of the src.core.r.Resource object .
Total : To manage a graphic or line, you need to add a ResourceLoader declaration to the corresponding page .
The field declaration only sets the connection we need, in order to use it, we need to refer to the field. Such calls occur either when using some kind of operation on the field (for example, dragging or clicking), or when the page is initialized. The last operation is very important, because firstly, it allows you to set the search area for the fields (usually we want to narrow it to the borders of the emulator) and check that we are on the page on which we want. Therefore, let's take a closer look at what happens in the page initialization code using the FbAuthPage example .
First, we should always inherit the page from the src.core.page.Page class .
as this gives us access to common page management methods, for example, the method of waiting for a full page load. We also need to run the parent constructor when the page is initialized.
and the last feature of initialization is the ability to set the search area for the field, usually this is the area occupied by the emulator, and to search only in it, we write
The box parameter is initially considered after taking a snapshot of the emulator and before launching the application the first time, and then passed from page to page. In more detail, two lines are taken (vertical and horizontal on the emulator) from, for example, res / pages / android / hdpi / core , the system detects them on the page and builds the emulator area from them. If we assign nothing to the fields, the system searches for ech across the screen (which usually affects the quality of the search), although this is sometimes necessary for some kind of exotic, for example, to press some hardware button on the emulator.
Usually, when initializing a page, we check that the fields we need are on it, and we do it this way:
Some fields may not be visible during initialization; specify only visible ones. If the page cannot find the fields, it throws an AssertionError exception which causes the test to fail and also reports the details to stdout.
Total : If you want the system to search for fields within the emulator, assign them self.box . Use page margin checking with self.checkIfLoaded () .
In our FbAuthPage code example, we used the click () method on the email field :
Each field is actually represented by a Match object from Sikuli, so you can do with it everything that is written in the corresponding spec (drag, click, pinch, release, enter text, and so on).
Configuration is a very important part of running tests. In particular, it determines which platform we want to run tests for, which tests, for which application. You can also register your variables in the configuration and use them in tests or in page code.
In our example FbAuthPage class , you can see the line:
Here the settings attribute is the ConfigParser sample associated with the current configuration file, so you can use all the methods from the official spec when working with it. Imagrium adds settings to each test, so you can use the configuration directly in the tests.
An example of working with a configuration:
For example, sometimes you need to press the back button (specific for Android) or enter text (Sikuli cannot do this simply because it enters text asynchronously, which usually leads to the fact that some characters do not have time to append). To provide these capabilities, classes have been introduced that give your page OS-specific functionality.
In practice, it looks like this (multiple inheritance!):
or so:
Total: If we want OS-specific functions, we inherit from the desired class.
In the previous sections, we first talked about FbAuthPage , and then jumped to FbAuthPageiOS and FbAuthPageAndroidHdpi . In this section, we will discuss what these classes are, why they are needed, and how they relate to FbAuthPage .
Initially, we said that we wanted to run the same test on different platforms, but even within the framework of one platform, serious differences in the presentation of graphic resources can come about. For example, resources for hdpi may differ from xhdpi, resources for iOS may differ from those for Android. However, only resources differ, and the methods of working with them remain the same (or almost the same, adjusted for guidelines). It was necessary to come up with some solution for redefining resources for different platforms, and we used standard inheritance. In other words, our pages can be divided into two levels - general and platform-specific .
With such a variety of pages, I would not want the test or page itself to decide which pages to load. In Imagrium, it is the responsibility of the system to read the configuration and load the necessary pages when the page calls the load () method . In order for the system to load the correct classes, these classes must be named in a definite way. In details:
If the class is still not found, then the system throws an AssertionError with the corresponding description, which fails the test.
In practice, it looks like this:
In the test, we write
There is a method in the AuthPage general page:
This method calls the load () method , during which the system decides which page to call (for example, FbAuthPageiOS or FbAuthPageAndroidHdpi ).
Total : We implement the general logic of the page, and then, if another density / platform requires it, we add the modified resources to the corresponding page classes. The configuration system decides which page to call in the appropriate case.
It is assumed that you clone a project with Github, open PyDev in Eclipse, delete everything superfluous, add everything you need, and then want to test it. For testing, you need to install everything that is necessary for the framework to work (see installation instructions), and then run from the project root:
In a little more detail, this call launches run.py with a specific configuration file as the only input parameter (for example, conf / android_settings.conf ) and also prescribing the necessary paths to the PATH and CLASSPATH variables, and then creates a page with the test results.
In general, when testing, the emulator starts, the application is reinstalled on it, after which a snapshot of the emulator is created. Next, for each test, the same snapshot is launched, on which the next test passes.
We recommend using Imagrium in cases where you need to test several platforms at once or when you do not have easy access to the application code to add locators for GUI elements to it. The fastest way to learn the framework is for those who programmed in Python, although the simplicity of the syntax of the language should contribute to the rapid learning to work with the tool. In this article, only the basics of working with Imagrium were given, I will describe in detail about the configuration of the framework and its capabilities (for example, multi-user scripting) in subsequent articles if this article is of interest to the community. You can also read the official documentation on the Github project page and see this Hello, World example .
Imagrium is the result of one of these studies. Technically, this is a Jython framework for cross-platform testing of mobile Android / iOS applications using image recognitionwritten by our company. It is presented as a working PyDev project, which you can change to fit your needs. The code is distributed under the MIT license and is available on Github. In this article I will talk about the principles of the framework and its structure.
Work principles
The framework has been around for 2 years, during which time it grew and developed, incorporating the experience of application on combat projects. At the same time, the basic principles, perhaps, have not changed much. They are as follows:
Use the same tests on different platforms
Usually, our customers order an application for several platforms at once, most often Android and iOS. It turns out that there is one specification, the functional tests are the same, so the most effective from our point of view would be to have one test base for different platforms. In other words, the same test must pass under different platforms.
Separate resources from logic
Tools that allow cross-platform testing can be divided into two types:
- Providing a common API and a service intermediary that translates common calls into axis-specific ones (MonkeyTalk, Appium, Robotium).
- Using pattern recognition method (Borland Silk Mobile, eggPlant).
We are big fans of the first approach, but at that time some of the tools were not yet there, the others were not working very stable, and this forced us to try our hand at writing an alternative. At the same time, we didn’t want another record-and-replay tool, because the tests created by such tools very quickly become very difficult to maintain. Why? Because in them the resources are connected with the logic of work. For example, if you change some primitive operation, you will have to not only forward the pictures in all the texts, but also change the necessary steps in the necessary tests. I wanted to avoid such meaningless and costly work using the popular PageObject pattern.
Support continuous integration and easy debugging
All our projects are automatically collected by Jenkins, so from the very first line of code of this framework we would like to allow tests for the application to be launched automatically too. Initially, it was not so simple, because for working with images we decided to use Sikuli (as the most popular, documented and free solution), which at that time did not have an easy way to separate the library from the IDE and only support Jython 2.5 (which, by the way, there was no json support yet, yes), in unittest there weren’t a bunch of tasty features (for example, auto-finding tests). However, over time, these difficulties were defeated, and now the test results are available in jUnit format, and Ant makes them a beautiful page with statistics.
According to the second point, if you saw the Sikuli IDE, then you understand that doing something a little more serious than a 10-line application in it is a pain. If only because there is no debugger. This was enough for us to switch to using PyDev Eclipse, which is familiar to programmers and contains a lot of auxiliary features to speed up development.
Ensure the same initial system state for all tests
We borrowed the idea of the independence of tests from each other from jUnit since this makes it possible to run tests in parallel or to perform spot checks. Another task that pushed us to this is the fall of tests in the middle of execution. We needed a system in which the fall of one test would not affect the execution of other tests. As a result, we decided to use snapshots (snapshots) of the emulator on Android and the reset function of the simulator on iOS.
Emulator downloads and responses should not critically delay test execution
We really liked the speed of the iOS simulator, which could not be said about the boxed Android emulator. We had hopes for HAXM and x86 images supplied by Intel, but the trouble is that these images before version 4.4 did not contain the Google API, which is used in most of our applications (i.e. applications simply did not put on these images). In turn, the 4.4 image that contains this API worked unstable for us (for example, it might crash when the application is re-installed). Therefore, we chose Genymotion and VirtualBox to create and manage snapshots.
If you share these principles and think of some kind of testing framework, we suggest you consider ours as one of the alternatives.
Environmental requirements
Imagrium successfully runs on Java 7 x64 and Windows 7 x64 ( win repository branch ) or MacOS 10.9 ( ios repository branch ).
Historically, under Windows only Android was tested, and under MacOS, respectively, only iOS.
Work demonstration
Before we briefly go over the rules for writing tests on Imagrium, we want to show a video with the capabilities of the framework (callouts in English). This video shows the test for the HopHop app on iOS and Android.
How to write tests
The code written in Imagrium can be divided into two blocks: pages and tests . In this grouping, a test is a sequence of operations carried out on different pages, as well as transitions between pages. For instance:
authPage = AuthPage.load(AppLauncher.box, self.settings)
fbAuthPage = authPage.signUpFb()
fbAuthPage.fillEmail(self.settings.get("Facebook", "email"))
fbAuthPage.fillPassword(self.settings.get("Facebook", "password"))
fbConfirmPage = fbAuthPage.login()
lobbyPage = fbConfirmPage.confirm()
As you can guess from the code, first the test loads the AuthPage page, then goes from it to fbAuthPage, fills in the necessary data and sends the form, confirms the user and submits to lobbyPage. In other words, the test walks through the pages and performs understandable with so on. operation tester, leaving the implementation of operations inside the pages. Those. tests are a fairly simple group, and in order to learn how to write them, we just need to learn how to write pages. This is also quite simple.
Writing pages
Page is a Jython representation of the screen / activity / page of an application. Technically, this is a class with fields and methods that control these fields. In the most difficult case, it looks like this:
class FbAuthPage(Page):
email = ResourceLoader([Resource.fbEmailFieldiOS, Resource.fbEmailFieldiOS_ru])
password = ResourceLoader([Resource.fbPasswordFieldiOS, Resource.fbPasswordFieldiOS_ru])
actionLogin = ResourceLoader([Resource.fbLoginBtniOS, Resource.fbLoginBtniOS_ru])
def __init__(self, box, settings):
super(FbAuthPage, self).__init__(box, settings)
self.email = self.box
self.password = self.box
self.actionLogin = self.box
self.settings = settings
self.waitPageLoad()
self.checkIfLoaded(['email', 'password'])
def fillEmail(self, text):
self.email.click()
self.waitPageLoad()
self.inputText(text)
This snippet uses most of the goodies of Imagrium, so let's discuss this snippet in detail with so on. features of the framework.
Field Definition and Localization
Let's start with this line:
email = ResourceLoader([Resource.fbEmailFieldiOS, Resource.fbEmailFieldiOS_ru])
This snippet connects the email page field with a graphic resource (image or line). In this case, we associate the field with two resources at once (for the English and Russian locales). When the page asks for the email field , the system tries to find one of these images and returns the area associated with the first successfully found image.
According to our agreements, we store graphic resources in the res directory . To define a resource, we need to pass in ResourceLoader path to the resource or text, for example:
email = ResourceLoader("res/pages/ios/fb_auth/fbEmailFieldiOS.png", "Password")
but for flexibility and convenience of the content of the code, we store the paths to resources in the variables of the src.core.r.Resource object .
Total : To manage a graphic or line, you need to add a ResourceLoader declaration to the corresponding page .
Page initialization and field validation
The field declaration only sets the connection we need, in order to use it, we need to refer to the field. Such calls occur either when using some kind of operation on the field (for example, dragging or clicking), or when the page is initialized. The last operation is very important, because firstly, it allows you to set the search area for the fields (usually we want to narrow it to the borders of the emulator) and check that we are on the page on which we want. Therefore, let's take a closer look at what happens in the page initialization code using the FbAuthPage example .
First, we should always inherit the page from the src.core.page.Page class .
class FbAuthPage(Page):
as this gives us access to common page management methods, for example, the method of waiting for a full page load. We also need to run the parent constructor when the page is initialized.
def __init__(self, box, settings):
super(FbAuthPage, self).__init__(box, settings)
and the last feature of initialization is the ability to set the search area for the field, usually this is the area occupied by the emulator, and to search only in it, we write
self.email = self.box
self.password = self.box
The box parameter is initially considered after taking a snapshot of the emulator and before launching the application the first time, and then passed from page to page. In more detail, two lines are taken (vertical and horizontal on the emulator) from, for example, res / pages / android / hdpi / core , the system detects them on the page and builds the emulator area from them. If we assign nothing to the fields, the system searches for ech across the screen (which usually affects the quality of the search), although this is sometimes necessary for some kind of exotic, for example, to press some hardware button on the emulator.
Usually, when initializing a page, we check that the fields we need are on it, and we do it this way:
self.checkIfLoaded(['email', 'password'])
Some fields may not be visible during initialization; specify only visible ones. If the page cannot find the fields, it throws an AssertionError exception which causes the test to fail and also reports the details to stdout.
Total : If you want the system to search for fields within the emulator, assign them self.box . Use page margin checking with self.checkIfLoaded () .
What can be done with fields
In our FbAuthPage code example, we used the click () method on the email field :
self.email.click()
Each field is actually represented by a Match object from Sikuli, so you can do with it everything that is written in the corresponding spec (drag, click, pinch, release, enter text, and so on).
Configuration access
Configuration is a very important part of running tests. In particular, it determines which platform we want to run tests for, which tests, for which application. You can also register your variables in the configuration and use them in tests or in page code.
In our example FbAuthPage class , you can see the line:
self.settings = settings
Here the settings attribute is the ConfigParser sample associated with the current configuration file, so you can use all the methods from the official spec when working with it. Imagrium adds settings to each test, so you can use the configuration directly in the tests.
An example of working with a configuration:
self.settings.get("Facebook", "email")
OS dependent methods
For example, sometimes you need to press the back button (specific for Android) or enter text (Sikuli cannot do this simply because it enters text asynchronously, which usually leads to the fact that some characters do not have time to append). To provide these capabilities, classes have been introduced that give your page OS-specific functionality.
In practice, it looks like this (multiple inheritance!):
class FbAuthPageiOS(FbAuthPage, iOSPage):
or so:
class FbAuthPageAndroidHdpi(FbAuthPage, AndroidPage):
Total: If we want OS-specific functions, we inherit from the desired class.
Organization of pages
In the previous sections, we first talked about FbAuthPage , and then jumped to FbAuthPageiOS and FbAuthPageAndroidHdpi . In this section, we will discuss what these classes are, why they are needed, and how they relate to FbAuthPage .
Initially, we said that we wanted to run the same test on different platforms, but even within the framework of one platform, serious differences in the presentation of graphic resources can come about. For example, resources for hdpi may differ from xhdpi, resources for iOS may differ from those for Android. However, only resources differ, and the methods of working with them remain the same (or almost the same, adjusted for guidelines). It was necessary to come up with some solution for redefining resources for different platforms, and we used standard inheritance. In other words, our pages can be divided into two levels - general and platform-specific .
- The general page contains the general logic of working with resources on the page. In our example, this is FbAuthPage . These methods do the same for both iOS and Android.
- Platform-dependent pages usually contain only resources that need to override the resources of the entire page. They look something like this:
class FbAuthPageAndroidHdpi(FbAuthPage, AndroidPage): email = ResourceLoader([Resource.fbEmailFieldAndroidHdpi, Resource.fbEmailFieldAndroidHdpi_ru])
With such a variety of pages, I would not want the test or page itself to decide which pages to load. In Imagrium, it is the responsibility of the system to read the configuration and load the necessary pages when the page calls the load () method . In order for the system to load the correct classes, these classes must be named in a definite way. In details:
- iOS pages should be called [shared page] + "iOS" , for example: FbAuthPageiOS . Additionally, this page should be the descendant of FbAuthPage .
- Android pages have more options - this is the OS version and density. First, the system tries to load the class [general page] + "_" + "[major version]" + "_" + "[minor version]" + "_" + "Android" + "[density]" . For example: FbAuthPage_4_2_AndroidHdpi . If she did not find such a class, she tries to download: [general page] + "Android" + "[density]" . For example: FbAuthPageAndroidHdpi .
If the class is still not found, then the system throws an AssertionError with the corresponding description, which fails the test.
In practice, it looks like this:
In the test, we write
fbAuthPage = authPage.signUpFb()
There is a method in the AuthPage general page:
def signUpFb(self):
self.actionAgreeTermsBtniOS.click()
self.actionSignUpFb.click()
return FbAuthPage.load(self.box, self.settings)
This method calls the load () method , during which the system decides which page to call (for example, FbAuthPageiOS or FbAuthPageAndroidHdpi ).
Total : We implement the general logic of the page, and then, if another density / platform requires it, we add the modified resources to the corresponding page classes. The configuration system decides which page to call in the appropriate case.
How to run tests
It is assumed that you clone a project with Github, open PyDev in Eclipse, delete everything superfluous, add everything you need, and then want to test it. For testing, you need to install everything that is necessary for the framework to work (see installation instructions), and then run from the project root:
ant
In a little more detail, this call launches run.py with a specific configuration file as the only input parameter (for example, conf / android_settings.conf ) and also prescribing the necessary paths to the PATH and CLASSPATH variables, and then creates a page with the test results.
In general, when testing, the emulator starts, the application is reinstalled on it, after which a snapshot of the emulator is created. Next, for each test, the same snapshot is launched, on which the next test passes.
Conclusion
We recommend using Imagrium in cases where you need to test several platforms at once or when you do not have easy access to the application code to add locators for GUI elements to it. The fastest way to learn the framework is for those who programmed in Python, although the simplicity of the syntax of the language should contribute to the rapid learning to work with the tool. In this article, only the basics of working with Imagrium were given, I will describe in detail about the configuration of the framework and its capabilities (for example, multi-user scripting) in subsequent articles if this article is of interest to the community. You can also read the official documentation on the Github project page and see this Hello, World example .