
Automation of mobile applications based on Appium
- Tutorial

Posted by Anton Sirota (QA, Automation)
In this article, based on a lecture that I recently read, we will look at the Appium framework. This is an introductory material intended to understand how, in principle, the automation of mobile applications occurs, what will be required for this, with what, in fact, to start work and what difficulties you will encounter.
Automation of mobile applications is a relatively new phenomenon, but its demand is constantly growing. There are some difficulties with Appium, although in general the automation process with its use has already been debugged.
Content
Environment for mobile automation
Search and work with elements
Working with a driver
Working with contexts
An emulator or a real device?
Possible problems / difficulties
Mobile automation process
Cloud services
Types of mobile applications:
Native.
Web.
Hybrid.
Before talking about automation, it is worth considering the types of mobile applications themselves.
Native apps. The main type of mobile applications are installed on the phone and work directly on the phone. Native applications do not require a browser and the Internet, and if they do, they are not dependent on their availability.
Web applications. Nowadays, there are often adaptive designs that adapt a web application from a desktop version to a mobile one. You need to test such applications in conditions as close as possible to the real environment - on emulators or devices.
Hybrid applications. Native applications, inside which the ability to open web pages is built-in, when access to the web is implemented through a native application.
Environment for automation
iOS Automation | Android Automatoin |
---|---|
Mac OS | Mac OS / Windows / Linux |
Xcode | Android SDK |
Node Js | Emulator setup |
Appium | Appium |
Application | Application |
Haxm driver (for SDK Emulator) |
Before starting the automation process, it is important to understand which environment will need to be configured. And, of course, the two main operating systems that you have to deal with are iOS and Android.
What does iOS require to start? Apple is a holistic system, so if tomorrow you need to automate the iOS-application, you will need the Mac operating system, as an option - you can deploy everything to Mac-mini. Why? Because we will need xCode with Mac OS.
The next tool will be Appium. There are two ways to launch Appium: UI or console. To install and run the console version, we additionally need NodeJs. And the UI version can be downloaded from the official site, and it will be immediately ready for use.
If we are talking about automating web applications, we will need to make sure that a browser is installed on the emulator. For Android, it will be Google Chrome, for iOS it will be Safari (which by default is always already in the emulator).
If we are testing the native application that the developers gave us, we need to make sure that it is on our machine and indicate the path - how to access it in Appium.
For Android automation, the issue of choosing an operating system is not so critical - everything can be configured here under Windows, Linux and MacOS. It should be noted that under a virtual machine it is not always possible to deploy Android automation due to the lack of a graphics adapter, without which we simply cannot launch the Android emulator. To start, we will need the Android SDK, i.e., a development kit for Android, which already has a built-in emulator.
The next step is to configure the emulator. There you can create and configure an emulator and emulate almost any Android device.
We will also need Appium and the .apk application itself, or Chrome.apk for Android. It is important to know that the standard Android emulator does not have a big problem: SDK - the emulator is very slow to speed it up. When creating the emulator, you can check the “Use host GPU” checkbox, and the Haxm driver (mentioned in the list) must be installed. Then the emulator starts to work with more or less decent speed. Of course, there are more stable and faster tools like Genymotion - it is shareware, but if you want to use it in your project, you will have to purchase a paid license.
Search and work with elements
Tools:
Appium Inspector
UI Automator Viewer (Android)
UI Automation (iOS)
Locators:
• Xpath
• Id
• Class
• Name
• UI Automation id
• Css (mobile web only)
• Accessibility id (ios only)
If you work with mobile web application, you can simply open it in a browser, reduce to the size of the desired device and find the necessary elements.
If the application is native or hybrid, you will need the Appium Inspector - the built-in inspector in Appium works with both Android and iOS. That is, if you are running the UI version of Appium, you can click Inspect, after which a tree of application elements will appear. There are also separate programs: UI Automator Viewer for Android and UI Automation for iOS. These are auxiliary tools that allow you to see elements and find them, so that you can use them later in automation.
What locators exist? Even for native applications, we can use the same markup language as for web applications - Xpath. ID if the elements have any IDs. You can also use Class or Name, there is also a UI Automation ID - it can come in handy if some element is not visible in the standard tree. That is, when an element is on the device, on the emulator, we see it, but in the element tree and in the xml it does not exist - this also happens. UI Automation allows you to record your actions, then generate code, parts of which can be used in automation. You can use CSS, but keep in mind that CSS locators will only work when automating mobile web applications.
An example of finding locators in native applications
Opening the UI Automation Viewer, on the left you will see a screenshot, on the right - a tree of elements that are in the application. We can go through the tree and build an Xpath. The locator will use two slashes and include elements that will allow you to build the dependency.
We have a tree and details for each element of the application. That is, we take some element, and we can see its class, index, text, id, etc. All this can be used when searching for an item. When constructing a locator, we can find an element by class if the class is unique. Also, we may have resource_id, inside which there is id =, such elements can be found directly by id =. The Appium Inspector will be very similar to the UI Automation viewer, which comes with the Android SDK (the tools folder, and in it there is a uiautomatorviewer.bat file that launches the UI Automation viewer and allows you to view the tree of elements). With it, you can see elements not only on the emulator - you can find the same elements from a real device if it is connected through a cable.
In the UI Automation Viewer, you can click on the device screenshot button, then it connects to the emulator using ADB and receives a screenshot and xml of the open application through it. After receiving the screenshot and the tree of elements, we can check whether we have the necessary elements and check which attributes can make up a unique locator. Thus, the elements are found in native applications.
Work with the driver
WebDriver - RemoteWebDriver - AppiumDriver = IOSDriver / AndroidDriver
Before working with the driver, you need to configure the environment, the emulator, install the application on the emulator, open and make sure that the application elements are available in the element tree. That is, check that everything is ready for further automation.
After that, you can build automation directly. It is also worth mentioning the layers in automation, where the end point is your application that is installed on the device, a level below is the device, and Appium interacts with it, which is installed separately. Commands in Appium are transmitted from code, which in general will look almost the same as when automating web applications using Selenium. That is, you use the same Page Object, you also work with elements, only it will not be WebElements, but MobileElements. The code itself interacts with Appium through the driver.
Web driver - an interface in which there is a framework of possible actions. Next comes RemoteWebDriver, which inherits WebDriver. Next is AppiumDriver, which will be necessary for automating mobile native applications (you can use RemoteWebDriver to automate mobile web applications). And already from AppiumDriver, IOSDriver and AndroidDriver are inherited, in which certain actions are implemented differently for each operating system.
If in more detail:
WebDriver - the basic interface.
RemoteWebDriver - often used for automation using Selenium Grid (web application automation).
AppiumDriver is a general abstract class for automating mobile applications.
IOSDriver - Used for iOS Automation.
AndroidDriver - used for Android Automation.
Using RemoteWebDriver to automate mobile web applications is not always convenient, because sometimes it becomes necessary to refer to certain native parts. On Facebook, if you will automate the mobile web version of the application, after entering the login and password, an offer to remember the password will pop up. And after the upgrade, a native pop-up pops up, which will block the site until you click “Ok”. To do this, you will already need an AppiumDriver (or iOSDriver / AndroidDriver), which will be able to work with the native context, because RemoteWebDriver can only work with web contexts.
AppiumDriver (General abstract class for Mobile) is an abstract class for mobile automation and for Android / iOS automation. Depending on the device, you can use IOSDriver or AndroidDriver, which is inherited from AppiumDriver.
Initialization Example

This example is taken from a real project. Here we can specify the capabilities we need. Once the automation environment is ready, you can implement everything in the code, which will send the necessary requests to Appium later. To do this, you need to specify the URL where it will go to Appium, and the required capability.
The initialization example shows the settings for Mobile Chrome and Mobile Safari. In capability, browserName is indicated, where we indicate whether it will be Chrome or Safari. Further, we can use different devices and platforms. There are also autoAcceptAlerts to prevent native allergies from popping up, and newCommandTimeout, which is set to tell the driver how easy it is to end the session.
Work with contexts
Get all contexts:
getDriver (). GetContextHandles ();
Switch context:
getDriver (). Context (" WEBVIEW ");
getDriver (). context (" NATIVE_APP ");
getDriver (). context (" CONTEXT_NAME ");
This is more true for hybrid applications. When automating native applications, either the Appium Inspector or the UI Automator Viewer will help us. As for hybrid applications, when a native application opens, after certain steps, you can open a web page in the same application, or part of it will be displayed.

In this example, the Appium Inspector is open. This is what it looks like when an iOS application is open. The tree of elements here opens sequentially. You select an element and, if there are more elements inside, the following tree of internal elements will be displayed. Here you can see contexts, that is, you can choose NATIVE_VIEW - a native context, or WEB_VIEW - a web context.
It also happens that you have a native header in the application, and a built-in web page follows. In such cases, you have several contexts, and you can switch between them to find the necessary elements.
How it works? If we talk about how this is implemented in the code, you have a driver in which it is possible to get all contexts by calling the getContextHandles () method.
You also have an Appium inspector to confirm that all contexts are available; it is possible to derive from the code all contexts that are currently available, after which you can switch to these contexts.
That is, the main thing you will need is the getContextHandles () method, which takes all the contexts that are on the current open page of the application.
If you need to switch between contexts, open the native application in the test, go to the web part in it, after which you need to switch from the native context to the web. To do this, call the context () method, which is in AppimDriver, and specify the context in which you want to switch to - for example, WEB_VIEW or NATIVE_VIEW.
Devices or emulators
Real device | Emulator | |
---|---|---|
Easy setup | Android: Quick Install iOS: Have to Rummage | Android: There are pitfalls when setting up iOS: requires xCode and minimal settings |
Run speed | Quickly | Android: the speed depends on the emulator iOS: low startup speed, quick run |
Stability | Relatively stable | Android: There are certain instabilities in iOS: The problem can be solved with additional bash scripts |
Behavior | May vary depending on OS version | May vary depending on OS version |
Item Availability | In a WebView, elements can be defined as Native. | WebView elements are available as web elements. |
We worked with real devices and with emulators. In each case, the pros and cons were found.
The real Android device as a whole can be set up pretty quickly. You need to make sure that developer mode is enabled on your device. And after connecting the device, you need to confirm the connection on the phone itself by clicking on the “Ok” pop-up window, and that’s it - with the minimum effort you are ready to automate mobile applications on Android.
If we are dealing with iOS, we will have to rummage - there are difficulties in connecting a real device. A bundle with Appium is possible, but it takes time to configure it.
If we talk about the Android emulator, the main pitfall is the performance problem. The standard emulator is very slow, but installing the haxm driver and choosing Use Host GPU when configuring the emulator allows it to be accelerated.
If this is web automation, it is important to install the correct version of the browser. For example, if you have an x86 platform, you need to install Google Chrome also an x86 version.
If we consider the iOS emulator, in general, everything is not so complicated - you will need minimal settings. You need to run Appium, so that on Xcode there is an emulator of the desired version. For example, to run a test on the iPhone 6, under iOS 9.3, you just need to make sure that this version is present in Xcode.
In general, real devices are faster than emulators. In the case of emulators, in Android, the speed depends on the emulator itself. As an alternative to Android, there is a Genymotion emulator that installs quite quickly and works faster and more stable than the standard one. Using Genymotion, you can run tests in several threads on the same machine, which can significantly increase the speed of passing tests.
If we talk about the IOS emulator, we get a high run speed, but a low test launch speed. Because if we run the test in a new session (and most often we need it to work on a clean application), it will reopen the emulator every time, which takes time. At the same time, the tests themselves run pretty fast.
Android emulators may experience certain instabilities. If we talk about real devices, everything is relatively stable. But in general, problems with unstable runs under a standard emulator can be partially solved with additional batch scripts - they will kill all unnecessary processes (which might have remained from the previous run) and start a clean Appium and emulator session before running the test group before starting the tests.
There are times when the behavior of an application may differ depending on the version of the operating system, some pop-up windows may not appear at all, or vice versa, pop up constantly. These are trifles, but this happens - such moments also have to be handled somehow.
Item Availability
So far, we have noticed this in only one project: a web page opens in the native application, and the elements on this page are defined differently on the emulator and the real device. It happens that elements on a real device are defined as native, and in the emulator everything works like a WebView, that is, like a regular website, where we can find elements even by CSS locator.
As I already noted, we noticed a similar difference in one project: we could find an element on CSS locator that would work fine on the emulator, but when we run the same test on real devices, it basically doesn’t find anything, even the web context - and sees everything as a native context.
Possible problems
Some items may not be available.
Not all standard methods work correctly (ex. Scroll / swipe).
Need to stay tuned.
It is necessary to monitor the compliance of versions (OS, Appium, Emulator).
It is important to configure the emulator correctly.
In the case of automation of mobile applications, it is still not as ideally debugged as in web automation, but in general, all the problems that we examined are solved. You just have to spend some time on this.
As I mentioned earlier, some items may not be available. We had a case when the validation on the page was visually visible, but it was not in the tree of elements. You can call getPageSource (), as in web automation, and the xml of the current open page will be returned to us, that is, everything that is visible on the page at the moment. But most often, if an element is not available in the inspector, xml will not open anything new either. In this case, there are two solutions: either leave this case for manual testing, or talk with developers, ask them to add an ID or other additional attributes.
Not all standard methods work correctly. And here you need to understand that Appium is a relatively new framework if you compare it, say, with Selenium. This is noticeable when talking about actions such as scroll / swipe, when we need to scroll left / right or up / down in the native application. AppiumDriver has a scrollToText () method that can scroll through a page to a specific text, but, unfortunately, it does not always work stably for iOS. Be prepared to write your own custom scroll - there are certain solutions on the Internet that you can use.
AppiumDriver also has a swipe method, but it has a lot of parameters. Therefore, to use it in tests, most often you will need to write your own wrapper, which will move the screen a certain number of percent or pixels left and right. That is, in general, the issue is resolved, but the problem will have to be faced.
It is necessary to keep track of updates and compliance with the version - something from something that previously did not work in Appium works successfully in new versions. That is, we will have to update Appium as Application in the same way as the API in the code. In both cases, we should try to maintain the latest version, because it is regularly improved and finalized. However, if you have the latest version of Appium and some old xCode with the old emulator, the application may not work.
Mobile Automation Workflow

Actually, this is how everything should work ideally. A native application is a file that we need to take from somewhere, and then use it when running tests. Thus, for the automation process to be implemented correctly, you need an assembly that will assemble the mobile application. For example, the automatic assembly of an Android application that will create an .apk file. Further, a Jenkins task can be automatically launched on a trigger, which will transfer the location of the assembled .apk application and run the necessary tests on it. You can configure the assembly of the application and run the tests on it every night, so we will always have up-to-date tests, and we can see and analyze the results of the test run every morning and identify bugs as quickly as possible.
Cloud services

To automate iOS tests, we needed a Mac-mini. But what if we need to ensure multithreading? Suppose there are 300 tests, and they all go on stream for 12 hours, and we need to get the results in just an hour. It becomes much more difficult to work: for each thread you need a separate mac-machine. At the same time, you need to constantly monitor updates of versions of Appium, Xcode and OS.
In cloud services, you pay a relatively small amount for the possibility of unlimited use and you can select the desired number of threads. With the right approach, you can convince the customer that due to the opportunities received and time savings on supporting local testing environments, he will rather save than spend too much.
Take BrowserStack. This vendor allocates virtual machines, and we simply specify in the code RemoteUrl, on which tests will be chased. And, if we specified a run in three threads, it will be automatically distributed to three machines on the browserstack side. In addition, they update all applications on time and carefully monitor compatibility. So, getting up will be much easier. The most significant advantage of cloud services is that you do not spend a lot of time setting up and maintaining environments for automated testing.
Benefits of Cloud Services
:• We do not waste time setting up the environment.
• Do not waste time supporting the environment.
• Stable work.
• Higher speed of passing tests (most often).
• Simple implementation of multithreading.
Cloud services also provide video recording of what is happening - this applies to both mobile and web applications. And if you need to test under a specific system, for example Windows XP or Internet Explorer 8, you can easily set the parameters during the test run, and the cloud service will automatically run the test under the desired environment. By the way, for manual testing, the same browserstack provides the possibility of free use.
Now you can not waste time setting up the environment and its support: we just tell the test which environment to run it on, and we always know that all the latest updates are already installed there.
Stable operation of cloud services means that you will not need to write additional scripts that control the stability of the run and start scripts in a clean session. Tests in cloud services usually run faster, although if you have an absolutely top-end and very expensive Mac machine, you can win in speed. But if you have a Mac-mini the year before last, even with good parameters the year before last, the test will not be as fast as most cloud services.
Moreover, the cloud service allows us to run one task in several threads and, accordingly, your 9-hour tests can be run in an hour or two.