It often happens that someone creating automated testing programs using Selenium or a similar framework that controls the UI (User Interface) eventually finds it necessary to address a non-Selenium API (Application Programming Interface) of some sort. You might need to create, update, or delete test data. You might need to modify a system setting. You might need to determine the current state of the system. You might need to have your test send or receive a signal from some other part of the deployment process. Often the only way to do these things is through use of an API.
API Defined
An API is a relationship between a client and a server. The server controls some aspect of the system, and the client wants to see or change some aspect of the system that the server controls.
Selenium-webdriver itself operates as an API. The server for the Selenium API is called WebDriver, and WebDriver exists only on the browsers that Selenium is automating. Your script controls the Selenium client that communicates with WebDriver in the browser instance. Selenium is a tool that automates browsers, nothing more. Working with APIs beyond WebDriver requires an understanding of what those broader APIs provide, what the test code requires, and how to make those connections. Knowing these approaches to APIs other than Selenium/WebDriver is important work in test automation.
There are many kinds of APIs in the world of application programming, and you as a developer will have to understand which APIs you need to address, and how to address them. For example, the most common type of API a tester will typically encounter is a ReST (Representational State Transfer) API, that sends data, usually in JSON or XML format, over an HTTP connection between the api host and the consuming client. Less popular today than ReST is SOAP, an API that tends to offer a wider range of possible actions than other kinds of APIs.
Just as there are generic ReST and SOAP APIS, there are also application-specific APIs: Wikipedia has its own API; Salesforce has its own API. In some cases the API is not a single monolithic API, but a disparate collection of multiple APIs collected for different services.
Designing Browser Tests Using APIs
API calls from an end-to-end test script answer different needs and serve different contexts. One context might be that your test needs to know something about the state of the system before it can proceed: is a service down? Is a setting in place?
Or your API calls might make tests run more quickly or make tests less prone to failure. For example, the API call might set up a known set of test data for a test and then tear down that data after the test runs. A typical example is to create a user account with specific properties that you do not want to create via some tedious and irrelevant browser-based registration.
Implementing API Client Calls From Within Test Code
The code you will use to address an API has nothing to do with the code you use to drive the UI. Using the ReST example, if you are working in a language like C# or Java, you may have to create your own instance of a ReST API client using your language's HTTP library and your language's JSON or XML parser. Dynamic languages like Python and Ruby tend to have shared libraries that make this easier, but there is no skipping the work involved in understanding how these API clients function in the context of your particular language and operating system.
There is another approach to addressing an API other than using the native clients in your programming language. There are any number of command-line options for addressing APIs, from the most basic curl() utility, to the comprehensive clients offered by for example Postman. Every programming language offers the ability to exit the language and address the command shell directly. (This is often called to "shell out".) If creating an API client in your own language is too troublesome, it may be worth simply exiting the program, invoking an API client on the local host like a curl() command or something similar, then capturing the results of the shell-based API call. This has the benefit of being relatively easy in almost every programming language, and may provide a useful set of tools beyond just browser testing. The drawback is that you have to rely on the utilities you need to exist and to be configured properly in the environment in which you run your test scripts.
Finally, consider unusual approaches. Every test environment is unique, and your test environment may offer opportunities to address your API that are not obvious. For example, one of the authors of this paper occasionally needed to do a simple operation by way of the API. The API server required an extensive security regimen involving passwords, tokens, third-party authentication, really an extremely secure transaction. Instead of investing in programming that overhead into the test framework, we knew that the API offered a Swagger interface on a web page on a local server, so we simply used Selenium to navigate to the Swagger web page, login as a regular user using Selenium to accomplish the login, and then accomplished their API transaction on the Swagger web page with Selenium automation. So sometimes you actually CAN address an API with Selenium automation!
Which brings us to our last consideration: authentication and security. In some cases, security for test environments may be relaxed and authentication may be simple or not required at all. But that may not be the case. Besides understanding what information your API provides and how your API client can address that information, you also need to understand and honor whatever security measures may be in place.
Using APIs Gives Tests Power
Selenium itself is an API client, it controls the browser. Using API calls, tests can observe the state of the system, and can control the state of the system being tested. Tests can create, read, update and delete test data. Using APIs, tests can reach into other aspects of the build and deploy systems in the sense of DevOps. This power over system state and test data and interprocess communication gives your browser tests the flexibility and reliability to be an integral part of a Testing Observability And DevOps (TOAD) environment.
Feb 2021
- Benjamin Hofmann
- Tim Western
- Chris McMahon
- Thanks to The Testing Observability And DevOps (#TOAD) API Working Group
This work is published under the Creative Commons “CC BY-SA” Attribution-ShareAlike license.