WebDriver

From AutoIt Wiki
Jump to navigation Jump to search

The W3C WebDriver API is a platform and language-neutral interface and wire protocol allowing programs or scripts to control the behavior of a web browser.

Introduction

WebDriver API

WebDriver enables developers to create automated tests that simulate user interaction. This is different from JavaScript unit tests because WebDriver has access to functionality and information that JavaScript running in the browser doesn't, and it can more accurately simulate user events or OS-level events. WebDriver can also manage testing across multiple windows, tabs and webpages in a single test session.

WebDriver UDF

The WebDriver UDF allows to interact with any browser that supports the W3C WebDriver specifications. Supporting multiple browsers via the same code base is now possible with just a few configuration settings.

Requirements

(Last modified: 2021/06/18)

The following UDFs need to be installed - independent of the Browser you try to automate:

One of the following Drivers needs to be installed - depending on the Browser type and version you try to automate:

Browser Download Link Latest Version / Date Comments
Chrome Google 92.0.4515.43 /
2021.06.11
Follow this link to select the correct version depending on the Chrome version you run!
Edge Microsoft 91.0.864.53 /
2021.06.17
Firefox GitHub 0.29.1 /
2021.04.09
Firefox version ≥ 60 is recommended

Note: You must still have the Microsoft Visual Studio redistributable runtime installed on your system for the binary to run. This is a known bug in version 0.26 which the authors weren't able fix for this release.

Opera GitHub 91.0.4472.77 /
2021.06.09
The versioning of OperaDriver matches the Chromium version on which Opera browser is based on.

Limitations

(Last modified: 2020/01/28)
Not all WebDriver functions have been implemented by each browser. To check the status goto the corresponding website below:

  • Chrome
  • Edge
  • Firefox
  • Opera: "OperaChromiumDriver is a WebDriver implementation derived from ChromeDriver and adapted by Opera". That's why I think it has at least the same limitations as ChromeDriver.

Big Picture

How the browser independent and browser dependent parts fit together:

Big Picture - How everything fits together

Used Terms

(Last modified: 2021/07/28)

You will find the following terms when using WebDriver. We try to shed some light onto this subject here:

CDP (Chrome DevTools Protocol)
Is a protocol that allows for tools to instrument, inspect, debug and profile Chromium, Chrome and other Blink-based browsers.

Marionette
Marionette is an automation driver for Mozilla’s Gecko engine. It can remotely control either the UI or the internal JavaScript of a Gecko platform, such as Firefox. It can control both the chrome (i.e. menus and functions) or the content (the webpage loaded inside the browsing context), giving a high level of control and ability to replicate user actions. In addition to performing actions on the browser, Marionette can also read the properties and attributes of the DOM.
Marionette consists of two parts: a server which takes requests and executes them in Gecko (the Marionette server ships with Firefox), and a client (the Marionette client ships with the GeckoDriver exe). The client sends commands to the server and the server executes the command inside the browser.
For details please visit this site.

ShadowRoot
The ShadowRoot interface of the Shadow DOM API is the root node of a DOM subtree that is rendered separately from a document's main DOM tree.
For details please visit this site.

Installation

(Last modified: 2020/09/27)

To automate your browser the following installation steps are needed:

  • Download the files listed in section "Requirements"
  • Move the UDFs to a directory where SciTE and Autoit can find them:
    • Json.au3 and BinaryCall.au3 from the JSON UDF
    • wd_Core.au3 and wd_helper.au3 from the WebDriver UDF
    • WinHttp.au3 and WinHttpConstants.au3 from the WinHttp UDF
  • Move the browser dependent WebDriver to the same directory (with WD_Demo.au3):
    • chromedriver.exe (Chrome)
    • geckodriver.exe (Firefox)
    • msedgedriver.exe (Edge - Chromium) or MicrosoftWebDriver.exe (Edge - EdgeHTML)
  • Run WD_Demo.au3 and select "DemoNavigation" to validate the installation.
    The result (for Firefox) displayed in the DOS window should be similar to the following:
1577745813519   geckodriver     DEBUG   Listening on 127.0.0.1:4444
1577745813744   webdriver::server       DEBUG   -> POST /session {"capabilities": {"alwaysMatch": {"browserName": "firefox", "acceptInsecureCerts":true}}}
1577745813746   geckodriver::capabilities       DEBUG   Trying to read firefox version from ini files
1577745813747   geckodriver::capabilities       DEBUG   Found version 71.0
1577745813757   mozrunner::runner       INFO    Running command: "C:\\Program Files\\Mozilla Firefox\\firefox.exe" "-marionette" "-foreground" "-no-remote" "-profile" "C:\\ ...
1577745813783   geckodriver::marionette DEBUG   Waiting 60s to connect to browser on 127.0.0.1:55184
1577745817392   geckodriver::marionette DEBUG   Connection to Marionette established on 127.0.0.1:55184.
1577745817464   webdriver::server       DEBUG   <- 200 OK {"value":{"sessionId":"925641bf-6c5d-4fe2-a985-02de9b1c7c74","capabilities":"acceptInsecureCerts":true,"browserName":"firefox", ...

Function reference

(Last modified: 2021/07/29 - based on version 0.4.1.0)

WD_CORE

The WD_Core.au3 file holds functions to implement the Webdriver W3C document.

Function Description Comment
_WD_Action Perform various interactions with the web driver session Use one of the following values for parameter $sCommand:
actions,
back,
forward,
refresh,
title,
url.

For command actions: Pass the actions to be set using parameter $sOption. If $sOption is empty then the set actions will be removed.

_WD_Alert Respond to user prompt Use one of the following actions to respond to the user prompt:
accept,
dismiss,
gettext,
sendtext,
status.
_WD_Cookies Gets, sets, or deletes the session's cookies Use one of the following commands:
get (Gets a single cookie),
getall (Gets all cookies),
add (Sets a single cookie),
delete (Deletes a single cookie).
_WD_CreateSession Request new session from web driver Define the capabilities of the browser with this function
_WD_DeleteSession Delete existing session Closes the session created by _WD_CreateSession
_WD_ElementAction Perform action on designated element Use one of the following actions:
Active (Get active element),
Attribute (Get element's attribute),
Clear (Clear element's value),
Click (Click element),
CompLabel(Get element's computed label),
CompRole (Get element's computed role),
CSS (Get element's CSS value),
Displayed (Get element's visibility),
Enabled, (Get element's enabled status),
Name (Get element's tag name),
Property (Get element's property),
Rect (Get element's dimensions / coordinates),
Screenshot (Take element screenshot),
Selected (Get element's selected status),
Shadow (Get element's shadow root),
Text (Get element's rendered text),
Value (Get or set element's value). 

Clear, Click and Value use POST to modify or process the element. All other actions use GET to retrieve data from the element

_WD_ExecuteScript Execute Javascipt commands
_WD_FindElement Find element(s) by designated strategy You can specify whether the function should only return the first find or all of them
_WD_GetSource Get page source
_WD_Navigate Navigate to the designated URL
_WD_Option Sets and get options for the web driver UDF The following options can be used:
BaseURL (IP address used for web driver communication),
BinaryFormat (format used to store binary data),
Console (define destination for console output),
DebugTrim (Length of response text written to the debug cocnsole),
DefaultTimeout (Default timeout (in miliseconds) used by other functions if no other value is supplied),
Driver (set the full path name to web driver executable),
DriverClose (Close prior driver instances before launching new one (Boolean)),
DriverDetect (Use existing driver instance if it exists (Boolean)),
DriverParams (parameters to pass to web driver executable),
HTTPTimeouts (Set WinHTTP timeouts on each Get, Post, Delete request (Boolean)),
Port (port used for web driver communication).

If no value is passed to be set the current value is returned

_WD_Shutdown Kill the web driver console app
_WD_Startup Launch the designated web driver console app The PID for the WebDriver console is returned
_WD_Status Get current web driver state Returns a raw JSON response from the web driver
_WD_Timeouts Set or retrieve the session timeout parameters Specify the type and value of the timeout like this: '{"type":value}'. Example: '{"pageLoad":2000}'
_WD_Window Perform interactions related to the current window One of the following actions:
close (close current tab),
frame (switch to frame),
fullscreen (set window to fullscreen),
handles (get all window handles),
maximize (maximize window),
minimize (minimize window), 
parent (switch to parent frame),
rect (get or set the window's size & position),
screenshot (take screenshot of window),
switch (switch to designated tab),
window (get current tab's window handle).

WD_HELPER

The WD_Helper.au3 file holds functions to help you automate a web site.

Function Description Comment
_WD_Attach Attach to existing browser tab Use one of the following search modes:
HTML,
Title,
URL.
_WD_ConsoleVisible Control visibility of the webdriver console app
_WD_DownloadFile Download file and save to disk
_WD_ElementActionEx Perform advanced action on desginated element Use one of the following commands:
childcount,
clickandhold,
doubleclick, 
hide,
hover, 
modifierclick,
rightclick,
show.
_WD_ElementOptionSelect Find and click on an option from a Select element
_WD_ElementSelectAction Perform action on designated Select element Use one of the following commands:
options,
value.
_WD_FrameEnter Will enter the specified frame for subsequent WebDriver operations
_WD_FrameLeave Will leave the current frame, to its parent, not necessarily the Top, for subsequent WebDriver operations
_WD_GetBrowserVersion Get version number of specifed browser
_WD_GetElementById Locate element by id
_WD_GetElementByName Locate element by name
_WD_GetElementFromPoint Retrieves reference to element descriped by x/y coordinate
_WD_GetFrameCount Returns how many frames/iframes are in your current window/frame It will not traverse to nested frames
_WD_GetMouseElement Retrieves reference to element below mouse pointer
_WD_GetSession Get details on existing session
_WD_GetShadowRoot Retrieves the shadow root of an element
_WD_GetTable Return all elements of a table
_WD_GetWebDriverVersion Get version number of specifed webdriver
_WD_HighlightElement Will highlight the specified element
_WD_HighlightElements Will highlight multiple elements passed as an array
_WD_IsFullScreen Return a boolean indicating if the session is in full screen mode
_WD_IsLatestRelease Compares local UDF version to latest release on Github
_WD_IsWindowTop Returns a boolean of the session being at the top level, or in a frame(s)
_WD_jQuerify Inject jQuery library into current session
_WD_LastHTTPResult Return the result of the last WinHTTP request
_WD_LinkClickByText Simulate a mouse click on a link with text matching the provided string
_WD_LoadWait Wait for a browser page load to complete before returning
_WD_NewTab Create new tab using Javascript
_WD_PrintToPDF Print the current tab in paginated PDF format
_WD_Screenshot Will return a screenshot of the browser window or a specified element
_WD_SelectFiles Select files for uploading to a website
_WD_SetElementValue Set value of designated element
_WD_SetTimeouts User friendly function to set webdriver session timeouts
_WD_UpdateDriver Replace web driver with newer version, if available
_WD_WaitElement Wait for a element to be found in the current tab before returning

WD_CDP

The WD_CDP.au3 file holds functions to help you automate the Chrome DevTools Protocol (CDP).

Function Description Comment
_WD_ExecuteCdpCommand ChromeDriver specific function to execute "Chrome DevTools Protocol" (CDP) commands
_WD_GetCDPSettings Retrieve CDP related settings from the browser Use one of the following options:
debugger (returns the websocket target originally returned by _WD_CreateSession),
list (returns an array containing websocket targets),
version (returns an array containing version metadata).

Browser related functionality

Google Chrome

ChromeDriver supports "Chrome DevTools Protocol" (CDP) commands (for an explanation of the term and related links please see the Used Terms section).
At the moment the WebDriver UDF supports the following commands.

Function Description Comment
_WD_ExecuteCDPCommand Execute CDP command
_WD_GetCDPSettings Retrieve CDP related settings from the browser

Troubleshooting

Debug the WebDriver setup

(Last modified: 2020/07/06)

WinHTTP UDF

Make sure that you are running at least version 1.6.4.2 (currently unreleased, but can be obtained here).

Chrome

Chrome does not start and the DOS window for chromedriver does not get displayed
Problem Solution Reference
When running WD_Demo.au3 it does not start up Chrome and does not display the DOS window for chromedriver.
When you manually run the chromedriver in a DOS window you get message "[0.023][SEVERE]: CreatePlatformSocket() returned an error: An invalid argument was supplied."
This could be caused by missing execution permission for the network drive. Please ask your IT admin for "Applocker" or "application directory whitelisting".
Or run the chrome driver from a local HDD and call _WD_Option to set the location of the webdriver executable. Example: _WD_Option("Driver", "C:\Local\WebDriver\chromedriver.exe")
Stackoverflow

Firefox

Firefox does not start and the DOS window for geckodriver does not get displayed
Problem Solution Reference
When running WD_Demo.au3 it does not start up Firefox and does not display the DOS window for geckodriver.
When you manually run the geckodriver in a DOS window you get message "geckodriver: error: An invalid argument was supplied. (os error 10022)"
This could be caused by missing execution permission for the network drive. Please ask your IT admin for "Applocker" or "application directory whitelisting".
Or run the gecko driver from a local HDD and call _WD_Option to set the location of the webdriver executable. Example: _WD_Option("Driver", "C:\Local\WebDriver\geckodriver.exe")
Stackoverflow

Debug your Script

FAQ

(Last modified: 2020/10/11)

1. How to connect to a running browser instance
Q: How can I connect to a running browser instance?
A: That's described (for Firefox, but should work similar for other browsers) in this post.
2. How to hide the webdriver console
Q: How can I hide the webdriver console?
A: The console can be completely hidden from the start by adding the following line near the beginning of your script:
$_WD_DEBUG = $_WD_DEBUG_None ; You could also use $_WD_DEBUG_Error
You can also control the visibility of the console with the function _WD_ConsoleVisible.
3. How to utilize an existing user profile
Q: Can I use an existing user profile instead of the default behavior of using a new one?
A: This is controlled by your "capabilities" declaration, with each browser using a different method to implement. Here are some examples:

Chrome
$sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "args":["--user-data-dir=C:\\Users\\' & @UserName & '\\AppData\\Local\\Google\\Chrome\\User Data\\", "--profile-directory=Default"]}}}}'
Firefox
$sDesiredCapabilities = '{"capabilities":{"alwaysMatch": {"moz:firefoxOptions": {"args": ["-profile", "' & GetDefaultFFProfile() & '"],"log": {"level": "trace"}}}}}'

Func GetDefaultFFProfile()
	Local $sDefault, $sProfilePath = ''

	Local $sProfilesPath = StringReplace(@AppDataDir, '\', '/') & "/Mozilla/Firefox/"
	Local $sFilename = $sProfilesPath & "profiles.ini"
	Local $aSections = IniReadSectionNames ($sFilename)

	If Not @error Then
		For $i = 1 To $aSections[0]
			$sDefault = IniRead($sFilename, $aSections[$i], 'Default', '0')

			If $sDefault = '1' Then
				$sProfilePath = $sProfilesPath & IniRead($sFilename, $aSections[$i], "Path", "")
				ExitLoop
			EndIf
		Next
	EndIf

	Return $sProfilePath
EndFunc
You will also likely need to specify the marionette port:
_WD_Option('DriverParams', '--marionette-port 2828')
4. How to specify location of browser executable
Q: Is it possible to launch a browser installed in a non-standard location?
A: This is controlled by your "capabilities" declaration. Here are some examples:

Chrome
$sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "binary":"C:\\Path\\To\\Alternate\\Browser\\chrome.exe" }}}}'
Firefox
$sDesiredCapabilities = '{"desiredCapabilities":{"javascriptEnabled":true,"nativeEvents":true,"acceptInsecureCerts":true,"moz:firefoxOptions":{"binary":"C:\\Path\\To\\Alternate\\Browser\\firefox.exe"}}}'
Alternate Firefox method:
_WD_Option('DriverParams', '--binary "C:\Program Files\Mozilla Firefox\firefox.exe" --log trace ')
5. How to maximize the browser window
Q: Is it possible to maximize the browser window?
A: Simply call the following function:
_WD_Window($sSession, "Maximize")
Make sure to call _WD_Window after the session has been created with _WD_CreateSession.
6. How to specify location of WebDriver executable
Q: Is it possible to launch the WebDriver executable from a specific location?
A: This is controlled by function "_WD_Option". Example:
_WD_Option("Driver", "C:\local\WebDriver\WebDriver.exe")
7. How to retrieve the values of a drop-down list
Q: How to retrieve the values of a drop-down list (<Select> tag)?
A1: Here's a simple way to do it:
$sElement = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//select[@name='xxx']")
$sText = _WD_ElementAction($sSession, $sElement, 'property', 'innerText')
$aOptions = StringSplit ( $sText, @LF,  $STR_NOCOUNT)
_ArrayDisplay($aOptions)

'xxx' is the name of the drop-down list.


A2: This can now be accomplished using the function _WD_ElementSelectAction:
$sElement = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//select[@name='xxx']")
$aOptions = _WD_ElementSelectAction ($sSession, $sElement, 'options')
_ArrayDisplay($aOptions)
8. How to run the browser in headless mode
Q: How do I run the browser in "headless" mode?
A: This is controlled by the Capabilities string that is passed to _WD_CreateSession. Example:
$sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "args": ["--headless", "--allow-running-insecure-content"] }}}}'

Tools

The following tools will help you to automate your browser:

  • ChroPath plugin: Makes finding an element by XPath, ID or CSS incredibly easy (Chrome, Firefox, Opera)
  • SelectorsHub plugin: Next Gen XPath tool to generate, write and verify the XPath and cssSelectors (All browsers)

References

(Last modified: 2020/03/01)

Further information sources: