Automation API Documentation
Complete reference for automating Android devices. Build powerful automation scripts with full access to device controls, screen content, file operations, and more.
Guide
Learn how to build automations with step-by-step tutorials and best practices.
Reference
Complete API reference for all interfaces, methods, types, and constants.
Quick Links
Agent Actions
All automation actions: tap, swipe, screenshot, screenContent, launchApp, and more
AndroidNode
Working with the accessibility tree - properties and methods for finding and interacting with UI elements
AndroidNodeFilter
Builder pattern for complex node queries - isButton(), hasText(), isClickable(), and more
File Operations
Read, write, list, and manage files on the device
Getting Started
The automation API is accessed through the global agent object. Here's a simple example:
// Get the current screen content
const screen = await agent.actions.screenContent();
// Find a button with text "Submit"
const submitBtn = screen.findTextOne("Submit");
// Tap the button
if (submitBtn) {
const { left, top, right, bottom } = submitBtn.boundsInScreen;
await agent.actions.tap(
(left + right) / 2,
(top + bottom) / 2
);
}