UI-Genie
Multimodal natural-language UI automation on Ember Cloud
UI-Genie
UI-Genie is Ember Cloud’s core capability: describe intent in natural language, and the platform executes on cloud devices using multimodal models.
Use cases
- Exploratory checks on new features
- Sampled regression via NL instructions
- Less traditional script maintenance
- Knowledge-base-assisted understanding of product UI
Workflow
- Pick target app and cloud device in Workspace
- Write or pick an NL instruction from the instruction repo
- UI-Genie reads screenshots & UI state, executes step by step
- Review logs, screenshots, and failed steps
- Save proven instructions to repo or task templates
vs desktop mobile workspace
| UI-Genie (cloud) | Desktop mobile workspace | |
|---|---|---|
| Driver | Multimodal NL | Step-based / Agent Apps |
| Devices | Cloud pool | Local USB |
| Collaboration | Multi-tenant, CI tokens | Personal workspace |
Admin setup
- Platform multimodal models & uiagent services
- Knowledge base (terms, screen docs)
- Device pools & Workspace RBAC
- AccessToken for CI
For local dev, align lmweb VITE_UI_GENIE_* and EMBER_DEBUG_TEMPLATE_ID with ember-mcp.
Limits
- Heavy animations or hostile UI may need split instructions
- Humans still own sign-off on critical assertions