Desktop Control via CUA Server
This skill allows OpenClaw to control the desktop using the CUA computer server API.
Prerequisites
- - CUA computer server running on port 8000
- Access to localhost:8000 (or configured CUASERVERURL)
Installation
To control your host desktop with OpenClaw, you need to install and run the CUA computer server on your machine.
Quick Start (Python SDK)
The easiest way to install the CUA computer server on your host:
CODEBLOCK0
Alternative: Install from Source
If you prefer to install from source:
CODEBLOCK1
Running as a Background Service
For always-on desktop control, set up as a system service:
macOS (launchd):
CODEBLOCK2
Linux (systemd):
CODEBLOCK3
Windows (Task Scheduler):
CODEBLOCK4
Configuration Options
Configure the server for your needs:
CODEBLOCK5
Security Considerations
⚠️ Important: By default, the server only listens on localhost (127.0.0.1) for security. This means only processes on your machine can connect to it.
- - Local only (default): Safe for personal use with OpenClaw
- Network exposure: Only use
--bind 0.0.0.0 with proper firewall rules AND authentication - Authentication: Always use
--auth-token if the server is accessible from the network
Verification
After installation, verify the server is working:
CODEBLOCK6
If you see a screenshot of your current desktop, the server is working correctly!
Troubleshooting
Port Already in Use:
CODEBLOCK7
Permission Denied (Linux):
CODEBLOCK8
Display Not Found (Linux):
CODEBLOCK9
Server Not Responding:
CODEBLOCK10
Available Commands
Take Screenshot
Capture the current screen:
CODEBLOCK11
Click at Coordinates
Click at specific x,y coordinates:
CODEBLOCK12
Right Click
CODEBLOCK13
Double Click
CODEBLOCK14
Type Text
Type text at the current cursor position:
CODEBLOCK15
Press Hotkey
Press a key combination:
CODEBLOCK16
Press Single Key
Press a single key:
CODEBLOCK17
Move Cursor
Move cursor to specific position:
CODEBLOCK18
Scroll
Scroll up or down:
CODEBLOCK19
Launch Application
Launch an application by name:
CODEBLOCK20
Open File or URL
Open a file or URL with default application:
CODEBLOCK21
Get Window Information
Get current window ID:
CODEBLOCK22
Window Control
Maximize window:
CODEBLOCK23
Minimize window:
CODEBLOCK24
Demo Workflows
Browser Navigation Demo
Open Firefox and navigate to a website:
CODEBLOCK25
Text Editor Demo
Open text editor and type content:
CODEBLOCK26
Form Filling Demo
Fill out a web form:
CODEBLOCK27
Helper Functions
Check Server Status
CODEBLOCK28
List All Available Commands
CODEBLOCK29
Get Screen Size
CODEBLOCK30
Get Cursor Position
CODEBLOCK31
Environment Variables
- -
CUA_SERVER_URL: Base URL for CUA server (default: http://localhost:8000)
Tips
- 1. Wait Between Commands: Add
sleep between commands to allow UI to update - Check Coordinates: Screen is 1280x720, center is at (640, 360)
- Screenshot for Debugging: Take screenshots before and after actions to verify
- Use Variables: Store coordinates and text in variables for reusability
Example OpenClaw Usage
Once this skill is loaded, you can use it in OpenClaw conversations:
CODEBLOCK32
Troubleshooting
- 1. Connection Refused: Make sure CUA server is running on port 8000
- No Response: Check if you're in the container or have SSH tunnel set up
- Commands Not Working: Verify with INLINECODE5
- Wrong Coordinates: Remember screen is 1280x720, adjust coordinates accordingly
通过CUA服务器进行桌面控制
此技能允许OpenClaw使用CUA计算机服务器API控制桌面。
前置条件
- - CUA计算机服务器在8000端口运行
- 可访问localhost:8000(或已配置的CUASERVERURL)
安装
要使用OpenClaw控制主机桌面,您需要在机器上安装并运行CUA计算机服务器。
快速开始(Python SDK)
在主机上安装CUA计算机服务器的最简单方法:
bash
安装计算机SDK
pip install cua-computer-sdk
启动服务器(将控制您当前的桌面)
cua-server start --port 8000
或者如果需要指定显示(Linux/Unix)
DISPLAY=:0 cua-server start --port 8000
验证服务器是否运行
curl http://localhost:8000/status
替代方案:从源码安装
如果您更倾向于从源码安装:
bash
克隆仓库
git clone https://github.com/trycua/cua-computer-server
cd cua-computer-server
安装依赖
pip install -r requirements.txt
运行服务器
python -m cua_server --port 8000
作为后台服务运行
如需始终在线的桌面控制,可设置为系统服务:
macOS (launchd):
bash
创建plist文件
cat > ~/Library/LaunchAgents/com.cua.server.plist <
Label
com.cua.server
ProgramArguments
/usr/local/bin/cua-server
start
--port
8000
RunAtLoad
KeepAlive
EOF
加载服务
launchctl load ~/Library/LaunchAgents/com.cua.server.plist
启动服务
launchctl start com.cua.server
Linux (systemd):
bash
创建服务文件
sudo tee /etc/systemd/system/cua-server.service > /dev/null <
[Unit]
Description=CUA计算机服务器
After=network.target
[Service]
Type=simple
User=$USER
Environment=DISPLAY=:0
Environment=XAUTHORITY=/home/$USER/.Xauthority
ExecStart=/usr/local/bin/cua-server start --port 8000
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
启用并启动服务
sudo systemctl daemon-reload
sudo systemctl enable cua-server
sudo systemctl start cua-server
检查状态
sudo systemctl status cua-server
Windows (任务计划程序):
powershell
创建开机启动的计划任务
$action = New-ScheduledTaskAction -Execute cua-server.exe -Argument start --port 8000
$trigger = New-ScheduledTaskTrigger -AtStartup
$principal = New-ScheduledTaskPrincipal -UserId $env:USERNAME -LogonType Interactive
$settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries
Register-ScheduledTask -TaskName CUA服务器 -Action $action -Trigger $trigger -Principal $principal -Settings $settings
配置选项
根据需求配置服务器:
bash
使用默认设置启动
cua-server start
自定义端口
cua-server start --port 8001
使用认证令牌(如果暴露到网络,推荐使用)
cua-server start --port 8000 --auth-token your-secret-token
指定显示(Linux/Unix)
DISPLAY=:1 cua-server start --port 8000
绑定到所有接口(注意:会暴露到网络!)
cua-server start --bind 0.0.0.0 --port 8000 --auth-token required-if-exposed
安全考虑
⚠️ 重要:出于安全考虑,服务器默认仅在localhost(127.0.0.1)上监听。这意味着只有您机器上的进程才能连接。
- - 仅本地(默认):个人使用OpenClaw时安全
- 网络暴露:仅在配置了适当防火墙规则和认证的情况下使用--bind 0.0.0.0
- 认证:如果服务器可从网络访问,务必使用--auth-token
验证
安装后,验证服务器是否正常工作:
bash
检查服务器状态
curl http://localhost:8000/status
列出可用命令
curl http://localhost:8000/commands | jq
截取桌面测试截图
curl -X POST http://localhost:8000/cmd \
-H Content-Type: application/json \
-d {command: screenshot} \
| jq -r .result.base64 \
| base64 -d > test-screenshot.png
查看截图
open test-screenshot.png # macOS
xdg-open test-screenshot.png # Linux
start test-screenshot.png # Windows
如果您能看到当前桌面的截图,说明服务器运行正常!
故障排除
端口已被占用:
bash
检查什么占用了8000端口
lsof -i :8000 # macOS/Linux
netstat -ano | findstr :8000 # Windows
解决方案:使用不同端口
cua-server start --port 8001
权限被拒绝(Linux):
bash
您可能需要将用户添加到input组以控制键盘/鼠标
sudo usermod -a -G input $USER
注销后重新登录以使更改生效
未找到显示(Linux):
bash
检查显示变量
echo $DISPLAY
显式设置
DISPLAY=:0 cua-server start --port 8000
服务器无响应:
bash
检查进程是否运行
ps aux | grep cua-server # Linux/macOS
tasklist | findstr cua-server # Windows
尝试在前台运行以查看错误
cua-server start --port 8000 --debug
可用命令
截取截图
捕获当前屏幕:
bash
curl -X POST http://localhost:8000/cmd \
-H Content-Type: application/json \
-d {command: screenshot} \
| jq -r .result.base64 \
| base64 -d > screenshot.png
在坐标处点击
在特定x,y坐标处点击:
bash
在1280x720屏幕中心点击
curl -X POST http://localhost:8000/cmd \
-H Content-Type: application/json \
-d {command: left_click, params: {x: 640, y: 360}}
右键点击
bash
curl -X POST http://localhost:8000/cmd \
-H Content-Type: application/json \
-d {command: right_click, params: {x: 640, y: 360}}
双击
bash
curl -X POST http://localhost:8000/cmd \
-H Content-Type: application/json \
-d {command: double_click, params: {x: 640, y: 360}}
输入文本
在当前光标位置输入文本:
bash
curl -X POST http://localhost:8000/cmd \
-H Content-Type: application/json \
-d {command: type_text, params: {text: 你好,世界!}}
按下快捷键
按下组合键:
bash
Ctrl+C
curl -X POST http://localhost:8000/cmd \
-H Content-Type: application/json \
-d {command: hotkey, params: {keys: [ctrl, c]}}
Ctrl+Alt+T(打开终端)
curl -X POST http://localhost:8000/cmd \
-H Content-Type: application/json \
-d {command: hotkey, params: {keys: [ctrl, alt, t]}}
按下单个键
按下单个键:
bash
按下回车
curl -X POST http://localhost:8000/cmd \
-H Content-Type: application/json \