mac-control

# Mac Control Automate Mac UI interactions using cliclick (mouse/keyboard) and system tools. ## Tools - **cliclick**: `/opt/homebrew/bin/cliclick` - mouse/keyboard control - **screencapture**: Built-in screenshot tool - **magick**: ImageMagick for image analysis - **osascript**: AppleScript for window info ## Coordinate System (Eason's Mac Mini) **Current setup**: 1920x1080 display, **1:1 scaling** (no conversion needed!) - Screenshot coords = cliclick coords - If screenshot shows element at (800, 500), click at (800, 500) ### For Retina Displays (2x) If screenshot is 2x the logical resolution: ```bash # Convert: cliclick_coords = screenshot_coords / 2 cliclick c:$((screenshot_x / 2)),$((screenshot_y / 2)) ``` ### Calibration Script Run to verify your scale factor: ```bash /Users/eason/clawd/scripts/calibrate-cursor.sh ``` ## cliclick Commands ```bash # Click at coordinates /opt/homebrew/bin/cliclick c:500,300 # Move mouse (no click) - Note: may not visually update cursor /opt/homebrew/bin/cliclick m:500,300 # Double-click /opt/homebrew/bin/cliclick dc:500,300 # Right-click /opt/homebrew/bin/cliclick rc:500,300 # Click and drag /opt/homebrew/bin/cliclick dd:100,100 du:200,200 # Type text /opt/homebrew/bin/cliclick t:"hello world" # Press key (Return, Escape, Tab, etc.) /opt/homebrew/bin/cliclick kp:return /opt/homebrew/bin/cliclick kp:escape # Key with modifier (cmd+w to close window) /opt/homebrew/bin/cliclick kd:cmd t:w ku:cmd # Get current mouse position /opt/homebrew/bin/cliclick p # Wait before action (ms) /opt/homebrew/bin/cliclick -w 100 c:500,300 ``` ## Screenshots ```bash # Full screen (silent) /usr/sbin/screencapture -x /tmp/screenshot.png # With cursor (may not work for custom cursor colors) /usr/sbin/screencapture -C -x /tmp/screenshot.png # Interactive region selection screencapture -i region.png # Delayed capture screencapture -T 3 -x delayed.png # 3 second delay ``` ## Workflow: Screenshot → Analyze → Click **Best practice for reliable clicking:** 1. **Take screenshot** ```bash /usr/sbin/screencapture -x /tmp/screen.png ``` 2. **View screenshot** (Read tool) to find target coordinates 3. **Click at those coordinates** (1:1 on 1920x1080) ```bash /opt/homebrew/bin/cliclick c:X,Y ``` 4. **Verify** by taking another screenshot ### Example: Click a button ```bash # 1. Screenshot /usr/sbin/screencapture -x /tmp/before.png # 2. View image, find button at (850, 450) # (Use Read tool on /tmp/before.png) # 3. Click /opt/homebrew/bin/cliclick c:850,450 # 4. Verify /usr/sbin/screencapture -x /tmp/after.png ``` ## Window Bounds ```bash # Get Chrome window bounds osascript -e 'tell application "Google Chrome" to get bounds of front window' # Returns: 0, 38, 1920, 1080 (left, top, right, bottom) ``` ## Common Patterns ### Chrome Extension Icon (Browser Relay) Use AppleScript to find exact button position: ```bash # Find Clawdbot extension button position osascript -e ' tell application "System Events" tell process "Google Chrome" set toolbarGroup to group 2 of group 3 of toolbar 1 of group 1 of group 1 of group 1 of group 1 of group 1 of window 1 set allButtons to every pop up button of toolbarGroup repeat with btn in allButtons if description of btn contains "Clawdbot" then return position of btn & size of btn end if end repeat end tell end tell ' # Output: 1755, 71, 34, 34 (x, y, width, height) # Click center of button # center_x = x + width/2 = 1755 + 17 = 1772 # center_y = y + height/2 = 71 + 17 = 88 /opt/homebrew/bin/cliclick c:1772,88 ``` ### Clicking by Color Detection If you need to find a specific colored element: ```bash # Find red (#FF0000) pixels in screenshot magick /tmp/screen.png txt:- | grep "#FF0000" | head -5 # Calculate center of colored region magick /tmp/screen.png txt:- | grep "#FF0000" | awk -F'[,:]' ' BEGIN{sx=0;sy=0;c=0} {sx+=$1;sy+=$2;c++} END{printf "Center: (%d, %d)\n", sx/c, sy/c}' ``` ### Dialog Button Click 1. Screenshot the dialog 2. Find button coordinates visually 3. Click (no scaling on 1920x1080) ```bash # Example: Click "OK" button at (960, 540) /opt/homebrew/bin/cliclick c:960,540 ``` ### Type in Text Field ```bash # Click to focus, then type /opt/homebrew/bin/cliclick c:500,300 sleep 0.2 /opt/homebrew/bin/cliclick t:"Hello world" /opt/homebrew/bin/cliclick kp:return ``` ## Helper Scripts Located in `/Users/eason/clawd/scripts/`: - `calibrate-cursor.sh` - Calibrate coordinate scaling - `click-at-visual.sh` - Click at screenshot coordinates - `get-cursor-pos.sh` - Get current cursor position - `attach-browser-relay.sh` - Auto-click Browser Relay extension ## Keyboard Navigation (When Clicks Fail) **Google OAuth and protected pages block synthetic mouse clicks!** Use keyboard navigation: ```bash # Tab to navigate between elements osascript -e 'tell application "System Events" to keystroke tab' # Shift+Tab to go backwards osascript -e 'tell application "System Events" to key code 48 using shift down' # Enter to activate focused element osascript -e 'tell application "System Events" to keystroke return' # Full workflow: Tab 3 times then Enter osascript -e ' tell application "System Events" keystroke tab delay 0.15 keystroke tab delay 0.15 keystroke tab delay 0.15 keystroke return end tell ' ``` **When to use keyboard instead of mouse:** - Google OAuth / login pages (anti-automation protection) - Popup dialogs with focus trapping - When mouse clicks consistently fail after verification ## Chrome Browser Relay & Multiple Windows **Problem**: Browser Relay may list tabs from multiple Chrome windows, causing `snapshot` to fail on the desired tab. **Solution**: 1. Close extra Chrome windows before automation 2. Or ensure only the target window has relay attached **Check tabs visible to relay**: ```bash # In agent code browser action=tabs profile=chrome ``` If target tab missing from list → wrong window attached. **Verify single window**: ```bash osascript -e 'tell application "Google Chrome" to return count of windows' ``` ## Verify-Before-Click Workflow **Critical**: Always verify coordinates BEFORE clicking important buttons. ```bash # 1. Take screenshot osascript -e 'do shell script "/usr/sbin/screencapture -x /tmp/before.png"' # 2. View screenshot (Read tool), note target position # 3. Move mouse to verify position (optional) python3 -c "import pyautogui; pyautogui.moveTo(X, Y)" osascript -e 'do shell script "/usr/sbin/screencapture -C -x /tmp/verify.png"' # 4. Check cursor is on target, THEN click /opt/homebrew/bin/cliclick c:X,Y # 5. Take screenshot to confirm action worked osascript -e 'do shell script "/usr/sbin/screencapture -x /tmp/after.png"' ``` ## Troubleshooting **Click lands wrong**: Verify scale factor with calibration script **cliclick m: doesn't move cursor visually**: Use `c:` (click) instead, or check with `cliclick p` to confirm position changed **Permission denied**: System Settings → Privacy & Security → Accessibility → Add `/opt/homebrew/bin/node` **Window not found**: Check exact app name: ```bash osascript -e 'tell application "System Events" to get name of every process whose background only is false' ``` **Clicks ignored on OAuth/protected pages**: These pages block synthetic events. Use keyboard navigation (Tab + Enter) instead. **pyautogui vs cliclick coordinates differ**: Stick with cliclick for consistency. pyautogui may have different coordinate mapping. **Quartz CGEvent clicks don't work**: Some pages (Google OAuth) block low-level mouse events too. Keyboard is the only reliable method.

mac-control

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

mac-control

mac-control

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement