System Load Monitor
Core Functions
Monitors the CPU and memory load of the server, automatically controls the execution of system tasks, and prevents the server from downtime due to excessive load.
When to Use This Skill
Use this skill when the user mentions the following situations:
- - The server has low configuration (e.g., 2 cores 2GB) and is prone to downtime
- Need to execute resource-intensive tasks
- Previous downtime caused by excessive load
- Need to intelligently control the rhythm of task execution
- Need to monitor server status in real time
Configuration Parameters
| Parameter | Default Value | Description |
|---|
| INLINECODE0 | 90 | CPU load threshold (percentage) |
| INLINECODE1 |
90 | Memory usage threshold (percentage) |
|
check_interval | 30 | Check interval (seconds) |
|
cool_down | 60 | Cool-down time after excessive load (seconds) |
Usage Methods
1. Check Current System Status
CODEBLOCK0
2. Load Check Process Before Task Execution
Before executing any resource-consuming tasks:
- 1. Run load check
CODEBLOCK1
- 2. Parse return results
-
status: "ok" / "warning" / "critical"
-
recommendation: "CONTINUE" / "PAUSE"
-
cpu.load_percent: CPU load percentage
-
memory.used_percent: Memory usage percentage
- 3. Make decisions based on status
-
ok: Continue executing the task
-
warning: Execute cautiously and consider batch processing
-
critical: Pause the task and retry after cooling down
3. Monitoring Loop for Long-Running Tasks
For long-running tasks, use the following pattern:
CODEBLOCK2
Status Code Explanation
| Exit Code | Status | Meaning |
|---|
| 0 | ok | Load is normal, can continue |
| 1 |
warning | Load is relatively high, recommended to proceed with caution |
| 2 | critical | Load is excessively high, must pause |
Recommendations for Low-Configured Servers (2 Cores 2GB)
For your 2-core 2GB server:
- 1. Lower the threshold: It is recommended to use 70-80% as the warning line
CODEBLOCK3
- 2. Execute in batches: Split large tasks into small batches
- 3. Avoid concurrency: Only perform one task at a time
- 4. Regular checks: Check the load every 30 seconds for long-running tasks
Alert Notifications
When a critical status is detected, you should:
- 1. Immediately pause the current task
- Notify the user (via Feishu message)
- Retry after the cool-down period
Script Output Example
CODEBLOCK4
Notes
- 1. This skill is an independent monitoring tool and does not rely on Fairy's built-in judgment
- The check should be invoked before executing any important tasks
- For long-running tasks, a cyclic monitoring mechanism should be established
- Threshold parameters can be adjusted according to actual conditions
系统负载监控
核心功能
监控服务器的CPU和内存负载,自动控制系统任务的执行,防止服务器因负载过高而宕机。
何时使用此技能
当用户提到以下情况时使用此技能:
- - 服务器配置较低(如2核2GB),容易宕机
- 需要执行资源密集型任务
- 之前因负载过高导致过宕机
- 需要智能控制任务执行节奏
- 需要实时监控服务器状态
配置参数
| 参数 | 默认值 | 描述 |
|---|
| cputhreshold | 90 | CPU负载阈值(百分比) |
| memorythreshold |
90 | 内存使用率阈值(百分比) |
| check_interval | 30 | 检查间隔(秒) |
| cool_down | 60 | 负载过高后的冷却时间(秒) |
使用方法
1. 检查当前系统状态
bash
快速检查
python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py
查看详细JSON输出
python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py --json
自定义阈值
python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py --cpu-threshold 80 --memory-threshold 85
2. 任务执行前的负载检查流程
在执行任何消耗资源的任务之前:
- 1. 运行负载检查
bash
python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py --json
- 2. 解析返回结果
- status: ok / warning / critical
- recommendation: CONTINUE / PAUSE
- cpu.load_percent: CPU负载百分比
- memory.used_percent: 内存使用率百分比
- 3. 根据状态做出决策
-
ok: 继续执行任务
-
warning: 谨慎执行,考虑分批处理
-
critical: 暂停任务,冷却后重试
3. 长时间运行任务的监控循环
对于长时间运行的任务,使用以下模式:
python
import subprocess
import time
import json
def check_load():
result = subprocess.run(
[python3, ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py, --json],
capture_output=True, text=True
)
return json.loads(result.stdout)
def runwithloadmonitor(taskfunc, cputhreshold=90, memorythreshold=90):
执行任务时持续监控负载
while True:
status = check_load()
if status[status] == critical:
print(f⚠️ 负载过高,暂停任务...)
print(fCPU: {status[cpu][loadpercent]}%, 内存: {status[memory][usedpercent]}%)
time.sleep(60) # 等待60秒
continue
# 负载正常,执行任务
task_func()
break
状态码说明
warning | 负载较高,建议谨慎操作 |
| 2 | critical | 负载过高,必须暂停 |
低配置服务器(2核2GB)建议
针对您的2核2GB服务器:
- 1. 降低阈值:建议使用70-80%作为警戒线
bash
python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py --cpu-threshold 75 --memory-threshold 80
- 2. 分批执行:将大任务拆分为小批次
- 3. 避免并发:每次只执行一个任务
- 4. 定期检查:长时间运行的任务每30秒检查一次负载
告警通知
当检测到critical状态时,您应该:
- 1. 立即暂停当前任务
- 通知用户(通过飞书消息)
- 冷却期后重试
脚本输出示例
json
{
status: critical,
cpu: {
loadavg1m: 3.8,
cpu_count: 2,
load_percent: 190.0
},
memory: {
total_mb: 2048,
used_mb: 1843,
available_mb: 205,
used_percent: 90.0
},
top_processes: [
{user: node, cpupercent: 45.2, mempercent: 32.1, command: node /usr/bin/openclaw}
],
thresholds: {cpu: 90, memory: 90},
recommendation: PAUSE
}
注意事项
- 1. 此技能是一个独立的监控工具,不依赖Fairy的内置判断
- 在执行任何重要任务之前应调用检查
- 对于长时间运行的任务,应建立循环监控机制
- 阈值参数可根据实际情况调整