System Load Monitor

Core Functions

Monitors the CPU and memory load of the server, automatically controls the execution of system tasks, and prevents the server from downtime due to excessive load.

When to Use This Skill

Use this skill when the user mentions the following situations:

- The server has low configuration (e.g., 2 cores 2GB) and is prone to downtime
Need to execute resource-intensive tasks
Previous downtime caused by excessive load
Need to intelligently control the rhythm of task execution
Need to monitor server status in real time

Configuration Parameters

Parameter	Default Value	Description
INLINECODE0	90	CPU load threshold (percentage)
INLINECODE1

Usage Methods

1. Check Current System Status

CODEBLOCK0

2. Load Check Process Before Task Execution

Before executing any resource-consuming tasks:

1. Run load check

CODEBLOCK1

2. Parse return results

- status: "ok" / "warning" / "critical" - recommendation: "CONTINUE" / "PAUSE" - cpu.load_percent: CPU load percentage - memory.used_percent: Memory usage percentage

3. Make decisions based on status

- ok: Continue executing the task - warning: Execute cautiously and consider batch processing - critical: Pause the task and retry after cooling down

3. Monitoring Loop for Long-Running Tasks

For long-running tasks, use the following pattern:

CODEBLOCK2

Status Code Explanation

Exit Code	Status	Meaning
0	ok	Load is normal, can continue
1

Recommendations for Low-Configured Servers (2 Cores 2GB)

For your 2-core 2GB server:

1. Lower the threshold: It is recommended to use 70-80% as the warning line

CODEBLOCK3

2. Execute in batches: Split large tasks into small batches

3. Avoid concurrency: Only perform one task at a time

4. Regular checks: Check the load every 30 seconds for long-running tasks

Alert Notifications

When a critical status is detected, you should:

1. Immediately pause the current task
Notify the user (via Feishu message)
Retry after the cool-down period

Script Output Example

CODEBLOCK4

Notes

1. This skill is an independent monitoring tool and does not rely on Fairy's built-in judgment
The check should be invoked before executing any important tasks
For long-running tasks, a cyclic monitoring mechanism should be established
Threshold parameters can be adjusted according to actual conditions

系统负载监控

核心功能

监控服务器的CPU和内存负载，自动控制系统任务的执行，防止服务器因负载过高而宕机。

何时使用此技能

当用户提到以下情况时使用此技能：

- 服务器配置较低（如2核2GB），容易宕机
需要执行资源密集型任务
之前因负载过高导致过宕机
需要智能控制任务执行节奏
需要实时监控服务器状态

配置参数

参数	默认值	描述
cputhreshold	90	CPU负载阈值（百分比）
memorythreshold

使用方法

1. 检查当前系统状态

bash

快速检查

python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py

查看详细JSON输出

python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py --json

自定义阈值

python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py --cpu-threshold 80 --memory-threshold 85

2. 任务执行前的负载检查流程

在执行任何消耗资源的任务之前：

1. 运行负载检查

bash python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py --json

2. 解析返回结果

- status: ok / warning / critical - recommendation: CONTINUE / PAUSE - cpu.load_percent: CPU负载百分比 - memory.used_percent: 内存使用率百分比

3. 根据状态做出决策

- ok: 继续执行任务 - warning: 谨慎执行，考虑分批处理 - critical: 暂停任务，冷却后重试

3. 长时间运行任务的监控循环

对于长时间运行的任务，使用以下模式：

python
import subprocess
import time
import json

def check_load():
result = subprocess.run(
[python3, ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py, --json],
capture_output=True, text=True
)
return json.loads(result.stdout)

def runwithloadmonitor(taskfunc, cputhreshold=90, memorythreshold=90):
执行任务时持续监控负载
while True:
status = check_load()

if status[status] == critical:
print(f⚠️ 负载过高，暂停任务...)
print(fCPU: {status[cpu][loadpercent]}%, 内存: {status[memory][usedpercent]}%)
time.sleep(60) # 等待60秒
continue

# 负载正常，执行任务
task_func()
break

状态码说明

退出码	状态	含义
0	ok	负载正常，可以继续
1

低配置服务器（2核2GB）建议

针对您的2核2GB服务器：

1. 降低阈值：建议使用70-80%作为警戒线

bash python3 ~/.openclaw/workspace/skills/system-load-monitor/scripts/check_load.py --cpu-threshold 75 --memory-threshold 80

2. 分批执行：将大任务拆分为小批次

3. 避免并发：每次只执行一个任务

4. 定期检查：长时间运行的任务每30秒检查一次负载

告警通知

当检测到critical状态时，您应该：

1. 立即暂停当前任务
通知用户（通过飞书消息）
冷却期后重试

脚本输出示例

json
{
status: critical,
cpu: {
loadavg1m: 3.8,
cpu_count: 2,
load_percent: 190.0
},
memory: {
total_mb: 2048,
used_mb: 1843,
available_mb: 205,
used_percent: 90.0
},
top_processes: [
{user: node, cpupercent: 45.2, mempercent: 32.1, command: node /usr/bin/openclaw}
],
thresholds: {cpu: 90, memory: 90},
recommendation: PAUSE
}

注意事项

1. 此技能是一个独立的监控工具，不依赖Fairy的内置判断
在执行任何重要任务之前应调用检查
对于长时间运行的任务，应建立循环监控机制
阈值参数可根据实际情况调整

system-load-monitor系统负载监控