Image Generation

Use this skill when you need to create one or more image files from a text prompt, or edit one or more existing images with Gemini.

Requirements

- ~/.openclaw/openclaw.json must include $.skills.entries["gemini-image-generation"].enabled set to true.
INLINECODE3 must include $.skills.entries["gemini-image-generation"].env with the following keys and values:
INLINECODE5 required
INLINECODE6 required
INLINECODE7 optional

- example ~/.openclaw/openclaw.json:

{
  ......,
  "skills": {
    "entries": {
      "gemini-image-generation": {
        "enabled": true,
        "env": {
          "GEMINI_API_KEY": "sk-xxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
          "GEMINI_MODEL_ID": "gemini-3.1-flash-image-preview",
          "GEMINI_BASE_URL": "https://custom-endpoint.com"
        }
      }
    }
  },
  ......
}

- Node.js must be installed in the workspace environment.
Install dependencies once with npm install from the skill root.

When To Use

- The user asks to generate a new image from a text prompt.
The user asks to modify, restyle, extend, or otherwise edit one or more existing images.
The user wants the generated image saved to a workspace file.
The task should be handled through a reusable OpenClaw skill instead of ad hoc SDK code.

Procedure

1. Convert the user request into a single clear image prompt.
If the user supplied source images, choose or confirm the input file path or paths inside the workspace.
If the user specified a target aspect ratio or size, pass them through as --aspectRatio and --imageSize.
Choose an output path inside the workspace unless the user already provided one.
For text-to-image, run generate-image.mjs with --prompt, --output, and optional image config arguments.
For image editing, run edit-image.mjs with --prompt, one or more --input values, --output, and optional image config arguments.
Read the api key from GEMINI_API_KEY and the model ID from GEMINI_MODEL_ID in the environment.
Optionally, read the base URL from GEMINI_BASE_URL in the environment for custom endpoints.
Return the saved image path or paths to the user.
After returning each image path, also output MEDIA:<image_path> (e.g. MEDIA:outputs/gemini-native-image.png) so the image is displayed inline in the conversation.

Commands

CODEBLOCK1

CODEBLOCK2

CODEBLOCK3

CODEBLOCK4

Notes

- The script prints TEXT: lines for model text and IMAGE: lines for each saved file.
After the skill finishes, always present every generated image to the user by outputting MEDIA:<path> for each saved image path. This ensures the image is rendered inline in the conversation alongside the file path.
The final JSON summary only includes generated image paths and optional image config so prompts, model IDs, and source image paths are not echoed back into logs.
Saved file extensions follow the returned image mime type. If the requested output path uses a different suffix, the scripts keep the base name and write the file with the returned type instead.
If the model returns multiple images, the scripts save them as name-1.png, name-2.png, and so on.
INLINECODE27 supports repeated --input flags. You can also pass a comma-separated list to a single --input value.
INLINECODE30 infers the source mime type from .png, .jpg, .jpeg, or .webp. Use one --mime-type for all inputs, or repeat --mime-type so it lines up with each --input.
Both scripts accept --aspectRatio and --imageSize. They also accept the kebab-case forms --aspect-ratio and --image-size.
The scripts only send config.imageConfig when at least one of those parameters is provided.

图像生成

当你需要根据文本提示创建一个或多个图像文件，或使用Gemini编辑一个或多个现有图像时，使用此技能。

要求

- ~/.openclaw/openclaw.json 必须包含 $.skills.entries[gemini-image-generation].enabled 设置为 true。
~/.openclaw/openclaw.json 必须包含 $.skills.entries[gemini-image-generation].env，并包含以下键值对：
GEMINIAPIKEY 必填
GEMINIMODELID 必填
GEMINIBASEURL 可选

- 示例 ~/.openclaw/openclaw.json：

json { ......, skills: { entries: { gemini-image-generation: { enabled: true, env: { GEMINIAPIKEY: sk-xxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, GEMINIMODELID: gemini-3.1-flash-image-preview, GEMINIBASEURL: https://custom-endpoint.com } } } }, ...... }

- 工作区环境中必须安装 Node.js。
在技能根目录下使用 npm install 一次性安装依赖。

使用时机

- 用户要求根据文本提示生成新图像。
用户要求修改、重新设计样式、扩展或以其他方式编辑一个或多个现有图像。
用户希望将生成的图像保存到工作区文件中。
任务应通过可复用的 OpenClaw 技能处理，而非临时编写的 SDK 代码。

操作步骤

1. 将用户请求转换为一个清晰的图像提示。
如果用户提供了源图像，选择或确认工作区内的输入文件路径。
如果用户指定了目标宽高比或尺寸，通过 --aspectRatio 和 --imageSize 传递。
除非用户已提供，否则在工作区内选择一个输出路径。
对于文生图，使用 --prompt、--output 和可选的图像配置参数运行 generate-image.mjs。
对于图像编辑，使用 --prompt、一个或多个 --input 值、--output 和可选的图像配置参数运行 edit-image.mjs。
从环境变量中读取 GEMINIAPIKEY 中的 API 密钥和 GEMINIMODELID 中的模型 ID。
可选地，从环境变量中的 GEMINIBASEURL 读取自定义端点的基础 URL。
将保存的图像路径返回给用户。
返回每个图像路径后，同时输出 MEDIA:（例如 MEDIA:outputs/gemini-native-image.png），以便图像在对话中内联显示。

命令

powershell
node ./skills/gemini-image-generation/scripts/generate-image.mjs --prompt Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme --output outputs/gemini-native-image.png

powershell
node ./skills/gemini-image-generation/scripts/generate-image.mjs --prompt Create a wide cinematic food photo of a nano banana dish in a fancy restaurant with a Gemini theme --output outputs/gemini-wide.png --aspectRatio 16:9 --imageSize 2K

powershell
node ./skills/gemini-image-generation/scripts/edit-image.mjs --prompt Turn this cat into a watercolor illustration eating a nano-banana in a fancy restaurant under the Gemini constellation --input inputs/cat.png --output outputs/cat-watercolor.png --aspectRatio 5:4 --imageSize 2K

powershell
node ./skills/gemini-image-generation/scripts/edit-image.mjs --prompt Create an office group photo of these people making funny faces --input inputs/person-1.jpg --input inputs/person-2.jpg --input inputs/person-3.jpg --output outputs/group-photo.png

注意事项

- 脚本会输出 TEXT: 行用于模型文本，IMAGE: 行用于每个保存的文件。
技能完成后，始终通过为每个保存的图像路径输出 MEDIA: 来向用户展示所有生成的图像。这确保图像在对话中与文件路径一起内联渲染。
最终的 JSON 摘要仅包含生成的图像路径和可选的图像配置，因此提示、模型 ID 和源图像路径不会回显到日志中。
保存的文件扩展名遵循返回的图像 MIME 类型。如果请求的输出路径使用不同的后缀，脚本会保留基本名称，并使用返回的类型写入文件。
如果模型返回多个图像，脚本会将其保存为 name-1.png、name-2.png 等。
edit-image.mjs 支持重复的 --input 标志。你也可以将逗号分隔的列表传递给单个 --input 值。
edit-image.mjs 从 .png、.jpg、.jpeg 或 .webp 推断源 MIME 类型。对所有输入使用一个 --mime-type，或重复 --mime-type 使其与每个 --input 对应。
两个脚本都接受 --aspectRatio 和 --imageSize。它们也接受短横线命名形式 --aspect-ratio 和 --image-size。
脚本仅在至少提供其中一个参数时发送 config.imageConfig。

gemini-image-generation Gemini图像生成

gemini-image-generation

Image Generation

Requirements

When To Use

Procedure

Commands

Notes

图像生成

要求

使用时机

操作步骤

命令

注意事项

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

gemini-image-generation Gemini图像生成

gemini-image-generation

Image Generation

Requirements

When To Use

Procedure

Commands

Notes

图像生成

要求

使用时机

操作步骤

命令

注意事项

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement