Deep HJB Solver

Before writing any loss or config code, read references/repo-conventions.md — it contains the exact constructor signatures and return-type contracts that must be followed.
Before writing any CSV or plotting code, read references/training-output-contract.md.

Never modify DGMTrainer, DGMNet, or any file under src/trainers/ or src/models/ unless explicitly asked.

Output layout

Every new HJB problem is created as a self-contained folder named after the problem slug. Nothing is written outside it.

CODEBLOCK0

Workflow

Use the following steps in order. In every template below, replace:

- <slug> → snake_case problem name, e.g. INLINECODE7
INLINECODE8 → CamelCase version, e.g. INLINECODE9
INLINECODE10 → spatial dimension (1 or 2)
INLINECODE13 → number of control variables
INLINECODE14 → Python list literal of control names, e.g. INLINECODE15

Step 1 — Copy the DGM framework into `<slug>/src/`

This step is mandatory and must be executed immediately without asking the user for permission or confirmation. Do not say "should I copy the assets?" — just do it.

Run the following shell commands to copy the bundled framework. Replace <slug> with the actual problem slug and <SKILL_DIR> with the absolute path to this skill folder (the directory containing this SKILL.md):

CODEBLOCK1

Do not proceed to Step 2 until the copy commands have completed successfully.

Step 2 — Create `<slug>/src/configs/<slug>_config.py`

1D domain (<dim> = 1):

CODEBLOCK2

2D domain (<dim> = 2) — replace the bounds lines with:

CODEBLOCK3

Step 3 — Create `<slug>/src/problems/<slug>_problem.py`

CODEBLOCK4

Step 4 — Create `<slug>/src/losses/<slug>_loss.py`

CODEBLOCK5

Step 5 — Register new classes in `<slug>/src/configs/init.py`, `<slug>/src/problems/init.py`, `<slug>/src/losses/init.py`

Append one import line to each file:

CODEBLOCK6

Step 6 — Create `<slug>/examples/<slug>_train.py`

All imports are relative to <slug>/ (the script is run from inside that folder).

CODEBLOCK7

Step 7 — Fill the real PDE

In <slug>/src/losses/<slug>_loss.py, replace the placeholder body of compute_value_loss with the actual HJB residual. The return signature must not change: return L1, L3, diff_V, diff_terminal.

In <slug>/src/problems/<slug>_problem.py, implement terminal_utility_<slug> with the correct payoff.

See "How to translate an HJB equation into loss functions" below for the exact decomposition rule.

How to translate an HJB equation into loss functions

Given an HJB equation with an inf (or sup) operator, split it into two losses following this fixed rule.

The rule

Loss	What to put in it
INLINECODE34 → INLINECODE35	Drop the inf/sup symbol; substitute the control network output for u; sum all remaining terms into a residual; squared mean.
INLINECODE36 → INLINECODE37

Keep only the terms inside inf{…}; evaluate them with the control network; take the mean directly (no square). For sup{…}, negate first. |

The intuition: L1 trains the value network so the PDE residual → 0 (equation is satisfied). L2 trains the control network to actually minimise (or maximise) the Hamiltonian — it IS a gradient-descent step on the inf objective, so no squaring.

Running the training

CODEBLOCK8

Results are saved to <slug>/results/.

Common pitfall: GradientTape reuse

Rule

Always use persistent=True in both compute_value_loss and compute_control_loss. Even if you think you only need one gradient today, problems that include V_xx (second-order / diffusion terms) require a nested tape inside compute_control_loss, which only works when the outer tape is persistent. Using persistent=True consistently prevents hard-to-diagnose errors.

Method	Minimum gradients	Must use `persistent=True`?
INLINECODE48	INLINECODE49 + `V_x` (2 gradients)	Yes
INLINECODE51

V_x + possibly V_xx via nested tape | Yes |

`compute_value_loss` — always `persistent=True`

CODEBLOCK9

`compute_control_loss` — always `persistent=True`

CODEBLOCK10

Omitting persistent=True when computing two or more gradients raises:

RuntimeError: A non-persistent GradientTape can only be used to compute one set of gradients

Guardrails

- Never modify DGMTrainer, DGMNet, or sampler internals.
Keep all tensors as tf.float32.
Never hardcode CSV column names in plotting code — detect control columns dynamically.

Environment setup (before running)

If the user wants to run the training script, make sure the Python environment is correctly configured first.

1. Install dependencies

CODEBLOCK12

INLINECODE62 includes: tensorflow, numpy, matplotlib, tqdm, pandas.

2. Verify TensorFlow can see the GPU (optional but recommended)

CODEBLOCK13

If GPU is available but not listed, install the matching tensorflow-gpu or cuda/cudnn version.

3. Run from the correct directory

The training script uses relative imports rooted at <slug>/. Always cd into the problem folder first:

CODEBLOCK14

Running from the workspace root will cause ModuleNotFoundError for src.*.

4. Results location

Output is written to <slug>/results/ (configurable via CommonConfig.save_dir).
CSV training history: <slug>/results/<saveName>_training_history.csv
Saved models: <slug>/results/<saveName>_value_model/ and INLINECODE79

Deep HJB Solver

在编写任何损失函数或配置代码之前，请先阅读 references/repo-conventions.md——其中包含了必须遵循的精确构造函数签名和返回类型约定。
在编写任何CSV或绘图代码之前，请先阅读 references/training-output-contract.md。

除非明确要求，否则不要修改 DGMTrainer、DGMNet 或 src/trainers/、src/models/ 下的任何文件。

输出布局

每个新的HJB问题都作为一个独立的文件夹创建，以问题标识符命名。所有内容都写在该文件夹内。

/
├── src/
│ ├── init.py
│ ├── configs/
│ │ ├── init.py
│ │ ├── common_config.py ← 从assets复制
│ │ └── _config.py ← 生成
│ ├── models/
│ │ ├── init.py
│ │ └── dgm_net.py ← 从assets复制
│ ├── problems/
│ │ ├── init.py
│ │ ├── base_problem.py ← 从assets复制
│ │ └── _problem.py ← 生成
│ ├── losses/
│ │ ├── init.py
│ │ └── _loss.py ← 生成
│ ├── samplers/
│ │ ├── init.py
│ │ ├── base_sampler.py ← 从assets复制
│ │ ├── uniform_sampler.py ← 从assets复制
│ │ └── uniformsampler2d.py ← 从assets复制
│ ├── trainers/
│ │ ├── init.py
│ │ └── dgm_trainer.py ← 从assets复制
│ └── utils/
│ ├── init.py
│ └── visualization.py ← 从assets复制
├── examples/
│ └── _train.py ← 生成
├── plottrainingcsv.py ← 从assets复制
└── requirements.txt ← 从assets复制

工作流程

按顺序执行以下步骤。在每个模板中，替换：

- → 蛇形命名的问题名称，例如 linear_control
→ 驼峰命名版本，例如 LinearControl
→ 空间维度（1 或 2）
→ 控制变量数量
→ 控制名称的Python列表字面量，例如 [Z, W]

步骤1 — 将DGM框架复制到 /src/

此步骤是强制性的，必须立即执行，无需询问用户许可或确认。 不要说我应该复制assets吗？——直接执行。

运行以下shell命令来复制捆绑的框架。将替换为实际的问题标识符，将替换为此技能文件夹的绝对路径（包含此SKILL.md的目录）：

bash
mkdir -p /src
cp -r /assets/src/. /src/
cp DIR>/assets/plottrainingcsv.py /plottraining_csv.py
cp /assets/requirements.txt /requirements.txt

在复制命令成功完成之前，不要继续执行步骤2。

步骤2 — 创建 /src/configs/_config.py

一维域（ = 1）：

python
的配置。

from dataclasses import dataclass, field
from .common_config import CommonConfig

@dataclass
class Config(CommonConfig):

dimension: int = 1
T: float = 1.0
t_low: float = 0.0
X_low: float = 0.0
X_high: float = 1.0

num_controls: int =
controlnames: list = field(defaultfactory=lambda: )
metricsconfig: list = field(defaultfactory=lambda: [maxdiffV, maxdiffterminal])
extrainfomapping: dict = field(default_factory=dict)
earlystopmetric: str = maxdiff_V
earlystopthreshold: float = 1e-4
problemparamskeys: list = field(default_factory=list)

saveName: str =

二维域（ = 2）— 将边界行替换为：

python
dimension: int = 2
Xlow: list = field(defaultfactory=lambda: [0.0, 0.0])
Xhigh: list = field(defaultfactory=lambda: [1.0, 1.0])

步骤3 — 创建 /src/problems/_problem.py

python
的问题定义。

from .base_problem import BaseProblem

def terminalutility(x):
TODO: 实现终端收益 g(x)。x形状：(batch, dim)。
return -x[:, :1]

class Problem(BaseProblem):

def getterminalcondition(self, x):
return terminalutility(x)

步骤4 — 创建 /src/losses/_loss.py

python
的损失函数。

import tensorflow as tf

class Loss:

def init(self, problem):
self.problem = problem

def computevalueloss(self, model, control, tinterior, Xinterior, tterminal, Xterminal):
# TODO: 替换为真实的HJB PDE残差。
# 重要提示：如果需要从同一个tape获取多个梯度（例如
# Vt和Vx），必须使用persistent=True并在之后删除tape。
# 非持久性tape在第二次调用时会引发RuntimeError。
with tf.GradientTape(persistent=True, watchaccessedvariables=False) as gt:
gt.watch(t_interior)
gt.watch(X_interior)
V = model(tinterior, Xinterior)
Vt = gt.gradient(V, tinterior) # ∂V/∂t
Vx = gt.gradient(V, Xinterior) # ∂V/∂x（如果HJB需要则使用）
del gt # 使用后立即释放持久性tape

ctrl = control(tinterior, Xinterior) # 控制网络的u — 代入HJB
residual = Vt # TODO: 替换为实际的HJB残差，例如 Vt + ctrl * V_x + ...
L1 = tf.reduce_mean(tf.square(residual))

targetterminal = self.problem.getterminalcondition(Xterminal)
fittedterminal = model(tterminal, X_terminal)
diffterminal = fittedterminal - target_terminal
L3 = tf.reducemean(tf.square(diffterminal))

# diff_V可以是：
# (a) 普通张量 — HJB残差（例如 diff_V = residual），或者
# (b) 调试张量的字典，必须包含residual键
# （例如 {residual: residual, V: V, Vt: Vt, Vx: Vx}）
# baseproblem.extractmetrics会自动处理两种形式。
# 必须精确返回这个4元组 — DGMTrainer直接解包它。
diff_V = residual # 选项(a)：最简单形式
# diffV = {residual: residual, V: V, Vt: Vt, Vx: V_x} # 选项(b)
return L1, L3, diffV, diffterminal

def computecontrolloss(self, model, control, tinterior, Xinterior, tterminal, Xterminal):
# TODO: 实现FOC/控制目标。
# 始终使用persistent=True：至少需要V_x，有些问题
# 还需要V_xx（二阶），这需要在此tape内嵌套另一个tape。
# persistent=True允许安全地多次重用外部tape。
with tf.GradientTape(persistent=True, watchaccessedvariables=False) as gt:
gt.watch(X_interior)
V = model(tinterior, Xinterior)
Vx = gt.gradient(V, Xinterior) # ∂V/∂x
del gt
ctrl = control(tinterior, Xinterior) # 控制网络的u
# TODO: 使用ctrl

deep-hjb-solver深度HJB求解器

deep-hjb-solver

Deep HJB Solver

Output layout

Workflow

Step 1 — Copy the DGM framework into `<slug>/src/`

Step 2 — Create `<slug>/src/configs/<slug>_config.py`

Step 3 — Create `<slug>/src/problems/<slug>_problem.py`

Step 4 — Create `<slug>/src/losses/<slug>_loss.py`

Step 5 — Register new classes in `<slug>/src/configs/init.py`, `<slug>/src/problems/init.py`, `<slug>/src/losses/init.py`