From 213cd6fba58092f0b3b98a5e5980e6843f12caa3 Mon Sep 17 00:00:00 2001 From: Davide Grilli Date: Fri, 15 May 2026 15:26:20 +0200 Subject: [PATCH] docs: aggiorna CLAUDE.md con piano implementazione PINN inversa e architettura semplificata Co-Authored-By: Claude Sonnet 4.6 --- CLAUDE.md | 147 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 75 insertions(+), 72 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 8f45d33..6e8444a 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,97 +6,100 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co Write all git commit messages in Italian. -## Scope of Work - -Lavora su tutto il repository: `fdm/`, `model.py`, `engine.py`, `visualizer.py`, `app.py`, `config.py`. Aiuta con migliorie, bugfix e ottimizzazioni su qualsiasi file del progetto. - -## Project Overview - -**Heat Equation PINN** — A Physics-Informed Neural Network that solves the 1D time-varying heat equation with an internal point heat source: - -``` -∂T/∂t = α ∂²T/∂x² + (α/k) Q(t) δ(x − X_SRC) x ∈ [0, L], t ∈ [0, T_END] -``` - -where `Q(t) = Q_VAL` if `t ≥ T_STEP` else `0`. - -- **IC:** `T(x, 0) = T0` (uniform) -- **BC x=0:** Robin — convection: `−k ∂T/∂x = h (T − T_AMB)` -- **BC x=L:** Robin — convection: `−k ∂T/∂x = h (T − T_AMB)` - -No experimental data is needed. A `fdm/` module provides a reference numerical solution (FTCS explicit scheme) used for evaluation and visualization comparison. - -All physical and numerical parameters live in `config.py`. - -## Running - -Always activate the virtual environment first: +## Commands ```bash -source .venv/bin/activate +source .venv/bin/activate # always first + +python app.py # forward PINN: train / evaluate / visualize +python fdm/app.py # FDM reference solver menu +python -m inverse.app # inverse PINN menu (to be implemented) + +pytest # all tests +pytest -m "not slow" # skip full-training tests +pytest tests/test_model.py # single file ``` -**PINN:** -```bash -python app.py # Train / Evaluate (L2 vs FDM) / Visualize -``` - -**FDM reference solver:** -```bash -python fdm/app.py # Solve / Heatmap / Animation / Time-series -``` - -Saved artifacts (git-ignored): `models/best_heat_pinn_model.pth`, HTML plots in `animations/` and `animations/fdm/`. - -To retrain from scratch: `rm models/best_heat_pinn_model.pth` before running option 1. - -## Dependencies - -`requirements.txt` exists. Key packages: `torch`, `numpy`, `plotly`. No `pandas` or `scikit-learn` needed. - -GPU is auto-detected (`cuda` → `mps` → `cpu`) in `engine.py:_get_device()`. +Delete `models/best_heat_pinn_model.pth` to retrain from scratch. ## Architecture ``` -config.py ← all physical + numerical parameters (edit here to change the problem) -app.py ← PINN CLI menu -model.py ← HeatPINN + heat_pinn_loss() -engine.py ← data sampling, Adam+L-BFGS training, evaluation vs FDM, visualization call -visualizer.py ← PINN vs FDM: heatmap, animated T(x), time-series at fixed points -fdm/ - solver.py ← FTCS explicit scheme, Robin on both ends, point source at X_SRC - visualizer.py ← same 3 plot types for FDM-only output - app.py ← FDM CLI menu +config.py ← all physical + numerical parameters +model.py ← HeatPINN (5-layer FC) + heat_pinn_loss() +engine.py ← prepare_data(), train_model(), evaluate_model() +app.py ← forward PINN CLI +visualizer.py ← PINN vs FDM plots (Plotly HTML) +fdm/solver.py ← FTCS explicit scheme, returns T_matrix[NX, NT] +inverse/ ← inverse PINN (to implement — see plan below) +tests/ ← pytest suite (42 tests); conftest.py has device, small_data, pinn_model fixtures ``` -### Neural Network (`model.py`) +## Key design decisions -`HeatPINN`: 5-layer fully connected, input `(x, t)` → output `T`. - -**Output scaling** — the network predicts a dimensionless perturbation; the `forward()` applies: +**Output scaling** (`model.py:forward`): ``` -T = T_AMB + (Q_VAL · L / K) · net(x, t) +T = T_AMB + (Q_VAL · L / K) · net(x_norm, t_norm) ``` -This keeps `net` outputs in `[0, 1]` range and ensures gradients `∂T/∂x` are O(1) for the network to learn. Do not remove this scaling. +This keeps net outputs in [0,1] and ∂T/∂x at O(1). Do not remove. -`heat_pinn_loss()` normalizes all four loss terms to O(1) using `T_char = Q_VAL·L/K` and `grad_char = (Q_VAL/K)²`. The PDE residual includes the Gaussian-smoothed source term (σ=0.02) as a continuous approximation to δ(x − X_SRC). Changing physical parameters in `config.py` does not require re-tuning loss weights. +**Loss normalization** (`model.py:heat_pinn_loss`): all four terms are scaled to O(1) via `_T_char = Q_VAL·L/K` and `_bc_scale`. Changing physical params in `config.py` does not require retuning weights. -### Training (`engine.py`) +**Collocation clustering** (`engine.py:prepare_data`): 25% extra points near `X_SRC` (source gradient) and `T_STEP` (flux discontinuity). First lever to pull if accuracy is poor: increase `N_F`. -`prepare_data()` samples collocation points with **deliberate clustering**: extra points near `x=X_SRC` (steep gradient at source) and around `t=T_STEP` (flux step discontinuity). Increasing `N_f` / `N_bc` here is the first lever to pull if accuracy is low. +**Training sequence**: Adam (early stopping + ReduceLROnPlateau) → L-BFGS fine-tuning. L-BFGS uses a `_last` closure dict to capture loss components without double-calling the loss outside a grad context. -`train_model()` runs **Adam first, then L-BFGS fine-tuning**. L-BFGS uses a closure that captures loss components in `_last` dict (avoids calling `heat_pinn_loss` outside an active grad context). +**FDM Robin BCs** (`fdm/solver.py`): implicit-like update `T[0] = (T[1] + robin_coeff·T_amb) / (1 + robin_coeff)`. Point source added after BCs: `T[i_src] += Q·α·dt/(k·dx)`. -`evaluate_model()` runs the FDM solver and downsamples its `(NX, NT)` output to the PINN prediction grid `(100, 100)` for L2 comparison. +--- -### FDM Solver (`fdm/solver.py`) +## Inverse PINN — implementation plan -Returns `(T_matrix[NX, NT], x_vals, t_vals)`. Uses: -- Robin BC on both ends: `T[0] = (T[1] + robin_coeff·T_amb) / (1 + robin_coeff)` -- Point source injected at node `i_src = argmin|x - X_SRC|` after BCs: `T[i_src] += Q·α·dt/(k·dx)` -- CFL check at startup (warns, does not crash) +Goal: identify unknown physical parameters (`ALPHA`, `K`, `H_CONV`) from sparse noisy temperature measurements. The network learns T(x,t) and the physics parameters simultaneously. -### Loss Scaling Notes +### Files to create (in order) -If you change `Q_VAL`, `K`, `H_CONV`, or `L` in `config.py`, the normalization in `heat_pinn_loss()` adjusts automatically. If losses diverge, check that `T_char = Q_VAL·L/K` is not near zero. +**`inverse/config_inverse.py`** +- `N_SENSORS`, `SENSOR_POSITIONS` (list of x positions) +- `NOISE_STD` — Gaussian noise std on measurements [°C] +- `IDENTIFY = ['alpha', 'k', 'h_conv']` +- `ALPHA_INIT`, `K_INIT`, `H_CONV_INIT` — initial guesses (2–5× off from true values) +- `EPOCHS_INV`, `LR_ADAM_INV`, `W_DATA = 10.0` +- `MODELS_DIR`, `DATA_PATH` + +**`inverse/data.py`** +- `generate_measurements(noise_std, sensor_positions)`: call `fdm.solver.solve()` with true params from `config.py`, sample at nearest FDM nodes, add noise, save to `inverse/data/measurements.csv` (columns: `x, t, T`) +- `load_measurements(device)`: load CSV → tensors `(x_s, t_s, T_meas)` on device + +**`inverse/model.py`** — `InverseHeatPINN(nn.Module)` +- Same 5-layer architecture as `HeatPINN` +- Unknown params as log-space `nn.Parameter` (guarantees positivity without constraints): + ```python + self.log_alpha = nn.Parameter(torch.log(torch.tensor(ALPHA_INIT))) + self.log_k = nn.Parameter(torch.log(torch.tensor(K_INIT))) + self.log_h_conv = nn.Parameter(torch.log(torch.tensor(H_CONV_INIT))) + ``` +- Properties `alpha`, `k`, `h_conv` that return `exp(log_*)` +- `forward()` uses same output scaling as `HeatPINN` but with `self.k` and `self.alpha` +- Never `.detach()` the learned params inside the loss — gradients must flow through them + +**`inverse/loss.py`** — `inverse_heat_pinn_loss(..., x_s, t_s, T_meas)` +- Same PDE/IC/BC structure as `heat_pinn_loss()` but uses `model.alpha`, `model.k`, `model.h_conv` +- Normalization scales must be computed from the **current learned params** (not config constants), otherwise there is no gradient signal toward the physics params +- Adds data fit term: `L_data = mean((T_pred(x_s, t_s) − T_meas)²) / T_char²` +- Total: `w_pde·L_pde + w_ic·L_ic + w_bc·L_bc + w_data·L_data` + +**`inverse/engine.py`** +- `prepare_data_inverse()`: same clustering strategy as `engine.prepare_data()` +- `train_inverse(data, measurements)`: **Adam only** (no L-BFGS — unstable when physics params are learnable because loss curvature differs by orders of magnitude between network weights and physics params); print identified param values every 100 epochs +- `evaluate_inverse(model)`: print table of true vs identified params with relative error %; also compute L2 error of T field vs FDM + +**`inverse/app.py`** — CLI menu: (1) Generate measurements, (2) Train, (3) Evaluate, (0) Exit + +**`inverse/__init__.py`** — empty + +### Pitfalls + +- If `W_DATA` is too high, BC/IC are ignored and the net overfits measurements (physics collapses) +- Sensors far from x=0 and x=L → poor identification of `H_CONV` (weak boundary signal) +- Do not resample sensor points each epoch — `(x_s, t_s, T_meas)` are fixed throughout training