docs: aggiorna CLAUDE.md con piano implementazione PINN inversa e architettura semplificata

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-15 15:26:20 +02:00
parent 51bb8e5fc8
commit 213cd6fba5
+75 -72
View File
@@ -6,97 +6,100 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
Write all git commit messages in Italian.
## Scope of Work
Lavora su tutto il repository: `fdm/`, `model.py`, `engine.py`, `visualizer.py`, `app.py`, `config.py`. Aiuta con migliorie, bugfix e ottimizzazioni su qualsiasi file del progetto.
## Project Overview
**Heat Equation PINN** — A Physics-Informed Neural Network that solves the 1D time-varying heat equation with an internal point heat source:
```
∂T/∂t = α ∂²T/∂x² + (α/k) Q(t) δ(x X_SRC) x ∈ [0, L], t ∈ [0, T_END]
```
where `Q(t) = Q_VAL` if `t ≥ T_STEP` else `0`.
- **IC:** `T(x, 0) = T0` (uniform)
- **BC x=0:** Robin — convection: `k ∂T/∂x = h (T T_AMB)`
- **BC x=L:** Robin — convection: `k ∂T/∂x = h (T T_AMB)`
No experimental data is needed. A `fdm/` module provides a reference numerical solution (FTCS explicit scheme) used for evaluation and visualization comparison.
All physical and numerical parameters live in `config.py`.
## Running
Always activate the virtual environment first:
## Commands
```bash
source .venv/bin/activate
source .venv/bin/activate # always first
python app.py # forward PINN: train / evaluate / visualize
python fdm/app.py # FDM reference solver menu
python -m inverse.app # inverse PINN menu (to be implemented)
pytest # all tests
pytest -m "not slow" # skip full-training tests
pytest tests/test_model.py # single file
```
**PINN:**
```bash
python app.py # Train / Evaluate (L2 vs FDM) / Visualize
```
**FDM reference solver:**
```bash
python fdm/app.py # Solve / Heatmap / Animation / Time-series
```
Saved artifacts (git-ignored): `models/best_heat_pinn_model.pth`, HTML plots in `animations/` and `animations/fdm/`.
To retrain from scratch: `rm models/best_heat_pinn_model.pth` before running option 1.
## Dependencies
`requirements.txt` exists. Key packages: `torch`, `numpy`, `plotly`. No `pandas` or `scikit-learn` needed.
GPU is auto-detected (`cuda``mps``cpu`) in `engine.py:_get_device()`.
Delete `models/best_heat_pinn_model.pth` to retrain from scratch.
## Architecture
```
config.py ← all physical + numerical parameters (edit here to change the problem)
app.py ← PINN CLI menu
model.py ← HeatPINN + heat_pinn_loss()
engine.py ← data sampling, Adam+L-BFGS training, evaluation vs FDM, visualization call
visualizer.py ← PINN vs FDM: heatmap, animated T(x), time-series at fixed points
fdm/
solver.py ← FTCS explicit scheme, Robin on both ends, point source at X_SRC
visualizer.py ← same 3 plot types for FDM-only output
app.py ← FDM CLI menu
config.py ← all physical + numerical parameters
model.pyHeatPINN (5-layer FC) + heat_pinn_loss()
engine.py ← prepare_data(), train_model(), evaluate_model()
app.py ← forward PINN CLI
visualizer.py ← PINN vs FDM plots (Plotly HTML)
fdm/solver.py ← FTCS explicit scheme, returns T_matrix[NX, NT]
inverse/ ← inverse PINN (to implement — see plan below)
tests/ ← pytest suite (42 tests); conftest.py has device, small_data, pinn_model fixtures
```
### Neural Network (`model.py`)
## Key design decisions
`HeatPINN`: 5-layer fully connected, input `(x, t)` → output `T`.
**Output scaling** — the network predicts a dimensionless perturbation; the `forward()` applies:
**Output scaling** (`model.py:forward`):
```
T = T_AMB + (Q_VAL · L / K) · net(x, t)
T = T_AMB + (Q_VAL · L / K) · net(x_norm, t_norm)
```
This keeps `net` outputs in `[0, 1]` range and ensures gradients `∂T/∂x` are O(1) for the network to learn. Do not remove this scaling.
This keeps net outputs in [0,1] and ∂T/∂x at O(1). Do not remove.
`heat_pinn_loss()` normalizes all four loss terms to O(1) using `T_char = Q_VAL·L/K` and `grad_char = (Q_VAL/K)²`. The PDE residual includes the Gaussian-smoothed source term (σ=0.02) as a continuous approximation to δ(x X_SRC). Changing physical parameters in `config.py` does not require re-tuning loss weights.
**Loss normalization** (`model.py:heat_pinn_loss`): all four terms are scaled to O(1) via `_T_char = Q_VAL·L/K` and `_bc_scale`. Changing physical params in `config.py` does not require retuning weights.
### Training (`engine.py`)
**Collocation clustering** (`engine.py:prepare_data`): 25% extra points near `X_SRC` (source gradient) and `T_STEP` (flux discontinuity). First lever to pull if accuracy is poor: increase `N_F`.
`prepare_data()` samples collocation points with **deliberate clustering**: extra points near `x=X_SRC` (steep gradient at source) and around `t=T_STEP` (flux step discontinuity). Increasing `N_f` / `N_bc` here is the first lever to pull if accuracy is low.
**Training sequence**: Adam (early stopping + ReduceLROnPlateau) → L-BFGS fine-tuning. L-BFGS uses a `_last` closure dict to capture loss components without double-calling the loss outside a grad context.
`train_model()` runs **Adam first, then L-BFGS fine-tuning**. L-BFGS uses a closure that captures loss components in `_last` dict (avoids calling `heat_pinn_loss` outside an active grad context).
**FDM Robin BCs** (`fdm/solver.py`): implicit-like update `T[0] = (T[1] + robin_coeff·T_amb) / (1 + robin_coeff)`. Point source added after BCs: `T[i_src] += Q·α·dt/(k·dx)`.
`evaluate_model()` runs the FDM solver and downsamples its `(NX, NT)` output to the PINN prediction grid `(100, 100)` for L2 comparison.
---
### FDM Solver (`fdm/solver.py`)
## Inverse PINN — implementation plan
Returns `(T_matrix[NX, NT], x_vals, t_vals)`. Uses:
- Robin BC on both ends: `T[0] = (T[1] + robin_coeff·T_amb) / (1 + robin_coeff)`
- Point source injected at node `i_src = argmin|x - X_SRC|` after BCs: `T[i_src] += Q·α·dt/(k·dx)`
- CFL check at startup (warns, does not crash)
Goal: identify unknown physical parameters (`ALPHA`, `K`, `H_CONV`) from sparse noisy temperature measurements. The network learns T(x,t) and the physics parameters simultaneously.
### Loss Scaling Notes
### Files to create (in order)
If you change `Q_VAL`, `K`, `H_CONV`, or `L` in `config.py`, the normalization in `heat_pinn_loss()` adjusts automatically. If losses diverge, check that `T_char = Q_VAL·L/K` is not near zero.
**`inverse/config_inverse.py`**
- `N_SENSORS`, `SENSOR_POSITIONS` (list of x positions)
- `NOISE_STD` — Gaussian noise std on measurements [°C]
- `IDENTIFY = ['alpha', 'k', 'h_conv']`
- `ALPHA_INIT`, `K_INIT`, `H_CONV_INIT` — initial guesses (25× off from true values)
- `EPOCHS_INV`, `LR_ADAM_INV`, `W_DATA = 10.0`
- `MODELS_DIR`, `DATA_PATH`
**`inverse/data.py`**
- `generate_measurements(noise_std, sensor_positions)`: call `fdm.solver.solve()` with true params from `config.py`, sample at nearest FDM nodes, add noise, save to `inverse/data/measurements.csv` (columns: `x, t, T`)
- `load_measurements(device)`: load CSV → tensors `(x_s, t_s, T_meas)` on device
**`inverse/model.py`** — `InverseHeatPINN(nn.Module)`
- Same 5-layer architecture as `HeatPINN`
- Unknown params as log-space `nn.Parameter` (guarantees positivity without constraints):
```python
self.log_alpha = nn.Parameter(torch.log(torch.tensor(ALPHA_INIT)))
self.log_k = nn.Parameter(torch.log(torch.tensor(K_INIT)))
self.log_h_conv = nn.Parameter(torch.log(torch.tensor(H_CONV_INIT)))
```
- Properties `alpha`, `k`, `h_conv` that return `exp(log_*)`
- `forward()` uses same output scaling as `HeatPINN` but with `self.k` and `self.alpha`
- Never `.detach()` the learned params inside the loss — gradients must flow through them
**`inverse/loss.py`** — `inverse_heat_pinn_loss(..., x_s, t_s, T_meas)`
- Same PDE/IC/BC structure as `heat_pinn_loss()` but uses `model.alpha`, `model.k`, `model.h_conv`
- Normalization scales must be computed from the **current learned params** (not config constants), otherwise there is no gradient signal toward the physics params
- Adds data fit term: `L_data = mean((T_pred(x_s, t_s) T_meas)²) / T_char²`
- Total: `w_pde·L_pde + w_ic·L_ic + w_bc·L_bc + w_data·L_data`
**`inverse/engine.py`**
- `prepare_data_inverse()`: same clustering strategy as `engine.prepare_data()`
- `train_inverse(data, measurements)`: **Adam only** (no L-BFGS — unstable when physics params are learnable because loss curvature differs by orders of magnitude between network weights and physics params); print identified param values every 100 epochs
- `evaluate_inverse(model)`: print table of true vs identified params with relative error %; also compute L2 error of T field vs FDM
**`inverse/app.py`** — CLI menu: (1) Generate measurements, (2) Train, (3) Evaluate, (0) Exit
**`inverse/__init__.py`** — empty
### Pitfalls
- If `W_DATA` is too high, BC/IC are ignored and the net overfits measurements (physics collapses)
- Sensors far from x=0 and x=L → poor identification of `H_CONV` (weak boundary signal)
- Do not resample sensor points each epoch — `(x_s, t_s, T_meas)` are fixed throughout training