docs: aggiorna CLAUDE.md con piano implementazione PINN inversa e architettura semplificata
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -6,97 +6,100 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
|
||||
Write all git commit messages in Italian.
|
||||
|
||||
## Scope of Work
|
||||
|
||||
Lavora su tutto il repository: `fdm/`, `model.py`, `engine.py`, `visualizer.py`, `app.py`, `config.py`. Aiuta con migliorie, bugfix e ottimizzazioni su qualsiasi file del progetto.
|
||||
|
||||
## Project Overview
|
||||
|
||||
**Heat Equation PINN** — A Physics-Informed Neural Network that solves the 1D time-varying heat equation with an internal point heat source:
|
||||
|
||||
```
|
||||
∂T/∂t = α ∂²T/∂x² + (α/k) Q(t) δ(x − X_SRC) x ∈ [0, L], t ∈ [0, T_END]
|
||||
```
|
||||
|
||||
where `Q(t) = Q_VAL` if `t ≥ T_STEP` else `0`.
|
||||
|
||||
- **IC:** `T(x, 0) = T0` (uniform)
|
||||
- **BC x=0:** Robin — convection: `−k ∂T/∂x = h (T − T_AMB)`
|
||||
- **BC x=L:** Robin — convection: `−k ∂T/∂x = h (T − T_AMB)`
|
||||
|
||||
No experimental data is needed. A `fdm/` module provides a reference numerical solution (FTCS explicit scheme) used for evaluation and visualization comparison.
|
||||
|
||||
All physical and numerical parameters live in `config.py`.
|
||||
|
||||
## Running
|
||||
|
||||
Always activate the virtual environment first:
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
source .venv/bin/activate
|
||||
source .venv/bin/activate # always first
|
||||
|
||||
python app.py # forward PINN: train / evaluate / visualize
|
||||
python fdm/app.py # FDM reference solver menu
|
||||
python -m inverse.app # inverse PINN menu (to be implemented)
|
||||
|
||||
pytest # all tests
|
||||
pytest -m "not slow" # skip full-training tests
|
||||
pytest tests/test_model.py # single file
|
||||
```
|
||||
|
||||
**PINN:**
|
||||
```bash
|
||||
python app.py # Train / Evaluate (L2 vs FDM) / Visualize
|
||||
```
|
||||
|
||||
**FDM reference solver:**
|
||||
```bash
|
||||
python fdm/app.py # Solve / Heatmap / Animation / Time-series
|
||||
```
|
||||
|
||||
Saved artifacts (git-ignored): `models/best_heat_pinn_model.pth`, HTML plots in `animations/` and `animations/fdm/`.
|
||||
|
||||
To retrain from scratch: `rm models/best_heat_pinn_model.pth` before running option 1.
|
||||
|
||||
## Dependencies
|
||||
|
||||
`requirements.txt` exists. Key packages: `torch`, `numpy`, `plotly`. No `pandas` or `scikit-learn` needed.
|
||||
|
||||
GPU is auto-detected (`cuda` → `mps` → `cpu`) in `engine.py:_get_device()`.
|
||||
Delete `models/best_heat_pinn_model.pth` to retrain from scratch.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
config.py ← all physical + numerical parameters (edit here to change the problem)
|
||||
app.py ← PINN CLI menu
|
||||
model.py ← HeatPINN + heat_pinn_loss()
|
||||
engine.py ← data sampling, Adam+L-BFGS training, evaluation vs FDM, visualization call
|
||||
visualizer.py ← PINN vs FDM: heatmap, animated T(x), time-series at fixed points
|
||||
fdm/
|
||||
solver.py ← FTCS explicit scheme, Robin on both ends, point source at X_SRC
|
||||
visualizer.py ← same 3 plot types for FDM-only output
|
||||
app.py ← FDM CLI menu
|
||||
config.py ← all physical + numerical parameters
|
||||
model.py ← HeatPINN (5-layer FC) + heat_pinn_loss()
|
||||
engine.py ← prepare_data(), train_model(), evaluate_model()
|
||||
app.py ← forward PINN CLI
|
||||
visualizer.py ← PINN vs FDM plots (Plotly HTML)
|
||||
fdm/solver.py ← FTCS explicit scheme, returns T_matrix[NX, NT]
|
||||
inverse/ ← inverse PINN (to implement — see plan below)
|
||||
tests/ ← pytest suite (42 tests); conftest.py has device, small_data, pinn_model fixtures
|
||||
```
|
||||
|
||||
### Neural Network (`model.py`)
|
||||
## Key design decisions
|
||||
|
||||
`HeatPINN`: 5-layer fully connected, input `(x, t)` → output `T`.
|
||||
|
||||
**Output scaling** — the network predicts a dimensionless perturbation; the `forward()` applies:
|
||||
**Output scaling** (`model.py:forward`):
|
||||
```
|
||||
T = T_AMB + (Q_VAL · L / K) · net(x, t)
|
||||
T = T_AMB + (Q_VAL · L / K) · net(x_norm, t_norm)
|
||||
```
|
||||
This keeps `net` outputs in `[0, 1]` range and ensures gradients `∂T/∂x` are O(1) for the network to learn. Do not remove this scaling.
|
||||
This keeps net outputs in [0,1] and ∂T/∂x at O(1). Do not remove.
|
||||
|
||||
`heat_pinn_loss()` normalizes all four loss terms to O(1) using `T_char = Q_VAL·L/K` and `grad_char = (Q_VAL/K)²`. The PDE residual includes the Gaussian-smoothed source term (σ=0.02) as a continuous approximation to δ(x − X_SRC). Changing physical parameters in `config.py` does not require re-tuning loss weights.
|
||||
**Loss normalization** (`model.py:heat_pinn_loss`): all four terms are scaled to O(1) via `_T_char = Q_VAL·L/K` and `_bc_scale`. Changing physical params in `config.py` does not require retuning weights.
|
||||
|
||||
### Training (`engine.py`)
|
||||
**Collocation clustering** (`engine.py:prepare_data`): 25% extra points near `X_SRC` (source gradient) and `T_STEP` (flux discontinuity). First lever to pull if accuracy is poor: increase `N_F`.
|
||||
|
||||
`prepare_data()` samples collocation points with **deliberate clustering**: extra points near `x=X_SRC` (steep gradient at source) and around `t=T_STEP` (flux step discontinuity). Increasing `N_f` / `N_bc` here is the first lever to pull if accuracy is low.
|
||||
**Training sequence**: Adam (early stopping + ReduceLROnPlateau) → L-BFGS fine-tuning. L-BFGS uses a `_last` closure dict to capture loss components without double-calling the loss outside a grad context.
|
||||
|
||||
`train_model()` runs **Adam first, then L-BFGS fine-tuning**. L-BFGS uses a closure that captures loss components in `_last` dict (avoids calling `heat_pinn_loss` outside an active grad context).
|
||||
**FDM Robin BCs** (`fdm/solver.py`): implicit-like update `T[0] = (T[1] + robin_coeff·T_amb) / (1 + robin_coeff)`. Point source added after BCs: `T[i_src] += Q·α·dt/(k·dx)`.
|
||||
|
||||
`evaluate_model()` runs the FDM solver and downsamples its `(NX, NT)` output to the PINN prediction grid `(100, 100)` for L2 comparison.
|
||||
---
|
||||
|
||||
### FDM Solver (`fdm/solver.py`)
|
||||
## Inverse PINN — implementation plan
|
||||
|
||||
Returns `(T_matrix[NX, NT], x_vals, t_vals)`. Uses:
|
||||
- Robin BC on both ends: `T[0] = (T[1] + robin_coeff·T_amb) / (1 + robin_coeff)`
|
||||
- Point source injected at node `i_src = argmin|x - X_SRC|` after BCs: `T[i_src] += Q·α·dt/(k·dx)`
|
||||
- CFL check at startup (warns, does not crash)
|
||||
Goal: identify unknown physical parameters (`ALPHA`, `K`, `H_CONV`) from sparse noisy temperature measurements. The network learns T(x,t) and the physics parameters simultaneously.
|
||||
|
||||
### Loss Scaling Notes
|
||||
### Files to create (in order)
|
||||
|
||||
If you change `Q_VAL`, `K`, `H_CONV`, or `L` in `config.py`, the normalization in `heat_pinn_loss()` adjusts automatically. If losses diverge, check that `T_char = Q_VAL·L/K` is not near zero.
|
||||
**`inverse/config_inverse.py`**
|
||||
- `N_SENSORS`, `SENSOR_POSITIONS` (list of x positions)
|
||||
- `NOISE_STD` — Gaussian noise std on measurements [°C]
|
||||
- `IDENTIFY = ['alpha', 'k', 'h_conv']`
|
||||
- `ALPHA_INIT`, `K_INIT`, `H_CONV_INIT` — initial guesses (2–5× off from true values)
|
||||
- `EPOCHS_INV`, `LR_ADAM_INV`, `W_DATA = 10.0`
|
||||
- `MODELS_DIR`, `DATA_PATH`
|
||||
|
||||
**`inverse/data.py`**
|
||||
- `generate_measurements(noise_std, sensor_positions)`: call `fdm.solver.solve()` with true params from `config.py`, sample at nearest FDM nodes, add noise, save to `inverse/data/measurements.csv` (columns: `x, t, T`)
|
||||
- `load_measurements(device)`: load CSV → tensors `(x_s, t_s, T_meas)` on device
|
||||
|
||||
**`inverse/model.py`** — `InverseHeatPINN(nn.Module)`
|
||||
- Same 5-layer architecture as `HeatPINN`
|
||||
- Unknown params as log-space `nn.Parameter` (guarantees positivity without constraints):
|
||||
```python
|
||||
self.log_alpha = nn.Parameter(torch.log(torch.tensor(ALPHA_INIT)))
|
||||
self.log_k = nn.Parameter(torch.log(torch.tensor(K_INIT)))
|
||||
self.log_h_conv = nn.Parameter(torch.log(torch.tensor(H_CONV_INIT)))
|
||||
```
|
||||
- Properties `alpha`, `k`, `h_conv` that return `exp(log_*)`
|
||||
- `forward()` uses same output scaling as `HeatPINN` but with `self.k` and `self.alpha`
|
||||
- Never `.detach()` the learned params inside the loss — gradients must flow through them
|
||||
|
||||
**`inverse/loss.py`** — `inverse_heat_pinn_loss(..., x_s, t_s, T_meas)`
|
||||
- Same PDE/IC/BC structure as `heat_pinn_loss()` but uses `model.alpha`, `model.k`, `model.h_conv`
|
||||
- Normalization scales must be computed from the **current learned params** (not config constants), otherwise there is no gradient signal toward the physics params
|
||||
- Adds data fit term: `L_data = mean((T_pred(x_s, t_s) − T_meas)²) / T_char²`
|
||||
- Total: `w_pde·L_pde + w_ic·L_ic + w_bc·L_bc + w_data·L_data`
|
||||
|
||||
**`inverse/engine.py`**
|
||||
- `prepare_data_inverse()`: same clustering strategy as `engine.prepare_data()`
|
||||
- `train_inverse(data, measurements)`: **Adam only** (no L-BFGS — unstable when physics params are learnable because loss curvature differs by orders of magnitude between network weights and physics params); print identified param values every 100 epochs
|
||||
- `evaluate_inverse(model)`: print table of true vs identified params with relative error %; also compute L2 error of T field vs FDM
|
||||
|
||||
**`inverse/app.py`** — CLI menu: (1) Generate measurements, (2) Train, (3) Evaluate, (0) Exit
|
||||
|
||||
**`inverse/__init__.py`** — empty
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- If `W_DATA` is too high, BC/IC are ignored and the net overfits measurements (physics collapses)
|
||||
- Sensors far from x=0 and x=L → poor identification of `H_CONV` (weak boundary signal)
|
||||
- Do not resample sensor points each epoch — `(x_s, t_s, T_meas)` are fixed throughout training
|
||||
|
||||
Reference in New Issue
Block a user