# Neural Operators

This guide explains the operator-learning models under `ai4plasma.operator`. It focuses on the intuition, data organization, and training workflow for neural operators in AI4Plasma.

```{contents}
:local:
:depth: 2
```

## What is an Operator?

In many plasma problems, we want a model that maps an *entire input function* to an *output function*, not just a single input to a single output. For example:

- Input: a spatially varying source term, boundary profile, or material property field
- Output: a temperature field, potential field, or density field across a domain

Neural operators approximate this mapping directly, enabling fast evaluation of parametric PDEs without solving the PDE from scratch each time.

## DeepONet

DeepONet learns an operator $\mathcal{G}$ that maps an input function $u$ to an output function $\mathcal{G}(u)(y)$.

In AI4Plasma, `ai4plasma.operator.deeponet.DeepONet` is implemented as a **branch network** + **trunk network**:

- Branch network: encodes the *input function / parameterization* (e.g., forcing term amplitude, boundary condition parameters, or a discretized field).
- Trunk network: encodes the *query coordinates* (e.g., $x$ for 1D, or $(x,y)$ for 2D).

The outputs are combined via Einstein summation:

$$
\text{out}_{b,n} = \sum_i \text{branch}_{b,i}\,\text{trunk}_{n,i} + \text{bias}
$$

### Input / output shapes

The implementation supports two typical branch modes:

- **FNN branch**: `branch_inputs` is 2D: `(batch_size, features)`
- **CNN branch**: `branch_inputs` is 4D: `(batch_size, channels, height, width)`

`trunk_inputs` is typically `(num_points, coord_dim)`.

The output has shape `(batch_size, num_points)`.

### Data organization

DeepONet training data typically contains:

- `branch_inputs`: sampled input functions, either as vectors (FNN) or images (CNN)
- `trunk_inputs`: coordinates where the output is evaluated
- `targets`: ground-truth output values at those coordinates

In many applications, the same `trunk_inputs` grid is shared across all samples, while `branch_inputs` varies across cases.

### Training wrapper

`ai4plasma.operator.deeponet.DeepONetModel` provides a ready-to-use training loop with:

- configurable optimizer / scheduler
- TensorBoard logging
- checkpointing and resuming

See also: the training guide in `guides/training.md`.

### Practical tips

- Normalize both inputs and outputs when possible (especially for multi-physics datasets).
- Ensure the branch and trunk output dimensions match the chosen basis dimension.
- If the output field is smooth, use `tanh` activations and moderate depth; for sharp features, consider deeper networks or richer basis dimension.

## DeepCSNet

DeepCSNet (`ai4plasma.operator.deepcsnet.DeepCSNet`) is a specialized operator-like architecture for cross-section prediction.

It separates inputs into **coefficient subnets** and a **trunk subnet**:

- Molecule Net (optional): molecular descriptors (multi-molecule mode)
- Energy Net (optional): incident energy features
- Trunk Net (required): output coordinates (angles, ejected energies, etc.)

The final prediction is computed via tensor contraction (einsum), similar to DeepONet.

### Modes

- **SMC** (single-molecule configuration): Energy Net + Trunk Net
- **MMC** (multi-molecule configuration): Molecule Net (+ optional Energy Net) + Trunk Net

### Data organization

DeepCSNet datasets often use a shared coordinate grid (angles, ejected energies) across samples. Organize the inputs so that the coefficient subnets capture *case-specific* variation while the trunk subnet captures *evaluation coordinates*.

### Practical notes

- Ensure the hidden dimension of trunk outputs matches the branch output dimension (or matches the concatenated dimension in MMC).
- When building datasets, the coordinate grid (trunk input) is often shared across all cases.
- For MMC, be careful to align molecular descriptors with the corresponding output samples.

## Losses and Metrics

Operator learning in AI4Plasma is typically supervised with mean-squared error on the output field values:

$$
\mathcal{L}_{data} = \frac{1}{N}\sum_{i=1}^N |\hat{u}(y_i) - u(y_i)|^2
$$

Common evaluation metrics include:

- Relative $L_2$ error: $\|\hat{u}-u\|_2 / \|u\|_2$
- Mean absolute error (MAE)
- Task-specific physics diagnostics (e.g., integral quantities, extrema, or conservation checks)

If you want physics constraints during operator training, consider combining operator models with PINN-style residual losses for hybrid supervision.

## Minimal working example (DeepONet)

This mirrors the scripts in `app/operator/deeponet/`:

```python
import numpy as np
import torch.nn as nn

from ai4plasma.config import DEVICE, REAL
from ai4plasma.utils.device import check_gpu
from ai4plasma.utils.common import set_seed, numpy2torch
from ai4plasma.core.network import FNN
from ai4plasma.operator.deeponet import DeepONet, DeepONetModel

set_seed(2023)
DEVICE.set_device(0 if check_gpu() else -1)

branch_net = FNN([1, 32, 32, 32], act_fun=nn.Tanh())
trunk_net = FNN([1, 32, 32, 32], act_fun=nn.Tanh())

net = DeepONet(branch_net, trunk_net)
model = DeepONetModel(net)

v = np.linspace(1.0, 10.0, 10, dtype=REAL()).reshape(-1, 1)
x = np.linspace(-1, 1, 64, dtype=REAL()).reshape(-1, 1)
u = v * np.sin(np.pi * x.T)

model.prepare_train_data(numpy2torch(v), numpy2torch(x), numpy2torch(u))
model.train(num_epochs=10000, lr=1e-4)
```

## Common pitfalls

- **Mismatched dimensions**: ensure branch output dimension equals trunk output dimension (or the concatenation in multi-branch setups).
- **Overfitting on small datasets**: use early stopping, weight decay, or data augmentation.
- **Inconsistent scaling**: apply consistent normalization across train/validation/test splits.

## Where to go next

- See the training workflow in [guides/training.md](training.md)
- Explore PINN-based alternatives in [guides/piml.md](piml.md)
- Check scripts in `app/operator/` for practical end-to-end examples