2. Neural Operators

This guide explains the operator-learning models under ai4plasma.operator. It focuses on the intuition, data organization, and training workflow for neural operators in AI4Plasma.

2.1. What is an Operator?

In many plasma problems, we want a model that maps an entire input function to an output function, not just a single input to a single output. For example:

Input: a spatially varying source term, boundary profile, or material property field
Output: a temperature field, potential field, or density field across a domain

Neural operators approximate this mapping directly, enabling fast evaluation of parametric PDEs without solving the PDE from scratch each time.

2.2. DeepONet 

DeepONet learns an operator \(\mathcal{G}\) that maps an input function \(u\) to an output function \(\mathcal{G}(u)(y)\).

In AI4Plasma, ai4plasma.operator.deeponet.DeepONet is implemented as a branch network + trunk network:

Branch network: encodes the input function / parameterization (e.g., forcing term amplitude, boundary condition parameters, or a discretized field).
Trunk network: encodes the query coordinates (e.g., \(x\) for 1D, or \((x,y)\) for 2D).

The outputs are combined via Einstein summation:

\[ \text{out}_{b,n} = \sum_i \text{branch}_{b,i}\,\text{trunk}_{n,i} + \text{bias} \]

2.2.1. Input / output shapes 

The implementation supports two typical branch modes:

FNN branch: branch_inputs is 2D: (batch_size, features)
CNN branch: branch_inputs is 4D: (batch_size, channels, height, width)

trunk_inputs is typically (num_points, coord_dim).

The output has shape (batch_size, num_points).

2.2.2. Data organization 

DeepONet training data typically contains:

branch_inputs: sampled input functions, either as vectors (FNN) or images (CNN)
trunk_inputs: coordinates where the output is evaluated
targets: ground-truth output values at those coordinates

In many applications, the same trunk_inputs grid is shared across all samples, while branch_inputs varies across cases.

2.2.3. Training wrapper 

ai4plasma.operator.deeponet.DeepONetModel provides a ready-to-use training loop with:

configurable optimizer / scheduler
TensorBoard logging
checkpointing and resuming

See also: the training guide in guides/training.md.

2.2.4. Practical tips 

Normalize both inputs and outputs when possible (especially for multi-physics datasets).
Ensure the branch and trunk output dimensions match the chosen basis dimension.
If the output field is smooth, use tanh activations and moderate depth; for sharp features, consider deeper networks or richer basis dimension.

2.3. DeepCSNet 

DeepCSNet (ai4plasma.operator.deepcsnet.DeepCSNet) is a specialized operator-like architecture for cross-section prediction.

It separates inputs into coefficient subnets and a trunk subnet:

Molecule Net (optional): molecular descriptors (multi-molecule mode)
Energy Net (optional): incident energy features
Trunk Net (required): output coordinates (angles, ejected energies, etc.)

The final prediction is computed via tensor contraction (einsum), similar to DeepONet.

2.3.1. Modes 

SMC (single-molecule configuration): Energy Net + Trunk Net
MMC (multi-molecule configuration): Molecule Net (+ optional Energy Net) + Trunk Net

DeepCSNet datasets often use a shared coordinate grid (angles, ejected energies) across samples. Organize the inputs so that the coefficient subnets capture case-specific variation while the trunk subnet captures evaluation coordinates.

2.3.3. Practical notes 

Ensure the hidden dimension of trunk outputs matches the branch output dimension (or matches the concatenated dimension in MMC).
When building datasets, the coordinate grid (trunk input) is often shared across all cases.
For MMC, be careful to align molecular descriptors with the corresponding output samples.

2.4. Losses and Metrics 

Operator learning in AI4Plasma is typically supervised with mean-squared error on the output field values:

\[ \mathcal{L}_{data} = \frac{1}{N}\sum_{i=1}^N |\hat{u}(y_i) - u(y_i)|^2 \]

Common evaluation metrics include:

Relative \(L_2\) error: \(\|\hat{u}-u\|_2 / \|u\|_2\)
Mean absolute error (MAE)
Task-specific physics diagnostics (e.g., integral quantities, extrema, or conservation checks)

If you want physics constraints during operator training, consider combining operator models with PINN-style residual losses for hybrid supervision.

2.5. Minimal working example (DeepONet)

This mirrors the scripts in app/operator/deeponet/:

import numpy as np
import torch.nn as nn

from ai4plasma.config import DEVICE, REAL
from ai4plasma.utils.device import check_gpu
from ai4plasma.utils.common import set_seed, numpy2torch
from ai4plasma.core.network import FNN
from ai4plasma.operator.deeponet import DeepONet, DeepONetModel

set_seed(2023)
DEVICE.set_device(0 if check_gpu() else -1)

branch_net = FNN([1, 32, 32, 32], act_fun=nn.Tanh())
trunk_net = FNN([1, 32, 32, 32], act_fun=nn.Tanh())

net = DeepONet(branch_net, trunk_net)
model = DeepONetModel(net)

v = np.linspace(1.0, 10.0, 10, dtype=REAL()).reshape(-1, 1)
x = np.linspace(-1, 1, 64, dtype=REAL()).reshape(-1, 1)
u = v * np.sin(np.pi * x.T)

model.prepare_train_data(numpy2torch(v), numpy2torch(x), numpy2torch(u))
model.train(num_epochs=10000, lr=1e-4)

2.6. Common pitfalls 

Mismatched dimensions: ensure branch output dimension equals trunk output dimension (or the concatenation in multi-branch setups).
Overfitting on small datasets: use early stopping, weight decay, or data augmentation.
Inconsistent scaling: apply consistent normalization across train/validation/test splits.

2.7. Where to go next 

See the training workflow in guides/training.md
Explore PINN-based alternatives in guides/piml.md
Check scripts in app/operator/ for practical end-to-end examples