3. Basic Concepts

AI4Plasma is the world’s first AI library specifically designed for plasma physics simulation, combining state-of-the-art machine learning techniques with rigorous physics-based modeling. This guide introduces the fundamental concepts and architectural components that power AI4Plasma.

3.1. Core Architecture

3.1.1. Network Components

AI4Plasma provides flexible neural network building blocks optimized for scientific computing:

3.1.1.1. FNN (Fully Connected Neural Network)

Dense multi-layer perceptrons with:

Customizable depth and width
Optional batch normalization
Multiple weight initialization strategies (Xavier, zero initialization)
Tanh activation by default (suitable for smooth physics solutions)
Precise floating-point control (REAL precision)

Structure:

Input → Linear → [BN?] → Activation → ... → Linear (output)

3.1.1.2. CNN (Convolutional Neural Network)

1D/2D/3D convolution-based architectures with:

Flexible backbone with optional fully connected head
Adaptive global pooling strategies
Batch normalization and max/avg pooling options
Automatic dimension detection
Lazy initialization based on actual feature sizes

Structure:

Input → [Conv → BN? → Activation → Pool?] × N
     → Global Pool or Flatten
     → [FC → Activation?] × M → Output

3.1.1.3. RelaxFNN (Neural Architecture Search)

Relaxed architecture for automatic network design:

Soft selection over different architectural choices
Learnable architecture parameters optimized jointly with weights
Efficient search for problem-specific network structures
Discrete architecture extraction after search

3.1.2. Geometry System

The geometry system provides a unified interface for domain and boundary sampling across different problem types:

3.1.2.1. Base Classes

Geometry: Abstract base class defining the interface for all geometric domains
GeoTime: Temporal domain \([t_s, t_e]\)
Geo1D: 1D spatial domain \([x_l, x_u]\)
Geo1DTime: Space-time domain for 1D problems
GeoPoly2D: 2D polygonal domains
GeoRect2D: 2D rectangular domains
GeoPoly2DTime: Space-time domain for 2D problems

3.1.2.2. Sampling Strategies

Uniform: Evenly spaced grid sampling
Random: Uniform random distribution
LHS: Latin Hypercube Sampling (for efficient space-filling designs)

Example Usage:

from ai4plasma.piml.geo import Geo1DTime, SamplingMode

# Create a space-time domain
geo = Geo1DTime()
geo.create_domain(xl=0.0, xu=1.0, ts=0.0, te=1.0)

# Sample interior points
X_interior = geo.sample_domain(nx=100, nt=50, mode=SamplingMode.UNIFORM)

# Sample initial condition
X_ic = geo.sample_ic(nx=100, mode=SamplingMode.RANDOM)

3.2. Physics-Informed Machine Learning (PIML)

3.2.1. Physics-Informed Neural Networks (PINNs)

PINNs embed physical laws directly into neural network training through residual-based loss functions. Instead of requiring large labeled datasets, PINNs learn solutions by minimizing physics equation residuals.

3.2.1.1. Mathematical Formulation

For a PDE of the form:

\[\mathcal{F}[u](x, t) = 0, \quad x \in \Omega, t \in [0, T]\]

with boundary conditions \(\mathcal{B}[u] = 0\) and initial conditions \(u(x, 0) = u_0(x)\), a PINN minimizes:

\[\mathcal{L} = w_{pde}\mathcal{L}_{pde} + w_{bc}\mathcal{L}_{bc} + w_{ic}\mathcal{L}_{ic}\]

where:

\(\mathcal{L}_{pde} = \frac{1}{N_{pde}}\sum_{i=1}^{N_{pde}} |\mathcal{F}[u_\theta](x_i, t_i)|^2\)
\(\mathcal{L}_{bc} = \frac{1}{N_{bc}}\sum_{i=1}^{N_{bc}} |\mathcal{B}[u_\theta](x_i, t_i)|^2\)
\(\mathcal{L}_{ic} = \frac{1}{N_{ic}}\sum_{i=1}^{N_{ic}} |u_\theta(x_i, 0) - u_0(x_i)|^2\)

3.2.1.2. Key Features

The PINN framework in AI4Plasma provides:

Multi-Physics Support: Handle arbitrary coupled PDEs through EquationTerm abstraction
Automatic Differentiation: Compute spatial/temporal derivatives via PyTorch autograd
Adaptive Loss Weighting: Automatically balance competing physics constraints
Batch Training: Support for large datasets via DataLoader integration
Visualization Callbacks: Real-time monitoring with custom visualization functions
Checkpoint Management: Save and resume training with full state recovery
TensorBoard Integration: Comprehensive logging and monitoring

Example:

from ai4plasma.piml.pinn import PINN, EquationTerm

# Define PDE residual
def pde_residual(model, X):
    x, t = X[:, 0:1], X[:, 1:2]
    u = model(X)
    u_t = model.df_dt(X, u)
    u_xx = model.df_dxx(X, u, x_idx=0)
    return u_t - 0.01 * u_xx  # Heat equation

# Create equation term
pde_term = EquationTerm(
    name="pde",
    residual_fn=pde_residual,
    data=X_pde,
    weight=1.0
)

# Build and train PINN
pinn = PINN(model=net, equation_terms=[pde_term])
pinn.train(epochs=10000, lr=1e-3)

3.2.2. PINN Variants

AI4Plasma includes several advanced PINN variants for specialized applications:

3.2.2.1. CS-PINN (Coefficient-Subnet PINN)

Specialized for problems with complex material properties and temperature-dependent coefficients:

Automatic Boundary Enforcement: Network architecture guarantees boundary conditions by construction
Spline Interpolation: Handles temperature-dependent properties (thermal conductivity, heat capacity)
Gauss-Legendre Quadrature: Accurate computation of integral terms
Application: Arc discharge simulations with temperature-dependent plasma properties

Network Construction:

\[T(r) = (r - R) \cdot N_\theta(r) + T_b\]

This structure automatically satisfies \(T(R) = T_b\), reducing training complexity.

3.2.2.2. Meta-PINN

Enables rapid adaptation to new physics tasks through meta-learning:

Task Abstraction: Support/query split for few-shot learning
MAML Framework: Model-Agnostic Meta-Learning for PINN
Fast Adaptation: Quickly fine-tune to new parameters (current, geometry)
Application: Multi-current arc discharge, multi-geometry problems

Meta-Learning Workflow:

Meta-training: Sample task batch → adapt on support set → update on query set
Meta-testing: Initialize with meta-parameters → fine-tune on new task (few steps)

3.2.2.3. RK-PINN (Runge-Kutta PINN)

Incorporates Runge-Kutta time-stepping for improved temporal accuracy:

High-Order Time Integration: 4th-order Runge-Kutta scheme
Stage-by-Stage Training: Learn intermediate RK stages
Temporal Accuracy: Better handling of time-dependent dynamics
Application: Transient plasma phenomena, corona discharge

3.2.2.4. NAS-PINN (Neural Architecture Search PINN)

Automatically discovers optimal network architectures for specific physics problems:

Differentiable Architecture Search: Learn architecture parameters via gradient descent
Relaxed Architectures: Soft selection over architectural choices
Problem-Specific Optimization: Find best width/depth for target PDE
Application: Automated PINN design without manual hyperparameter tuning

3.3. Neural Operators

Neural operators learn mappings between infinite-dimensional function spaces, enabling fast evaluation of parametric PDEs and reducing computational cost of repeated simulations.

3.3.1. DeepONet (Deep Operator Network)

DeepONet learns nonlinear operators \(G: \mathcal{U} \to \mathcal{V}\) mapping input functions to output functions.

3.3.1.1. Architecture

DeepONet consists of two sub-networks:

Branch Network: Processes input functions \(u(x)\) (supports FNN and CNN)
Trunk Network: Processes evaluation coordinates \(y\)

Mathematical Formulation:

\[G(u)(y) \approx \sum_{i=1}^p b_i(u) \cdot t_i(y) + \text{bias}\]

where:

\(b_i(u)\): \(i\)-th basis function from branch network
\(t_i(y)\): \(i\)-th basis function from trunk network
\(p\): latent dimension

3.3.1.2. Key Features

Automatic Architecture Detection: FNN for 2D data, CNN for 4D image-like data
Flexible Data Splitting: By branch samples or trunk evaluation points
Distributed Training: Full DataLoader support for large-scale problems
Checkpoint Management: Resume training with full state recovery

Applications:

Parametric PDE solving (e.g., Poisson equation with varying boundary conditions)
Fast surrogate models for expensive simulations
Real-time physics predictions

3.3.1.3. Example:

from ai4plasma.operator.deeponet import DeepONetModel

model = DeepONetModel(
    branch_sizes=[100, 128, 128, 128],  # Branch network
    trunk_sizes=[2, 128, 128, 128],      # Trunk network
    basis_dim=128,                       # Latent dimension
    bias_output=True
)

model.train_model(
    train_loader=train_loader,
    epochs=1000,
    lr=1e-3
)

3.3.2. DeepCSNet (Deep Cross Section Network)

Specialized neural operator for predicting electron-impact cross sections in plasma physics.

3.3.2.1. Architecture

DeepCSNet employs a modular coefficient-subnet structure:

Molecule Net: Processes molecular features (for multi-molecule mode)
Energy Net: Processes incident electron energy
Trunk Net: Processes scattering angles and kinematics

3.3.2.2. Operation Modes

SMC (Single-Molecule Configuration): Energy Net + Trunk Net
- For single molecular species
MMC (Multi-Molecule Configuration): Molecule Net + Energy Net + Trunk Net
- For multiple molecular species simultaneously

3.3.2.3. Applications

Predicting doubly differential ionization cross sections (DDCS)
Total ionization cross sections
Fast cross section lookup for plasma kinetic simulations

Physical Relevance: Cross sections are fundamental to plasma modeling, determining collision rates, energy transfer, and species production. DeepCSNet provides orders-of-magnitude speedup compared to first-principles calculations.

3.4. Equation Terms and Loss Components

The EquationTerm class provides a flexible abstraction for physics constraints:

class EquationTerm:
    """Encapsulates a single physics constraint.
    
    Attributes
    ----------
    name : str
        Identifier for the constraint (e.g., 'pde', 'bc', 'ic')
    residual_fn : callable
        Function computing the physics residual
    data : torch.Tensor
        Collocation points for evaluating residual
    weight : float
        Loss weight for balancing multiple constraints
    """

Benefits:

Modular constraint definition
Dynamic weight updates during training
Easy addition/removal of physics terms
Support for data batching via DataLoader

3.5. Visualization and Monitoring

AI4Plasma provides comprehensive tools for monitoring training progress:

3.5.1. TensorBoard Integration

Automatic logging of:

Loss components (PDE, BC, IC losses)
Total loss evolution
Learning rate schedules
Custom metrics
Solution visualizations

3.5.2. Visualization Callbacks

Abstract base class VisualizationCallback enables custom real-time monitoring:

class MyCallback(VisualizationCallback):
    def __call__(self, model, epoch, writer):
        # Custom visualization logic
        fig = plot_solution(model)
        writer.add_figure('solution', fig, epoch)

Features:

Automatic figure logging to TensorBoard
Configurable callback frequency
Multiple independent visualizations
No modification to core training loop

3.6. Training Utilities

3.6.1. Checkpoint Management

Full state saving and resumption:

# Save checkpoint
model.save_checkpoint('checkpoint.pth', epoch=1000)

# Resume training
model = PINN.load_checkpoint('checkpoint.pth', model=net)
model.train(epochs=2000, lr=1e-4)  # Continue from epoch 1000

3.6.2. Adaptive Loss Weighting

Automatically balance competing loss terms to prevent one constraint from dominating:

pinn = PINN(
    model=net,
    equation_terms=terms,
    use_adaptive_weights=True  # Enable adaptive weighting
)

3.6.3. Learning Rate Scheduling

Support for PyTorch schedulers:

pinn.train(
    epochs=10000,
    lr=1e-3,
    scheduler=torch.optim.lr_scheduler.StepLR(optimizer, step_size=1000, gamma=0.9)
)

3.7. Plasma Physics Applications

AI4Plasma is specifically designed for plasma simulation challenges:

3.7.1. Arc Discharge Modeling

Steady-state and transient 1D cylindrical arc
Temperature-dependent properties (conductivity, heat capacity)
Ohmic heating, radiation losses, convection
Automatic boundary condition enforcement

3.7.2. Cross Section Prediction

Electron-impact ionization cross sections
Multi-molecule species support
Fast lookup for kinetic simulations

3.7.3. Corona Discharge

Time-dependent ionization dynamics
Runge-Kutta time integration
Photoionization and recombination

3.7.4. Parametric Studies

Meta-learning for fast parameter sweeps
Neural operators for real-time predictions
Uncertainty quantification

3.8. Typical Workflow

A typical AI4Plasma workflow consists of:

Problem Definition
- Define governing PDEs
- Specify boundary/initial conditions
- Set up computational domain (geometry)
Network Design
- Choose network architecture (FNN, CNN, custom)
- Set hyperparameters (depth, width, activation)
- Optionally use NAS-PINN for automatic design
Physics Encoding
- Implement residual functions
- Create equation terms with appropriate weights
- Set up sampling points (collocation points)
Training
- Initialize PINN/operator model
- Configure optimizer and scheduler
- Add visualization callbacks
- Train with TensorBoard monitoring
Validation & Deployment
- Compare with reference solutions or experiments
- Compute error metrics (L2 error, relative error)
- Export model for inference
- Use in larger simulation pipelines

3.9. References

PINNs: M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, pp. 686-707, 2019.
DeepONet: L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, “Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators,” Nature Machine Intelligence, vol. 3, no. 3, pp. 218-229, 2021.
CS-PINN: L. Zhong, B. Wu, and Y. Wang, “Low-temperature plasma simulation based on physics-informed neural networks: Frameworks and preliminary applications,” Physics of Fluids, vol. 34, no. 8, p. 087116, 2022.
RK-PINN: L. Zhong, B. Wu, and Y. Wang, “Low-temperature plasma simulation based on physics-informed neural networks: Frameworks and preliminary applications,” Physics of Fluids, vol. 34, no. 8, p. 087116, 2022.
Meta-PINN: L. Zhong, B. Wu, and Y. Wang, “Accelerating physics-informed neural network based 1D arc simulation by meta learning,” Journal of Physics D: Applied Physics, vol. 56, p. 074006, 2023.
NAS-PINN: Y. Wang and L. Zhong, “NAS-PINN: Neural architecture search-guided physics-informed neural network for solving PDEs,” Journal of Computational Physics, vol. 496, p. 112603, 2024.
DeepCSNet: Y. Wang and L. Zhong, “DeepCSNet: a deep learning method for predicting electron-impact doubly differential ionization cross sections,” Plasma Sources Science and Technology, vol. 33, no. 10, p. 105012, 2024.

3.10. Next Steps

Explore the API Reference for detailed class and function documentation
Check out Examples for practical tutorials
Read the Developer Guide to understand the internals
Try the example scripts in the app/ directory