3. Basic Concepts

AI4Plasma is the world’s first AI library specifically designed for plasma physics simulation, combining state-of-the-art machine learning techniques with rigorous physics-based modeling. This guide introduces the fundamental concepts and architectural components that power AI4Plasma.

3.1. Core Architecture

3.1.1. Network Components

AI4Plasma provides flexible neural network building blocks optimized for scientific computing:

3.1.1.1. FNN (Fully Connected Neural Network)

Dense multi-layer perceptrons with:

  • Customizable depth and width

  • Optional batch normalization

  • Multiple weight initialization strategies (Xavier, zero initialization)

  • Tanh activation by default (suitable for smooth physics solutions)

  • Precise floating-point control (REAL precision)

Structure:

Input → Linear → [BN?] → Activation → ... → Linear (output)

3.1.1.2. CNN (Convolutional Neural Network)

1D/2D/3D convolution-based architectures with:

  • Flexible backbone with optional fully connected head

  • Adaptive global pooling strategies

  • Batch normalization and max/avg pooling options

  • Automatic dimension detection

  • Lazy initialization based on actual feature sizes

Structure:

Input → [Conv → BN? → Activation → Pool?] × N
     → Global Pool or Flatten
     → [FC → Activation?] × M → Output

3.1.2. Geometry System

The geometry system provides a unified interface for domain and boundary sampling across different problem types:

3.1.2.1. Base Classes

  • Geometry: Abstract base class defining the interface for all geometric domains

  • GeoTime: Temporal domain \([t_s, t_e]\)

  • Geo1D: 1D spatial domain \([x_l, x_u]\)

  • Geo1DTime: Space-time domain for 1D problems

  • GeoPoly2D: 2D polygonal domains

  • GeoRect2D: 2D rectangular domains

  • GeoPoly2DTime: Space-time domain for 2D problems

3.1.2.2. Sampling Strategies

  • Uniform: Evenly spaced grid sampling

  • Random: Uniform random distribution

  • LHS: Latin Hypercube Sampling (for efficient space-filling designs)

Example Usage:

from ai4plasma.piml.geo import Geo1DTime, SamplingMode

# Create a space-time domain
geo = Geo1DTime()
geo.create_domain(xl=0.0, xu=1.0, ts=0.0, te=1.0)

# Sample interior points
X_interior = geo.sample_domain(nx=100, nt=50, mode=SamplingMode.UNIFORM)

# Sample initial condition
X_ic = geo.sample_ic(nx=100, mode=SamplingMode.RANDOM)

3.2. Physics-Informed Machine Learning (PIML)

3.2.1. Physics-Informed Neural Networks (PINNs)

PINNs embed physical laws directly into neural network training through residual-based loss functions. Instead of requiring large labeled datasets, PINNs learn solutions by minimizing physics equation residuals.

3.2.1.1. Mathematical Formulation

For a PDE of the form:

\[\mathcal{F}[u](x, t) = 0, \quad x \in \Omega, t \in [0, T]\]

with boundary conditions \(\mathcal{B}[u] = 0\) and initial conditions \(u(x, 0) = u_0(x)\), a PINN minimizes:

\[\mathcal{L} = w_{pde}\mathcal{L}_{pde} + w_{bc}\mathcal{L}_{bc} + w_{ic}\mathcal{L}_{ic}\]

where:

  • \(\mathcal{L}_{pde} = \frac{1}{N_{pde}}\sum_{i=1}^{N_{pde}} |\mathcal{F}[u_\theta](x_i, t_i)|^2\)

  • \(\mathcal{L}_{bc} = \frac{1}{N_{bc}}\sum_{i=1}^{N_{bc}} |\mathcal{B}[u_\theta](x_i, t_i)|^2\)

  • \(\mathcal{L}_{ic} = \frac{1}{N_{ic}}\sum_{i=1}^{N_{ic}} |u_\theta(x_i, 0) - u_0(x_i)|^2\)

3.2.1.2. Key Features

The PINN framework in AI4Plasma provides:

  • Multi-Physics Support: Handle arbitrary coupled PDEs through EquationTerm abstraction

  • Automatic Differentiation: Compute spatial/temporal derivatives via PyTorch autograd

  • Adaptive Loss Weighting: Automatically balance competing physics constraints

  • Batch Training: Support for large datasets via DataLoader integration

  • Visualization Callbacks: Real-time monitoring with custom visualization functions

  • Checkpoint Management: Save and resume training with full state recovery

  • TensorBoard Integration: Comprehensive logging and monitoring

Example:

from ai4plasma.piml.pinn import PINN, EquationTerm

# Define PDE residual
def pde_residual(model, X):
    x, t = X[:, 0:1], X[:, 1:2]
    u = model(X)
    u_t = model.df_dt(X, u)
    u_xx = model.df_dxx(X, u, x_idx=0)
    return u_t - 0.01 * u_xx  # Heat equation

# Create equation term
pde_term = EquationTerm(
    name="pde",
    residual_fn=pde_residual,
    data=X_pde,
    weight=1.0
)

# Build and train PINN
pinn = PINN(model=net, equation_terms=[pde_term])
pinn.train(epochs=10000, lr=1e-3)

3.2.2. PINN Variants

AI4Plasma includes several advanced PINN variants for specialized applications:

3.2.2.1. CS-PINN (Coefficient-Subnet PINN)

Specialized for problems with complex material properties and temperature-dependent coefficients:

  • Automatic Boundary Enforcement: Network architecture guarantees boundary conditions by construction

  • Spline Interpolation: Handles temperature-dependent properties (thermal conductivity, heat capacity)

  • Gauss-Legendre Quadrature: Accurate computation of integral terms

  • Application: Arc discharge simulations with temperature-dependent plasma properties

Network Construction:

\[T(r) = (r - R) \cdot N_\theta(r) + T_b\]

This structure automatically satisfies \(T(R) = T_b\), reducing training complexity.

3.2.2.2. Meta-PINN

Enables rapid adaptation to new physics tasks through meta-learning:

  • Task Abstraction: Support/query split for few-shot learning

  • MAML Framework: Model-Agnostic Meta-Learning for PINN

  • Fast Adaptation: Quickly fine-tune to new parameters (current, geometry)

  • Application: Multi-current arc discharge, multi-geometry problems

Meta-Learning Workflow:

  1. Meta-training: Sample task batch → adapt on support set → update on query set

  2. Meta-testing: Initialize with meta-parameters → fine-tune on new task (few steps)

3.2.2.3. RK-PINN (Runge-Kutta PINN)

Incorporates Runge-Kutta time-stepping for improved temporal accuracy:

  • High-Order Time Integration: 4th-order Runge-Kutta scheme

  • Stage-by-Stage Training: Learn intermediate RK stages

  • Temporal Accuracy: Better handling of time-dependent dynamics

  • Application: Transient plasma phenomena, corona discharge

3.2.2.4. NAS-PINN (Neural Architecture Search PINN)

Automatically discovers optimal network architectures for specific physics problems:

  • Differentiable Architecture Search: Learn architecture parameters via gradient descent

  • Relaxed Architectures: Soft selection over architectural choices

  • Problem-Specific Optimization: Find best width/depth for target PDE

  • Application: Automated PINN design without manual hyperparameter tuning

3.3. Neural Operators

Neural operators learn mappings between infinite-dimensional function spaces, enabling fast evaluation of parametric PDEs and reducing computational cost of repeated simulations.

3.3.1. DeepONet (Deep Operator Network)

DeepONet learns nonlinear operators \(G: \mathcal{U} \to \mathcal{V}\) mapping input functions to output functions.

3.3.1.1. Architecture

DeepONet consists of two sub-networks:

  • Branch Network: Processes input functions \(u(x)\) (supports FNN and CNN)

  • Trunk Network: Processes evaluation coordinates \(y\)

Mathematical Formulation:

\[G(u)(y) \approx \sum_{i=1}^p b_i(u) \cdot t_i(y) + \text{bias}\]

where:

  • \(b_i(u)\): \(i\)-th basis function from branch network

  • \(t_i(y)\): \(i\)-th basis function from trunk network

  • \(p\): latent dimension

3.3.1.2. Key Features

  • Automatic Architecture Detection: FNN for 2D data, CNN for 4D image-like data

  • Flexible Data Splitting: By branch samples or trunk evaluation points

  • Distributed Training: Full DataLoader support for large-scale problems

  • Checkpoint Management: Resume training with full state recovery

Applications:

  • Parametric PDE solving (e.g., Poisson equation with varying boundary conditions)

  • Fast surrogate models for expensive simulations

  • Real-time physics predictions

3.3.1.3. Example:

from ai4plasma.operator.deeponet import DeepONetModel

model = DeepONetModel(
    branch_sizes=[100, 128, 128, 128],  # Branch network
    trunk_sizes=[2, 128, 128, 128],      # Trunk network
    basis_dim=128,                       # Latent dimension
    bias_output=True
)

model.train_model(
    train_loader=train_loader,
    epochs=1000,
    lr=1e-3
)

3.3.2. DeepCSNet (Deep Cross Section Network)

Specialized neural operator for predicting electron-impact cross sections in plasma physics.

3.3.2.1. Architecture

DeepCSNet employs a modular coefficient-subnet structure:

  • Molecule Net: Processes molecular features (for multi-molecule mode)

  • Energy Net: Processes incident electron energy

  • Trunk Net: Processes scattering angles and kinematics

3.3.2.2. Operation Modes

  • SMC (Single-Molecule Configuration): Energy Net + Trunk Net

    • For single molecular species

  • MMC (Multi-Molecule Configuration): Molecule Net + Energy Net + Trunk Net

    • For multiple molecular species simultaneously

3.3.2.3. Applications

  • Predicting doubly differential ionization cross sections (DDCS)

  • Total ionization cross sections

  • Fast cross section lookup for plasma kinetic simulations

Physical Relevance: Cross sections are fundamental to plasma modeling, determining collision rates, energy transfer, and species production. DeepCSNet provides orders-of-magnitude speedup compared to first-principles calculations.

3.4. Equation Terms and Loss Components

The EquationTerm class provides a flexible abstraction for physics constraints:

class EquationTerm:
    """Encapsulates a single physics constraint.
    
    Attributes
    ----------
    name : str
        Identifier for the constraint (e.g., 'pde', 'bc', 'ic')
    residual_fn : callable
        Function computing the physics residual
    data : torch.Tensor
        Collocation points for evaluating residual
    weight : float
        Loss weight for balancing multiple constraints
    """

Benefits:

  • Modular constraint definition

  • Dynamic weight updates during training

  • Easy addition/removal of physics terms

  • Support for data batching via DataLoader

3.5. Visualization and Monitoring

AI4Plasma provides comprehensive tools for monitoring training progress:

3.5.1. TensorBoard Integration

Automatic logging of:

  • Loss components (PDE, BC, IC losses)

  • Total loss evolution

  • Learning rate schedules

  • Custom metrics

  • Solution visualizations

3.5.2. Visualization Callbacks

Abstract base class VisualizationCallback enables custom real-time monitoring:

class MyCallback(VisualizationCallback):
    def __call__(self, model, epoch, writer):
        # Custom visualization logic
        fig = plot_solution(model)
        writer.add_figure('solution', fig, epoch)

Features:

  • Automatic figure logging to TensorBoard

  • Configurable callback frequency

  • Multiple independent visualizations

  • No modification to core training loop

3.6. Training Utilities

3.6.1. Checkpoint Management

Full state saving and resumption:

# Save checkpoint
model.save_checkpoint('checkpoint.pth', epoch=1000)

# Resume training
model = PINN.load_checkpoint('checkpoint.pth', model=net)
model.train(epochs=2000, lr=1e-4)  # Continue from epoch 1000

3.6.2. Adaptive Loss Weighting

Automatically balance competing loss terms to prevent one constraint from dominating:

pinn = PINN(
    model=net,
    equation_terms=terms,
    use_adaptive_weights=True  # Enable adaptive weighting
)

3.6.3. Learning Rate Scheduling

Support for PyTorch schedulers:

pinn.train(
    epochs=10000,
    lr=1e-3,
    scheduler=torch.optim.lr_scheduler.StepLR(optimizer, step_size=1000, gamma=0.9)
)

3.7. Plasma Physics Applications

AI4Plasma is specifically designed for plasma simulation challenges:

3.7.1. Arc Discharge Modeling

  • Steady-state and transient 1D cylindrical arc

  • Temperature-dependent properties (conductivity, heat capacity)

  • Ohmic heating, radiation losses, convection

  • Automatic boundary condition enforcement

3.7.2. Cross Section Prediction

  • Electron-impact ionization cross sections

  • Multi-molecule species support

  • Fast lookup for kinetic simulations

3.7.3. Corona Discharge

  • Time-dependent ionization dynamics

  • Runge-Kutta time integration

  • Photoionization and recombination

3.7.4. Parametric Studies

  • Meta-learning for fast parameter sweeps

  • Neural operators for real-time predictions

  • Uncertainty quantification

3.8. Typical Workflow

A typical AI4Plasma workflow consists of:

  1. Problem Definition

    • Define governing PDEs

    • Specify boundary/initial conditions

    • Set up computational domain (geometry)

  2. Network Design

    • Choose network architecture (FNN, CNN, custom)

    • Set hyperparameters (depth, width, activation)

    • Optionally use NAS-PINN for automatic design

  3. Physics Encoding

    • Implement residual functions

    • Create equation terms with appropriate weights

    • Set up sampling points (collocation points)

  4. Training

    • Initialize PINN/operator model

    • Configure optimizer and scheduler

    • Add visualization callbacks

    • Train with TensorBoard monitoring

  5. Validation & Deployment

    • Compare with reference solutions or experiments

    • Compute error metrics (L2 error, relative error)

    • Export model for inference

    • Use in larger simulation pipelines

3.9. References

  1. PINNs: M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, pp. 686-707, 2019.

  2. DeepONet: L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, “Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators,” Nature Machine Intelligence, vol. 3, no. 3, pp. 218-229, 2021.

  3. CS-PINN: L. Zhong, B. Wu, and Y. Wang, “Low-temperature plasma simulation based on physics-informed neural networks: Frameworks and preliminary applications,” Physics of Fluids, vol. 34, no. 8, p. 087116, 2022.

  4. RK-PINN: L. Zhong, B. Wu, and Y. Wang, “Low-temperature plasma simulation based on physics-informed neural networks: Frameworks and preliminary applications,” Physics of Fluids, vol. 34, no. 8, p. 087116, 2022.

  5. Meta-PINN: L. Zhong, B. Wu, and Y. Wang, “Accelerating physics-informed neural network based 1D arc simulation by meta learning,” Journal of Physics D: Applied Physics, vol. 56, p. 074006, 2023.

  6. NAS-PINN: Y. Wang and L. Zhong, “NAS-PINN: Neural architecture search-guided physics-informed neural network for solving PDEs,” Journal of Computational Physics, vol. 496, p. 112603, 2024.

  7. DeepCSNet: Y. Wang and L. Zhong, “DeepCSNet: a deep learning method for predicting electron-impact doubly differential ionization cross sections,” Plasma Sources Science and Technology, vol. 33, no. 10, p. 105012, 2024.

3.10. Next Steps

  • Explore the API Reference for detailed class and function documentation

  • Check out Examples for practical tutorials

  • Read the Developer Guide to understand the internals

  • Try the example scripts in the app/ directory