3. Basic Concepts
AI4Plasma is the world’s first AI library specifically designed for plasma physics simulation, combining state-of-the-art machine learning techniques with rigorous physics-based modeling. This guide introduces the fundamental concepts and architectural components that power AI4Plasma.
3.1. Core Architecture
3.1.1. Network Components
AI4Plasma provides flexible neural network building blocks optimized for scientific computing:
3.1.1.1. FNN (Fully Connected Neural Network)
Dense multi-layer perceptrons with:
Customizable depth and width
Optional batch normalization
Multiple weight initialization strategies (Xavier, zero initialization)
Tanh activation by default (suitable for smooth physics solutions)
Precise floating-point control (REAL precision)
Structure:
Input → Linear → [BN?] → Activation → ... → Linear (output)
3.1.1.2. CNN (Convolutional Neural Network)
1D/2D/3D convolution-based architectures with:
Flexible backbone with optional fully connected head
Adaptive global pooling strategies
Batch normalization and max/avg pooling options
Automatic dimension detection
Lazy initialization based on actual feature sizes
Structure:
Input → [Conv → BN? → Activation → Pool?] × N
→ Global Pool or Flatten
→ [FC → Activation?] × M → Output
3.1.1.3. RelaxFNN (Neural Architecture Search)
Relaxed architecture for automatic network design:
Soft selection over different architectural choices
Learnable architecture parameters optimized jointly with weights
Efficient search for problem-specific network structures
Discrete architecture extraction after search
3.1.2. Geometry System
The geometry system provides a unified interface for domain and boundary sampling across different problem types:
3.1.2.1. Base Classes
Geometry: Abstract base class defining the interface for all geometric domainsGeoTime: Temporal domain \([t_s, t_e]\)Geo1D: 1D spatial domain \([x_l, x_u]\)Geo1DTime: Space-time domain for 1D problemsGeoPoly2D: 2D polygonal domainsGeoRect2D: 2D rectangular domainsGeoPoly2DTime: Space-time domain for 2D problems
3.1.2.2. Sampling Strategies
Uniform: Evenly spaced grid sampling
Random: Uniform random distribution
LHS: Latin Hypercube Sampling (for efficient space-filling designs)
Example Usage:
from ai4plasma.piml.geo import Geo1DTime, SamplingMode
# Create a space-time domain
geo = Geo1DTime()
geo.create_domain(xl=0.0, xu=1.0, ts=0.0, te=1.0)
# Sample interior points
X_interior = geo.sample_domain(nx=100, nt=50, mode=SamplingMode.UNIFORM)
# Sample initial condition
X_ic = geo.sample_ic(nx=100, mode=SamplingMode.RANDOM)
3.2. Physics-Informed Machine Learning (PIML)
3.2.1. Physics-Informed Neural Networks (PINNs)
PINNs embed physical laws directly into neural network training through residual-based loss functions. Instead of requiring large labeled datasets, PINNs learn solutions by minimizing physics equation residuals.
3.2.1.1. Mathematical Formulation
For a PDE of the form:
with boundary conditions \(\mathcal{B}[u] = 0\) and initial conditions \(u(x, 0) = u_0(x)\), a PINN minimizes:
where:
\(\mathcal{L}_{pde} = \frac{1}{N_{pde}}\sum_{i=1}^{N_{pde}} |\mathcal{F}[u_\theta](x_i, t_i)|^2\)
\(\mathcal{L}_{bc} = \frac{1}{N_{bc}}\sum_{i=1}^{N_{bc}} |\mathcal{B}[u_\theta](x_i, t_i)|^2\)
\(\mathcal{L}_{ic} = \frac{1}{N_{ic}}\sum_{i=1}^{N_{ic}} |u_\theta(x_i, 0) - u_0(x_i)|^2\)
3.2.1.2. Key Features
The PINN framework in AI4Plasma provides:
Multi-Physics Support: Handle arbitrary coupled PDEs through
EquationTermabstractionAutomatic Differentiation: Compute spatial/temporal derivatives via PyTorch autograd
Adaptive Loss Weighting: Automatically balance competing physics constraints
Batch Training: Support for large datasets via DataLoader integration
Visualization Callbacks: Real-time monitoring with custom visualization functions
Checkpoint Management: Save and resume training with full state recovery
TensorBoard Integration: Comprehensive logging and monitoring
Example:
from ai4plasma.piml.pinn import PINN, EquationTerm
# Define PDE residual
def pde_residual(model, X):
x, t = X[:, 0:1], X[:, 1:2]
u = model(X)
u_t = model.df_dt(X, u)
u_xx = model.df_dxx(X, u, x_idx=0)
return u_t - 0.01 * u_xx # Heat equation
# Create equation term
pde_term = EquationTerm(
name="pde",
residual_fn=pde_residual,
data=X_pde,
weight=1.0
)
# Build and train PINN
pinn = PINN(model=net, equation_terms=[pde_term])
pinn.train(epochs=10000, lr=1e-3)
3.2.2. PINN Variants
AI4Plasma includes several advanced PINN variants for specialized applications:
3.2.2.1. CS-PINN (Coefficient-Subnet PINN)
Specialized for problems with complex material properties and temperature-dependent coefficients:
Automatic Boundary Enforcement: Network architecture guarantees boundary conditions by construction
Spline Interpolation: Handles temperature-dependent properties (thermal conductivity, heat capacity)
Gauss-Legendre Quadrature: Accurate computation of integral terms
Application: Arc discharge simulations with temperature-dependent plasma properties
Network Construction:
This structure automatically satisfies \(T(R) = T_b\), reducing training complexity.
3.2.2.2. Meta-PINN
Enables rapid adaptation to new physics tasks through meta-learning:
Task Abstraction: Support/query split for few-shot learning
MAML Framework: Model-Agnostic Meta-Learning for PINN
Fast Adaptation: Quickly fine-tune to new parameters (current, geometry)
Application: Multi-current arc discharge, multi-geometry problems
Meta-Learning Workflow:
Meta-training: Sample task batch → adapt on support set → update on query set
Meta-testing: Initialize with meta-parameters → fine-tune on new task (few steps)
3.2.2.3. RK-PINN (Runge-Kutta PINN)
Incorporates Runge-Kutta time-stepping for improved temporal accuracy:
High-Order Time Integration: 4th-order Runge-Kutta scheme
Stage-by-Stage Training: Learn intermediate RK stages
Temporal Accuracy: Better handling of time-dependent dynamics
Application: Transient plasma phenomena, corona discharge
3.2.2.4. NAS-PINN (Neural Architecture Search PINN)
Automatically discovers optimal network architectures for specific physics problems:
Differentiable Architecture Search: Learn architecture parameters via gradient descent
Relaxed Architectures: Soft selection over architectural choices
Problem-Specific Optimization: Find best width/depth for target PDE
Application: Automated PINN design without manual hyperparameter tuning
3.3. Neural Operators
Neural operators learn mappings between infinite-dimensional function spaces, enabling fast evaluation of parametric PDEs and reducing computational cost of repeated simulations.
3.3.1. DeepONet (Deep Operator Network)
DeepONet learns nonlinear operators \(G: \mathcal{U} \to \mathcal{V}\) mapping input functions to output functions.
3.3.1.1. Architecture
DeepONet consists of two sub-networks:
Branch Network: Processes input functions \(u(x)\) (supports FNN and CNN)
Trunk Network: Processes evaluation coordinates \(y\)
Mathematical Formulation:
where:
\(b_i(u)\): \(i\)-th basis function from branch network
\(t_i(y)\): \(i\)-th basis function from trunk network
\(p\): latent dimension
3.3.1.2. Key Features
Automatic Architecture Detection: FNN for 2D data, CNN for 4D image-like data
Flexible Data Splitting: By branch samples or trunk evaluation points
Distributed Training: Full DataLoader support for large-scale problems
Checkpoint Management: Resume training with full state recovery
Applications:
Parametric PDE solving (e.g., Poisson equation with varying boundary conditions)
Fast surrogate models for expensive simulations
Real-time physics predictions
3.3.1.3. Example:
from ai4plasma.operator.deeponet import DeepONetModel
model = DeepONetModel(
branch_sizes=[100, 128, 128, 128], # Branch network
trunk_sizes=[2, 128, 128, 128], # Trunk network
basis_dim=128, # Latent dimension
bias_output=True
)
model.train_model(
train_loader=train_loader,
epochs=1000,
lr=1e-3
)
3.3.2. DeepCSNet (Deep Cross Section Network)
Specialized neural operator for predicting electron-impact cross sections in plasma physics.
3.3.2.1. Architecture
DeepCSNet employs a modular coefficient-subnet structure:
Molecule Net: Processes molecular features (for multi-molecule mode)
Energy Net: Processes incident electron energy
Trunk Net: Processes scattering angles and kinematics
3.3.2.2. Operation Modes
SMC (Single-Molecule Configuration): Energy Net + Trunk Net
For single molecular species
MMC (Multi-Molecule Configuration): Molecule Net + Energy Net + Trunk Net
For multiple molecular species simultaneously
3.3.2.3. Applications
Predicting doubly differential ionization cross sections (DDCS)
Total ionization cross sections
Fast cross section lookup for plasma kinetic simulations
Physical Relevance: Cross sections are fundamental to plasma modeling, determining collision rates, energy transfer, and species production. DeepCSNet provides orders-of-magnitude speedup compared to first-principles calculations.
3.4. Equation Terms and Loss Components
The EquationTerm class provides a flexible abstraction for physics constraints:
class EquationTerm:
"""Encapsulates a single physics constraint.
Attributes
----------
name : str
Identifier for the constraint (e.g., 'pde', 'bc', 'ic')
residual_fn : callable
Function computing the physics residual
data : torch.Tensor
Collocation points for evaluating residual
weight : float
Loss weight for balancing multiple constraints
"""
Benefits:
Modular constraint definition
Dynamic weight updates during training
Easy addition/removal of physics terms
Support for data batching via DataLoader
3.5. Visualization and Monitoring
AI4Plasma provides comprehensive tools for monitoring training progress:
3.5.1. TensorBoard Integration
Automatic logging of:
Loss components (PDE, BC, IC losses)
Total loss evolution
Learning rate schedules
Custom metrics
Solution visualizations
3.5.2. Visualization Callbacks
Abstract base class VisualizationCallback enables custom real-time monitoring:
class MyCallback(VisualizationCallback):
def __call__(self, model, epoch, writer):
# Custom visualization logic
fig = plot_solution(model)
writer.add_figure('solution', fig, epoch)
Features:
Automatic figure logging to TensorBoard
Configurable callback frequency
Multiple independent visualizations
No modification to core training loop
3.6. Training Utilities
3.6.1. Checkpoint Management
Full state saving and resumption:
# Save checkpoint
model.save_checkpoint('checkpoint.pth', epoch=1000)
# Resume training
model = PINN.load_checkpoint('checkpoint.pth', model=net)
model.train(epochs=2000, lr=1e-4) # Continue from epoch 1000
3.6.2. Adaptive Loss Weighting
Automatically balance competing loss terms to prevent one constraint from dominating:
pinn = PINN(
model=net,
equation_terms=terms,
use_adaptive_weights=True # Enable adaptive weighting
)
3.6.3. Learning Rate Scheduling
Support for PyTorch schedulers:
pinn.train(
epochs=10000,
lr=1e-3,
scheduler=torch.optim.lr_scheduler.StepLR(optimizer, step_size=1000, gamma=0.9)
)
3.7. Plasma Physics Applications
AI4Plasma is specifically designed for plasma simulation challenges:
3.7.1. Arc Discharge Modeling
Steady-state and transient 1D cylindrical arc
Temperature-dependent properties (conductivity, heat capacity)
Ohmic heating, radiation losses, convection
Automatic boundary condition enforcement
3.7.2. Cross Section Prediction
Electron-impact ionization cross sections
Multi-molecule species support
Fast lookup for kinetic simulations
3.7.3. Corona Discharge
Time-dependent ionization dynamics
Runge-Kutta time integration
Photoionization and recombination
3.7.4. Parametric Studies
Meta-learning for fast parameter sweeps
Neural operators for real-time predictions
Uncertainty quantification
3.8. Typical Workflow
A typical AI4Plasma workflow consists of:
Problem Definition
Define governing PDEs
Specify boundary/initial conditions
Set up computational domain (geometry)
Network Design
Choose network architecture (FNN, CNN, custom)
Set hyperparameters (depth, width, activation)
Optionally use NAS-PINN for automatic design
Physics Encoding
Implement residual functions
Create equation terms with appropriate weights
Set up sampling points (collocation points)
Training
Initialize PINN/operator model
Configure optimizer and scheduler
Add visualization callbacks
Train with TensorBoard monitoring
Validation & Deployment
Compare with reference solutions or experiments
Compute error metrics (L2 error, relative error)
Export model for inference
Use in larger simulation pipelines
3.9. References
PINNs: M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, pp. 686-707, 2019.
DeepONet: L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, “Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators,” Nature Machine Intelligence, vol. 3, no. 3, pp. 218-229, 2021.
CS-PINN: L. Zhong, B. Wu, and Y. Wang, “Low-temperature plasma simulation based on physics-informed neural networks: Frameworks and preliminary applications,” Physics of Fluids, vol. 34, no. 8, p. 087116, 2022.
RK-PINN: L. Zhong, B. Wu, and Y. Wang, “Low-temperature plasma simulation based on physics-informed neural networks: Frameworks and preliminary applications,” Physics of Fluids, vol. 34, no. 8, p. 087116, 2022.
Meta-PINN: L. Zhong, B. Wu, and Y. Wang, “Accelerating physics-informed neural network based 1D arc simulation by meta learning,” Journal of Physics D: Applied Physics, vol. 56, p. 074006, 2023.
NAS-PINN: Y. Wang and L. Zhong, “NAS-PINN: Neural architecture search-guided physics-informed neural network for solving PDEs,” Journal of Computational Physics, vol. 496, p. 112603, 2024.
DeepCSNet: Y. Wang and L. Zhong, “DeepCSNet: a deep learning method for predicting electron-impact doubly differential ionization cross sections,” Plasma Sources Science and Technology, vol. 33, no. 10, p. 105012, 2024.
3.10. Next Steps
Explore the API Reference for detailed class and function documentation
Check out Examples for practical tutorials
Read the Developer Guide to understand the internals
Try the example scripts in the
app/directory