Simulation Methods: CMC, Line Sampling, and Subset Simulation#

This notebook compares three simulation-based reliability methods on two high-dimensional benchmark problems chosen to highlight the strengths and limitations of each method:

Method	Class	Key idea
Crude Monte Carlo	`CrudeMonteCarlo`	Direct sampling baseline
Line Sampling	`LineSampling`	Variance reduction along FORM important direction
Subset Simulation	`SubsetSimulation`	Decompose rare event into nested intermediate events

Example 1 — Hypersphere: demonstrates that FORM fails badly on a strongly nonlinear problem, and exposes a fundamental limitation of Line Sampling when the failure region is bilateral [Schueller2007].

Example 2 — Parabolic limit state: the classic LS benchmark from [Breitung1984], where the failure region is one-sided and LS performs as designed.

Example 1: High-Dimensional Hypersphere#

Problem Description#

We use the n-dimensional hypersphere limit state, a standard benchmark for high-dimensional reliability methods [Schueller2007]:

\[g(\mathbf{x}) = r^2 - \sum_{i=1}^{n} x_i^2, \quad x_i \sim \mathcal{N}(0, 1), \quad i = 1, \ldots, n\]

Failure occurs when \(g \leq 0\), i.e., when \(\mathbf{x}\) lies outside the hypersphere of radius \(r\). Because \(\sum x_i^2 \sim \chi^2(n)\), the exact failure probability is

\[p_f = P\!\left(\sum_{i=1}^n x_i^2 > r^2\right) = 1 - F_{\chi^2(n)}(r^2)\]

We choose \(n = 10\) and \(r = 5\). The key challenge is that FORM fails badly for this problem: the strongly curved failure surface causes a large discrepancy between the first-order approximation and the truth.

[1]:

import numpy as np
import pystra as ra
from scipy.stats import chi2, norm

Problem Setup#

[2]:

n = 10   # number of standard-normal random variables
r = 5.0  # hypersphere radius

# Exact solution via chi-squared survival function
Pf_exact   = float(chi2.sf(r**2, df=n))
beta_exact = float(-norm.ppf(Pf_exact))
print(f"Exact Pf   = {Pf_exact:.4e}")
print(f"Exact beta = {beta_exact:.4f}")

Exact Pf   = 5.3455e-03
Exact beta = 2.5526

[3]:

def lsf(**kwargs):
    """Hypersphere limit state: failure when ||x||^2 > r^2.

    Accepts any number of N(0,1) variables via **kwargs, so the
    problem dimension n can be changed without touching this function.
    """
    return r**2 - sum(v**2 for v in kwargs.values())


def make_model():
    """Build the stochastic model (n independent N(0,1) variables)."""
    model = ra.StochasticModel()
    for i in range(1, n + 1):
        model.addVariable(ra.Normal(f"X{i}", 0.0, 1.0))
    limit_state = ra.LimitState(lsf)
    options = ra.AnalysisOptions()
    options.setPrintOutput(False)
    return model, limit_state, options

FORM — First-Order Reliability Method#

FORM linearises the limit state at the design point. For the hypersphere, every point at distance \(r\) from the origin is a design point, so \(\beta_\text{FORM} = r = 5\). However, the strongly curved failure surface means the first-order approximation severely underestimates the failure probability.

[4]:

model, limit_state, options = make_model()
form = ra.Form(
    analysis_options=options,
    stochastic_model=model,
    limit_state=limit_state,
)
form.run()

beta_form = form.getBeta()
Pf_form   = form.getFailure()
print(f"FORM:  beta = {beta_form:.4f},  Pf = {Pf_form:.4e}")
print(f"Exact: beta = {beta_exact:.4f},  Pf = {Pf_exact:.4e}")
print(f"Error ratio Pf_exact / Pf_FORM = {Pf_exact / Pf_form:.0f}x")

FORM:  beta = 5.0003,  Pf = 2.8615e-07
Exact: beta = 2.5526,  Pf = 5.3455e-03
Error ratio Pf_exact / Pf_FORM = 18681x

The error confirms that FORM is unreliable for this highly nonlinear problem. Despite the inaccurate \(\beta\), the important direction \(\boldsymbol{\alpha}\) found by FORM points radially outward from the origin — which, by symmetry of the sphere, is still a valid direction for Line Sampling.

Crude Monte Carlo (CMC)#

CMC draws \(N\) independent samples from the joint distribution and counts failures. The estimator is unbiased but requires \(N \gg 1/p_f\) for acceptable variance. With \(p_f \approx 5 \times 10^{-3}\) we use \(N = 20\,000\) samples, giving roughly 100 failures and a coefficient of variation of ~10 %.

[5]:

np.random.seed(1)
model, limit_state, options = make_model()
options.setSamples(20_000)

cmc = ra.CrudeMonteCarlo(
    analysis_options=options,
    stochastic_model=model,
    limit_state=limit_state,
)
cmc.run()

beta_cmc = cmc.getBeta()
Pf_cmc   = cmc.getFailure()
print(f"CMC:  beta = {beta_cmc:.4f},  Pf = {Pf_cmc:.4e}")

CMC:  beta = 2.5491,  Pf = 5.4000e-03

Line Sampling (LS)#

Line Sampling projects \(N\) standard-normal samples onto the \((n-1)\)-dimensional hyperplane perpendicular to the FORM important direction \(\boldsymbol{\alpha}\), then solves a one-dimensional root-finding problem along each line to locate the limit-state surface. The failure probability estimate is

\[\hat{p}_f = \frac{1}{N}\sum_{i=1}^{N} \Phi(-c_i)\]

where \(c_i\) is the signed distance to the limit-state surface along \(\boldsymbol{\alpha}\) for line \(i\). Only \(N = 200\) lines are needed for high accuracy.

[6]:

np.random.seed(2)
model, limit_state, options = make_model()
options.setSamples(200)

ls = ra.LineSampling(
    analysis_options=options,
    stochastic_model=model,
    limit_state=limit_state,
    form=form,   # re-use the FORM result for the important direction
)
ls.run()

beta_ls = ls.getBeta()
Pf_ls   = ls.getFailure()
print(f"LS:  beta = {beta_ls:.4f},  Pf = {Pf_ls:.4e},  CoV = {ls.cov:.3f}")

LS:  beta = 3.0925,  Pf = 9.9239e-04,  CoV = 0.299

Subset Simulation (SS)#

Subset Simulation decomposes the rare failure event \(F = \{g(\mathbf{x}) \le 0\}\) into a sequence of nested intermediate events \(F_1 \supset F_2 \supset \cdots \supset F_m = F\):

\[p_f = P(F_1)\prod_{j=2}^{m} P(F_j \mid F_{j-1})\]

Conditional samples at each level are generated via Modified Metropolis–Hastings (MMH), keeping the conditional probability per level close to the target \(p_0 = 0.1\).

[7]:

np.random.seed(3)
model, limit_state, options = make_model()
options.setSamples(2_000)

ss = ra.SubsetSimulation(
    analysis_options=options,
    stochastic_model=model,
    limit_state=limit_state,
    p0=0.1,
)
ss.run()

beta_ss = ss.getBeta()
Pf_ss   = ss.getFailure()
print(f"SS:  beta = {beta_ss:.4f},  Pf = {Pf_ss:.4e},  CoV = {ss.cov:.3f}")
print(f"     levels = {ss.n_levels},  thresholds = {[round(t, 2) for t in ss.thresholds]}")

SS:  beta = 2.5669,  Pf = 5.1300e-03,  CoV = 0.097
     levels = 3,  thresholds = [9.51, 2.21, 0.0]

Summary#

[8]:

import pandas as pd

rows = [
    ("Exact",          beta_exact, Pf_exact, "—"),
    ("FORM",           beta_form,  Pf_form,  f"{form.model.getCallFunction()} LSF calls"),
    ("CMC",            beta_cmc,   Pf_cmc,   "20 000 samples"),
    ("Line Sampling",  beta_ls,    Pf_ls,    f"{ls.n_samples} lines"),
    ("Subset Sim.",    beta_ss,    Pf_ss,    f"{ss.n_levels} × 2 000 samples"),
]

df = pd.DataFrame(rows, columns=["Method", "beta", "Pf", "Cost"])
df["beta"] = df["beta"].map("{:.4f}".format)
df["Pf"]   = df["Pf"].map("{:.4e}".format)
df

[8]:

	Method	beta	Pf	Cost
0	Exact	2.5526	5.3455e-03	—
1	FORM	5.0003	2.8615e-07	164 LSF calls
2	CMC	2.5491	5.4000e-03	20 000 samples
3	Line Sampling	3.0925	9.9239e-04	200 lines
4	Subset Sim.	2.5669	5.1300e-03	3 × 2 000 samples

Key observations#

FORM gives \(\beta = r = 5\), but the true reliability index is \(\approx 2.56\). The curved failure surface in 10 dimensions causes FORM to underestimate \(p_f\) by several orders of magnitude.
CMC recovers the correct failure probability but requires 20 000 samples for only modest precision (~10 % CoV).
Line Sampling underestimates \(p_f\) for this problem because the hypersphere failure region is bilateral: for any chosen direction \(\boldsymbol{\alpha}\), failure occurs on both sides (\(c > c^*\) and \(c < -c^*\)). LS finds only the positive (safe→failure) crossing and contributes \(\Phi(-c^*)\), silently missing the symmetric contribution from the other tail. LS is most effective when the failure region is one-sided relative to \(\boldsymbol{\alpha}\), as in linear or mildly nonlinear limit states.
Subset Simulation adaptively finds the intermediate thresholds and reaches the failure region without knowing \(p_f\) in advance, using a small number of samples per level.

Example 2: Parabolic Limit State#

Problem Description#

The parabolic limit state is the canonical LS benchmark [Breitung1984]:

\[g(\mathbf{u}) = \beta_0 - u_1 + \kappa \sum_{i=2}^{n} u_i^2, \quad u_i \sim \mathcal{N}(0,1)\]

Failure is one-sided: it is driven by large \(u_1\), with the secondary variables introducing a parabolic correction. For \(\kappa < 0\) the failure surface curves toward the origin, making the true \(p_f\) larger than the first-order estimate.

The FORM design point is \(\mathbf{u}^* = (\beta_0, 0, \ldots, 0)\) with important direction \(\boldsymbol{\alpha} = \mathbf{e}_1\), giving \(\beta_\text{FORM} = \beta_0\) exactly. The Breitung SORM approximation corrects for the \(n-1\) equal principal curvatures \(\kappa_i = 2\kappa\):

\[p_f^\text{SORM} = \Phi(-\beta_0)\,(1 + 2\kappa\beta_0)^{-(n-1)/2}\]

This serves as the reference solution.

[9]:

beta0 = 3.0
kappa = -0.05   # negative → failure surface curves toward origin
# n = 10 (reused from Example 1)

# SORM (Breitung) reference  — principal curvatures kappa_i = 2*kappa
Pf_sorm  = float(norm.cdf(-beta0) * (1 + 2*kappa*beta0)**(-(n-1)/2))
beta_sorm = float(-norm.ppf(Pf_sorm))
Pf_form2  = float(norm.cdf(-beta0))

print(f"FORM (exact): beta = {beta0:.4f},  Pf = {Pf_form2:.4e}")
print(f"SORM ref:     beta = {beta_sorm:.4f},  Pf = {Pf_sorm:.4e}")
print(f"Correction factor: {Pf_sorm / Pf_form2:.1f}x")

FORM (exact): beta = 3.0000,  Pf = 1.3499e-03
SORM ref:     beta = 2.4719,  Pf = 6.7199e-03
Correction factor: 5.0x

Problem Setup#

[10]:

def lsf_parabolic(**kwargs):
    """Parabolic limit state. Scalable via **kwargs — works for any n."""
    vals = list(kwargs.values())
    u1 = vals[0]
    return beta0 - u1 + kappa * sum(v**2 for v in vals[1:])


def make_parabolic_model():
    model = ra.StochasticModel()
    for i in range(1, n + 1):
        model.addVariable(ra.Normal(f"X{i}", 0.0, 1.0))
    limit_state = ra.LimitState(lsf_parabolic)
    options = ra.AnalysisOptions()
    options.setPrintOutput(False)
    return model, limit_state, options

FORM Solution#

[11]:

# FORM — design point is (beta0, 0,...,0) by construction
model_p, limit_state_p, options_p = make_parabolic_model()
form_p = ra.Form(analysis_options=options_p, stochastic_model=model_p,
                 limit_state=limit_state_p)
form_p.run()
beta_form_p = form_p.getBeta()
Pf_form_p   = form_p.getFailure()
print(f"FORM:  beta = {beta_form_p:.4f},  Pf = {Pf_form_p:.4e}")

FORM:  beta = 3.0000,  Pf = 1.3499e-03

CMC Solution#

[12]:

np.random.seed(4)
model_p, limit_state_p, options_p = make_parabolic_model()
options_p.setSamples(20_000)
cmc_p = ra.CrudeMonteCarlo(analysis_options=options_p, stochastic_model=model_p,
                            limit_state=limit_state_p)
cmc_p.run()
beta_cmc_p = cmc_p.getBeta()
Pf_cmc_p   = cmc_p.getFailure()
print(f"CMC:   beta = {beta_cmc_p:.4f},  Pf = {Pf_cmc_p:.4e}")

CMC:   beta = 2.5006,  Pf = 6.2000e-03

LineSampling#

[13]:

np.random.seed(5)
model_p, limit_state_p, options_p = make_parabolic_model()
options_p.setSamples(200)
ls_p = ra.LineSampling(analysis_options=options_p, stochastic_model=model_p,
                       limit_state=limit_state_p, form=form_p)
ls_p.run()
beta_ls_p = ls_p.getBeta()
Pf_ls_p   = ls_p.getFailure()
print(f"LS:    beta = {beta_ls_p:.4f},  Pf = {Pf_ls_p:.4e},  CoV = {ls_p.cov:.3f}")

LS:    beta = 2.4757,  Pf = 6.6490e-03,  CoV = 0.068

Subset Simulation#

[14]:

np.random.seed(6)
model_p, limit_state_p, options_p = make_parabolic_model()
options_p.setSamples(2_000)
ss_p = ra.SubsetSimulation(analysis_options=options_p, stochastic_model=model_p,
                            limit_state=limit_state_p, p0=0.1)
ss_p.run()
beta_ss_p = ss_p.getBeta()
Pf_ss_p   = ss_p.getFailure()
print(f"SS:    beta = {beta_ss_p:.4f},  Pf = {Pf_ss_p:.4e},  CoV = {ss_p.cov:.3f}")
print(f"       levels = {ss_p.n_levels},  thresholds = {[round(t,2) for t in ss_p.thresholds]}")

SS:    beta = 2.5669,  Pf = 5.1300e-03,  CoV = 0.097
       levels = 3,  thresholds = [1.26, 0.22, 0.0]

Summary#

[15]:

rows_p = [
    ("SORM ref.",       beta_sorm,    Pf_sorm,    "analytical"),
    ("FORM",            beta_form_p,  Pf_form_p,  f"{form_p.model.getCallFunction()} LSF calls"),
    ("CMC",             beta_cmc_p,   Pf_cmc_p,   "20 000 samples"),
    ("Line Sampling",   beta_ls_p,    Pf_ls_p,    f"{ls_p.n_samples} lines"),
    ("Subset Sim.",     beta_ss_p,    Pf_ss_p,    f"{ss_p.n_levels} × 2 000 samples"),
]
df_p = pd.DataFrame(rows_p, columns=["Method", "beta", "Pf", "Cost"])
df_p["beta"] = df_p["beta"].map("{:.4f}".format)
df_p["Pf"]   = df_p["Pf"].map("{:.4e}".format)
df_p

[15]:

	Method	beta	Pf	Cost
0	SORM ref.	2.4719	6.7199e-03	analytical
1	FORM	3.0000	1.3499e-03	28 LSF calls
2	CMC	2.5006	6.2000e-03	20 000 samples
3	Line Sampling	2.4757	6.6490e-03	200 lines
4	Subset Sim.	2.5669	5.1300e-03	3 × 2 000 samples

Key observations#

FORM recovers the design point exactly (\(\beta = \beta_0 = 3\)) but ignores the parabolic curvature, underestimating \(p_f\) by ~5×.
CMC gives an unbiased estimate but requires many samples for adequate precision at this failure probability level.
Line Sampling corrects the FORM result accurately with only 200 function evaluations. Because failure is one-sided (driven by large \(u_1\) along \(\boldsymbol{\alpha} = \mathbf{e}_1\)), each line crosses the failure surface exactly once on the positive side, and the \(\Phi(-c_i^*)\) contribution is correct. This is LS working as designed.
Subset Simulation also gives an accurate result, adaptively discovering the intermediate thresholds without prior knowledge of \(p_f\).

References#

Breitung, K. (1984). Asymptotic approximations for multinormal integrals. Journal of Engineering Mechanics, 110(3), 357–366.

Au, S. K., & Beck, J. L. (2001). Estimation of small failure probabilities in high dimensions by subset simulation. Probabilistic Engineering Mechanics, 16(4), 263–277.