Bootstrapped confidence intervals in routine analysis

How to obtain bootstrapped confidence intervals for simple routine operations.

Unless specified otherwise, the function fit will return moment-based confidence intervals based on the covariance matrix of the objective function used to fit the data. These are quick to calculate and therefore very comfortable for quick estimates of the uncertainty during routine analysis or testing.

However, for publication-level analysis, these confidence intervals might be inaccurate. It is strongly recommended to employ bootstrapped confidence intervals to get accurate estimates of the uncertainty. Conviniently, fit integrates bootstrapping to make it accessible via the keyword argument bootstrap which specifies the number of samples to analyze to estimate the uncertainty. The larger this number, the more accurate the confidence intervals but the longer the analysis will be. The standard for publication is typically 1000 samples.

import numpy as np
import matplotlib.pyplot as plt
import deerlab as dl
# File location
path = '../data/'
file = 'example_4pdeer_1.DTA'

# Experimental parameters
tau1 = 0.3      # First inter-pulse delay, μs
tau2 = 4.0      # Second inter-pulse delay, μs
tmin = 0.1      # Start time, μs

# Load the experimental data
t,Vexp = dl.deerload(path + file)

# Pre-processing
Vexp = dl.correctphase(Vexp) # Phase correction
Vexp = Vexp/np.max(Vexp)     # Rescaling (aesthetic)
t = t - t[0]                 # Account for zerotime
t = t + tmin

# Distance vector
r = np.linspace(2,5,100) # nm

# Construct the model
Vmodel = dl.dipolarmodel(t,r, experiment = dl.ex_4pdeer(tau1,tau2, pathways=[1]))

# Fit the model to the data
results = dl.fit(Vmodel,Vexp,bootstrap=20)

# In this example, just for the sake of time, we will just use 20 bootstrap samples.

# Print results summary
print(results)
Bootstrap analysis with 1 cores:

  0%|          | 0/20 [00:00<?, ?it/s]
  5%|▌         | 1/20 [00:09<02:55,  9.23s/it]
 10%|█         | 2/20 [00:18<02:43,  9.06s/it]
 15%|█▌        | 3/20 [00:27<02:35,  9.12s/it]
 20%|██        | 4/20 [00:36<02:26,  9.14s/it]
 25%|██▌       | 5/20 [00:43<02:11,  8.75s/it]
 30%|███       | 6/20 [00:50<01:58,  8.49s/it]
 35%|███▌      | 7/20 [00:59<01:50,  8.52s/it]
 40%|████      | 8/20 [01:08<01:42,  8.53s/it]
 45%|████▌     | 9/20 [01:17<01:34,  8.59s/it]
 50%|█████     | 10/20 [01:24<01:24,  8.49s/it]
 55%|█████▌    | 11/20 [01:34<01:16,  8.55s/it]
 60%|██████    | 12/20 [01:43<01:08,  8.61s/it]
 65%|██████▌   | 13/20 [01:51<01:00,  8.57s/it]
 70%|███████   | 14/20 [02:00<00:51,  8.64s/it]
 75%|███████▌  | 15/20 [02:10<00:43,  8.68s/it]
 80%|████████  | 16/20 [02:19<00:34,  8.71s/it]
 85%|████████▌ | 17/20 [02:29<00:26,  8.80s/it]
 90%|█████████ | 18/20 [02:36<00:17,  8.71s/it]
 95%|█████████▌| 19/20 [02:45<00:08,  8.73s/it]
100%|██████████| 20/20 [02:56<00:00,  8.80s/it]
100%|██████████| 20/20 [02:56<00:00,  8.80s/it]
100%|██████████| 20/20 [02:56<00:00,  8.80s/it]
Goodness-of-fit:
========= ============= ============= ===================== =======
 Dataset   Noise level   Reduced 𝛘2    Residual autocorr.    RMSD
========= ============= ============= ===================== =======
   #1         0.005         0.812             0.124          0.004
========= ============= ============= ===================== =======
Model hyperparameters:
==========================
 Regularization parameter
==========================
          0.053
==========================
Model parameters:
=========== ========= ========================= ====== ======================================
 Parameter   Value     95%-Confidence interval   Unit   Description
=========== ========= ========================= ====== ======================================
 mod         0.304     (0.300,0.307)                    Modulation depth
 reftime     0.299     (0.298,0.301)              μs    Refocusing time
 conc        147.220   (142.578,152.158)          μM    Spin concentration
 P           ...       (...,...)                 nm⁻¹   Non-parametric distance distribution
 P_scale     1.006     (0.988,1.046)             None   Normalization factor of P
=========== ========= ========================= ====== ======================================
# Extract fitted dipolar signal
Vfit = results.model
Vci = results.propagate(Vmodel).ci(95)

# Extract fitted distance distribution
Pfit = results.P
Pci95 = results.PUncert.ci(95)
Pci50 = results.PUncert.ci(50)

# Extract the unmodulated contribution
Bfcn = lambda mod,conc,reftime: results.P_scale*(1-mod)*dl.bg_hom3d(t-reftime,conc,mod)
Bfit = results.evaluate(Bfcn)
Bci = results.propagate(Bfcn).ci(95)

plt.figure(figsize=[6,7])
violet = '#4550e6'
plt.subplot(211)
# Plot experimental data
plt.plot(t,Vexp,'.',color='grey',label='Data')
# Plot the fitted signal
plt.plot(t,Vfit,linewidth=3,label='Bootstrap median',color=violet)
plt.plot(t,Bfit,'--',linewidth=3,color=violet,label='Unmodulated contribution')
plt.legend(frameon=False,loc='best')
plt.xlabel('Time $t$ (μs)')
plt.ylabel('$V(t)$ (arb.u.)')
# Plot the distance distribution
plt.subplot(212)
plt.plot(r,Pfit,linewidth=3,label='Bootstrap median',color=violet)
plt.fill_between(r,Pci95[:,0],Pci95[:,1],alpha=0.3,color=violet,label='95%-Conf. Inter.',linewidth=0)
plt.fill_between(r,Pci50[:,0],Pci50[:,1],alpha=0.5,color=violet,label='50%-Conf. Inter.',linewidth=0)
plt.legend(frameon=False,loc='best')
plt.autoscale(enable=True, axis='both', tight=True)
plt.xlabel('Distance $r$ (nm)')
plt.ylabel('$P(r)$ (nm$^{-1}$)')
plt.tight_layout()
plt.show()
ex bootstrapping

Total running time of the script: (4 minutes 1.554 seconds)

Gallery generated by Sphinx-Gallery