Skip to contents

The R package pointapply contains the code and data to reconstruct the publication: Martin Schobben, Michiel Kienhuis, and Lubos Polerecky. 2021. New methods to detect isotopic heterogeneity with Secondary Ion Mass Spectrometry, preprint on Eartharxiv.

Introduction

This vignette shows how ion count data of a Secondary Ion Mass Spectrometry (SIMS) isotope analysis is simulated, which forms the basis to test the performance of th intra- and inter-analysis isotope test as introduced in the paper (Section 4.1 Simulated data). The accompanying vignette Visualise test performance (vignette("performance")) provides the code to plot the synthetic data produced here (Figs 4 and 5).

The following packages are used for the simulation.

library(point) # regression diagnostics
library(pointapply) # load package

The performance of the intra- and inter-analysis isotope tests bundled in the point package are tested with the simulated (or synthetic) data.

Simulate intra-analysis isotope variability data

The first simulation is designed to assess the sensitivity of the intra-analysis isotope variability test for two continuous variables; 1) ionization efficiency and, 2) the isotope offset between two components. The ionization efficiency is a systematic fluctuation in the emitted secondary ions, that should normally be mirrored in strength among both isotopes of the same element (Fitzsimons, Harte, and Clark 2000) (and see the paper). The ionization effect will have a bearing on the intra-analysis isotope variability test, as it is a regression-based method, where variance of the independent variable yields more precise estimation of the coefficients in the linear model. See the figure below to see how the difference in ionization efficiency affects how data points cluster, where a dense cluster signifies the absence of variability in in ionization efficiency and an elongated cluster with increased variance of ionization.

Conceptual sketches for ionization efficiency.

Conceptual sketches for ionization efficiency.

In the above representation, the label “predicted” and “true” refer to a band of two times the predicted and descriptive standard deviation of \(X^a\) around the mean, respectively. The difference in those two error bars during fluctuating ionization is referred to in the paper as “excess ionization”.

The isotope offset encompasses two end-member components that cause intra-analysis isotope variability in the analysis, and this is one of the foremost targets in this study. Hence, a range of 22.0‰ is chosen to mimic the divergent isotope composition of carbonate and organic carbon inclusion hosted in a predominant carbonate matrix as a realistic scenario for a SIMS isotope analyses. Whereas, the range in ionization efficiency is varied from 0 to 120%, or 0 to 34% excess ionization. To exemplify this, the following figure shows a scenario of an organic component with a \(\delta\)13C of 32.0‰ in a predominant carbonate matrix with a \(\delta\)13C of 0‰.

Conceptual sketches for isotopic offset.

Conceptual sketches for isotopic offset.

Parameters

For the sensitivity test (test_sensitivity), excess ionization and the isotopic offset are crossed for all possible combination with the tidyr (Wickham and Girlich 2022) function crossing(). In addition, a 10 times repetition is included with a specified seed for each of the repetitions to ensure that the study is reproducible.

# varying linear trends in the ionization efficiency (percent common isotope)
var_T <- seq(0, 120, length.out = 12)
# varying isotope offset (delta per mille)
var_I <- seq(0, -22, length.out = 12)

Conceptual models of intra-analysis isotope variability

Besides the isotope-offset introduced above, two scenarios of intra-analysis isotope variability are tested (Supplementary Section 1.2: Synthetic data of the paper). The first scenario, referred to as “asymmetric” intra-analysis isotope variation, would approximate, for example, a situation in which the primary ion beams cuts through an organic inclusion (component A) with depth within a predominant carbonate matrix (component B), and thus both the matrix and isotopic composition deviate between components A and B. In a second scenario of “symmetric” intra-analysis isotope variation, a simulated gradient in the isotopic composition traverses the complete analyses.

Conceptual sketches for intra-analysis R variation.

Conceptual sketches for intra-analysis R variation.

Execute intra-analysis isotope variability simulation

The function sensitivity_test is especially designed to simulate SIMS isotope count data. The default arguments .ion1 = "13C", ion2 = "12C" and .reference = "VPDB" define the isotope system and reference scale used to simulate the data. Together with the previous defined ranges for the sensitivity test (excess ionization and the isotopic offset) this produces the simulated data-set with intra-analysis isotope variability. The execution time of this function benefits from parallelization and the number of used workers can be set with the argument mc_cores. The generated data in the appropriate directory (data directory of the package).

# sensitivity
test_sensitivity(var_T, var_I, reps = 10, mc_cores = 4, save = TRUE)

Sensitivity intra-analysis isotope variability test

The function diag_R() of the point package are the core functions for assessing intra-analysis isotope variability (as introduced in the paper). To gauge the performance of this test is analysed by applying the function to the synthetic data.

Execute intra-analysis isotope variability test

Two types of outlier detection are available; \(\sigma_R\)-rejection method (default method in the Cameca software) and the Cook’s Distance measure (see paper for details).

Cooks D

The Cook’s D method is the default method of the point function diag_R(), and is therefore simple to execute.

# performace Cooks D diagnostics
test_sensitivity(var_T, var_I, reps = 10, diag = "CooksD", mc_cores = 4, 
                 save = TRUE)

Cameca

The \(\sigma_R\)-rejection method requires the .method of diag_R() to be set to "Cameca". In addition, to faithfully mimic the behaviour of the Cameca software the grouping structure needs to be adjusted slightly, where this software uses an intermediate (but artificial) subdivision of a single analysis; the so-called “block” (variable: bl.nm), which consists generally of ~50 measurements. To accommodate this, the diag_R() and eval_diag() functions are successively applied with a respective grouping structure for “blocks” and then “single analysis”. Note, that the .output of diag_R() is set to "diagnostic" to only produce outlier detection results. The application of eval_diag() produces now mean values for the single blocks, which are irrelevant for the subsequent study, as such the tibble (Müller and Wickham 2021) is filtered for distinct analysis only by using distinct() (Wickham et al. 2022). This functionality has all been wrapped in the function test_sensitivity, which, furthermore, allows parallel execution (by setting mc_cores to an appropriate number) of this data intensive exercize.

# performance Cameca diagnostics
test_sensitivity(var_T, var_I, reps = 10, diag = "Cameca", mc_cores = 4, 
                 save = TRUE)

Simulate inter-analysis isotope variability data

The simulated data to assess the performance of the inter-analysis isotope variability test encompasses the same rang as for the intra-analysis isotope variability test. However, for this simulation, no intra-analysis isotope variance is included (instead the ideal linear R model is adopted), and only the starting value is varied over a range of 12.0‰ with again the VPDB isotope scale as a reference framework.

parameters

The same procedure, but with a somewhat different parameter-set, is used to cross all possible combination with again 10 repetitions.

# varying linear trends in the ionization efficiency (percent common isotope)
var_T <- seq(0, 120, length.out = 12)
# varying isotope base (delta per mille)
var_B <- seq(0, -11, length.out = 12)

Execute inter-analysis isotope simulation

Execution of the simulation (test_sensitivity()) for inter-analysis isotope data follows the same procedure as previously outlined for the intra-analysis isotope data simulation.

Post-simulation processing

The protocol for simulating the inter-analysis isotope dataset differs from the intra-analysis isotope variability protocol in that it includes a post-simulation processing step. Here, the \(m\)-series of analyses (a typical study along a transect) is recombined by randomly replacing one out of the 10 samples of the 0‰ (VPDB) set with an analysis that has an isotope value along the predefined range of -12–0‰. This replacement is repeated for each of the initial isotope values as defined in the starting parameters, and repeated 10 times, which produces 10 times 10-series of analyses with one anomalous isotope value, for each of the starting isotope values. This functionality is wrapped in test_sensitivity() , which again, directly writes the results to the correct directory.

Sensitivity intra-analysis isotope variability test

The intra-analysis isotope variability test is also encapsulated in the sequential usage of diag_R() of the point package. However, for this sensitivity run the grouping is a little different as it has to point to the \(m\)-series of analyses, instead of single measurements (in the \(n\)-series of measurements. One note of caution, running this code requires a high workload from your computer.

# performance inter-analysis diagnostics
test_sensitivity(var_T, var_B, reps = 10, mc_cores = 4, type = "inter", 
                 save = TRUE)

References

Fitzsimons, I. C. W., B. Harte, and R. M. Clark. 2000. “SIMS stable isotope measurement: counting statistics and analytical precision.” Mineralogical Magazine 64 (01): 59–83. https://doi.org/10.1180/002646100549139.

Müller, Kirill, and Hadley Wickham. 2021. Tibble: Simple Data Frames. https://CRAN.R-project.org/package=tibble.

Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2022. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.

Wickham, Hadley, and Maximilian Girlich. 2022. Tidyr: Tidy Messy Data. https://CRAN.R-project.org/package=tidyr.