Skip to contents

1 Precision of Ion Count Data

The random nature of secondary ions emitted from an analytical substrate (e.g. rock sample) during sputtering can be described by Poisson statistics, which can be used to predict the precision of pulsed ion counts (e.g. measurements with a Cameca NanoSIMS 50L) under ideal circumstances. More specifically, the variation can be deduced from the total counts of secondary ions. Usefully, we can compare these predictive values with the descriptive statistics; essentially estimates of the true population location (e.g. mean) and spread (variance). This requires the assumption that the sample is taken from an infinite population. In the package point, tools are provided that perform these statistical tests on raw ion count data with appropriate error propagation in the case of isotope ratios. Working with raw ion count data has certain benefits as it allows subsetting of certain parts of the analysis by checking for anomalous measurements. The latter is more detailed in the vignette IC-diagnostics, which itself heavily relies on the here outlined functions.

library(point) # load package

The following packages are used in the examples that follow.

library(dplyr) # manipulating data
library(purrr) # functional programming
library(stringr) # manipulating strings

1.1 Nomenclature

  • Sample: sample of the true population
  • Analytical substrate: physical sample measured during SIMS analysis
  • Event: single event of an ion hitting the detector
  • Measurement: single count cycle \(N_i\)
  • Analysis: \(n\)-series of measurements \(N_{(i)} = M_j\)
  • Study: \(m\)-series of analyses \(M_{(j)}\), constituting the different spots on the analytical substrate

1.2 Example dataset

Example datatsets can be accessed as follows with the function read_IC (more information on reading raw ion count data can be found in the vignette IC-read).

# Use point_example() to access the examples bundled with this package 

# Carry-out the routine point work-flow
# Raw data containing 13C and 12C counts on carbonate
tb_rw <- read_IC(point_example("2018-01-19-GLENDON"), meta = TRUE, hide = FALSE)

# Vectors of isotope ratios 
ion1 <-  c("13C", "12C 13C", "13C 14N", "12C 14N", "12C")
ion2 <-  c("12C", "12C2", "12C 14N", "40Ca 16O", "40Ca 16O")

# Call function over vectors
tb_rw <- map2(ion1, ion2, ~zeroCt(tb_rw, .x, .y, file.nm, sample.nm, .N = N.rw)) 
# Combine but remove duplicate observations (related isotope pairs)  
tb_rw <- reduce(tb_rw, union)  

The ion counts obtained from the 2018-01-19-GLENDON dataset includes the species: \({}^{12}\mathrm{C}_{}\), \({}^{13}\mathrm{C}_{}\), \({}^{12}\mathrm{C}_{}\)\({}^{14}\mathrm{N}_{}\), \({}^{40}\mathrm{Ca}_{}\)\({}^{16}\mathrm{O}_{}\), of which some are poly-atomic.

As a first step, the counts of a single count cycle (\(N_i\)) are normalised against the time it took to complete the cycle (\(0.541\) s) to account for differences in the count times for two different isotopes during stable isotopic SIMS analysis. Hence, for the time period (\(t\)) over which an isotope species \(a\) during measurement \(i\) accumulated, the count rate is given by

\[\begin{equation} X_i^{a} = N_i^{a} / t_i^{a} \tag{1.1} \end{equation}\]

The function cor_IC() can perform this transformation.

# Processing raw ion count data
tb_pr <- cor_IC(tb_rw)

This function can also correct the ion counts for effects associated with the machine setup, such as, artifacts induced by the ion detector type. These settings mostly affect the accuracy of the analysis. For more information on this topic see the vignette IC-process.

1.3 Internal precision of ion count data

Internal analytical precision is a consistency check of a series of analytical results. The package point contains several functions needed to obtain the descriptive and predictive statistics (Poisson statistics) to assess the internal precision of count data for single ions as well as isotope ratios. In the here-outlined examples, the internal consistency of ion count data generated with a NanoSIMS Cameca 50L is validated, and the underlying principals of the statistical treatment is explained.

1.3.1 Descriptive and predictive statistics for single ions

The function stat_X() can be applied to the previously processed dataset and gives descriptive and predictive statistics of all the individual ions. The function requires the following arguments: .IC, which is a tibble containing the processed ion counts (N.pr by default), ion count rates (X.pr by default), the ion species name (species.nm by default), and the time increments of the measurement (t.nm by default). The dots ... should be used to define a grouping variable for identifying e.g. individual analysis (here defined as the sample.nm and file.nm names of the loaded data). In addition, it is possible to choose whether the variable names can be rendered with \(\LaTeX\); and whether to create a tibble that contains only statistics as a summary table "sum"; a tibble with the same number of observations as the input dataset "stat"; or return a tibble complete with both the statistics and the original dataset "complete". Lastly one can decide to only produce a subset of statistic transformations with .stat. The overview of avalaible statistical transformations can be found in names_stat_X

# Single ion descriptive an predictive statistics for all measured ions
tb_X <- stat_X(tb_pr, sample.nm, file.nm, .stat = c("tot", "M", "S", "SeM"), 
               .label = "webtex")
Table 1.1: Summary statistics for internal precision of single ions with stat_X().
sample.nm file.nm species.nm \(N_{tot}\) \(\bar{X}\) \(s_{X}\) \(s_{\bar{X}}\)
Belemnite,Indium 2018-01-19-GLENDON_1_1 \({}^{12}\mathrm{C}_{}\) 41718475 30616 2738.1 43.85
Belemnite,Indium 2018-01-19-GLENDON_1_1 \({}^{12}\mathrm{C}_{}\)\({}^{14}\mathrm{N}_{}\) 511093 375 220.4 3.53
Belemnite,Indium 2018-01-19-GLENDON_1_1 \({}^{13}\mathrm{C}_{}\) 458139 336 42.5 0.68
Belemnite,Indium 2018-01-19-GLENDON_1_1 \({}^{40}\mathrm{Ca}_{}\)\({}^{16}\mathrm{O}_{}\) 18082538 13270 1856.0 29.72
Belemnite,Indium 2018-01-19-GLENDON_1_2 \({}^{12}\mathrm{C}_{}\) 72956119 53541 4072.9 65.22
Belemnite,Indium 2018-01-19-GLENDON_1_2 \({}^{12}\mathrm{C}_{}\)\({}^{14}\mathrm{N}_{}\) 362709 266 114.2 1.83

The underlying principals for the statistics calculated with the function stat_X() are delineated in detail below.

1.3.1.1 Arithmetic mean

The sample mean (\(\bar{X}^a\)) of chemical species \(a\) over a single analysis is given by:

\[\begin{equation} \bar{X}^a = \frac{1}{n} \sum_{i=1}^{n} X_i^a \tag{1.2} \end{equation}\]

To validate the internal consistency of the ion count data, it is necessary to define the internal precision of the analysis. This can be done with the standard deviation (\(s_x\)), which gives the spread of the sample, and the standard error of the mean (\(s_{\bar{x}}\)), which defines how well this \(\bar{X}\) approximates the true population mean (\(\mu\)). These statistics rely on the assumption that the underlying probability distribution follows a normal (Gaussian) distribution.

1.3.1.2 Standard deviation

The standard deviation for a limited sample of the population gives a measure of how individual measurements are spread about the mean in one analysis, and is given by:

\[\begin{equation} s_{X^a} = \sqrt{\sum_{i=1}^{n} \frac{(X_{i}^a-\bar{X}^a)^2}{n-1}} \tag{1.3} \end{equation}\]

where \(n\) is the number of measurement cycles in the analysis and \(X_i\) is the \(i\)-th measurement cycle. The number of measurements is subtracted with one (\(n - 1\)) to express that only \(n - 1\) of the \((x_{i}-\bar{x})^2\) are independent. The sample standard deviation can inform about the confidence whether a single measurement falls within a given range of the sample mean value.

1.3.1.3 Standard error of the mean

The standard error of the mean (\(s_{\bar{X}^a}\)) provides a measure of how well the mean of a limited sample (i.e., analysis) approximates the actual population mean. This measure can be used to gauge the precision of the analysis with \(n\) measurement cycles. This value is dependent on the number of measurements (\(n\)) and thus becomes smaller with increasing measurement numbers (i.e. \(\bar{X}\) becomes more precise). The standard error of the mean is given by the following equation.

\[\begin{equation} s_{\bar{X}^a} = \frac{s_{X^a}}{\sqrt{n}} \tag{1.4} \end{equation}\]

1.3.1.4 Predicted standard deviation

Ion count measurements have an inherent fundamental imprecision, which is dictated by the random nature of secondary ion production. This restrict the precision of the analysis to a certain analytical threshold. The amplitude of this inherent variation can be gauged with Poisson statistics. The Poisson distribution describes the likelihood of random events occurring over a defined (and fixed) time-period. Further conditions to be satisfied to validate the assumption of a Poisson distribution is the observation that \(N\) should be able to occur over a larger number of occasions and that the probability of the event occurring at a particular occasions is limited but constant. In the case of SIMS measurements \(N_i\) is the number of secondary ions counted by the detector during a single measurement cycle (see Fitzsimons, Harte, and Clark 2000).

The predicted standard deviation of a whole analysis is directly related to the population mean of \(N_{(i)}\) (\(\mu_{N}\)) by the equation;

\[\begin{equation} \sigma = \sqrt(\mu_{N}) \tag{1.5} \end{equation}\]

In this formulation the population mean of \(N_{(i)}\) (\(\mu_{N}\)) can be substituted by the mean number of events (i.e. secondary ion counts) per time unit, or \(\bar{N}\). The predicted standard deviation can therefore be deduced from the mean number of counts for that particular ion per analysis, as follows

\[\begin{equation} \hat{s}_{N^a} = \sqrt{\bar{N}^a} \tag{1.6} \end{equation}\]

where:

\[\begin{equation} \bar{N}^a = \frac{1}{n}\sum_{i=1}^{n}N_i^a \tag{1.7} \end{equation}\]

In this formulation, the hat on \(\hat{s}_N\) denotes that the statistics is predictive, instead of \(s_X\) which is an observed value. The commonality of the two measures is, however, that they are a estimate of the true population \(\sigma\).

1.3.1.5 Predicted standard error of the mean

In a similar fashion, the standard error of the mean for Poisson statistics depends on the number of measurements (\(n\)), and can be formulated as follows:

\[\begin{equation} \hat{s}_{\bar{N}^a} = \sqrt{\left( \frac{ \bar{N}^a}{n}\right)} \tag{1.8} \end{equation}\]

1.3.2 Descriptive and predictive statistics for isotope ratios

The function stat_R() can be used on the previously processed dataset and gives descriptive and predictive statistics for a pre-specified isotope ratio (\(R\)); e.g. \(^{13}\)C/\(^{12}\)C. Note that only isotope pairs give sensible statistical results as the ionization potential for two isotope of one element should be relatively similar. The function requires the following arguments; the ion count data (.IC) in a tibble as outlined for stat_X(), but now also .ion1, a character string representing the rare isotope (e.g. "13C"); and .ion2, the common isotope (e.g. "12C") are required. The dots ... should again be used to define a grouping variable for an analysis (here defined as the sample- and file-names of the loaded data). Again for this function, the arguments .label and .output tailor the generated tibble to the specific needs; and follows the same definitions as outlined for stat_X() (see above). This function has an additional argument named .zero (default is TRUE), which removes analysis that contain measurements with zero counts, but generates a warning to inform about this operation. Setting .zero to TRUE prevents the generation of NaN in the output statistics be division through zero.

# Descriptive an predictive statistics for 13C/12C ratios
tb_R <- stat_R(tb_pr, "13C", "12C", sample.nm, file.nm, .label = "webtex", 
               .stat = c("M", "RS", "RSeM", "hat_RS", "hat_RSeM", "chi2"))
Table 1.2: Summary statistics for internal precision of isotope ratios with stat_R().
sample.nm file.nm ratio.nm \(\bar{R}\) \(\epsilon_{R}\) (‰) \(\epsilon_{\bar{R}}\) (‰) \(\hat{\epsilon}_{R}\) (‰) \(\hat{\epsilon}_{\bar{R}}\) (‰) \(\chi^{2}_{R}\)
Belemnite,Indium 2018-01-19-GLENDON_1_1 13C/12C 0.011 93.0 1.49 92.8 1.49 1.00
Belemnite,Indium 2018-01-19-GLENDON_1_2 13C/12C 0.011 70.8 1.13 70.1 1.12 1.02
Belemnite,Indium 2018-01-19-GLENDON_1_3 13C/12C 0.011 66.5 1.06 65.5 1.05 1.03

The underlying principals are again delineated in detail below.

1.3.2.1 Descriptive statistics with error propagation for isotope ratios

The mean isotope ratio (\(\bar{R}\)) can be calculated from the mean values of the specific ions of the complete analysis.

\[\begin{equation} \bar{R} = \frac{\frac{1}{n}\sum_{i = 1}^{n} X_i^{b}}{\frac{1}{n}\sum_{i = 1}^{n} X_i^{a}} \tag{1.9} \end{equation}\]

and this value can be considered as an estimate of the true isotopic value (\(\mu_R\)). The uncertainties associated with the pulsed ion count rates of the individual variables \(X^{b}\) (e.g. 13C) and \(X^{a}\) (e.g. 12C) need to be combined. This can be achieved by applying; The formula for exact propagation of error (Ku 1966).

\[\begin{equation} s_x^{2} \approx \sum_{i = 1}^{n} \left[ \left( \frac{\partial F}{\partial z_i} \right) s_i^{2} \right] + 2 \sum_{j = 1}^{n} \sum_{k = 1}^{n} \left[ \left( \frac{\partial F}{\partial z_j} \right) \left( \frac{\partial F}{\partial z_k} \right) s_j s_k r_{(z_j, z_k)} \right] \tag{1.10} \end{equation}\]

which ensures proper propagation of the error. In this formulation \(r_{jk}\) stands for the correlation coefficient for the variables \(z_j\) and \(z_k\), as defined by

\[\begin{equation} r_{jk} = \frac{1}{\left(n-1\right) s_j s_k} \sum_{i=1}^n{ \left[ \left(z_{j}\right)_i - \bar{z}_j \right] \left[ \left(z_{k}\right)_i - \bar{z}_k \right]} \tag{1.11} \end{equation}\]

and yields an estimate for the sample correlation coefficient, where values can range between \(-1\) and \(+1\), and thereby recording a inverse or positive linear correlation between the variables, and no correlation if \(r\) falls close to zero. The product of \(r_{(z_j, z_k)}\), \(s_j\), and \(s_k\) is the same as the co-variance between of the two input variables, as such, the variable simplifies to \(s_{jk}\). For this calculation the stat function cov() was used, with the method argument set to "pearson" and use to "everything".

Recasting Eq. (1.10) for when \(F(...)\) is \(R\), and with the variables \(\bar{X}^{b}\) (e.g. 13C) and \(\bar{X}^{a}\) (e.g. 12C), yields the following equation:

\[\begin{equation} s_{R} = \sqrt{ \left( \frac{ s_{X^{b}}}{\bar{X}^{b}} \right)^2 + \left( \frac{ s_{X^{a}}}{\bar{X}^{a}} \right)^2 - 2 \frac{s_{\bar{X}^{a} \bar{X}^{b}}}{\bar{X}^{b}\bar{X}^{a}}} \times \bar{R} \tag{1.12} \end{equation}\]

The standard error of the mean isotope value \({\bar{R}}\) is obtained through diving \(s_{R}\) by \(\sqrt(n)\). In addition, both the standard deviation and standard error of the mean of the isotope value can conveniently be expressed as relative values in ‰ by dividing them with the \(\bar{R}\) and multiplying by \(1,000\).

1.3.2.2 Predictive statistics with error propagation for isotope ratios

For isotope analysis based on pulsed ion count data we need to have at least two different analyses, so that we can get a count ratio, as defined by Eq.(1.9), and where \(X_i\) is a time normalised count, or count rate. Satisfying this assumption provides us with count-rate ratio \(R\) for measurement \(i\) of the isotopes \(a\) and \(b\), where we take a mean \(\bar{R}\) from the completed analysis as our estimate of the true isotope value \(\mu_R\). As the predicted \(\hat{s}_X\) can be calculated for single ions, this should also mean that the uncertainty in the isotope measurement can be predicted (\(\hat{s}_R\)). And, again this requires proper error propagation to incorporate the cumulative errors on the counts of both isotopes; \(N^{a}\) and \(N^{b}\), over one analysis (Fitzsimons, Harte, and Clark 2000). Since the count-rate ratio \(R\) is a linear function of the count ratio, it is possible to use the standard deviation of the count ratio \(\hat{s}_{N^{b}/N^{a}}\) instead of \(\hat{s}_{R}\), following that:

\[\begin{equation} \hat{s}_{R} \approx \left(\frac{t^{a}}{t^{b}} \right) \hat{s}_{N^{b}/N^{a}} \tag{1.13} \end{equation}\]

This provides the possibility to express \(\hat{s}_{N^{b}/N^{a}}\) in terms of the standard deviations of the individual counts, and by using Eq. (1.10), yields the following equation;

\[\begin{equation} \hat{s}_{N^{b}/N^{a}} \approx \sqrt{ \left( \frac{\hat{s}_{N^{b}}}{N^{b}} \right)^2 + \left( \frac{\hat{s}_{N^{a}}}{N^{a}} \right)^2 - 2\frac{r_{N^{b}N^{a}} s_{N^{b}} s_{N^{a}}}{N^{b}N^{a}} }\times \frac{\bar{N}^{b}}{\bar{N}^{a}} \tag{1.14} \end{equation}\]

The correlation coefficient (\(r\)) becomes zero, as the count statistics for both isotopes are independent. The predicted standard deviations for \(N^{b}\) and \(N^{a}\) can be approximated by the population mean, according to Eq. (1.6), thereby transforming Eq. (1.14);

\[\begin{equation} \hat{s}_{N^{b}/N^{a}} \approx \sqrt{\frac{1}{ \bar{N}^{b}} + \frac{1}{ \bar{N}^{a}}} \times \frac{\bar{N}^{b}}{\bar{N}^{a}} \tag{1.15} \end{equation}\]

in which we can substitute Eq.(1.13) to obtain,

\[\begin{equation} \hat{s}_{R} \approx \sqrt{\frac{1}{ \bar{N}^{b}} + \frac{1}{ \bar{N}^{a}}} \times \frac{\bar{N}^{b}}{\bar{N}^{a}} \left( \frac{t^{a}}{t^{b}} \right) \tag{1.16} \end{equation}\]

, which is equivalent to,

\[\begin{equation} \hat{S}_{R} \approx \sqrt{\frac{1}{ \bar{N}^{b}} + \frac{1}{ \bar{N}^{a}}} \times \bar{R} \tag{1.17} \end{equation}\]

In Eq. (1.17), we can substitute Eq. (1.7) for \(\bar{N}^{b}\) and \(\bar{N}^{b}\), respectively.

\[\begin{equation} \hat{s}_{R} = \sqrt{ \left( \frac{1}{\sum_{i = 1}^{n}{N_i^a}} \right) + \left( \frac{1}{\sum_{i = 1}^{n}{N_i^b}} \right)} \times \bar{R} \sqrt{n} \tag{1.18} \end{equation}\]

The predicted standard error of the mean of a repeated set of measurements in one analysis is then:

\[\begin{equation} \hat{s}_{\bar{R}} = \sqrt{ \left( \frac{1}{\sum_{i = 1}^{n}{N_i^a}} \right) + \left( \frac{1}{\sum_{i = 1}^{n}{N_i^b}} \right)} \times \bar{R} \tag{1.19} \end{equation}\]

The predicted standard deviation (Eq. (1.18)) and standard error of the mean (Eq. (1.19)) can again be expressed as relative uncertainties in ‰, following the same transformation as for the descriptive statistics.

1.3.2.3 Comparing predicted and descriptive statistics

The reduced \(\chi^2\) can be used to assess the machine performance as it cross-validates the observed error estimate with the theoretical Poisson-based precision. For example, the reduced \(\chi^2\) of an isotope ratio equates to:

\[\begin{equation} \chi^2 = \left( \frac{s_{\bar{R}}} {\hat{s}_{\bar{R}}} \right)^2 \tag{1.20} \end{equation}\]

where values close to \(1\) suggest good agreement between the actual measurement and the predicted value (Kilburn and Wacey 2015). Values lower than \(1\) suggest that the analysis was better than predicted, and values higher than \(1\) indicate that the analysis was worse than predicted by Poisson statistics.

1.4 External precision of ion count data

A series of several ion count analyses is usually performed in a study. The consistency of such \(m\)-series of analyses is usually gauged with a homogeneous reference material; the so-called external precision, repeatability or reproducibility of the study (Fitzsimons, Harte, and Clark 2000). This value is usually reported as the standard deviation of the \(m\)-series of analyses on the reference material. In calculating this statistic the \(n\) in Eq. (1.3) is replaced by \(m\). The standard deviation is reported as we are interested in how a single analysis relates to the variability of the study, where variability is the sum of the random nature of counting statistics, machine performance and homogeneity of the analytical substrate. Conversely, we are not interested in how precisely we can approach the mean of an \(m\)-series of analysis (i.e. the mean of the study).

Similarly the predicted standard deviation of an \(m\)-series of analyses can be calculated following similar conventions as outlined above. Here, we substitute \(\bar{N}^a\) of a chemical species \(a\) by \(\bar{M}^a\) (i.e., the mean counts for an \(m\)-series of analyses) in Eq. (1.6) for single ions and (1.14) (and subsequent derivations of the equation) for isotope ratios. Note, that these formulations consider counts per analysis and not counts per measurement.

To calculate the external precision for ion ratios with the package point, use the .nest argument of stat_R() to define groupings of analyses (e.g. a raw ion count dataset containing replicate measurements on a reference material).

# external precision of the dataset 
tb_R_ext  <- stat_R(tb_pr, "13C", "12C", sample.nm, file.nm, .nest = file.nm, 
                    .stat =  c("M", "RS", "RSeM", "hat_RS", "hat_RSeM", "chi2"), 
                    .label = "webtex")
Table 1.3: Summary statistics for external precision of isotope ratios.
sample.nm ratio.nm \(\bar{\bar{R}}\) \(\epsilon_{\bar{R}}\) (‰) \(\epsilon_{\bar{\bar{R}}}\) (‰) \(\hat{\epsilon}_{\bar{R}}\) (‰) \(\hat{\epsilon}_{\bar{\bar{R}}}\) (‰) \(\chi^{2}_{\bar{R}}\)
Belemnite,Indium 13C/12C 0.011 1.85 1.07 1.18 0.681 2.45

The number of counts of any single analysis \(j\) (\(M_j\)) is equal to to sum of all counts (\(N_i\)) in an analysis of \(n\) measurements, an thus:

\[\begin{equation} \sum_{j=1}^m M_j^a = m \sum_{i=1}^n N_i^a \tag{1.21} \end{equation}\]

Given the previous relationship, the average predicted standard error of the mean of single analyses should approximate the predicted standard deviation if viewed as one single continuous analysis, or \(m\)-series of analyses (see Fitzsimons, Harte, and Clark 2000).

We can validate this derivation as follows:

# Prove of the previous statement with the example dataset
tb_R_int <- stat_R(tb_pr, "13C", "12C", sample.nm,  file.nm, .stat = "hat_RSeM",
                   .zero = TRUE)

tb_R_ext <- stat_R(tb_pr, "13C", "12C", sample.nm, file.nm, .nest = file.nm, 
                   .stat = "hat_RS",.zero = TRUE)

# The average of the internal relative predicted standard error of the mean 
# (per mille)
filter(tb_R_int, str_detect(sample.nm, "Belemnite")) %>%  
  pull(hat_RSeM_R_N.pr) %>% 
  mean() %>% 
  round(1)
#> [1] 1.2

# The external predicted standard deviation (per mille)
round(tb_R_ext$hat_RS_R_tot_N.pr, 1)
#> [1] 1.2

The values match and confirm the previous relationship for this study.

References

Fitzsimons, I. C. W., B. Harte, and R. M. Clark. 2000. “SIMS stable isotope measurement: counting statistics and analytical precision.” Mineralogical Magazine 64 (01): 59–83. https://doi.org/10.1180/002646100549139.

Kilburn, Matt R., and David Wacey. 2015. “Nanoscale secondary ion mass spectrometry (NanoSIMS) as an analytical tool in the geosciences.” RSC Detection Science 2015-Janua (4): 1–34. https://doi.org/10.1039/9781782625025-00001.

Ku, H. H. 1966. “Notes on the use of propagation of error formulas.” Journal of Research of the National Bureau of Standards, Section C: Engineering and Instrumentation 70C (4): 263. https://doi.org/10.6028/jres.070c.025.