`library(splithalfr)`

This vignette describes a scoring method similar to Mogg and Bradley (1999); difference of mean reaction times (RTs) between conditions with probe-at-test and probe-at-control, for correct responses, after removing RTs below 200 ms and above 520 ms, on Visual Probe Task data.

Load the included VPT dataset and inspect its documentation.

```
data("ds_vpt", package = "splithalfr")
?ds_vpt
```

The columns used in this example are:

`UserID`

, which identifies participants`block_type`

, in order to select assessment blocks only`patt`

, in order to compare trials in which the probe is at the test or at the control stimulus`response`

, in order to select correct responses only`rt`

, in order to drop RTs outside of the range [200, 520] and calculate means per level of patt`thor`

, which is the horizontal position of test stimulus`keep`

, which is whether probe was superimposed on the stimuli or replaced stimuli

Only select trials from assessment blocks

`ds_vpt <- subset(ds_vpt, block_type == "assess")`

The variables `patt`

, `thor`

, and `keep`

were counterbalanced. Below we illustrate this for the first participant.

```
ds_1 <- subset(ds_vpt, UserID == 1)
table(ds_1$patt, ds_1$thor, ds_1$keep)
```

The scoring function calculates the score of a single participant as follows:

- select only correct responses
- drop responses with RTs outside of the range [200, 520]
- calculate the mean RT of remaining responses

```
fn_score <- function (ds) {
ds_keep <- ds[ds$response == 1 & ds$rt >= 200 & ds$rt <= 520, ]
rt_yes <- mean(ds_keep[ds_keep$patt == "yes", ]$rt)
rt_no <- mean(ds_keep[ds_keep$patt == "no", ]$rt)
return (rt_no - rt_yes)
}
```

Let’s calculate the VPT score for the participant with UserID 23. NB - This score has also been calculated manually via Excel in the splithalfr repository.

`fn_score(subset(ds_vpt, UserID == 23))`

To calculate the VPT score for each participant, we will use R’s native `by`

function and convert the result to a data frame.

```
scores <- by(
ds_vpt,
ds_vpt$UserID,
fn_score
)
data.frame(
UserID = names(scores),
score = as.vector(scores)
)
```

To calculate split-half scores for each participant, use the function `by_split`

. The first three arguments of this function are the same as for `by`

. An additional set of arguments allow you to specify how to split the data and how often. In this vignette we will calculate scores of 1000 permutated splits. The trial properties `patt`

, `thor`

and `keep`

were counterbalanced in the VPT design. We will stratify splits by these trial properties. See the vignette on splitting methods for more ways to split the data.

The `by_split`

function returns a data frame with the following columns:

`participant`

, which identifies participants`replication`

, which counts replications`score_1`

and`score_2`

, which are the scores calculated for each of the split datasets

*Calculating the split scores may take a while. By default, by_split uses all available CPU cores, but no progress bar is displayed. Setting ncores = 1 will display a progress bar, but processing will be slower.*

```
split_scores <- by_split(
ds_vpt,
ds_vpt$UserID,
fn_score,
replications = 1000,
stratification = paste(ds_vpt$patt, ds_vpt$thor, ds_vpt$keep)
)
```

Next, the output of `by_split`

can be analyzed in order to estimate reliability. By default, functions are provided that calculate Spearman-Brown adjusted Pearson correlations (`spearman_brown`

), Flanagan-Rulon (`flanagan_rulon`

), Angoff-Feldt (`angoff_feldt`

), and Intraclass Correlation (`short_icc`

) coefficients. Each of these coefficient functions can be used with `split_coef`

to calculate the corresponding coefficients per split, which can then be plotted or averaged via a simple `mean`

. A bias-corrected and accelerated bootstrap confidence interval can be calculated via `split_ci`

. Note that estimating the confidence interval involves very intensive calculations, so it can take a long time to complete.

```
# Spearman-Brown adjusted Pearson correlations per replication
coefs <- split_coefs(split_scores, spearman_brown)
# Distribution of coefficients
hist(coefs)
# Mean of coefficients
mean(coefs)
# Confidence interval of coefficients
split_ci(split_scores, spearman_brown)
```