---
title: Managing multiple testing challenges with sequential inference using the onlineFDR
  package
author: "David S. Robertson, Aaditya Ramdas and Natasha A. Karp"
date: "2019-10-23"
output:
  html_document:
    toc: yes
  pdf_document:
    toc: yes
vignette: >
  %\VignetteIndexEntry{Using the onlineFDR package}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>"
)
```

---

## Introduction

Multiple hypothesis testing is a fundamental problem in statistical inference.
The failure to manage the multiple testing problem has been highlighted as one
of the elements contributing to the replicability crisis in science (Ioannidis
2015). Methodologies have been developed for a family of hypotheses to adjust
the significance levels to manage the multiple testing situation by controlling
error metrics such as the familywise error rate (FWER) or the false discovery
rate (FDR).  

Frequently, modern data analysis problems have a further complexity that the
hypothesis arrive sequentially in a stream. This introduces the challenge that
at each step the investigator must decide whether to reject the current null
hypothesis without having access to the future p-values or the total number of
hypothesis to be tested, but does have knowledge of the historic decisions to
date. The International Mouse Phenotyping Consortium (Koscielny *et al.*, 2013),
provides a concrete example of such a scenario. Here the dataset is constantly
growing as new knockout mice lines are generated and phenotyping data uploaded
to a database. 

Javanmard and Montanari proposed two procedures, LOND and LORD, to control the
FDR in an online manner (Javanmard and Montanari (2015, 2018)), with the latter
extended by Ramdas *et al.* (2017). The LOND procedure sets the adjusted
significance thresholds  based on the number of discoveries made so far, while
LORD sets them according to the time of the most recent discovery. Ramdas
*et al.* (2018) then proposed the SAFFRON procedure, which provides an adaptive
method of online FDR control. They also proposed a variant of the
Alpha-investing algorithm of Foster and Stine (2008) that guarantees FDR
control, using SAFFRON's update rule.

Subsequently, Zrnic *et al.* (2018) proposed procedures to control the modified
FDR (mFDR) in the context of *asynchronous* testing, i.e. where each hypothesis
test can itself be a sequential process and the tests can overlap in time. They
presented asynchronous versions of the LOND, LORD and SAFFRON procedures for a
variety of trial settings. For both synchronous and asynchronous testing, Tian
& Ramdas (2019a) proposed the ADDIS algorithms which compensate for the loss
in power in the presence of conservative nulls by adaptively 'discarding' these
p-values.

Finally, Tian & Ramdas (2019b) proposed procedures that provide online control
of the FWER. One procedure, online fallback, gives a uniform improvement to the
naive Alpha-spending procedure (see below). The ADDIS-spending procedure
compensates for the power loss of these procedures by including both adapativity
in the fraction of null hypotheses and the conservativeness of nulls.

The onlineFDR package implements all these procedures and provides wrapper
functions to apply them to a historic or growing dataset. As a comparison, we
have also provided a function for implementation of the Alpha-spending
procedure, which is based on the Bonferroni procedure adapted to the online
scenario. This vignette explains the use of the package and demonstrates a
typical workflow.

---

## Overview of the process

1. A dataset with three columns (an identifier ('id'), date ('date') and
p-value ('pval')) is passed to one of the onlineFDR wrapper functions.
Alternatively, a vector of p-values can be provided, in which case step 2 is
skipped.

2. The function orders the information by date.  If there are multiple p-values
with the same date (i.e. the same batch), the order of the p-values within each
batch is randomised by default. In order for the randomisation of the p-values
to be reproducible, it is necessary to set a seed (via the `set.seed`
function) before calling the wrapper function (see also step 6).

3. For each hypothesis test, an adjusted significance threshold (`alphai`) is
calculated, which gives the threshold at which the corresponding p-value would
be declared significant.

4. Using the p-values provided and the alphai, an indicator of discoveries (`R`)
is calculated, where `R[i] = 1` corresponds to hypothesis i being rejected
(and `R[i] = 0` otherwise).  

5. A dataframe is returned, reordered by batch, with the original data and the
newly calculated `alphai` and `R`. 

6. For simplicity, as the dataset grows the new larger dataset should be passed
to the wrapper function and the values recalculated repeating steps 1 to 5. In
order for the randomisation of the data within the previous batches to remain
the same (and hence to allow for reproducibility of the results),
*the same seed should be used for all analyses*.


### Outline of the functions available

1: **LOND**
Implements the LOND procedure for online FDR control, where LOND stands for
(significance) Levels based On Number of Discoveries, as presented by Javanmard
and Montanari (2015). The procedure controls the FDR for independent or
positively dependent (PRDS) p-values, with an option (dep = TRUE) which
guarantees control for arbitrarily dependent p-values.

2: **LORD**
Implements the LORD procedure for online FDR control, where LORD stands for
(significance) Levels based On Recent Discovery, as presented by Javanmard and
Montanari (2018), Ramdas *et al.* (2017) and Tian & Ramdas (2019). The
function provides different versions of the procedure valid for independent
p-values, see 'Theory behind onlineFDR'. There is also a version ('dep') that
guarantees control for dependent p-values.

3: **SAFFRON**
Implements the SAFFRON procedure for online FDR control, where SAFFRON stands
for Serial estimate of the Alpha Fraction that is Futilely Rationed On true Null
hypotheses, as presented by Ramdas *et al.* (2018). The procedure provides an
adaptive method of online FDR control.

4: **ADDIS**
Implements the ADDIS algorithm for online FDR control, where ADDIS stands for
an ADaptive algorithm that DIScards conservative nulls, as presented by Tian
& Ramdas (2019). The algorithm compensates for the power loss of SAFFRON
with conservative nulls, by including both adapativity in the fraction of
null hypotheses (like SAFFRON) and the conservativeness of nulls (unlike
SAFFRON). This procedure controls the FDR for independent p-values.

5: **Alpha_investing**
Implements a variant of the Alpha-investing algorithm of Foster and Stine (2008)
that guarantees FDR control, as proposed by Ramdas et al. (2018). This procedure
uses a variant of SAFFRON's update rule. This procedure controls the FDR for
independent p-values.

6: **LONDstar**
Implements the LOND algorithm for asynchronous online testing, as presented 
by Zrnic *et al.* (2018). These algorithms control the mFDR.

7: **LORDstar**
Implements LORD algorithms for asynchronous online testing, as presented by
Zrnic *et al.* (2018). These algorithms control the mFDR.

8: **SAFFRONstar**
Implements the SAFFRON algorithm for asynchronous online testing, as
presented by Zrnic *et al.* (2018). These algorithms control the mFDR.

9: **Alpha_spending**
Implements online FWER control using a Bonferroni-like test. Alpha-spending
provides strong FWER control for arbitrarily dependent p-values.

10: **online_fallback**
Implements the online fallback algorithm for FWER control, as proposed by
Tian & Ramdas (2019b). Online fallback is a uniformly more powerful method than
Alpha-spending, as it saves the significance level of a previous rejection.
Online fallback strongly controls the FWER for arbitrarily dependent p-values.

11: **ADDIS-spending**
Implements the ADDIS-spending algorithm for online FWER control, as proposed by
Tian & Ramdas (2019b). The algorithm compensates for the power loss of
Alpha-spending, by including both adapativity in the fraction of null hypotheses
and the conservativeness of nulls. ADDIS-spending provides strong FWER control
for independent p-values. Tian & Ramdas (2019b) also presented a version for
handling local dependence.


### Quick start

Here we show the steps to achieve online FDR control of a growing dataset. First
load a dataframe with the three columns: an identifier ('id'), date ('date')
and p-value ('pval'), and then call the wrapper function of interest. In order 
for the results to be reproducible, we also set a seed using the `set.seed`
function.

```{r}
library(onlineFDR)

sample.df <- data.frame(
    id = c('A15432', 'B90969', 'C18705', 'B49731', 'E99902',
        'C38292', 'A30619', 'D46627', 'E29198', 'A41418',
        'D51456', 'C88669', 'E03673', 'A63155', 'B66033'),
    date = as.Date(c(rep("2014-12-01",3),
                    rep("2015-09-21",5),
                    rep("2016-05-19",2),
                    "2016-11-12",
                    rep("2017-03-27",4))),
    pval = c(2.90e-14, 0.06743, 0.01514, 0.08174, 0.00171,
            3.61e-05, 0.79149, 0.27201, 0.28295, 7.59e-08,
            0.69274, 0.30443, 0.000487, 0.72342, 0.54757))

set.seed(1)
results <- LORD(sample.df)
results

```

### Input data

A dataset with three columns (an identifier ('id'), date ('date') and p-value
('pval')).  All p-values generated should be passed to the function (and not
just the significant p-values).  An exception to this would be if you have
implemented an orthogonal filter to reduce the dataset size, such as discussed
in  (Burgon *et al.*, 2010).

Alternatively, just the vector of p-values can be passed to the function. In
this case, the p-values are treated as being ordered sequentially with no
batches.


### Understanding the output

For each hypothesis test, the functions calculate the adjusted significance
thresholds (`alphai`) at which the corresponding p-value would be declared
statistically significant.   

Also calculated is an indicator function of discoveries (`R`), where `R[i] = 1`
corresponds to hypothesis i being rejected, otherwise `R[i] = 0`.  

A dataframe is returned, reordered by batch, with the original data and the
newly calculated `alphai` and `R`. 

---

## How to get help for onlineFDR

All questions regarding onlineFDR should be posted to the
**Bioconductor support site**,  which serves as a searchable knowledge base of
questions and answers:

https://support.bioconductor.org

Posting a question and tagging with "onlineFDR" will automatically send an alert
to the package authors to respond on the support site.  

---

## Theory behind onlineFDR

### Online hypothesis testing

Consider a sequence of hypotheses $H_1, H_2, H_3, \ldots$ that arrive
sequentially in a stream, with corresponding $p$-values
$(p_1, p_2, p_3, \ldots)$. A testing procedure provides a sequence of adjusted
significance thresholds $\alpha_i$, with corresponding decision rule:
\[ R_i = \begin{cases}
1 & \text{if } p_i \leq \alpha_i & (\text{reject } H_i)\\
0 & \text{otherwise} & (\text{accept } H_i)
\end{cases} \]

In *online* testing, the significance thresholds can only be functions of
the prior decisions, i.e. $\alpha_i = \alpha_i(R_1, R_2, \ldots, R_{i-1})$.

Javanmard and Montanari (2015, 2018) proposed two procedures for online control.
The first is LOND, which stands for (significance) Levels based On Number of
Discoveries. The second is LORD, which stands for (significance) Levels based On
Recent Discovery. LORD was subsequently extended by Ramdas *et al.* (2017).
Ramdas *et al.* (2018) also proposed the SAFFRON procedure, which provides an
adaptive method of online FDR control, which includes a variant of
Alpha-investing. Finally, Tian & Ramdas (2019) proposed the ADDIS procedure
as an improvement of SAFFRON in the presence of conservative nulls.


### LOND {#LOND}

The LOND procedure controls the FDR for independent or positively dependent
(PRDS) $p$-values. Given an overall significance level $\alpha$, we choose a
sequence of non-negative numbers $\beta = (\beta_i)_{i \in \mathbb{N}}$ such
that they sum to $\alpha$. The values of the adjusted significance thresholds
$\alpha_i$ are chosen as follows: \[ \alpha_i = \beta_i (D(i-1) + 1) \] where
$D(n) = \sum_{i=1}^n R_i$ denotes the number of discoveries (i.e. rejections) in
the first $n$ hypotheses tested.

LOND can be adjusted to also control FDR under arbitrarily dependent $p$-values.
To do so, it is modified with $\tilde{\beta}_i = \beta_i/H(i)$ in place of
$\beta_i$, where $H(i) = \sum_{j=1}^i \frac{1}{j}$ is the $i$-th harmonic
number. Note that this leads to a substantial loss in power compared to the
unadjusted LOND procedure. The correction factor is similar to the classical one
used by Benjamini and Yekutieli (2001), except that in this case the $i$-th
hypothesis among $N$ is penalised by a factor of $H(i)$ to give consistent
results across time (as compared to a factor $H(N)$ for the Benjamini and
Yekutieli method).

<a name="LOND_beta"></a> The default sequence of $\beta$ is given by
\[\beta_j = C \alpha \frac{\log(\max(j, 2))}{j e^{\sqrt{\log j}}}\] where 
$C \approx 0.07720838$, as proposed by Javanmard and Montanari (2018)
equation 31.


### LORD {#LORD}

The LORD procedure controls the FDR for independent $p$-values. We first fix a
sequence of non-negative numbers $\gamma = (\gamma_i)_{i \in \mathbb{N}}$ such
that $\gamma_i \geq \gamma_j$ for 
$i \leq j$ and $\sum_{i=1}^{\infty} \gamma_i = 1$. At each time $i$, let 
$\tau_i$ be the last time a discovery was made before $i$: \[
\tau_i = \max \left\{ l \in \{1, \ldots, i-1\} : R_l = 1\right\}
\]

LORD depends on constants $w_0$ and $b_0$, where $w_0 \geq 0$ represents the
initial 'wealth' of the procedure and $b_0 > 0$ represents the 'payout' for 
rejecting a hypothesis. We require $w_0+b_0 \leq \alpha$ for FDR control to
hold.

Javanmard and Montanari (2018) presented three different versions of LORD, which
have different definitions of the adjusted significance thresholds $\alpha_i$. 
Versions 1 and 2 have since been superseded by the LORD++ procedure of
Ramdas *et al.* (2017), so we do not describe them here.

* **LORD++**: The significance thresholds for LORD++ are chosen as follows: \[
\alpha_i = \gamma_i w_0 + (\alpha - w_0) \gamma_{i-\tau_1} + 
\alpha \sum_{j : \tau_j < i, \tau_j \neq \tau_1} \gamma_{i - \tau_j}
\] 

* **LORD 3**: The significance thresholds depend on the time of the last
discovery time and the wealth accumulated at that time, with
\[
\alpha_i  = \gamma_{i - \tau_i} W(\tau_i)
\]
where $\tau_1 = 0$. Here $\{W(j)\}_{j \geq 0}$ represents the 'wealth' available
at time $j$, and is defined recursively: \[
\begin{align}
W(0) & = w_0 \nonumber \\
W(j) & = W(j-1) - \alpha_{j-1} + b_0 R_j
\end{align}
\]

* **D-LORD**: This is equivalent to the LORD++ procedure with discarding. Given
a discarding threshold $\tau \in (0,1)$ and initial wealth $w_0 \leq \tau\alpha$
the significance thresholds are chosen as follows: \[ 
\alpha_t = \min\{\tau, \tilde{\alpha}_t\}
\] where \[
\tilde{\alpha}_t = w_0 \gamma_{S^t} +
(\tau\alpha - w_0)\gamma_{S^t - \kappa_1^*} + 
\tau\alpha \sum_{j \geq 2} \gamma_{S^t - \kappa_j^*}
\] and \[
\kappa_j = \min\{i \in [t-1] : \sum_{k \leq i} 
1 \{p_k \leq \alpha_k\} \geq j\}, \;
\kappa_j^* = \sum_{i \leq \kappa_j} 1 \{p_i \leq \tau \}, \;
S^t = \sum_{i < t} 1 \{p_i \leq \tau \}
\]

LORD++ is an instance of a monotone rule, and provably controls the FDR for
independent p-values provided $w_0 \leq \alpha$. LORD 3 is a non-monotone rule,
and FDR control is only demonstrated empirically. In some scenarios with large
$N$, LORD 3 will have a slightly higher power than LORD++ (see Robertson *et
al.*, 2018), but since it is a non-monotone rule we would recommend using LORD++
(which is the default), especially since it also has a provable guarantee of FDR
control.

<a name="LORD_gamma"></a> In all versions, the default sequence of $\gamma$ is
given by \[\gamma_j = C \frac{\log(\max(j, 2))}{j e^{\sqrt{\log j}}}\] 
where $C \approx 0.07720838$, as proposed by Javanmard and Montanari (2018)
equation 31.


Javanmard and Montanari (2018) also proposed an adjusted version of LORD that
is valid for arbitrarily *dependent* p-values. Similarly to LORD 3, the adjusted
significance thresholds are set equal to \[ \alpha_i = \xi_i W(\tau_i)\] where 
(assuming $w_0 \leq b_0$), 
$\sum_{j=1}^{\infty} \xi_i (1 + \log(j)) \leq \alpha / b_0$

<a name="LORDdep_xi"></a> The default sequence of $\xi$ is given by
\[ \xi_j  = \frac{C \alpha }{b_0 j \log(\max(j, 2))^3}\] 
where $C \approx 0.139307$.

Note that allowing for dependent p-values can lead to a substantial loss in
power compared with the LORD procedures described above.


### SAFFRON {#SAFFRON}

The SAFFRON procedure controls the FDR for independent p-values, and was
proposed by Ramdas *et al.* (2018). The algorithm is based on an estimate of the
proportion of true null hypotheses. More precisely, SAFFRON sets the adjusted
test levels based on an estimate of the amount of alpha-wealth that is allocated
to testing the true null hypotheses.

SAFFRON depends on constants $w_0$ and $\lambda$, where $w_0$ satisfies 
$0 \leq w_0 < (1 - \lambda)\alpha$ and represents the initial 'wealth' of the
procedure, and $\lambda \in (0,1)$ represents the threshold for a 'candidate'
hypothesis. A 'candidate' refers to p-values smaller than $\lambda$, since
SAFFRON will never reject a p-value larger than $\lambda$. These candidates can
be thought of as the hypotheses that are a-priori more likely to be non-null.

The SAFFRON procedure runs as follows:

1. Set the initial alpha-wealth $w_0 < (1-\lambda)\alpha$

2. At each time $i$, define the number of candidates after the $k$-th rejection
as \[ C_{k+} = C_{k+}(i) = \sum_{j = \tau_k + 1}^{i-1} C_j\]
where $C_j = 1\{p_j \leq \lambda \}$ is the indicator for candidacy.

3. SAFFRON starts with $\alpha_1 = \min\{\gamma_1 w_0, \lambda\}$. Subsequent 
test levels are chosen as $\alpha_i = \min\{ \lambda, \tilde{\alpha}_i\}$, where
\[
\tilde{\alpha}_i = w_0 \gamma_{i-C_{0+}} + 
((1-\lambda)\alpha - w_0)\gamma_{i-\tau_1-C_{1+}} +
(1-\lambda)\alpha \sum_{j : \tau_j < i, \tau_j \neq \tau_1} 
\gamma_{i - \tau_j- C_{j+}}
\]

<a name="SAFFRON_gamma"></a> The default sequence of $\gamma$ for SAFFRON is
given by $\gamma_j \propto j^{-1.6}$.


### Alpha-investing {#AlphaInvesting}

Ramdas et al. (2018) proposed a variant of the Alpha-investing algorithm of
Foster and Stine (2008) that guarantees FDR control for independent p-values.
This procedure uses SAFFRON's update rule with the constant \eqn{\lambda}
replaced by a sequence $\lambda_i = \alpha_i$. This is also equivalent to using
the ADDIS algorithm (see below) with $\tau = 1$ and $\lambda_i = \alpha_i$.


### ADDIS {#ADDIS}

The ADDIS procedure controls the FDR for independent p-values, and was proposed
by Tian & Ramdas (2019). The algorithm compensates for the power loss of SAFFRON
with conservative nulls, by including both adapativity in the fraction of null
hypotheses (like SAFFRON) and the conservativeness of nulls (unlike SAFFRON).

ADDIS depends on constants $w_0, \lambda$ and $\tau$. $w_0$ represents the
initial `wealth' of the procedure and satisfies $0 \leq w_0 \leq \tau \lambda
\alpha$. $\tau \in (0,1)$ represents the threshold for a hypothesis to be
selected for testing: p-values greater than $\tau$ are implicitly 'discarded' by
the procedure. Finally, $\lambda \in (0,1)$ sets the threshold for a p-value to
be a candidate for rejection: ADDIS will never reject a p-value larger than
$\tau \lambda$.

The significance thresholds for ADDIS are chosen as follows: \[ 
\alpha_t = \min\{\tau\lambda, \tilde{\alpha}_t\}
\] where \[
\tilde{\alpha}_t = w_0 \gamma_{S^t-C_{0+}} +
(\tau(1-\lambda)\alpha - w_0)\gamma_{S^t - \kappa_1^*-C_{1+}} + 
\tau(1-\lambda)\alpha \sum_{j \geq 2} \gamma_{S^t - \kappa_j^* - C_{j+}}
\] and \[
\kappa_j = \min\{i \in [t-1] : \sum_{k \leq i} 
1 \{p_k \leq \alpha_k\} \geq j\}, \;
\kappa_j^* = \sum_{i \leq \kappa_j} 1 \{p_i \leq \tau \}, \;
S^t = \sum_{i < t} 1 \{p_i \leq \tau \}, \;
C_{j+} = \sum_{i = \tau_k + 1}^{t-1} 1\{p_i \leq \tau\}
\]

The default sequence of $\gamma$ for ADDIS is the same as for SAFFRON given 
[here](#SAFFRON_gamma).


### Alpha-spending {#Alpha-spending}

The Alpha-spending procedure controls the FWER for a potentially infinite stream
of p-values using a Bonferroni-like test. Given an overall significance level
$\alpha$, the significance thresholds are chosen as
\[\alpha_i = \alpha \gamma_i\]
where $\sum_{i=1}^{\infty} \gamma_i = 1$ and $\gamma_i \geq 0$. The procedure
strongly controls the FWER for arbitrarily dependent p-values.

Note that the procedure also controls the generalised familywise error rate
(k-FWER) for $k > 1$ if $\alpha$ is replaced by $\min(1,k\alpha)$.

The default sequence of $\gamma$ is the same as that for $\xi$ for LORD given
[here](#LORD_gamma).


### Online Fallback {#onlineFallback}

The online fallback procedure of Tian & Ramdas (2019b) provides a uniformly more
powerful method than Alpha-spending, by saving the significance level of a
previous rejection. More specifically, online fallback tests hypothesis $H_i$ at
level \[\alpha_i = \alpha \gamma_i + R_{i-1} \alpha_{i-1}\] where $R_i = 1\{p_i
\leq \alpha_i\}$ denotes a rejected hypothesis. The procedure strongly controls
the FWER for arbitrarily dependent p-values.

The default sequence of $\gamma$ is the same as that for $\xi$ for LORD given
[here](#LORD_gamma).


### ADDIS-spending {#ADDIS-spending}

The ADDIS-spending procedure strongly controls the FWER for independent
p-values, and was proposed by Tian & Ramdas (2019b). The procedure compensates
for the power loss of Alpha-spending, by including both adapativity in the
fraction of null hypotheses and the conservativeness of nulls.

ADDIS depends on constants $\lambda$ and $\tau$, where $\lambda < \tau$. Here
$\tau \in (0,1)$ represents the threshold for a hypothesis to be selected for
testing: p-values greater than $\tau$ are implicitly `discarded' by the
procedure, while $\lambda \in (0,1)$ sets the threshold for a p-value to be a
candidate for rejection: ADDIS-spending will never reject a p-value larger than
$\lambda$.

Note that the procedure controls the generalised familywise error rate (k-FWER)
for $k > 1$ if $\alpha$ is replaced by $\min(1,k\alpha)$. Tian and Ramdas
(2019b) also presented a version for handling local dependence, see the Section
on Asynchronous testing below.

The default sequence of $\gamma$ for ADDIS-spending is the same as for SAFFRON
given [here](#SAFFRON_gamma).


### Accounting for dependent p-values

As noted above, the LORD, SAFFRON, ADDIS and ADDIS-spending procedures assume
independent p-values, while the LOND procedure is also valid under positive
dependencies (like the Benjamini-Hochberg method, see below). Adjusted
versions of LOND and LORD available for arbitrarily dependent p-values.
Alpha-spending and online fallback also control the FWER and FDR for arbitrarily
dependent p-values.

By way of comparison, the usual Benjamini-Hochberg method for controlling
the FDR assumes that the p-values are positively dependent (PRDS). As an
example, the PRDS is satisfied for multivariate normal test statistics with a
positive correlation matrix). See Benjamini & Yekutieli (2001) for further
technical details.

---

## Asynchronous testing

Zrnic *et al.* (2018) proposed procedures to control the modified
FDR (mFDR) in the context of *asynchronous* testing, i.e. where each hypothesis
test can itself be a sequential process and the tests can overlap in time. They
presented asynchronous versions of the LOND, LORD and SAFFRON procedures for a 
variety of trial settings, including the following:

1: **Asynchronous online mFDR control**: 
This is for an asynchronous testing process, consisting of tests that start and
finish at (potentially) random times. The discretised finish times of the test
correspond to the decision times.

2: **Online mFDR control under local dependence**: 
For any $t>0$ we allow the p-value $p_t$ to have arbitrary dependence on the
previous $L_t$ p-values. The fixed sequence $L_t$ is referred to as `lags'.

3: **mFDR control in asynchronous mini-batch testing**: 
A mini-batch represents a grouping of tests run asynchronously which
result in dependent p-values. Once a mini-batch of tests is fully completed,
a new one can start, testing hypotheses independent of the previous batch.


---

## Variations to the default options

In the following section,  we consider the arguments that a typical user might
consider amending for their analysis.


### Common arguments

As a default, the `alpha` argument is set to 0.05, where `alpha` sets the
overall significance level of the FDR of FWER controlling procedure. By
convention, the standard significance level utilised is the 5%.  However, there
are applications where an alternate threshold could be considered. For example,
a more stringent threshold might be appropriate when there are limited resources
to follow up significant findings. A less stringent threshold might be
appropriate when the downstream analysis is a global analysis which can tolerate
a higher proportion of false positives.

To ensure correct interpretation of the dates provided there is a date.format 
argument. As a default, the date format is set to receive dates as 
year-month(00-12)-day(number).  The following website provides clear guidance
on symbols used to interpret the date information: 
https://www.statmethods.net/input/dates.html

As a default, the `random` argument is set to `TRUE`. In this situation, the
order of p-values in each batch (i.e. with the same date) are randomised. This
is to avoid the risk of p-values being ordered post-hoc, which can lead to an
inflation of the FDR. As the dataset grows the data is reprocessed. To ensure 
the consistency of the output (with the randomisation within the previous
batches remaining the same), it is necessary to set the same `seed`
for all analyses.

The user also has the option to turn off the randomisation step, by setting the 
`random` argument to `FALSE`. This approach would be appropriate if the user 
has both a date *and* a time stamp for the p-values, in which case the data
should be ordered by date and time beforehand and then passed to a wrapper
function. Another scenario would be when p-values within the batches are
ordered using *independent* side information, so that hypotheses most likely to
be rejected come first, which would potentially increase the power of the
procedure (see Javanmard and Montanari (2018) and Li and Barber (2017)). 


### LOND

As a default, the `dep` argument is set to `FALSE`.  Alternatively, this can be
set to `TRUE` and will implement the LOND procedure to guarantee FDR control for
arbitrarily dependent p-values. This method will in general be more
conservative.

```{r}
set.seed(1); results.indep <- LOND(sample.df)    # for independent p-values
set.seed(1); results.dep <- LOND(sample.df, dep=TRUE)   # for dependent p-values

# compare adjusted significance thresholds
cbind(independent = results.indep$alphai, dependent = results.dep$alphai)

```

The vector `betai` is supplied by default, but can optionally be specified by the
user (as described above, see the formula for $\beta_j$ [here](#LOND_beta)).


### LORD

The default version of LORD used is version '++', but the user can optionally
specify versions 3, 'discard' and 'dep' using the `version` argument (see
[here](#LORD) for further details about the different versions).

```{r}
set.seed(1); results.LORD.plus <- LORD(sample.df)
set.seed(1); results.LORD3 <- LORD(sample.df, version=3)
set.seed(1); results.LORD.discard <- LORD(sample.df, version='discard')
set.seed(1); results.LORD.dep <- LORD(sample.df, version='dep') 

# compare adjusted significance thresholds
cbind(LORD.plus = results.LORD.plus$alphai,
    LORD3 = results.LORD3$alphai,
    LORD.discard  = results.LORD.discard$alphai,
    LORD.dep = results.LORD.dep$alphai)

```

By default $w_0 = \alpha/10$ and (for LORD 3 and LORD dep) $b0 = alpha - w0$,
but these parameters can optionally be specified by the user subject to the
requirements that $0 \leq w_0 \leq \alpha$, $b_0 > 0$ and $w_0+b_0 \leq \alpha$.

The value of `gammai` is also supplied by default, but can optionally be
specified by the user (as described above, see the formula for $\gamma_j$
[here](#LORDdep_xi) for version='dep' and [here](#LORD_gamma) for all other
versions of LORD).


### SAFFRON

By default $w_0 = \alpha/2$ and $\lambda = 0.5$, but these parameters can
optionally be specified by the user subject to the requirements that
$0 \leq w_0 < \alpha$ and $0 < \lambda < 1$. The values of `gammai` are also
supplied by default, but can optionally be specified by the user (as described
above, see the formula for $\gamma_j$ [here](#SAFFRON_gamma)).


### ADDIS

By default $w_0 = \tau \lambda \alpha/2$ and $\lambda = \tau = 0.5$, but these
parameters can optionally be specified by the user subject to the requirements
that $0 \leq w_0 < \tau \lambda \alpha$, $0 < \lambda < 1$ and $0 < \tau < 1$.
The values of `gammai` are also supplied by default, but can optionally be
specified by the user.


### Alpha-spending and online fallback

The values of `gammai` are supplied by default, but can optionally
be specified by the user.


### ADDIS-spending

By default $\lambda = 0.25$ and $\tau = 0.5$, but these
parameters can optionally be specified by the user subject to the requirements
that $\lambda < \tau$, $0 < \lambda < 1$ and $0 < \tau < 1$.
The values of `gammai` are also supplied by default, but can optionally be
specified by the user.


---

## Acknowledgements

We would like to thank the IMPC team (via Jeremy Mason and Hamed Haseli
Mashhadi) for useful discussions during the development of the package.

---

## References

Aharoni, E. and Rosset, S. (2014). Generalized $\alpha$-investing: definitions, 
optimality results and applications to public databases.
*Journal of the Royal Statistical Society (Series B)*, 76(4):771--794.

Benjamini, Y., and Yekutieli, D. (2001). The control of the false discovery rate
in multiple testing under dependency. *The Annals of Statistics*,
29(4):1165-1188.

Bourgon, R., Gentleman, R., and Huber, W. (2010). Independent filtering
increases detection power for high-throughput experiments. 
*Proceedings of the National Academy of Sciences*, 107(21), 9546-9551.

Foster, D. and Stine R. (2008). $\alpha$-investing: a procedure for sequential
control of expected false discoveries.
*Journal of the Royal Statistical Society (Series B)*, 29(4):429-444.

Ioannidis, J.P.A. (2005). Why most published research findings are false.
*PLoS Medicine*, 2.8:e124.

Javanmard, A., and Montanari, A. (2015). On Online Control of False
Discovery Rate. *arXiv preprint*, https://arxiv.org/abs/1502.06197.

Javanmard, A., and Montanari, A. (2018). Online Rules for Control of False
Discovery Rate and False Discovery Exceedance. *Annals of Statistics*, 
46(2):526-554.

Koscielny, G., *et al*. (2013). The International Mouse Phenotyping Consortium
Web Portal, a unified point of access for knockout mice and related phenotyping
data. *Nucleic Acids Research*, 42.D1:D802-D809.

Li, A., and Barber, F.G. (2017). Accumulation Tests for FDR Control in Ordered
Hypothesis Testing. *Journal of the American Statistical Association*, 
112(518):837-849.

Ramdas, A., Yang, F., Wainwright M.J. and Jordan, M.I. (2017). Online control
of the false discovery rate with decaying memory. 
*Advances in Neural Information Processing Systems 30*, 5650-5659.

Ramdas, A., Zrnic, T., Wainwright M.J. and Jordan, M.I. (2018). SAFFRON: an
adaptive algorithm for online control of the false discovery rate. 
*Proceedings of the 35th International Conference in Machine Learning*,
80:4286-4294.

Robertson, D.S. and Wason, J.M.S. (2018). Online control of the false discovery
rate in biomedical research. *arXiv preprint*, https://arxiv.org/abs/1809.07292.

Robertson, D.S., Wildenhain, J., Javanmard, A. and Karp, N.A. (2019).
Online control of the false discovery rate in biomedical research.
*Bioinformatics*, 35:4196-4199, https://doi.org/10.1093/bioinformatics/btz191.

Tian, J. and Ramdas, A. (2019a). ADDIS: an adaptive discarding algorithm for 
online FDR control with conservative nulls. *arXiv preprint*, 
https://arxiv.org/abs/1905.11465.

Tian, J. and Ramdas, A. (2019b). Online control of the familywise error rate.
*arXiv preprint*, https://arxiv.org/abs/1910.04900.

Zrnic, T., Ramdas, A. and Jordan, M.I. (2018). Asynchronous Online Testing of
Multiple Hypotheses. *arXiv preprint*, https://arxiv.org/abs/1812.05068.