3 A tibble: 2 x 3

group n smokers 1 Cancer patients 86. 82. 2 Control group 86. 72.

3.1 Two Sample Binomial Model

In implementing this model, we have just two data points (cancer patients and control group) and a binomial sampling model, in which the population proportions of smokers in each group appear as parameters. Quantities of interest such as the difference in the population proportions and the log of the odds ratio are computed in the generated quantities section. Uniform priors on the population proportions are used in this example.

\[ \begin{aligned}[t] r_i &\sim \mathsf{Binomial}(n_i, \pi_i) \end{aligned} \] Additionally the difference, \[ \delta = \pi_1 - \pi_2 , \] and the log-odds ratio, \[ \lambda = \log\left(\frac{\pi_1}{1 - \pi_1}\right) - \log \left( \frac{\pi_2}{1 - \pi_2} \right) , \]

It places uniform priors (Beta priors) are placed on \(\pi\), \[ \begin{aligned} \pi_i &\sim \mathsf{Beta}(1, 1) \end{aligned} \]

The difference between and log odds ratio are defined in the generated quantities block.

The Stan model for this is:

  data {
  int r[2];
  int n[2];
  // param for beta prior on p
  vector[2] p_a;
  vector[2] p_b;
}
parameters {
  vector[2] p;
}
model {
  p ~ beta(p_a, p_b);
  r ~ binomial(n, p);
}
generated quantities {
  real delta;
  int delta_up;
  real lambda;
  int lambda_up;

  delta = p[1] - p[2];
  delta_up = delta > 0;
  lambda = logit(p[1]) - logit(p[2]);
  lambda_up = lambda > 0;

}

Now estimate the model:

3.2 Binomial Logit Model of the Difference

An alternative parameterization directly models the difference in the population proportion.

\[ \begin{aligned}[t] r_i &\sim \mathsf{Binomial}(n_i, \pi_i) \\ \pi_1 &= \frac{1}{1 + \exp(-(\alpha + \beta)} \\ \pi_2 &= \frac{1}{1 + \exp(-\alpha))} \end{aligned} \] The parameters \(\alpha\) and \(\beta\) are given weakly informative priors on the log-odds scale, \[ \begin{aligned} \alpha &\sim N(0, 10)\\ \beta &\sim N(0, 2.5) \end{aligned} \]

  data {
  int r[2];
  int n[2];
  // param for beta prior on p
  real a_loc;
  real a_scale;
  real b_loc;
  real b_scale;
}
parameters {
  real a;
  real b;
}
transformed parameters {
  vector[2] p;
  p[1] = inv_logit(a + b);
  p[2] = inv_logit(a);
}
model {
  a ~ normal(a_loc, a_scale);
  b ~ normal(a_loc, b_scale);
  r ~ binomial(n, p);
}
generated quantities {
  real delta;
  int delta_up;
  real lambda;
  int lambda_up;

  delta = p[1] - p[2];
  delta_up = delta > 0;
  lambda = logit(p[1]) - logit(p[2]);
  lambda_up = lambda > 0;

}

Re-use r and n values from cancer_data, but add the appropriate values for the prior distributions.

Sample from the model:

3.3 Questions

  1. Expression the Binomial Logit model of the Difference as a regression
  2. What number of success and failures is a Beta(1,1) prior equivalent to?