6 MLDA Difference-in-Difference

Difference-in-difference estimates of the effect of the minimum legal drinking age (MLDA) on mortality (Mouchel, Williams, and Zador 1987; Norberg, Bierut, and Grucza 2009). This replicates the analyses in Tables 5.2 and 5.3 in Mastering ’Metrics.

Load necessary libraries.

library("tidyverse")
library("haven")
library("rlang")
library("broom")
library("clubSandwich")
data("deaths", package = "masteringmetrics")

In these regressions, we will use both indicator variables for year as well as a trend, so make a factor version of the year variable.

deaths <- mutate(deaths, year_fct = factor(year))

6.1 Table 5.2

Regression DD Estimates of MLDA-Induced Deaths among 18-20 year-olds, from 1970-1983

dtypes <- c("all" = "All deaths",
            "MVA" = "Motor vehicle accidents",
            "suicide" = "Suicide",
            "internal" = "All internal causes")

Estimate the DD for MLDA for all causes of death in 18-20 year olds. Run the regression with lm and calculate the cluster robust standard errors using sandwich::vcovCL. Subset the data.

data <- filter(deaths, year <= 1983, agegr == "18-20 yrs", dtype == "all")

Run the OLS model.

mod <- lm(mrate ~ 0 + legal + state + year_fct, data = data)

Calculate cluster robust coefficients. These are calculated using a different method than Stata uses, and thus will be slightly different than those reported in the book.

vcov <- vcovCR(mod, cluster = data[["state"]],
               type = "CR2")
coef_test(mod, vcov = vcov) %>%
  rownames_to_column(var = "term") %>%
  as_tibble() %>%
  select(term, estimate = beta, std.error = SE) %>%
  filter(term == "legal") %>%
  knitr::kable(digits = 2)
term estimate std.error
legal 10.8 4.48

Function to calculate clustered standard errors and return a tidy data frame of the coefficients and standard errors.

cluster_se <- function(mod, cluster, type = "CR2") {
  vcov <- vcovCR(mod, cluster = cluster, type = "CR2")
  coef_test(mod, vcov = vcov) %>%
    rownames_to_column(var = "term") %>%
    as_tibble() %>%
    select(term, estimate = beta, std.error = SE)
}
run_mlda_dd <- function(i) {
  data <- filter(deaths, year <= 1983, agegr == "18-20 yrs", dtype == i) # nolint
  mods <- tribble(
    ~ name, ~ model,
    "No trends, no weights",
    lm(mrate ~ 0 + legal + state + year_fct, data = data),
    "Time trends, no weights",
    lm(mrate ~ 0 + legal + year_fct + state + state:year, data = data),
    "No trends, weights",
    lm(mrate ~ 0 + legal + year_fct + state, data = data, weights = pop),
    # nolint start
    # "Time trends, weights",
    #   lm(mrate ~ 0 + legal + year_fct + state + state:year,
    #      data = data, weights = pop)
    # nolint end
  ) %>%
    mutate(coefs = map(model, ~ cluster_se(.x, cluster = data[["state"]],
                                           type = "CR2"))) %>%
    unnest(coefs) %>%
    filter(term == "legal") %>%
    mutate(response = i) %>%
    select(name, response, estimate, std.error)
}
mlda_dd <- map_df(names(dtypes), run_mlda_dd)
mlda_dd %>%
  knitr::kable(digits = 2)
name response estimate std.error
No trends, no weights all 10.80 4.48
Time trends, no weights all 8.47 4.74
No trends, weights all 12.41 4.78
No trends, no weights MVA 7.59 2.43
Time trends, no weights MVA 6.64 2.47
No trends, weights MVA 7.50 2.30
No trends, no weights suicide 0.59 0.57
Time trends, no weights suicide 0.47 0.74
No trends, weights suicide 1.49 0.92
No trends, no weights internal 1.33 1.53
Time trends, no weights internal 0.08 1.80
No trends, weights internal 1.89 1.83

6.2 Table 5.3

Regression DD Estimates of MLDA-Induced Deaths among 18-20 year-olds, from 1970-1983, controlling for Beer Taxes. This is the analysis presented in Angrist and Pischke (2014) Table 5.3.

run_beertax <- function(i) {
  data <- filter(deaths, year <= 1983, agegr == "18-20 yrs",
                 dtype == i, !is.na(beertaxa))
  out <- tribble(
    ~ name, ~ model,
    "No time trends",
    lm(mrate ~ 0 + legal + beertaxa + year_fct + state, data = data),
    "Time trends",
    lm(mrate ~ 0 + legal + beertaxa + year_fct + state + state:year,
       data = data)
  ) %>%
    # calc culstered standard errors
    mutate(coefs = map(model, ~ cluster_se(.x, data[["state"]]))) %>%
    unnest(coefs) %>%
    filter(term %in% c("legal", "beertaxa")) %>%
    mutate(response = i) %>%
    select(response, name, term, estimate, std.error)
}
beertax <- map_df(names(dtypes), run_beertax)
beertax %>%
  knitr::kable(digits = 2)
response name term estimate std.error
all No time trends legal 10.98 4.60
all No time trends beertaxa 1.51 9.02
all Time trends legal 10.03 4.57
all Time trends beertaxa -5.52 30.40
MVA No time trends legal 7.59 2.51
MVA No time trends beertaxa 3.82 5.27
MVA Time trends legal 6.89 2.47
MVA Time trends beertaxa 26.88 18.76
suicide No time trends legal 0.45 0.58
suicide No time trends beertaxa -3.05 1.61
suicide Time trends legal 0.38 0.72
suicide Time trends beertaxa -12.13 8.28
internal No time trends legal 1.46 1.56
internal No time trends beertaxa -1.36 3.02
internal Time trends legal 0.88 1.68
internal Time trends beertaxa -10.31 10.90

Note: I had trouble getting sandwich::vcovCL to estimate clustered standard errors for this regression.