6 MLDA Difference-in-Difference

Difference-in-difference estimates of the effect of the minimum legal drinking age (MLDA) on mortality (Mouchel, Williams, and Zador 1987; Norberg, Bierut, and Grucza 2009). This replicates the analyses in Tables 5.2 and 5.3 in Mastering ’Metrics.

Load necessary libraries.

library("tidyverse")
library("haven")
library("rlang")
library("broom")
library("clubSandwich")

data("deaths", package = "masteringmetrics")

In these regressions, we will use both indicator variables for year as well as a trend, so make a factor version of the year variable.

deaths <- mutate(deaths, year_fct = factor(year))

6.1 Table 5.2

Regression DD Estimates of MLDA-Induced Deaths among 18-20 year-olds, from 1970-1983

dtypes <- c("all" = "All deaths",
            "MVA" = "Motor vehicle accidents",
            "suicide" = "Suicide",
            "internal" = "All internal causes")

Estimate the DD for MLDA for all causes of death in 18-20 year olds. Run the regression with lm and calculate the cluster robust standard errors using sandwich::vcovCL. Subset the data.

data <- filter(deaths, year <= 1983, agegr == "18-20 yrs", dtype == "all")

Run the OLS model.

mod <- lm(mrate ~ 0 + legal + state + year_fct, data = data)

Calculate cluster robust coefficients. These are calculated using a different method than Stata uses, and thus will be slightly different than those reported in the book.

vcov <- vcovCR(mod, cluster = data[["state"]],
               type = "CR2")
coef_test(mod, vcov = vcov) %>%
  rownames_to_column(var = "term") %>%
  as_tibble() %>%
  select(term, estimate = beta, std.error = SE) %>%
  filter(term == "legal") %>%
  knitr::kable(digits = 2)

term	estimate	std.error
legal	10.8	4.48

Function to calculate clustered standard errors and return a tidy data frame of the coefficients and standard errors.

cluster_se <- function(mod, cluster, type = "CR2") {
  vcov <- vcovCR(mod, cluster = cluster, type = "CR2")
  coef_test(mod, vcov = vcov) %>%
    rownames_to_column(var = "term") %>%
    as_tibble() %>%
    select(term, estimate = beta, std.error = SE)
}

run_mlda_dd <- function(i) {
  data <- filter(deaths, year <= 1983, agegr == "18-20 yrs", dtype == i) # nolint
  mods <- tribble(
    ~ name, ~ model,
    "No trends, no weights",
    lm(mrate ~ 0 + legal + state + year_fct, data = data),
    "Time trends, no weights",
    lm(mrate ~ 0 + legal + year_fct + state + state:year, data = data),
    "No trends, weights",
    lm(mrate ~ 0 + legal + year_fct + state, data = data, weights = pop),
    # nolint start
    # "Time trends, weights",
    #   lm(mrate ~ 0 + legal + year_fct + state + state:year,
    #      data = data, weights = pop)
    # nolint end
  ) %>%
    mutate(coefs = map(model, ~ cluster_se(.x, cluster = data[["state"]],
                                           type = "CR2"))) %>%
    unnest(coefs) %>%
    filter(term == "legal") %>%
    mutate(response = i) %>%
    select(name, response, estimate, std.error)
}

mlda_dd <- map_df(names(dtypes), run_mlda_dd)

mlda_dd %>%
  knitr::kable(digits = 2)

name	response	estimate	std.error
No trends, no weights	all	10.80	4.48
Time trends, no weights	all	8.47	4.74
No trends, weights	all	12.41	4.78
No trends, no weights	MVA	7.59	2.43
Time trends, no weights	MVA	6.64	2.47
No trends, weights	MVA	7.50	2.30
No trends, no weights	suicide	0.59	0.57
Time trends, no weights	suicide	0.47	0.74
No trends, weights	suicide	1.49	0.92
No trends, no weights	internal	1.33	1.53
Time trends, no weights	internal	0.08	1.80
No trends, weights	internal	1.89	1.83

6.2 Table 5.3

Regression DD Estimates of MLDA-Induced Deaths among 18-20 year-olds, from 1970-1983, controlling for Beer Taxes. This is the analysis presented in Angrist and Pischke (2014) Table 5.3.

run_beertax <- function(i) {
  data <- filter(deaths, year <= 1983, agegr == "18-20 yrs",
                 dtype == i, !is.na(beertaxa))
  out <- tribble(
    ~ name, ~ model,
    "No time trends",
    lm(mrate ~ 0 + legal + beertaxa + year_fct + state, data = data),
    "Time trends",
    lm(mrate ~ 0 + legal + beertaxa + year_fct + state + state:year,
       data = data)
  ) %>%
    # calc culstered standard errors
    mutate(coefs = map(model, ~ cluster_se(.x, data[["state"]]))) %>%
    unnest(coefs) %>%
    filter(term %in% c("legal", "beertaxa")) %>%
    mutate(response = i) %>%
    select(response, name, term, estimate, std.error)
}

beertax <- map_df(names(dtypes), run_beertax)

beertax %>%
  knitr::kable(digits = 2)

response	name	term	estimate	std.error
all	No time trends	legal	10.98	4.60
all	No time trends	beertaxa	1.51	9.02
all	Time trends	legal	10.03	4.57
all	Time trends	beertaxa	-5.52	30.40
MVA	No time trends	legal	7.59	2.51
MVA	No time trends	beertaxa	3.82	5.27
MVA	Time trends	legal	6.89	2.47
MVA	Time trends	beertaxa	26.88	18.76
suicide	No time trends	legal	0.45	0.58
suicide	No time trends	beertaxa	-3.05	1.61
suicide	Time trends	legal	0.38	0.72
suicide	Time trends	beertaxa	-12.13	8.28
internal	No time trends	legal	1.46	1.56
internal	No time trends	beertaxa	-1.36	3.02
internal	Time trends	legal	0.88	1.68
internal	Time trends	beertaxa	-10.31	10.90

Note: I had trouble getting sandwich::vcovCL to estimate clustered standard errors for this regression.

R Code for Mastering ’Metrics

6 MLDA Difference-in-Difference

6.1 Table 5.2

6.2 Table 5.3

6.3 References