
# Chapter 12 Formatting Tables

## 12.1 Overview of Packages

R has multiple packages and functions for directly producing formatted tables for LaTeX, HTML, and other output formats. Given the

See the Reproducible Research Task View for an overview of various options.

• xtable is a general purpose package for creating LaTeX, HTML, or plain text tables in R.

• texreg is more specifically geared to regression tables. It also outputs results in LaTeX (texreg), HTML (texreg), and plain text.

The packages stargazer and apsrtable are other popular packages for formatting regression output. However, they are less-well maintained and have less functionality than texreg. For example, apsrtable hasn’t been updated since 2012, stargazer since 2015.

The texreg vignette is a good introduction to texreg, and also discusses the These blog posts by Will Lowe cover many of the options.

Additionally, for simple tables, knitr, the package which provides the heavy lifting for R markdown, has a function knitr. knitr also has the ability to customize how R objects are printed with the knit_print function.

Other notable packages are:

• pander creates output in markdown for export to other formats.
• tables uses a formula syntax to define tables
• ReportR has the most complete support for creating Word documents, but is likely too much.

For a political science perspective on why automating the research process is important see:

## 12.2 Summary Statistic Table Example

The xtable package has methods to convert many types of R objects to tables.

## # A tibble: 4 x 7
##   variable      n       Mean Std. Dev.    Median    Min.         Max.
##   <chr>     <int>      <dbl>       <dbl>     <dbl>   <dbl>        <dbl>
## 1 gdpPercap  1704     7215.       9857.     3532.    241.      113523.
## 2 lifeExp    1704       59.5        12.9      60.7    23.6         82.6
## 3 pop        1704 29601212.  106157897.  7023596.  60011.  1318683096.
## 4 year       1704     1980.         17.3    1980.   1952.        2007.

Now that we have a data frame with the table we want, use xtable to create it:

variable n Mean Std. Dev. Median Min. Max.
gdpPercap 1,704 7,215 9,857 3,532 241 113,523
lifeExp 1,704 59 13 61 24 83
pop 1,704 29,601,212 106,157,897 7,023,596 60,011 1,318,683,096
year 1,704 1,980 17 1,980 1,952 2,007

Note that there we two functions to get HTML. The function xtable creates an xtable R object, and the function xtable (called as print()), which prints the xtable object as HTML (or LaTeX). The default HTML does not look nice, and would need to be formatted with CSS. If you are copy and pasting it into Word, you would do some post-processing cleanup anyways.

Another alternative is the knitr function in the knitr package, which outputs R markdown tables.

variable n Mean Std. Dev. Median Min. Max.
gdpPercap 1704 7.215327e+03 9.857455e+03 3531.8470 241.1659 1.135231e+05
lifeExp 1704 5.947444e+01 1.291711e+01 60.7125 23.5990 8.260300e+01
pop 1704 2.960121e+07 1.061579e+08 7023595.5000 60011.0000 1.318683e+09
year 1704 1.979500e+03 1.726533e+01 1979.5000 1952.0000 2.007000e+03

This is useful for producing quick tables.

Finally, htmlTables package unsurprisingly produces HTML tables.

variable n Mean Std. Dev. Median Min. Max.
1 gdpPercap 1704 7 10 3532 241 1
2 lifeExp 1704 6 1 61 24 8
3 pop 1704 3 1 7023596 60011 1
4 year 1704 2 2 1980 1952 2

It has more features for producing HTML tables than xtable, but does not output LaTeX.

## 12.3 Regression Table Example

We will run several regression models with the Duncan data

Since I’m running several regressions, I will save them to a list. If you know that you will be creating multiple objects, and programming with them, always put them in a list.

First, create a list of the regression formulas,

Write a function to run a single model, Now use map to run a regression with each of these formulae, and save them to a list,

This is a list of lm objects,

## [[1]]
## [1] "lm"
##
## [[2]]
## [1] "lm"
##
## [[3]]
## [1] "lm"
##
## [[4]]
## [1] "lm"

We can look at the first model,

##
## Call:
## lm(formula = .x, data = Duncan, model = FALSE)
##
## Coefficients:
## (Intercept)     typeprof       typewc
##       22.76        57.68        13.90

Now we can format the regression table in HTML using htmlreg. The first argument of htmlreg is a list of models:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>
Statistical models
Model 1 Model 2 Model 3 Model 4
(Intercept) 22.76*** 2.46 0.28 -0.19
(3.47) (5.19) (5.09) (3.71)
typeprof 57.68*** 16.66*
(5.10) (6.99)
typewc 13.90 -14.66*
(7.35) (6.11)
income 1.08*** 0.60***
(0.11) (0.09)
education 0.90*** 0.35**
(0.08) (0.11)
R2 0.76 0.70 0.73 0.91
Adj. R2 0.75 0.69 0.72 0.90
Num. obs. 45 45 45 45
RMSE 15.88 17.40 16.69 9.74
p < 0.001, p < 0.01, p < 0.05

By default, htmlreg() prints out HTML, which is exactly what I want in an R markdown document. To save the output to a file, specify a non-null file argument. For example, to save the table to the file prestige.html,

Since this function outputs HTML directly to the console, it can be hard to tell what’s going on. If you want to preview the table in RStudio while working on it, this snippet of code uses htmltools package to do so:

The htmlreg function has many options to adjust the table formatting. Below, I clean up the table.

• I remove stars using stars = NULL. It is a growing convention to avoid the use of stars indicating significance in regression tables (see AJPS and Political Analysis guidelines).

• The arguments doctype, html.tag, head.tag, body.tag control what sort of HTML is created. Generally all these functions (whether LaTeX or HTML output) have some arguments that determine whether it is creating a standalone, complete document, or a fragment that will be copied into another document.

• The arguments include.rsquared, include.adjrs, and include.nobs are passed to the function extract() which determines what information the texreg package extracts from a model to put into the table. I get rid of $$R^2$$, but keep adjusted $$R^2$$, and the number of observations.

Regressions of Occupational Prestige
(1) (2) (3) (4)
Professional 57.68 16.66
(5.10) (6.99)
Working Class 13.90 -14.66
(7.35) (6.11)
Income 1.08 0.60
(0.11) (0.09)
Education 0.90 0.35
(0.08) (0.11)
Adj. R2 0.75 0.69 0.72 0.90
Num. obs. 45 45 45 45
Note: OLS regressions with prestige as the response variable.

Once you find a set of options that are common across your tables, make a function so you do not need to retype them.

Statistical models
Model 1 Model 2 Model 3 Model 4
(Intercept) 22.76 2.46 0.28 -0.19
(3.47) (5.19) (5.09) (3.71)
typeprof 57.68 16.66
(5.10) (6.99)
typewc 13.90 -14.66
(7.35) (6.11)
income 1.08 0.60
(0.11) (0.09)
education 0.90 0.35
(0.08) (0.11)
R2 0.76 0.70 0.73 0.91
Adj. R2 0.75 0.69 0.72 0.90
Num. obs. 45 45 45 45
RMSE 15.88 17.40 16.69 9.74
Note: OLS regressions with prestige as the response variable.

Note that I didn’t include every option in my_reg_table, only those arguments that will be common across tables. I use ... to pass arguments to htmlreg. Then when I call my_reg_table the only arguments are those specific to the content of the table, not the formatting, making it easier to understand what each table is saying.

Of course, texreg also produces LaTeX output, with the function texreg. Almost all the options are the same as htmlreg.