Example 4 — Table of t test results

Title stata.com

Description Remarks and examples Reference Also see

Description

In this example, we demonstrate how to use collect to store the results of mean-comparison

tests (t tests) for levels of a categorical variable in a collection and how to create a customized table

with these results.

Remarks and examples stata.com

Remarks are presented under the following headings:

Collecting statistics

Customizing the table

Collecting statistics

Below, we use data from the Second National Health and Nutrition Examination Survey (NHANES

II) (McDowell et al. 1981). We wish to test whether the mean systolic blood pressure (bpsystol)

is the same across males and females in each category of race. To perform the test for each level of

race, we use the by preﬁx. We ﬁrst create a new collection named ex4 and then use the collect

preﬁx to collect the results from each ttest command and store them in the new collection. All

results that ttest returns in r() will be collected, but only the ones we have speciﬁed will be

automatically included in our table.

. use https://www.stata-press.com/data/r18/nhanes2l

(Second National Health and Nutrition Examination Survey)

. collect create ex4

(current collection is ex4)

. quietly: collect r(N_1) r(mu_1) r(N_2) r(mu_2) r(p):

> by race, sort: ttest bpsystol, by(sex)

These results are stored in the current collection. We can then use collect layout to arrange

the items from the collection into a table. We place the levels of race on the rows and the results

(result) on the columns.

. collect layout (race) (result)

Collection: ex4

Rows: race

Columns: result

Table 1: 3 x 5

(output omitted )

The labels for these statistics are automatically included in the table, which makes it very wide.

Therefore, we omit the table preview from the output. In the following section, we will format the

table to make it ready for publication.

2 Example 4 — Table of t test results

Customizing the table

To ﬁnalize our table from the previous section, we will want to label which statistics are for males

and females, shorten the labels for the statistics, and display the results with two digits to the right

of the decimal.

First, let’s work on the labels. The statistics are part of the dimension result. We list the labels

for the levels of this dimension:

. collect label list result

Collection: ex4

Dimension: result

Label: Result

Level labels:

N_1 Sample size n

N_2 Sample size n

df_t Degrees of freedom

level Confidence level

mu_1 x mean for population 1

mu_2 x mean for population 2

p Two-sided p-value

p_l Lower one-sided p-value

p_u Upper one-sided p-value

sd Combined std. dev.

sd_1 Standard deviation for first variable

sd_2 Standard deviation for second variable

se Std. error

t t statistic

We would like to remap the statistics for males to their own dimension and similarly for females.

This will allow us to categorize the results under the labels Males and Females. The levels N 1 and

mu 1 correspond to males, and the levels N 2 and mu 2 correspond to females. We also remap the

p-values to their own dimension called Difference.

. collect remap result[N_1 mu_1] = Males

(6 items remapped in collection ex4)

. collect remap result[N_2 mu_2] = Females

(6 items remapped in collection ex4)

. collect remap result[p] = Difference

(3 items remapped in collection ex4)

Then, we use collect style header to specify that we want to display the title for the speciﬁed

dimensions. These titles are suppressed by default. Then, we arrange our items once more with the

new dimension names. Again, we place the levels of race on the rows, but now we place the

dimensions Males, Females, and Difference on the columns.

. collect style header Males Females Difference, title(name)

. collect layout (race) (Males Females Difference)

Collection: ex4

Rows: race

Columns: Males Females Difference

Table 1: 3 x 5

Males Males Females Females Difference

N_1 mu_1 N_2 mu_2 p

White 4312 132.8476 4753 128.5264 1.78e-19

Black 500 133.69 586 133.8481 .9217363

Other 103 130.6699 97 126.7216 .3098674

Example 4 — Table of t test results 3

Our table looks much better. Next, we will add labels to the statistics. The statistics are levels of

the new dimensions that we remapped them to. To modify labels for levels of a dimension, we use

collect label levels.

. collect label levels Males N_1 "N" mu_1 "Mean BP"

. collect label levels Females N_2 "N" mu_2 "Mean BP"

. collect label levels Difference p "p-value"

Previously, we saw the column headers Males and Females being repeated. We would like to

display these only once and center them horizontally. We can use collect style column to make

this change. We also set the columns to have the same width. Then, we center-align all the cells

in the table. With collect style cell, we can modify all cells in the table or speciﬁc cells. For

example, we wish to format the means and p-values to display two digits to the right of the decimal.

Therefore, we specify the levels of the dimensions we want to apply this format to. Then, we get a

preview of our table.

. collect style column, dups(center) width(equal)

. collect style cell, halign(center)

. collect style cell Males[mu_1] Females[mu_2] Difference[p], nformat(%5.2f)

. collect preview

Males Females Difference

N Mean BP N Mean BP p-value

White 4312 132.85 4753 128.53 0.00

Black 500 133.69 586 133.85 0.92

Other 103 130.67 97 126.72 0.31

Finally, we will modify the borders in the table by using collect style cell. First, we remove

the vertical border. Because we do not want any vertical borders, we do not list any levels of

the dimension border block when we specify the border(right, pattern(nil)) option. Our

next collect style cell command requires a bit more explanation. With it, we add a horizontal

border below Males to indicate that the ﬁrst N and Mean BP are for males. To target this very

speciﬁc border, we specify cell type[column-header]#Males. Here cell type refers to cells

in different parts of the table. We want to make a change only in the column header. We also want

to make this change only for the Males dimension. By specifying the # between the tags, we direct

the change only at the dimension Male within the column headers. We can also target the border

under Females by specifying cell type[column-header]#Females. To this command, we add

the border(bottom, pattern(single)) option to place a single border on the bottom of these

cells.

4 Example 4 — Table of t test results

. collect style cell border_block, border(right, pattern(nil))

. collect style cell cell_type[column-header]#Males

> cell_type[column-header]#Females, border(bottom, pattern(single))

. collect preview

Males Females Difference

N Mean BP N Mean BP p-value

White 4312 132.85 4753 128.53 0.00

Black 500 133.69 586 133.85 0.92

Other 103 130.67 97 126.72 0.31

After ﬁnalizing our table of results, we can export it to another format with collect export.

Reference

McDowell, A., A. Engel, J. T. Massey, and K. Maurer. 1981. Plan and operation of the Second National Health and

Nutrition Examination Survey, 1976–1980. Vital and Health Statistics 1(15): 1–144.

Also see

[TABLES] collect remap — Remap tags in a collection

[TABLES] collect style column — Collection styles for column headers

[TABLES] collect style header — Collection styles for hiding and showing header components

Stata, Stata Press, and Mata are registered trademarks of StataCorp LLC. Stata and

Stata Press are registered trademarks with the World Intellectual Property Organization

of the United Nations. StataNow and NetCourseNow are trademarks of StataCorp

LLC. Other brand and product names are registered trademarks or trademarks of their

respective companies. Copyright

 1985–2023 StataCorp LLC, College Station, TX,

For suggested citations, see the FAQ on citing Stata documentation.