Excel Spreadsheet Data Entry Tips Page 12
Prepared by the Center for Biostatistics (http://www.biostatistics.osu.edu) Revised: 04/15/2015
Example: Spreadsheet with Data Issues – FIXED!
The following hypothetical spreadsheet is now ready for analysis by a statistician.
site patient_ID group race gender comorbid_1 comorbid_2 comorbid_3 WBC LOS date complications
1 1 1 C M 0 1 0 9.6 3 3/26/2014 1
1 2 1 C M 0 0 1 6.7 6 11/24/2013 1
1 3 1 AA M 1 0 0 12.2 13 2/8/2014 0
1 4 2 AA F 1 1 0 7.8 8 3/19/2014 1
1 5 2 O M 0 0 1 8.3 21
1
1 6 3 AA F 0 1 0 6.4 13 2/1/2014 0
2 7 2 AA F 1 0 1 4.9 9 4/13/2014 0
2 8 2 C M 0 1 0 4.9 2 9/14/2013 0
2 9 2 O
0 1 0 10.4 6 12/5/2013 0
2 10 1 C M 1 0 0 8.3 8 3/2/2014 0
2 11 3 C M 0 1 0 4.9 35 4/2/2014 1
2 12 3 C F 0 1 1 11.2 7 10/30/2013 1
2 13 3 AA F 0 0 1 7.3 1 1/19/2014 1
2 14 3 AA F 0 1 0 10.4 2 1/5/2014 1
2 15 1 C M 0 0 1 8.7 6 2/27/2014 1
2 16 1 C F 0 0 1 9.6 18 9/17/2013 1
2 17 2 C M 1 0 0 5.5 15 11/8/2013 0
3 18 2 O F 1 1 1 8.8 6 10/18/2013 0
3 19 2 O M 0 0 1 5.7 4 2/19/2014 0
3 20 2 AA F 0 1 0 9.7 9 1/24/2014 1
cells, each patient has
site listed individually.
entries are now
consistent (for
example, ‘C’ alone
denotes Caucasian).
A blank cell is used
to indicate missing
responses in one variable,
created separate 0/1
variables for each possible
response (0=does not have
comorbidity, 1 = has
comorbidity).
from numeric
variables. Replaced
‘<5’ with ‘4.9’ after
discussions between
investigator and
coding with a
new variable to
identify group.
of the form
MM/DD/YYYY.
Replaced
‘UNKNOWN’ with
blank cell.
variable instead of text.
Patients without
complications are now
denoted as ‘0’ to avoid
being confused with
missing data.
names with no spaces
or special characters.
Use of underscore (‘_’)
to separate words.