infile (fixed format) — Read text data in fixed format with a dictionary 3
Dictionary directives
* marks comment lines. Wherever you wish to place a comment, begin the line with a *. Comments
can appear many times in the same dictionary.
lrecl(#) is used only for reading datasets that do not have end-of-line delimiters (carriage return,
line feed, or some combination of these). Such files are often produced by mainframe computers
and are either coded in EBCDIC or have been translated from EBCDIC into ASCII. lrecl() specifies
the logical record length. lrecl() requests that infile act as if a line ends every # characters.
lrecl() appears only once, and typically not at all, in a dictionary.
firstlineoffile(#) (abbreviation first()) is also rarely specified. It states the line of the file
where the data begin. You do not need to specify first() when the data follow the dictionary;
Stata can figure that out for itself. However, you might specify first() when reading data from
another file in which the first line does not contain data because of headers or other markers.
first() appears only once, and typically not at all, in a dictionary.
lines(#) states the number of lines per observation in the file. Simple datasets typically have
lines(1). Large datasets often have many lines (sometimes called records) per observation.
lines() is optional, even when there is more than one line per observation because infile
can sometimes figure it out for itself. Still, if lines(1) is not right for your data, it is best to
specify the correct number through lines(#).
lines() appears only once in a dictionary.
line(#) tells infile to jump to line # of the observation. line() is not the same as lines().
Consider a file with lines(4), meaning four lines per observation. line(2) says to jump to
the second line of the observation. line(4) says to jump to the fourth line of the observation.
You may jump forward or backward. infile does not care, and there is no inefficiency in going
forward to line(3), reading a few variables, jumping back to line(1), reading another
variable, and jumping forward again to line(3).
You need not ensure that, at the end of your dictionary, you are on the last line of the observation.
infile knows how to get to the next observation because it knows where you are and it knows
lines(), the total number of lines per observation.
line() may appear many times in a dictionary.
newline[(#)] is an alternative to line(). newline(1), which may be abbreviated newline,
goes forward one line. newline(2) goes forward two lines. We do not recommend using
newline() because line() is better. If you are currently on line 2 of an observation and want
to get to line 6, you could type newline(4), but your meaning is clearer if you type line(6).
newline() may appear many times in a dictionary.
column(#) jumps to column # on the current line. You may jump forward or backward within a
line. column() may appear many times in a dictionary.
skip[(#)] jumps forward # columns on the current line. skip() is just an alternative to column().
skip() may appear many times in a dictionary.
[type] varname [:lblname] [% infmt] ["variable label"] instructs infile to read a variable. The simplest
form of this instruction is the variable name itself: varname.
At all times, infile is on some column of some line of an observation. infile starts on column
1 of line 1, so pretend that is where we are. Given the simplest directive, ‘varname’, infile goes
through the following logic: