Processing grammatical and ungrammatical center embeddings in

Processing grammatical and ungrammatical center embeddings

in English and German: A computational model

Felix Engelmann ([email protected])

Department of Linguistics, Karl-Liebknecht Str. 24-25,

14476 Potsdam, Germany

Shravan Vasishth ([email protected])

Department of Linguistics, Karl-Liebknecht Str. 24-25,

14476 Potsdam, Germany

Abstract

Previous work has shown that in English ungrammatical cen-

ter embeddings are more acceptable and easier to process

than their grammatical counterparts (Frazier, 1985; Gibson &

Thomas, 1999). A well-known explanation for this preference

for ungrammatical structures is based on working-memory

overload: the claim is that the prediction for an upcoming

verb phrase is forgotten due to memory overload, leading to an

illusion of grammaticality (Gibson & Thomas, 1999). How-

ever, this memory-overload account cannot explain the recent

ﬁnding by Vasishth, Suckow, Lewis, and Kern (2008) that in

German no illusion of ungrammaticality occurs. We present a

simple recurrent network model that can explain both the pres-

ence of the grammaticality illusion in English and its absence

in German. We argue that the grammaticality illusion emerges

as a function of experience with language-speciﬁc structures,

not working memory constraints as argued for in Gibson and

Thomas (1999).

Keywords: sentence comprehension ; center embeddings ; il-

lusion of grammaticality ; working-memory models ; connec-

tionist models

Introduction

Consider the contrast in (1), discussed ﬁrst by Frazier (1985)

(the original observation is attributed by Frazier to Janet

Fodor). Although the rules of English grammar allow a sen-

tence like (1a), such a complex structure is perceived by na-

tive English speakers to be less acceptable than its ungram-

matical counterpart (1b), in which the middle verb phrase,

was cleaning every week, is missing.

(1) a. The apartment that the maid who the service had

sent over was cleaning every week was well dec-

orated.

b. *The apartment that the maid who the service had

sent over was well decorated.

The ﬁrst published study involving this contrast was an of-

ﬂine questionnaire-based experiment by Gibson and Thomas

(1999). Their main ﬁnding was that ungrammatical sentences

such as (1b) were rated no worse than grammatical ones such

as (1a). In related work, Christiansen and Macdonald (2009)

show that ungrammatical sentences were rated signiﬁcantly

better than the grammatical ones. We will refer to this sur-

prising ﬁnding as the grammaticality illusion.

At least two competing explanations exist for this illusion.

One is due to Gibson and Thomas (1999), who argue that the

prediction for the middle verb phrase is forgotten if memory

cost exceeds a certain threshold; this explanation relies on

the assumption that working memory overload leads to for-

getting. The second explanation is due to Christiansen and

Chater (1999) and Christiansen and Macdonald (2009), who

attribute the illusion to experience (exposure to particular reg-

ularities in the syntax of a language) as encoded in a con-

nectionist network. They trained a simple recurrent network

(SRN) on right-branching and center-embedding structures

and then assessed the output node activations after seeing the

ungrammatical sequence NNNVV (i.e., sentences like 1b).

The activations showed a clear preference for ungrammatical

structures, consistent with empirical data from English speak-

ers.

An important theoretical question is whether these

two explanations—the memory-overload account and the

experience-based account—can be distinguished. Although

the English data is consistent with both explanations, re-

cent work by Vasishth et al. (2008) provides revealing new

evidence regarding the grammaticality illusion. Vasishth

and colleagues carried out several self-paced reading and

eyetracking studies demonstrating that although the English

grammaticality illusion can be replicated in online measures

like reading time, in German the pattern reverses: readers ﬁnd

the ungrammatical sentence (1b) harder to process than its

grammatical counterpart (1a). In other words, German read-

ers do not experience the grammaticality illusion.

Speciﬁcally, for English Vasishth and colleagues found

(across several experiments) longer reading times in the

grammatical condition (1a) either at the ﬁnal verb or the word

immediately following it (or in both regions); whereas for

German they reported shorter re-reading times in the gram-

matical condition either in the ﬁnal verb region and/or the

region following it.

The absence of the grammaticality illusion in German is in-

teresting because it cannot be explained by the memory-based

forgetting account as stated in (Gibson & Thomas, 1999).

The explanation due to Christiansen and Chater (1999), how-

ever, may be able to explain the German results (in addi-

tion to the patterns seen in English): since German relative

clauses are always head-ﬁnal, German readers are exposed to

head-ﬁnal center embeddings much more often than English

speakers. This greater exposure to head-ﬁnal structures could

be the reason why German speakers are able to identify the

missing verb but the English speakers are unable to do so.

In this paper, we extend the connectionist model of

Christiansen and Chater (1999) to generate predictions for

both the English and German structures, and demonstrate that

this experience-based account provides a better explanation

for the English and German data than an account based on

language-independent working-memory constraints.

The Model

Network Architecture, grammar and corpora

We used a simple recurrent network (Elman, 1990) for mod-

eling the effect of experience on forgetting. SRNs have been

used previously to model the effect of structural properties

in the language on comprehension performance (Christiansen

& Chater, 1999; MacDonald & Christiansen, 2002). Since

the predictions of an SRN are sensitive to probabilistic con-

straints in the input structure, they serve well to assess the ef-

fect of language-speciﬁc properties on learning. Furthermore,

the architectural limitations of an SRN and its gradient nature

give rise to human-like processing properties that have been

explained in terms of working memory capacity limitations

and decay in symbolic models. Our claim is that the gram-

maticality illusion is dependent on experience with word or-

der regularities of the language in question. In order to show

this we used a simple artiﬁcial language resembling simple

sentences and subject- and object-extracted relative clauses.

We also held the number of subject- and object-relatives equal

in the corpus. In doing so we made sure that the only vary-

ing factor between the two training languages was whether its

relative clauses are head-ﬁnal or not.

The Corpora were generated from probabilistic context-

free grammars (PCFGs) originally designed by Lars

Konieczny (English) and Daniel M

uller and Lars Konieczny

(German).

For generating corpora and likelihood predictions

the Simple Language Generator (Rohde, 1999) was used. Ev-

ery training corpus consisted of 10,000 randomly generated

sentences. Test corpora were generated for every condition

consisting of 10 test sentences each. The networks described

below were built, trained, and tested in the Tlearn simulator

(Elman, 1992) on a Windows platform.

Training and Testing Procedure

Prior to training, all networks were initialized with random

connection weights in the range of [-0.15, 0.15] and the hid-

den units received an initial bias activation of 0.5. Each train-

ing included 10 individually initialized networks that were

trained on 10 different corpora, respectively. The networks

were trained for three epochs, where one epoch corresponded

to a full run through a corpus.

The SRNs were trained on a word-by-word continuation

prediction. Each input word produced an activation distribu-

tion over the output nodes which represented lexical entries.

Both grammars can be found at http://cognition.iig.uni-

freiburg.de/teaching/veranstaltungen/ws03/projekt.htm.

In combination with a cross-entropy error calculation (all out-

put activations sum to 1) the activation distribution was com-

parable to a probability distribution over words.

The SRN’s prediction were assessed using grammatical

prediction error (Christiansen & Chater, 1999). The GPE al-

gorithm is based on the numerical differences between the

PCFG probabilities and the actual output. The GPE value is

a difﬁculty measure for every word in the sentence, which

can be used as a reading time predictor (MacDonald & Chris-

tiansen, 2002).

Modeling the grammaticality illusion

The SRN trained on English sentences had 31 input and

output units and 60 hidden units. Each input and output

unit stood for one lexical entry in the lexicon. The lexicon

consisted of ﬁve nouns, four intransitive and four transitive

verbs in singular, plural and past tense forms and one end-

of-sentence marker (EOS). At every NP the probability of an

RC embedding was 0.1.

An RC could be realized as a sub-

ject relative (SRC) or an object relative clause (ORC) with

equal probability.

Probabilities for transitivity and number

status were also equal. The longest sentence in the corpus

for English had 18 words. The German lexicon contained 21

words, including four verbs and nouns in singular and plu-

ral forms, the respective determiners in nominative and ac-

cusative, the comma and the EOS marker. In consequence

the SRN trained on German had only 21 input and output

units. The longest corpus sentence had 41 words, including

the obligatory commas in German relative clauses. Both the

English and German grammars included a number agreement

between subjects and their predicates. In German a number

and case agreement between determiner and noun was also

included.

Christiansen and Chater (1999) reported node activations

for the region after an NNNVV sequence. For better com-

parison with empirical data we extended their study to obtain

GPE values for both conditions on all regions after the miss-

ing verb. Consider for example the error values on seeing

V1 after the sequence ‘N1 N2 N3 V3’, which is ungrammati-

cal because V2 is missing. In case the network is not aware of

the ungrammaticality, this should be reﬂected by similar GPE

values for both the grammatical and the ungrammatical con-

dition at V1. In order to model that we set the target probabil-

ity at V1 to the same value as in the grammatical condition.

(Meaning the probability distribution is conditioned by the

assumption that V2 has actually been seen.) In consequence,

an expectation of a V2 at this point would increase the GPE.

So, in the ungrammatical condition an SRN with a more ac-

curate grammar representation would produce a higher pre-

These are the probabilities used by Konieczny in his grammar;

MacDonald and Christiansen have used 0.05. The precise number

is arbitrary; the essential point is that relative clauses should be less

frequent than simple sentences.

We did not encode the well-known difference in probability of

occurrence between SRCs and ORCs because we were not modeling

this difference; this assumption does not affect the results presented

here.

diction error than an SRN wrongly predicting V1 instead of

V2.

For the English case, the GPE values would be lower in

the ungrammatical condition. This effectively means that the

SRN is unable to make correct predictions based on long-

distance dependencies, but bases its predictions on rather lo-

cally consistent sequences. For example after seeing V3 the

network only predicts one more verb because the observation

of N1 is too weakly encoded in the hidden representations

to inﬂuence the predictions. Consequently, on V1 the error

for the ungrammatical condition should be lower because in

the grammatical condition V1 is the third verb which is in-

consistent with the SRN’s predictions. The preference for the

ungrammatical structure should continue on the post-V1 re-

gions because a locally coherent context with two verbs is

easier to handle than a context of three verbs.

We ﬁrst tested whether the SRN makes the same predic-

tions as previous work on the English grammatical and un-

grammatical structures (Christiansen & Macdonald, 2009).

Simulation 1: English

The SRN, which was trained on the English corpus, was

tested on the grammatical and the ungrammatical condition

after one, two, and three epochs.

The grammar we used was more complex than Chris-

tiansen and Chater’s, but structurally compatible. Therefore

we expected that we would replicate their ﬁndings for En-

glish. In particular, the GPE values for the V1 and post-V1 re-

gions should receive lower values in the ungrammatical con-

dition (see corpus example 2b).

(2) a. The judge that the reporters that the senators un-

derstand praise attacked the lawyers .

b. *The judge that the reporters that the senators un-

derstand attacked the lawyers .

Results for simulation 1 In order to compare the results for

the English self-paced reading and eyetracking experiments

in Vasishth et al. (2008) the assessed regions in the simula-

tion were the three verbs V3, V2, V1 and the post-V1 region.

The V2 region contains no datapoint in the ungrammatical

condition because the verb is dropped in the testing stimuli.

Figure 1 shows GPE values for the SRNs trained and tested

on the English grammar after one, two and three epochs of

training. The pattern corresponded to the empirical results;

the SRNs predicted an advantage for ungrammatical struc-

tures at V1 and post-V1. No effect was predicted on V3 be-

cause no difference in stimuli and probability between the

conditions is present at this point.

Simulation 2: German

We turn next to the simulations for German center embed-

dings. German relative clauses differ from English in at least

two respects (a third difference is the morphology of the rela-

tive pronoun; but we do not discuss this difference here due to

space constraints). First, German relative clauses are obliga-

torily head ﬁnal; second, commas are obligatory in German

0.2 0.4 0.6 0.8 1.0

English without commas

V3 V2 V1 Post-V1

GPE

Region

Figure 1: Simulation 1. English double-embedded object rel-

ative clauses. The ﬁgure shows the GPE values (for three

epochs) for the three verbs and the subsequent region of the

grammatical and ungrammatical conditions. The dotted line

shows the ungrammatical condition. Epochs 3, 2, and 1 are

colored black, dark grey and light grey, respectively.

relative clauses (see 3 for an example). We return to the role

of commas later in the paper.

(3) a. Der Polizist , den der Mensch , den der Passant

verspottet , ruft , trifft den Jungen .

b. *Der Polizist , den der Mensch , den der Passant

verspottet , trifft den Jungen .

Results of simulation 2 Figure 2 summarizes the ﬁndings.

First, in the regions V2 and V1, the GPEs were lower com-

pared to the English sentences. Second, in contrast to the En-

glish case, the comparison by conditions did not reveal any

difference on the main verb (V1). Finally, a small but signif-

icant preference for the grammatical structure was found on

the post-V1 region (p < 0.001).

Discussion

The English and German center-embedding simulations sug-

gest that experience with head-ﬁnal structures may furnish a

better explanation for the grammaticality illusion in English

(and its absence in German) than working-memory based ac-

counts such as Gibson and Thomas’. Both the English and

German reading patterns found in the literature can be mod-

eled by the SRN, whereas the working-memory based expla-

nation can only explain the English results.

Our results do not imply that working memory plays no

role in these constructions; rather, our claim is that experience

plays a dominant role. A plausible way to reconcile the two

accounts into one composite theory would have experience

modulating working-memory overload. These details are or-

0.2 0.4 0.6 0.8 1.0

German with commas

V3 V2 V1 Post-V1

GPE

Region

Figure 2: Simulation 2. German double-embedded object rel-

ative clauses. The ﬁgure shows the GPE values (for three

epochs) for the three verbs and the subsequent region of the

grammatical and ungrammatical conditions.

thogonal to our main ﬁnding, which is that experience de-

termines whether English and German readers can correctly

maintain predictions for upcoming verbs.

The role of commas in processing English

center embeddings

One objection to this experience-based explanation for the

grammaticality illusion (and its absence) is that the differ-

ence between English and German center embeddings could

be related to the obligatory presence of commas in German.

The commas in German relative clauses could lead to a strat-

egy that is not available in the English structures previously

studied. For example, readers could simply be counting the

number of commas in German, and this could make it easier

for them to detect ungrammaticality.

If commas alone (and not the head-ﬁnal nature of relative

clauses) are responsible for the patterns observed in German,

then two straightforward predictions are that: (a) adding com-

mas to English relative clauses should result in a German-like

pattern for English sentences; and (b) removing commas from

German relative clauses should result in an English-like pat-

tern for German sentences.

Prediction (a) can be evaluated empirically but prediction

(b) cannot because, as mentioned earlier, commas are oblig-

atory in German relative clauses. As it turns out, Vasishth et

al. (2008) tested the prediction for English and found that the

presence of commas in English does not change the pattern;

the grammaticality illusion persists.

The question we address next is: What does the SRN

model predict for English RCs when commas are present?

Simulation 3: English with commas

For the simulation we extended the English grammar with

appropriate comma insertions and trained the SRNs on the

resulting corpora. In English non-restrictive object relative

clauses (ORCs), commas would appear after nouns in the be-

ginning of the sentence and after the verbs in the end. In

a double-embedded ORC there would be a comma after V3

and V2. Thus, the grammatical/ungrammatical sequence pair

is N,N,NV,V,V vs. N,N,NV,V. See (4) for examples.

For the SRN the comma effectively appears as a word cat-

egory with only one token which attaches to nouns or verbs

and is not involved in long-distance dependencies. Hence, the

activation pattern representing it should not be too complex.

In fact the learning of comma usage in ORCs can be reduced

to a counting recursion problem of the pattern aabb instead of

abba. As discussed in (Christiansen & Chater, 1999), count-

ing recursion is the easiest of the three recursion types for

both humans and connectionist networks. Thus, it is very

likely that the inclusion of commas facilitates processing in

the grammatical condition, lowering the respective GPE val-

ues.

(4) a. The lawyer , who the senator , who the judges

attack , understands , praises the reporters .

b. *The lawyer , who the senator , who the judges

attack , praises the reporters .

Results for simulation 3 See Figure 3 for the results after

one, two and three epochs. Compared to simulation 1, there

was a global improvement for both conditions, i.e., the GPEs

were lower in each region. On V1 training had more effect in

the ungrammatical than in the grammatical condition, result-

ing in a preference for the ungrammatical structure on V1 (as

in simulation 1). On post-V1 training affected the grammati-

cal condition more, however, not resulting in a grammatical-

ity preference.

In summary, the SRN model suggests that although the in-

sertion of commas in English helps to make better predic-

tions overall, training effects seem to be driven by rather

local consistency (Tabor, Galantucci, & Richardson, 2004),

(Konieczny & Mueller, 2007), affecting the ungrammatical

condition more than the grammatical one.

Importantly, the grammaticality illusion persists for En-

glish even when commas are present. This is consistent

with the empirical ﬁndings for non-restrictive English rela-

tive clauses: Vasishth et al. (2008) also found in a self-paced

reading study that the comma cue did not affect the grammat-

icality illusion in English.

The above ﬁndings raise an interesting question for Ger-

man: is the reversal of the grammaticality illusion in German

due only to the head-ﬁnal nature of relative clauses, or do

commas also play a role in determining the outcome? The

only way to empirically disentangle the effect of head-ﬁnality

and commas in German would be to examine a language such

as Hindi, which also has head-ﬁnal relative clauses but does

not require commas.

0.2 0.4 0.6 0.8 1.0

English with commas

V3 V2 V1 Post-V1

GPE

Region

Figure 3: Simulation 3. The ﬁgure shows the GPEs (for the

three epochs) of English center embeddings with commas.

Until such empirical evidence becomes available we can-

not deﬁnitively answer the question about the role of com-

mas, head-ﬁnality their interaction with experience. The SRN

model can however generate predictions regarding the role

of commas versus head-ﬁnality in German. We simulated

the acquisition of experience with German head-ﬁnal rela-

tive clauses which do not have any commas at all; in effect,

we can simulate the learning of Hindi-type relative clauses

in German. If commas are (partly) responsible for the rever-

sal of the grammaticality illusion in German, then we should

see an English-like pattern; if head-ﬁnality alone is the crit-

ical factor, then we should see a preference for grammatical

structures even when commas are absent. This simulation is

presented next.

Simulation 4: German without commas

In German, the presence of commas could have a facilitat-

ing effect because the counting-recursion pattern aabb is not

only applicable in the ORC as in English but also in the

SRC (both are head-ﬁnal structures in German, unlike En-

glish). Consequently, the SRN trained on the German cor-

pus should be very skilled on center-embedding recursion and

comma counting-recursion and hence will have much lower

error rates for the grammatical condition.

Thus, in German the removal of commas should make the

SRN’s predictions more error-prone. The verb-ﬁnality regu-

larity in German, however, could still result in better predic-

tions for the grammatical condition in German than in En-

glish. In order to test these predictions, simulation 4 tested

SRNs trained on a comma-free German grammar.

Results of Simulation 4 The GPE values of the simulation

involving German without commas (Figure 4) show a simi-

lar pattern as in English without commas. In the ﬁrst epoch,

an ungrammaticality preference was found in a small effect

on V1 and a very pronounced effect on the region follow-

ing it. After completion of training, V1 and post-V1 show

a similar sized preference for the ungrammatical structure.

Surprisingly, the regularity of verb-ﬁnal structures does not

seem to support correct predictions in German any more than

in English. Rather, the more regular application of commas

in German has a very facilitating effect on both conditions,

slightly more on the grammatical.

0.2 0.4 0.6 0.8 1.0

German without commas

V3 V2 V1 Post-V1

GPE

Region

Figure 4: Simulation 4. The GPEs for German center embed-

dings without commas.

General Discussion

The results of simulation 1 (English without commas) and 2

(German with commas) were consistent with existing empiri-

cal data from both ofﬂine studies and online (self-paced read-

ing and eyetracking) studies (Gibson & Thomas, 1999; Chris-

tiansen & MacDonald, 1999; Vasishth et al., 2008; Chris-

tiansen & Macdonald, 2009): the grammaticality illusion oc-

curs in English but not in German.

These simulations demonstrate that the inherent architec-

tural constraints of SRNs correctly predict both the grammat-

icality illusion in English double-embedded ORCs, as well

as the absence of the illusion in German. In addition, the

SRN model also makes the correct predictions regarding the

effect of commas in English relative clauses: although com-

mas reduce the GPEs, the grammaticality illusion persists in

English. This is consistent with the evidence presented by

Vasishth et al. (2008). Finally, we showed that in German

head-ﬁnality alone does not explain the absence of the gram-

maticality illusion; commas appears to be crucial for the pat-

terns observed.

Conclusion

This paper investigated the explanatory power of a particu-

lar implementation of the experience-based account for the

grammaticality illusion. The well-known SRN modeling ap-

proach of MacDonald and Christiansen (2002), Christiansen

and Macdonald (2009) was adopted to test its predictions on

the forgetting effect in complex center-embedding.

The grammaticality illusion was predicted for English but

not for German, consistent with human data. However, fur-

ther simulations revealed the comma insertion as an important

factor for the German pattern.

A caveat is necessary here. An SRN trained on a sim-

ple grammar obviously does not learn exactly the same con-

straints as humans do. These simulations are rather approxi-

mations that are suggestive of the role that experience plays in

modulating memory processes. An important issue with the

SRNs’ predictions is their dependency on local coherence.

Interestingly, however, there is evidence that even human

readers rely on local coherence in certain structures (Tabor

et al., 2004). Another ﬁnding is that the simulations reported

by Christiansen and Chater (1999), and also the comma issue

in simulations presented here, showed that the SRN handles

counting-recursion better than other types. That may be the

reason for the strong facilitating effect of comma insertion

compared to head-ﬁnality.

More broadly, this work argues in favor of a uniform ac-

count of language-speciﬁc differences that are grounded in

experience and that emerge as a consequence of architec-

tural constraints. This account is broadly consistent with

a range of recent work that characterizes processing mod-

ulated by experience (Hale, 2001). At the same time, it is

clear that working-memory centered accounts capture a great

deal of the empirical base that purely experience-based ac-

counts cannot explain. Some examples are: the presence of

both similarity-based interference and similarity-based facil-

itation effects (Loga

cev & Vasishth, 2009), the interaction

of interference with locality (Van Dyke & Lewis, 2003) and

with antilocality (Vasishth & Lewis, 2006). Thus, it appears

that a principled composition experience as well as working-

memory constraints is necessary to explain the range of em-

pirical phenomena in sentence processing.

Acknowledgements

We are very grateful to Lars Konieczny for permission to use

the grammar developed in his lab.

References

Christiansen, M. H., & Chater, N. (1999). Toward a connec-

tionist model of recursion in human linguistic performance.

Cognitive Science, 23(2), 157–205.

Christiansen, M. H., & Macdonald, M. (2009). A usage-

based approach to recursion in sentence processing. (Sub-

mitted)

Christiansen, M. H., & MacDonald, M. C. (1999). Processing

of recursive sentence structure: Testing predictions from a

connectionist model. (Manuscript in preparation)

Elman, J. L. (1990). Finding structure in time. Cognitive

Science, 14(2), 179–211.

Elman, J. L. (1992). Tlearn simulator.

Software available at: http://crl.ucsd.edu/innate/tlearn.html.

Frazier, L. (1985). Syntactic complexity. In D. R. Dowty,

L. Kartunnen, & A. M. Zwicky (Eds.), Natural language

parsing: Psychological, computational, and theoretical

perspectives (pp. 129–189). Cambridge University Press.

Gibson, E., & Thomas, J. (1999). Memory limitations and

structural forgetting: The perception of complex ungram-

matical sentences as grammatical. Language and Cognitive

Processes, 14(3), 225–248.

Hale, J. (2001). A probabilistic earley parser as a psycholin-

guistic model. North American Chapter Of The Association

For Computational Linguistics, 1–8.

Konieczny, L., & Mueller, D. (2007). Local coherence inter-

pretation in written and spoken language. In Proceedings

of the 20th annual CUNY conference on human sentence

processing. La Jolla, CA.

Loga

cev, P., & Vasishth, S. (2009). Morphological ambiguity

and working memory. In P. de Swart & M. Lamers (Eds.),

Case, word order, and prominence: Psycholinguistic and

theoretical approaches to argument structure. Springer.

MacDonald, M. C., & Christiansen, M. H. (2002). Re-

assessing working memory: Comment on Just and Carpen-

ter (1992) and Waters and Caplan (1996). Psychological

Review, 109(1), 35–54.

Rohde, D. L. T. (1999). The simple language generator: En-

coding complex languages with simple grammars (Tech).

Mellon University, Department of Computer Science, 99–

123.

Tabor, W., Galantucci, B., & Richardson, D. (2004, May). Ef-

fects of merely local syntactic coherence on sentence pro-

cessing. Journal of Memory and Language, 50(4), 355–

370.

Van Dyke, J. A., & Lewis, R. L. (2003). Distinguishing ef-

fects of structure and decay on attachment and repair: A

cue-based parsing account of recovery from misanalyzed

ambiguities. Journal of Memory and Language, 49(3),

285–316.

Vasishth, S., & Lewis, R. L. (2006). Argument-head distance

and processing complexity: Explaining both locality and

antilocality effects. Language, 82(4), 767-794.

Vasishth, S., Suckow, K., Lewis, R., & Kern, S.

(2008). Short-term forgetting in sentence comprehension:

Crosslinguistic evidence from head-ﬁnal structures. (Sub-

mitted to Language and Cognitive Processes)