| x |
| x |
| How Visual Encoding Learning Enables Fluent Reading |
| MY RESEARCH |
| . |
![]() |
| HOW VISUAL ENCODING LEARNING CONTRIBUTES TO FLUENT READING |
| While I was training neural nets to recognize handwritten characters, I was also teaching my 3-year old daughter to read. She had no problem learning to identify single letters, and with a bit more effort, she learned to associate each letter with its corresponding sound or sounds, but she didn't begin to learn how to read words until two years later. What caused this delay? I knew that neural net training gets significantly more difficult as image size increases, and so I wondered if her slowed progress in learning to read might stem from the same computational problems my nets encountered in learning to classify larger character images--the curse of dimensionality. Perhaps one of the reasons why learning to read is difficult is because it requires learning to classify fixated text images, each of which is rather large, spanning many characters. |
| . |
![]() |
| Fluent readers of English correctly classify an average of 7-8 letters in each fixated text image, and at least partially classify as many as 14. Letter classification within an image seems to occur in parallel, within the first 150 ms of the fixation. Furthermore, classification of any single character in the text image benefits from that character being embedded within the familiar context of a word, suggesting that children learn to classify a familiar letter sequence in parallel. However, this skill isn't acquired overnight. Under the best of circumstances, it requires extensive practice from childhood into adulthood. Increasing the letter recognition span speeds reading by enabling readers to cover a line text using fewer fixations. I developed a computational model, called Encoder, to discover how people might minimize the curse of dimensionality through various constraints. |
| Encoder is a backpropagation, feedforward neural network with two hidden layers. Its inputs consist of text images spanning about 14 letters. The text images were generated using the complete text of the book The Wizard of Oz by Frank Baum. Three different type fonts and both upper- and lower-case letters were used in producing the text images. The network architecture has two hidden layers. As predicted by the curse of dimensionality, Encoder fails at learning to classify the 14-letter- wide images unless learning is constrained in the following ways. |
| For more information on the model, see: G. Martin (2004) Encoder: A Connectionist Model of How Learning to Visually Encode Fixated Text Images Improves Reading Fluency. PDF |
| G. Martin (1997) From Image to Word: A Computational Model of Word Recognition in Reading. PDF |
| THE ENCODER MODEL |
| Each hidden node has a local, shared receptive field, such that the network learns to represent images in terms of a limited number of learned, local features that can occur anywhere in the input image. Network learning starts small, such that the net first learns to classify the leftmost character in each image, then learning is extended to the next character to the right, and so on. The fixated text images the net learns to encode are generated using consistent fixation positions falling just to the left of the center of the fixated word. |
| Once trained to encode fixated text images, Encoder spontaneously exhibits the following human-like behaviors with respect to reading words and word-like stimuli. |
| Word frequency effects Pseudo-word superiority effects Word length x frequency effects Trigram frequency effects Word superiority effects Minimal effects of printing words in aLtErNaTiNg cases |
| Encoder differs from previous models of reading proficiency by demonstrating that reading fluency can stem, to a significant degree, from visual encoding learning that widens the span over which letters are encoded, thereby reducing the number of fixations used in reading a segment of text. Previous models of reading fluency have focused on learning that occurs at more abstract levels of processing, such as reducing activation thresholds for representations of frequently-occuring words (Morton's logogen model) or establishing interactive links between letters that tend to co-occur within a word (McClelland & Rumelhart's Interactive Activation Model). See: Morton, J. (1969) Interaction of information in word recognition. Psychological Review, 76, 165-178 McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception, Part 1: An account of basic findings. Psychological Review, 88, 375-405 Encoder also differs from other backpropagation neural net models of reading, developed by McClelland, Seidenberg, Plaut, and others,in that the inputs Encoder processes are two-dimensional images, each of which spans a sequence of 14 letters, Encoder does not involve interactive activation, and it focuses on understanding how people overcome the significant computational difficulties associated with learning to classify large images. In contrast, previous connectionist models have focused on modeling the roles played by phonological and semantic coding as well as interactivity in the development of reading skills. See: |
| ALTERNATIVE COMPUTATIONAL MODELS OF READING |
| Seidenberg, M. S. and McClelland, J. L. (1989) A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568. Plaut, D. C., McClelland, J. L., Seidenberg, M. S. & Patterson, K. (1996) Understanding normal and impaired word reading: computational principles in Quasi-Regular Domains. Psychological Review |
| The Encoder model claims that learning to read fluently requires extensive practice in learning to visually encode fixated text images. This claim is supported by recent brain imaging research indicating that adults having a history of developmental dyslexia exhibit a reduced tendency to activate an area of the left posterior occipitotemporal region of the brain sometimes referred to as the visual word form area. Activation in this area of the brain during reading becomes more pronounced as the reader develops fluent reading skills. Furthermore, brain injury to this region is associated with so-called "letter-by-letter reading," in which the span over which the reader can rapidly identify letters comprising a word is significantly reduced, thereby significantly reducing reading speed. See: |
| RELEVANT NEUROSCIENCE RESEARCH |