In this project a novel approach was taken for performing a natural language task. The task requires a neural network to predict the grammatical category of the next word in a stream of sentences. There are two main reasons why this task is interesting. In natural language processing, it is sometimes very difficult to determine the grammatical category of a word in a sentence when that word could belong to different grammatical categories depending on the context. For example, the word "run" can either be a noun or a verb in a certain sentence. The ability to correctly determine the category of the word can help a computer process natural language. In addition, the approach taken here to solve this task can lead to insights about the way the human brain learns and/or understands language. A Genetic Algorithm, which is conceptually based on simple principles known from Genetics, was developed and utilized to evolve neural networks that were used to perform the task. Genetic Algorithms have been used with remarkable success to solve complex problems in a number of fields but not for this type of problem. In addition, networks were trained via a classic learning algorithm, called back-propagation, to perform the same task. Since a Genetic Algorithm has not been used for this type of task, an implicit goal of this project was to show that it can be used. One of the other main questions addressed is whether learning (as in the case of training a neural network via back-propagation) and a search for an optimal solution (as in the case of the use of a Genetic Algorithm to evolve neural networks) differ and if so, how. Also, the underlying properties of the two different types of networks (depending on the approach taken to obtain them) were compared. Finally, issues about the computational complexity of the Genetic Algorithm were studied and discussed. These issues included the relationship between the input size (for ex. 10000 sentences) and the perfonnance of the network developed via the Genetic Algorithm approach, as well as the way the network must change as the input changes in size and the task changes in complexity (i.e. as the grammar and lexicon change) while the optimal parameters (of the Genetic Algorithm) are used.


Computer Sciences