Owing to the inherently unconstrained nature of the domain, natural language processing systems would be well-served to accommodate the ability to learn new words encountered during the process of natural language analysis. The paper briefly touches on the FOULUP program which utilized scripts to learn words from context, and discusses, in-depth, an algorithm-based “incremental learning” approach implemented in the system POLITICS. POLITICS uses “goal-driven” inference methods in the process of natural language analysis. This is a key factor in the system’s ability to provide contextual information necessary for inducing the meaning of unknown words. As an example, the paper discusses the way in which POLITICS analyzed the following input text, “Russia sent massive arms shipments to the MPLA in Angola,” where the acronym MPLA was an unknown word to the system. The POLITICS algorithm for learning a new word, the “project-and-integrate method,” includes the use of syntactic and semantic analyses, contextual enrichment, and belief revision in order to arrive at a best-fit, semantically, for unknown words. The paper, then, looks at ways in which word-senses may be abstracted to a lowest common denominator of sorts in order for the words to be semantically relevant in context. The paper concludes by enumerating a few strategies for developing the abilities of a natural language system and gives brief insight into each.
The paper, while indicating an interest in the general improvement of the “language understanding capabilities” of natural language systems, is primarily slanted to deal with a system’s ability to learn unknown words from context. Insofar as the project-and-integrate method is a true algorithm for generating word-definitions from unknown words, it is similar to the CVA strategy. That is, no miracles occur. The paper is concerned with what the author refers to as “incremental learning,” which, in its own right, is not divergent from the CVA project’s aspiration to allow the SNePS system enrich its own vocabulary solely from context. Both approaches require a lack of outside assistance such as dictionaries, etc.
letters are the primary unit of natural language processing.
the axioms..
the letter C represents a Complete cycle. The letters b and B represent the analyses of the relevant cycle. They birth in binary 3 and produce an association composed from the relation of the superclass to the two relevant subclasses and all regular expression matches in the entire graph space that, in any way, can be structurally controlled by this association...
Monday, October 8, 2007
Subscribe to:
Post Comments (Atom)

No comments:
Post a Comment