A formal description of English identifies commonly-occurring grammatical structures to facilitate understanding the position of linguistic elements. This grammar can be used as an aid for constructing sentences in a mechanical way.
A description of the rules for spelling out numbers in the English language. This is useful when writing checks.
This online inflectional morphology interface allows you to conjugate any verb in the English Dictionary described below, but there is nothing to download. You also can get the singular and plural forms of nouns and spelling aid.
This program generates correctly structured English sentences even if you don't know English. If you play enough with this program and read the attached grammar help file, you will master the basic sentence patterns of English. The program includes a help file with a Basic English grammar. This program is intended for people learning English as a second language or for those interested in developing computer programs with natural language interfaces.
This program may be used by a non-English speaker who wants to learn English vocabulary or by a young child learning to read. The program includes English words and pictures representing the words. You can hear the pronunciation for the words by clicking with the mouse on the pictures. Some of the topics covered are: the alphabet, numbers, animals, fruits, and vegetables.
See also the Basic English Vocabulary.
DICTGET accesses an English dictionary containing over 100,000 words and has a spelling aid feature that enables you to find similar words. The program also displays parts of speech, verb conjugations, and the singular and plural forms of nouns. One important feature of DICTGET is that it will give you the root form of a word regardless of which word you put in, for example, the word "was" retrieves the verb "be". Each form of the word is displayed with its grammatical attributes. An additional 66,000 word medical dictionary is also included.
The archetype of interactive natural language interfaces was a program called ELIZA written by Joseph Weizenbaum in the mid-1960s. The idea of the program was to engage a human in a conversation that appeared to be with a human, instead of with a computer. The original domain of ELIZA was psychiatry, and the mode of interaction was guided by the practices of psychologists at the time. Although the field has advanced substantially, no chatbot today has passed the "Turing test" which is the ability to fool a human into thinking that there is another human on the other side of the interface.
With a few hours of work you can enhance your Windows/Java applications to support spelling verification and spelling aid using DICTGET English Dictionary components. You also get access to inflectional morphology to create linguistic applications such as parsers, educational packages, etc.
TAGGER is a natural language parser that assigns parts of speech to English text and displays phrase markings. This program is the basis for automatic indexing and data mining projects. The program has an option to output the tagged text in XML format. Here is an example of marked output:
Albert Einstein was one of the greatest scientists of all time. N N X N# R T J N R D N --------------- === --- ++++++++++++++++++++++++++ +++++++++++
PHRASER is an experimental noun phrase parser for English text. After invoking the part-of-speech tagger, PHRASER isolates noun phrases and outputs them as potential index phrases. Using the Derivational Morphology dictionary, the program can apply verb nominalization rules to convert verb phrases and their complements into noun phrases. For example, "John translates books" will generate the index term "translation of books". The program has an option to output the text in XML format.
Stemming is a technique for generating truncated terms that increase recall in information retrieval systems. The stem RECEIV, for example, will match against RECEIVE, RECEIVES, and RECEIVING. The Paice/Husk stemmer was modified to improve error diagnostics in the rules, allow interactive testing, provide more precise stems, and add some flexibility for implementing finite state automata. A new rule set was developed to generate stems that are as precise as possible for information retrieval in large databases. Click on the following link to download source code and executable programs for Windows or to try a web-based implementation.