XtraSpellChecker: Dictionaries

WinForms Team Blog
08 October 2007

Summary

The XtraSpellChecker requires at least one dictionary type be defined. Which dictionary do you use? Where do you find the best dictionaries? Read below to find out.

The SpellChecker component uses special spell checking algorithms which require a dictionary to provide the list of words and/or rules. At a minimum, one of the four dictionary types must be defined for the SpellChecker. Therefore, to add spellchecking to your project, you need to:

  • Choose the most appropriate dictionary type (Simple, ISpell, OpenOffice)
  • Add it to your application for every particular language (culture) you want to support.
  • Add a custom dictionary for every language if you want allow end-users to add custom words.

Dictionaries can be shared by several SpellChecker components. You can add them to the SharedDictionaryStorage and then set SpellChecker.UseSharedDictionaries to true. This helps avoid time-consuming operations on loading dictionaries and maintain a single custom dictionary for all SpellChecker components in your application.

You can try all the samples below for yourself. Just download the following file and open this project in Visual Studio 2005: SpellCheckerExample.Zip


Simple Dictionary

This dictionary type lists words in a plain text file where each line contains only one word. To use this dictionary, manually prepare a text file (or export any existing dictionary to a plain text file).

The following code demonstrates how to create the SpellCheckerDictionary component at runtime. It also shows how to use an alphabet file containing all letters in your preferred language:

standardDictionary = new SpellCheckerDictionary();
standardDictionary.Culture = new CultureInfo("en-US");
standardDictionary.AlphabetPath = @"\Dicts\Standard\EnglishAlphabet.txt";
standardDictionary.DictionaryPath = @"\Dicts\Standard\American.txt";

ISpell Dictionary

An ideal dictionary would contain all the words of a given language. However, it's much smaller and more effective to split the dictionary into several parts (depending on the language). For example, in several Indo-European languages, including English, words are derived from the base by adding affixes - prefixes or postfixes. So the size of the dictionary can be greatly reduced if the base words, affixes and the rules for adding affixes to base words are placed into separate files. The complete list of words could be built in-place, when necessary. This technique proves its effectiveness especially for synthetic languages (rich in verbal and inflective forms) – Lithuanian or Russian, for example.

The ISpell dictionary is based upon this approach that includes the base words and affixes. Physically it's represented by the Alphabet file - *.txt, the Affix file - *.aff, and the Base Words file - *.xlg or *.hash. (Note: XtraSpellChecker doesn't provide support for compressed *.hash files but you may use *.xlg files).

ISpell dictionaries are mostly developed by enthusiasts all over the world, and you may freely find them for different languages on the Web. For example, the English dictionary can be found at ftp://ftp.tue.nl/pub/tex/GB95/ispell-english.zip. Note also that due to being developed by different people, each ISpell dictionary may be redistributed under its specific license agreement; however most of them are free.

You can get more information about ISpell dictionaries on the following on-line resources:
http://en.wikipedia.org/wiki/Ispell
http://www.lasr.cs.ucla.edu/geoff/ispell.html

The following code demonstrates how to create the SpellCheckerISpellDictionary dictionary at runtime:

iSpellDictionary = new SpellCheckerISpellDictionary();
iSpellDictionary.Culture = new CultureInfo("en-US");
iSpellDictionary.AlphabetPath = @"\Dicts\ISpell\EnglishAlphabet.txt";
iSpellDictionary.DictionaryPath = @"\Dicts\ISpell\american.xlg";
iSpellDictionary.GrammarPath = @"\Dicts\ISpell\english.aff";

Open Office Dictionary

The Open Office dictionary is similar to ISpell, since it also generates the entire word list based on the Affix file - *.aff, and the Base Words file - *.dic. But note that this standard provides different rules for the Affix file than ISpell, so you can't use the same Affix files for both dictionaries.

In general, Open Office dictionaries are word rich and contain less mistakes than ISpell since they are developed by more people. The dictionaries and affix files used are a part of the OpenOffice.org project and can be download from here: Dictionaries page.

The following code demonstrates how to create the SpellCheckerOpenOfficeDictionary component at runtime:

openOfficeDictionary = new SpellCheckerOpenOfficeDictionary();
openOfficeDictionary.Culture = new CultureInfo("en-US");
openOfficeDictionary.DictionaryPath = @"\Dicts\OpenOffice\en_US.dic";
openOfficeDictionary.GrammarPath = @"\Dicts\OpenOffice\en_US.aff";

Custom Dictionary

Custom dictionaries are intended to store user additions (words that are considered by users as "correct"). For example, after a SpellChecker has a custom dictionary for the current Culture, an end-user is able to click on the Add button and add this word to the currently available custom dictionary. The other dictionary types don't support this feature.

Adding a word to a custom dictionary 

Also, a set of words in a custom dictionary may be manually changed by an end-user by invoking the Custom Dictionary dialog via the Edit button on the Spelling Options form.

Modifying a custom dictionary

The following code demonstrates how to create the SpellCheckerCustomDictionary component at runtime:

customDictionary = new SpellCheckerCustomDictionary();
customDictionary.Culture = new CultureInfo("en-US");
customDictionary.AlphabetPath = @"\Dicts\Custom\EnglishAlphabet.txt";
customDictionary.DictionaryPath = @"\Dicts\Custom\CustomEnglish.dic";

Note: A custom dictionary is overwritten by the XtraSpellChecker every time a new word is added to it, or it's manually changed by an end-user. Therefore this file must not be set to read-only.

All dictionaries are added to the Dictionaries collection of the SpellChecker. You can add multiple dictionaries of the same type to this collection. For example, you may add different dictionaries for different languages to a single SpellChecker, and then the SpellChecker chooses appropriate dictionaries according to its current Culture.

Have any other questions about dictionaries? Need help finding one? Please drop by the XtraSpellChecker forum to ask or just to learn more.

2 comment(s)
Anonymous
hostab » Blog Archive » XtraSpellChecker: Dictionaries

Pingback from  hostab  » Blog Archive   » XtraSpellChecker: Dictionaries

22 October, 2007
Anonymous
senddesks » XtraSpellChecker: Dictionaries

Pingback from  senddesks » XtraSpellChecker: Dictionaries

14 November, 2007

Please login or register to post comments.