User Dictionary specifications for some languages
Unlike English, where the declension of nouns in cases is made using prepositions, there are many languages in which the declension is made according to other rules, for example, by changing part of the word itself.
For these languages, we add the same word to the dictionary dozens of times, but many words are built and changed according to the same rules. (Russian, Ukrainian, Byelorussian and many Eastern-European languages).
Here is an example (not complete) for two Ukrainian words ("dissertation" and "compilation").
ди~~сер~~та~~ція, ком~~пі~~ля~~ція
ди~~сер~~та~~цію, ком~~пі~~ля~~цію
ди~~сер~~та~~ції, ком~~пі~~ля~~ції
ди~~сер~~та~~цією, ком~~пі~~ля~~цією
ди~~сер~~та~~цій~~ний, ком~~пі~~ля~~цій~~ний
ди~~сер~~та~~цій~~на, ком~~пі~~ля~~цій~~на
ди~~сер~~та~~цій~~ні, ком~~пі~~ля~~цій~~ні
ди~~сер~~та~~цій~~ним, ком~~пі~~ля~~цій~~ним
ди~~сер~~та~~цій~~ній, ком~~пі~~ля~~цій~~ній
ди~~сер~~та~~цій~~ну, ком~~пі~~ля~~цій~~ну
ди~~сер~~та~~цій~~но~~му, ком~~пі~~ля~~цій~~ному
ди~~сер~~та~~цій~~но~~го, ком~~пі~~ля~~цій~~ного
ди~~сер~~та~~цій~~не, ком~~пі~~ля~~цій~~не
...
You may notice that the initial parts of these words appear in the dictionary dozens of times, and the endings of these words can be applied to thousands of other words.
What if users are allowed to add to the dictionary not only the whole word, but also in parts?
Then the initial parts of the word would appear only once, as well as the final ones:
ди~~сер~~та~~|
ком~~пі~~ля~~|
|ція
|цію
|ції
|цією
|цій~~ний
|цій~~на
|цій~~ні
|цій~~ним
|цій~~ній
|цій~~ну
|цій~~ному
|цій~~ного
|цій~~не
Where "|" is a marker of separation of parts of words.
Some words can have not only endings, but also prefixes, therefore there can be more than one marker in a word. And in this case in the dictionary it will begin and end with the "|" sign:
пе~~ре~~| |роб~~лю~~| |ва~~ти
пе~~ре~~| |пи~~су~~| |ва~~ти
пе~~ре~~| |со~~ву~~| |ва~~ти
пе~~ре~~| |див~~ля~~| |тись
пе~~ре~~| |су~~ва~~н~~| |ня
...
Such an extension of the rules would not complicate the existing algorithm too much. This would greatly reduce the vocabulary.