spell.txt 69 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363136413651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426142714281429143014311432143314341435143614371438143914401441144214431444144514461447144814491450145114521453145414551456145714581459146014611462146314641465146614671468146914701471147214731474147514761477147814791480148114821483148414851486148714881489149014911492149314941495149614971498149915001501150215031504150515061507150815091510151115121513151415151516151715181519152015211522152315241525152615271528152915301531153215331534153515361537153815391540154115421543154415451546154715481549155015511552155315541555155615571558155915601561156215631564156515661567156815691570157115721573157415751576157715781579158015811582158315841585158615871588158915901591159215931594159515961597159815991600160116021603160416051606160716081609161016111612161316141615161616171618161916201621162216231624162516261627162816291630163116321633163416351636163716381639164016411642164316441645164616471648164916501651165216531654165516561657165816591660166116621663166416651666166716681669167016711672167316741675167616771678167916801681168216831684168516861687168816891690169116921693169416951696169716981699170017011702170317041705170617071708170917101711171217131714171517161717171817191720172117221723172417251726172717281729173017311732173317341735173617371738173917401741174217431744174517461747174817491750175117521753175417551756175717581759176017611762176317641765176617671768176917701771177217731774177517761777177817791780178117821783
  1. *spell.txt* Nvim
  2. VIM REFERENCE MANUAL by Bram Moolenaar
  3. Spell checking *spell*
  4. Type |gO| to see the table of contents.
  5. ==============================================================================
  6. 1. Quick start *spell-quickstart* *E756*
  7. This command switches on spell checking: >
  8. :setlocal spell spelllang=en_us
  9. This switches on the 'spell' option and specifies to check for US English.
  10. The words that are not recognized are highlighted with one of these:
  11. SpellBad word not recognized |hl-SpellBad|
  12. SpellCap word not capitalised |hl-SpellCap|
  13. SpellRare rare word |hl-SpellRare|
  14. SpellLocal wrong spelling for selected region |hl-SpellLocal|
  15. Vim only checks words for spelling, there is no grammar check.
  16. If the 'mousemodel' option is set to "popup" and the cursor is on a badly
  17. spelled word or it is "popup_setpos" and the mouse pointer is on a badly
  18. spelled word, then the popup menu will contain a submenu to replace the bad
  19. word. Note: this slows down the appearance of the popup menu.
  20. To search for the next misspelled word:
  21. *]s*
  22. ]s Move to next misspelled word after the cursor.
  23. A count before the command can be used to repeat.
  24. 'wrapscan' applies.
  25. *[s*
  26. [s Like "]s" but search backwards, find the misspelled
  27. word before the cursor. Doesn't recognize words
  28. split over two lines, thus may stop at words that are
  29. not highlighted as bad. Does not stop at word with
  30. missing capital at the start of a line.
  31. *]S*
  32. ]S Like "]s" but only stop at bad words, not at rare
  33. words or words for another region.
  34. *[S*
  35. [S Like "]S" but search backwards.
  36. *]r*
  37. ]r Move to next "rare" word after the cursor.
  38. A count before the command can be used to repeat.
  39. 'wrapscan' applies.
  40. *[r*
  41. [r Like "]r" but search backwards, find the "rare"
  42. word before the cursor. Doesn't recognize words
  43. split over two lines, thus may stop at words that are
  44. not highlighted as rare.
  45. To add words to your own word list:
  46. *zg*
  47. zg Add word under the cursor as a good word to the first
  48. name in 'spellfile'. A count may precede the command
  49. to indicate the entry in 'spellfile' to be used. A
  50. count of two uses the second entry.
  51. In Visual mode the selected characters are added as a
  52. word (including white space!).
  53. When the cursor is on text that is marked as badly
  54. spelled then the marked text is used.
  55. Otherwise the word under the cursor, separated by
  56. non-word characters, is used.
  57. If the word is explicitly marked as bad word in
  58. another spell file the result is unpredictable.
  59. *zG*
  60. zG Like "zg" but add the word to the internal word list
  61. |internal-wordlist|.
  62. *zw*
  63. zw Like "zg" but mark the word as a wrong (bad) word.
  64. If the word already appears in 'spellfile' it is
  65. turned into a comment line. See |spellfile-cleanup|
  66. for getting rid of those.
  67. *zW*
  68. zW Like "zw" but add the word to the internal word list
  69. |internal-wordlist|.
  70. zuw *zug* *zuw*
  71. zug Undo |zw| and |zg|, remove the word from the entry in
  72. 'spellfile'. Count used as with |zg|.
  73. zuW *zuG* *zuW*
  74. zuG Undo |zW| and |zG|, remove the word from the internal
  75. word list. Count used as with |zg|.
  76. *:spe* *:spellgood* *E1280*
  77. :[count]spe[llgood] {word}
  78. Add {word} as a good word to 'spellfile', like with
  79. |zg|. Without count the first name is used, with a
  80. count of two the second entry, etc.
  81. :spe[llgood]! {word} Add {word} as a good word to the internal word list,
  82. like with |zG|.
  83. *:spellw* *:spellwrong*
  84. :[count]spellw[rong] {word}
  85. Add {word} as a wrong (bad) word to 'spellfile', as
  86. with |zw|. Without count the first name is used, with
  87. a count of two the second entry, etc.
  88. :spellw[rong]! {word} Add {word} as a wrong (bad) word to the internal word
  89. list, like with |zW|.
  90. *:spellra* *:spellrare*
  91. :[count]spellra[re] {word}
  92. Add {word} as a rare word to 'spellfile', similar to
  93. |zw|. Without count the first name is used, with
  94. a count of two the second entry, etc.
  95. There are no normal mode commands to mark words as
  96. rare as this is a fairly uncommon command and all
  97. intuitive commands for this are already taken. If you
  98. want you can add mappings with e.g.: >
  99. nnoremap z? :exe ':spellrare ' .. expand('<cWORD>')<CR>
  100. nnoremap z/ :exe ':spellrare! ' .. expand('<cWORD>')<CR>
  101. < |:spellundo|, |zuw|, or |zuW| can be used to undo this.
  102. :spellra[re]! {word} Add {word} as a rare word to the internal word
  103. list, similar to |zW|.
  104. :[count]spellu[ndo] {word} *:spellu* *:spellundo*
  105. Like |zuw|. [count] used as with |:spellgood|.
  106. :spellu[ndo]! {word} Like |zuW|. [count] used as with |:spellgood|.
  107. After adding a word to 'spellfile' with the above commands its associated
  108. ".spl" file will automatically be updated and reloaded. If you change
  109. 'spellfile' manually you need to use the |:mkspell| command. This sequence of
  110. commands mostly works well: >
  111. :edit <file in 'spellfile'>
  112. < (make changes to the spell file) >
  113. :mkspell! %
  114. More details about the 'spellfile' format below |spell-wordlist-format|.
  115. *internal-wordlist*
  116. The internal word list is used for all buffers where 'spell' is set. It is
  117. not stored, it is lost when you exit Vim. It is also cleared when 'encoding'
  118. is set.
  119. Finding suggestions for bad words:
  120. *z=*
  121. z= For the word under/after the cursor suggest correctly
  122. spelled words. This also works to find alternatives
  123. for a word that is not highlighted as a bad word,
  124. e.g., when the word after it is bad.
  125. In Visual mode the highlighted text is taken as the
  126. word to be replaced.
  127. The results are sorted on similarity to the word being
  128. replaced.
  129. This may take a long time. Hit CTRL-C when you get
  130. bored.
  131. If the command is used without a count the
  132. alternatives are listed and you can enter the number
  133. of your choice or press <Enter> if you don't want to
  134. replace. You can also use the mouse to click on your
  135. choice (only works if the mouse can be used in Normal
  136. mode and when there are no line wraps). Click on the
  137. first line (the header) to cancel.
  138. The suggestions listed normally replace a highlighted
  139. bad word. Sometimes they include other text, in that
  140. case the replaced text is also listed after a "<".
  141. If a count is used that suggestion is used, without
  142. prompting. For example, "1z=" always takes the first
  143. suggestion.
  144. If 'verbose' is non-zero a score will be displayed
  145. with the suggestions to indicate the likeliness to the
  146. badly spelled word (the higher the score the more
  147. different).
  148. When a word was replaced the redo command "." will
  149. repeat the word replacement. This works like "ciw",
  150. the good word and <Esc>. This does NOT work for Thai
  151. and other languages without spaces between words.
  152. *:spellr* *:spellrepall* *E752* *E753*
  153. :spellr[epall] Repeat the replacement done by |z=| for all matches
  154. with the replaced word in the current window.
  155. In Insert mode, when the cursor is after a badly spelled word, you can use
  156. CTRL-X s to find suggestions. This works like Insert mode completion. Use
  157. CTRL-N to use the next suggestion, CTRL-P to go back. |i_CTRL-X_s|
  158. The 'spellsuggest' option influences how the list of suggestions is generated
  159. and sorted. See |'spellsuggest'|.
  160. The 'spellcapcheck' option is used to check the first word of a sentence
  161. starts with a capital. This doesn't work for the first word in the file.
  162. When there is a line break right after a sentence the highlighting of the next
  163. line may be postponed. Use |CTRL-L| when needed. Also see |set-spc-auto| for
  164. how it can be set automatically when 'spelllang' is set.
  165. The 'spelloptions' option has a few more flags that influence the way spell
  166. checking works. For example, "camel" splits CamelCased words so that each
  167. part of the word is spell-checked separately.
  168. Vim counts the number of times a good word is encountered. This is used to
  169. sort the suggestions: words that have been seen before get a small bonus,
  170. words that have been seen often get a bigger bonus. The COMMON item in the
  171. affix file can be used to define common words, so that this mechanism also
  172. works in a new or short file |spell-COMMON|.
  173. ==============================================================================
  174. 2. Remarks on spell checking *spell-remarks*
  175. PERFORMANCE
  176. Vim does on-the-fly spell checking. To make this work fast the word list is
  177. loaded in memory. Thus this uses a lot of memory (1 Mbyte or more). There
  178. might also be a noticeable delay when the word list is loaded, which happens
  179. when 'spell' is set and when 'spelllang' is set while 'spell' was already set.
  180. To minimize the delay each word list is only loaded once, it is not deleted
  181. when 'spelllang' is made empty or 'spell' is reset. When 'encoding' is set
  182. all the word lists are reloaded, thus you may notice a delay then too.
  183. REGIONS
  184. A word may be spelled differently in various regions. For example, English
  185. comes in (at least) these variants:
  186. en all regions
  187. en_au Australia
  188. en_ca Canada
  189. en_gb Great Britain
  190. en_nz New Zealand
  191. en_us USA
  192. Words that are not used in one region but are used in another region are
  193. highlighted with SpellLocal |hl-SpellLocal|.
  194. Always use lowercase letters for the language and region names.
  195. When adding a word with |zg| or another command it's always added for all
  196. regions. You can change that by manually editing the 'spellfile'. See
  197. |spell-wordlist-format|. Note that the regions as specified in the files in
  198. 'spellfile' are only used when all entries in 'spelllang' specify the same
  199. region (not counting files specified by their .spl name).
  200. *spell-german*
  201. Specific exception: For German these special regions are used:
  202. de all German words accepted
  203. de_de old and new spelling
  204. de_19 old spelling
  205. de_20 new spelling
  206. de_at Austria
  207. de_ch Switzerland
  208. *spell-russian*
  209. Specific exception: For Russian these special regions are used:
  210. ru all Russian words accepted
  211. ru_ru "IE" letter spelling
  212. ru_yo "YO" letter spelling
  213. *spell-yiddish*
  214. Yiddish requires using "utf-8" encoding, because of the special characters
  215. used. If you are using latin1 Vim will use transliterated (romanized) Yiddish
  216. instead. If you want to use transliterated Yiddish with utf-8 use "yi-tr".
  217. In a table:
  218. 'encoding' 'spelllang'
  219. utf-8 yi Yiddish
  220. latin1 yi transliterated Yiddish
  221. utf-8 yi-tr transliterated Yiddish
  222. *spell-cjk*
  223. Chinese, Japanese and other East Asian characters are normally marked as
  224. errors, because spell checking of these characters is not supported. If
  225. 'spelllang' includes "cjk", these characters are not marked as errors. This
  226. is useful when editing text with spell checking while some Asian words are
  227. present.
  228. SPELL FILES *spell-load*
  229. Vim searches for spell files in the "spell" subdirectory of the directories in
  230. 'runtimepath'. The name is: LL.EEE.spl, where:
  231. LL the language name
  232. EEE the value of 'encoding'
  233. The value for "LL" comes from 'spelllang', but excludes the region name.
  234. Examples:
  235. 'spelllang' LL ~
  236. en_us en
  237. en-rare en-rare
  238. medical_ca medical
  239. Only the first file is loaded, the one that is first in 'runtimepath'. If
  240. this succeeds then additionally files with the name LL.EEE.add.spl are loaded.
  241. All the ones that are found are used.
  242. If no spell file is found the |SpellFileMissing| autocommand event is
  243. triggered. This may trigger the |spellfile.vim| plugin to offer you
  244. downloading the spell file.
  245. Additionally, the files related to the names in 'spellfile' are loaded. These
  246. are the files that |zg| and |zw| add good and wrong words to.
  247. Exceptions:
  248. - Vim uses "latin1" when 'encoding' is "iso-8859-15". The euro sign doesn't
  249. matter for spelling.
  250. - When no spell file for 'encoding' is found "ascii" is tried. This only
  251. works for languages where nearly all words are ASCII, such as English. It
  252. helps when 'encoding' is not "latin1", such as iso-8859-2, and English text
  253. is being edited. For the ".add" files the same name as the found main
  254. spell file is used.
  255. For example, with these values:
  256. 'runtimepath' is "~/.config/nvim,/usr/share/nvim/runtime/,~/.config/nvim/after"
  257. 'encoding' is "iso-8859-2"
  258. 'spelllang' is "pl"
  259. Vim will look for:
  260. 1. ~/.config/nvim/spell/pl.iso-8859-2.spl
  261. 2. /usr/share/nvim/runtime/spell/pl.iso-8859-2.spl
  262. 3. ~/.config/nvim/spell/pl.iso-8859-2.add.spl
  263. 4. /usr/share/nvim/runtime/spell/pl.iso-8859-2.add.spl
  264. 5. ~/.config/nvim/after/spell/pl.iso-8859-2.add.spl
  265. This assumes 1. is not found and 2. is found.
  266. If 'encoding' is "latin1" Vim will look for:
  267. 1. ~/.config/nvim/spell/pl.latin1.spl
  268. 2. /usr/share/nvim/runtime/spell/pl.latin1.spl
  269. 3. ~/.config/nvim/after/spell/pl.latin1.spl
  270. 4. ~/.config/nvim/spell/pl.ascii.spl
  271. 5. /usr/share/nvim/runtime/spell/pl.ascii.spl
  272. 6. ~/.config/nvim/after/spell/pl.ascii.spl
  273. This assumes none of them are found (Polish doesn't make sense when leaving
  274. out the non-ASCII characters).
  275. A spell file might not be available in the current 'encoding'. See
  276. |spell-mkspell| about how to create a spell file. Converting a spell file
  277. with "iconv" will NOT work!
  278. *spell-sug-file* *E781*
  279. If there is a file with exactly the same name as the ".spl" file but ending in
  280. ".sug", that file will be used for giving better suggestions. It isn't loaded
  281. before suggestions are made to reduce memory use.
  282. *E758* *E759* *E778* *E779* *E780* *E782*
  283. When loading a spell file Vim checks that it is properly formatted. If you
  284. get an error the file may be truncated, modified or intended for another Vim
  285. version.
  286. SPELLFILE CLEANUP *spellfile-cleanup*
  287. The |zw| command turns existing entries in 'spellfile' into comment lines.
  288. This avoids having to write a new file every time, but results in the file
  289. only getting longer, never shorter. To clean up the comment lines in all
  290. ".add" spell files do this: >
  291. :runtime spell/cleanadd.vim
  292. This deletes all comment lines, except the ones that start with "##". Use
  293. "##" lines to add comments that you want to keep.
  294. You can invoke this script as often as you like. A variable is provided to
  295. skip updating files that have been changed recently. Set it to the number of
  296. seconds that has passed since a file was changed before it will be cleaned.
  297. For example, to clean only files that were not changed in the last hour: >
  298. let g:spell_clean_limit = 60 * 60
  299. The default is one second.
  300. WORDS
  301. Vim uses a fixed method to recognize a word. This is independent of
  302. 'iskeyword', so that it also works in help files and for languages that
  303. include characters like '-' in 'iskeyword'. The word characters do depend on
  304. 'encoding'.
  305. The table with word characters is stored in the main .spl file. Therefore it
  306. matters what the current locale is when generating it! A .add.spl file does
  307. not contain a word table though.
  308. For a word that starts with a digit the digit is ignored, unless the word as a
  309. whole is recognized. Thus if "3D" is a word and "D" is not then "3D" is
  310. recognized as a word, but if "3D" is not a word then only the "D" is marked as
  311. bad. Hex numbers in the form 0x12ab and 0X12AB are recognized.
  312. WORD COMBINATIONS
  313. It is possible to spell-check words that include a space. This is used to
  314. recognize words that are invalid when used by themselves, e.g. for "et al.".
  315. It can also be used to recognize "the the" and highlight it.
  316. The number of spaces is irrelevant. In most cases a line break may also
  317. appear. However, this makes it difficult to find out where to start checking
  318. for spelling mistakes. When you make a change to one line and only that line
  319. is redrawn Vim won't look in the previous line, thus when "et" is at the end
  320. of the previous line "al." will be flagged as an error. And when you type
  321. "the<CR>the" the highlighting doesn't appear until the first line is redrawn.
  322. Use |CTRL-L| to redraw right away. "[s" will also stop at a word combination
  323. with a line break.
  324. When encountering a line break Vim skips characters such as "*", '>' and '"',
  325. so that comments in C, shell and Vim code can be spell checked.
  326. SYNTAX HIGHLIGHTING *spell-syntax*
  327. Files that use syntax highlighting can specify where spell checking should be
  328. done:
  329. 1. everywhere default
  330. 2. in specific items use "contains=@Spell"
  331. 3. everywhere but specific items use "contains=@NoSpell"
  332. For the second method adding the @NoSpell cluster will disable spell checking
  333. again. This can be used, for example, to add @Spell to the comments of a
  334. program, and add @NoSpell for items that shouldn't be checked.
  335. Also see |:syn-spell| for text that is not in a syntax item.
  336. VIM SCRIPTS
  337. If you want to write a Vim script that does something with spelling, you may
  338. find these functions useful:
  339. spellbadword() find badly spelled word at the cursor
  340. spellsuggest() get list of spelling suggestions
  341. soundfold() get the sound-a-like version of a word
  342. SETTING 'spellcapcheck' AUTOMATICALLY *set-spc-auto*
  343. After the 'spelllang' option has been set successfully, Vim will source the
  344. files "spell/LANG.vim" and "spell/LANG.lua" in 'runtimepath'. "LANG" is the
  345. value of 'spelllang' up to the first comma, dot or underscore. This can be
  346. used to set options specifically for the language, especially 'spellcapcheck'.
  347. The distribution includes a few of these files. Use this command to see what
  348. they do: >
  349. :next $VIMRUNTIME/spell/*.vim
  350. Note that the default scripts don't set 'spellcapcheck' if it was changed from
  351. the default value. This assumes the user prefers another value then.
  352. DOUBLE SCORING *spell-double-scoring*
  353. The 'spellsuggest' option can be used to select "double" scoring. This
  354. mechanism is based on the principle that there are two kinds of spelling
  355. mistakes:
  356. 1. You know how to spell the word, but mistype something. This results in a
  357. small editing distance (character swapped/omitted/inserted) and possibly a
  358. word that sounds completely different.
  359. 2. You don't know how to spell the word and type something that sounds right.
  360. The edit distance can be big but the word is similar after sound-folding.
  361. Since scores for these two mistakes will be very different we use a list
  362. for each and mix them.
  363. The sound-folding is slow and people that know the language won't make the
  364. second kind of mistakes. Therefore 'spellsuggest' can be set to select the
  365. preferred method for scoring the suggestions.
  366. ==============================================================================
  367. 3. Generating a spell file *spell-mkspell*
  368. Vim uses a binary file format for spelling. This greatly speeds up loading
  369. the word list and keeps it small.
  370. *.aff* *.dic* *Myspell*
  371. You can create a Vim spell file from the .aff and .dic files that Myspell
  372. uses. Myspell is used by OpenOffice.org and Mozilla. The OpenOffice .oxt
  373. files are zip files which contain the .aff and .dic files. You should be able
  374. to find them here:
  375. https://extensions.services.openoffice.org/dictionary
  376. The older, OpenOffice 2 files may be used if this doesn't work:
  377. http://wiki.services.openoffice.org/wiki/Dictionaries
  378. You can also use a plain word list. The results are the same, the choice
  379. depends on what word lists you can find.
  380. If you install Aap (from www.a-a-p.org) you can use the recipes in the
  381. runtime/spell/??/ directories. Aap will take care of downloading the files,
  382. apply patches needed for Vim and build the .spl file.
  383. Make sure your current locale is set properly, otherwise Vim doesn't know what
  384. characters are upper/lower case letters. If the locale isn't available (e.g.,
  385. when using an MS-Windows codepage on Unix) add tables to the .aff file
  386. |spell-affix-chars|. If the .aff file doesn't define a table then the word
  387. table of the currently active spelling is used. If spelling is not active
  388. then Vim will try to guess.
  389. *:mksp* *:mkspell*
  390. :mksp[ell][!] [-ascii] {outname} {inname} ...
  391. Generate a Vim spell file from word lists. Example: >
  392. :mkspell /tmp/nl nl_NL.words
  393. < *E751*
  394. When {outname} ends in ".spl" it is used as the output
  395. file name. Otherwise it should be a language name,
  396. such as "en", without the region name. The file
  397. written will be "{outname}.{encoding}.spl", where
  398. {encoding} is the value of the 'encoding' option.
  399. When the output file already exists [!] must be used
  400. to overwrite it.
  401. When the [-ascii] argument is present, words with
  402. non-ascii characters are skipped. The resulting file
  403. ends in "ascii.spl".
  404. The input can be the Myspell format files {inname}.aff
  405. and {inname}.dic. If {inname}.aff does not exist then
  406. {inname} is used as the file name of a plain word
  407. list.
  408. Multiple {inname} arguments can be given to combine
  409. regions into one Vim spell file. Example: >
  410. :mkspell ~/.config/nvim/spell/en /tmp/en_US /tmp/en_CA /tmp/en_AU
  411. < This combines the English word lists for US, CA and AU
  412. into one en.spl file.
  413. Up to eight regions can be combined. *E754* *E755*
  414. The REP and SAL items of the first .aff file where
  415. they appear are used. |spell-REP| |spell-SAL|
  416. *E845*
  417. This command uses a lot of memory, required to find
  418. the optimal word tree (Polish, Italian and Hungarian
  419. require several hundred Mbyte). The final result will
  420. be much smaller, because compression is used. To
  421. avoid running out of memory compression will be done
  422. now and then. This can be tuned with the 'mkspellmem'
  423. option.
  424. After the spell file was written and it was being used
  425. in a buffer it will be reloaded automatically.
  426. :mksp[ell] [-ascii] {name}.{enc}.add
  427. Like ":mkspell" above, using {name}.{enc}.add as the
  428. input file and producing an output file in the same
  429. directory that has ".spl" appended.
  430. :mksp[ell] [-ascii] {name}
  431. Like ":mkspell" above, using {name} as the input file
  432. and producing an output file in the same directory
  433. that has ".{enc}.spl" appended.
  434. Vim will report the number of duplicate words. This might be a mistake in the
  435. list of words. But sometimes it is used to have different prefixes and
  436. suffixes for the same basic word to avoid them combining (e.g. Czech uses
  437. this). If you want Vim to report all duplicate words set the 'verbose'
  438. option.
  439. Since you might want to change a Myspell word list for use with Vim the
  440. following procedure is recommended:
  441. 1. Obtain the xx_YY.aff and xx_YY.dic files from Myspell.
  442. 2. Make a copy of these files to xx_YY.orig.aff and xx_YY.orig.dic.
  443. 3. Change the xx_YY.aff and xx_YY.dic files to remove bad words, add missing
  444. words, define word characters with FOL/LOW/UPP, etc. The distributed
  445. "*.diff" files can be used.
  446. 4. Start Vim with the right locale and use |:mkspell| to generate the Vim
  447. spell file.
  448. 5. Try out the spell file with ":set spell spelllang=xx" if you wrote it in
  449. a spell directory in 'runtimepath', or ":set spelllang=xx.enc.spl" if you
  450. wrote it somewhere else.
  451. When the Myspell files are updated you can merge the differences:
  452. 1. Obtain the new Myspell files as xx_YY.new.aff and xx_UU.new.dic.
  453. 2. Use |diff-mode| to see what changed: >
  454. nvim -d xx_YY.orig.dic xx_YY.new.dic
  455. 3. Take over the changes you like in xx_YY.dic.
  456. You may also need to change xx_YY.aff.
  457. 4. Rename xx_YY.new.dic to xx_YY.orig.dic and xx_YY.new.aff to xx_YY.orig.aff.
  458. SPELL FILE VERSIONS *E770* *E771* *E772*
  459. Spell checking is a relatively new feature in Vim, thus it's possible that the
  460. .spl file format will be changed to support more languages. Vim will check
  461. the validity of the spell file and report anything wrong.
  462. E771: Old spell file, needs to be updated ~
  463. This spell file is older than your Vim. You need to update the .spl file.
  464. E772: Spell file is for newer version of Vim ~
  465. This means the spell file was made for a later version of Vim. You need to
  466. update Vim.
  467. E770: Unsupported section in spell file ~
  468. This means the spell file was made for a later version of Vim and contains a
  469. section that is required for the spell file to work. In this case it's
  470. probably a good idea to upgrade your Vim.
  471. SPELL FILE DUMP
  472. If for some reason you want to check what words are supported by the currently
  473. used spelling files, use this command:
  474. *:spelldump* *:spelld*
  475. :spelld[ump] Open a new window and fill it with all currently valid
  476. words. Compound words are not included.
  477. Note: For some languages the result may be enormous,
  478. causing Vim to run out of memory.
  479. :spelld[ump]! Like ":spelldump" and include the word count. This is
  480. the number of times the word was found while
  481. updating the screen. Words that are in COMMON items
  482. get a starting count of 10.
  483. The format of the word list is used |spell-wordlist-format|. You should be
  484. able to read it with ":mkspell" to generate one .spl file that includes all
  485. the words.
  486. When all entries to 'spelllang' use the same regions or no regions at all then
  487. the region information is included in the dumped words. Otherwise only words
  488. for the current region are included and no "/regions" line is generated.
  489. Comment lines with the name of the .spl file are used as a header above the
  490. words that were generated from that .spl file.
  491. SPELL FILE MISSING *spell-SpellFileMissing* *spellfile.vim*
  492. If the spell file for the language you are using is not available, you will
  493. get an error message. But if the "spellfile.vim" plugin is active it will
  494. offer you to download the spell file. Just follow the instructions, it will
  495. ask you where to write the file (there must be a writable directory in
  496. 'runtimepath' for this).
  497. The plugin has a default place where to look for spell files, on the Vim ftp
  498. server. The protocol used is SSL (https://) for security. If you want to use
  499. another location or another protocol, set the g:spellfile_URL variable to the
  500. directory that holds the spell files. You can use http:// or ftp://, but you
  501. are taking a security risk then. The |netrw| plugin is used for getting the
  502. file, look there for the specific syntax of the URL. Example: >
  503. let g:spellfile_URL = 'https://ftp.nluug.nl/vim/runtime/spell'
  504. You may need to escape special characters.
  505. The plugin will only ask about downloading a language once. If you want to
  506. try again anyway restart Vim, or set g:spellfile_URL to another value (e.g.,
  507. prepend a space).
  508. To avoid using the "spellfile.vim" plugin do this in your vimrc file: >
  509. let loaded_spellfile_plugin = 1
  510. Instead of using the plugin you can define a |SpellFileMissing| autocommand to
  511. handle the missing file yourself. You can use it like this: >
  512. :au SpellFileMissing * call Download_spell_file(expand('<amatch>'))
  513. Thus the <amatch> item contains the name of the language. Another important
  514. value is 'encoding', since every encoding has its own spell file. With two
  515. exceptions:
  516. - For ISO-8859-15 (latin9) the name "latin1" is used (the encodings only
  517. differ in characters not used in dictionary words).
  518. - The name "ascii" may also be used for some languages where the words use
  519. only ASCII letters for most of the words.
  520. The default "spellfile.vim" plugin uses this autocommand, if you define your
  521. autocommand afterwards you may want to use ":au! SpellFileMissing" to overrule
  522. it. If you define your autocommand before the plugin is loaded it will notice
  523. this and not do anything.
  524. *E797*
  525. Note that the SpellFileMissing autocommand must not change or destroy the
  526. buffer the user was editing.
  527. ==============================================================================
  528. 4. Spell file format *spell-file-format*
  529. This is the format of the files that are used by the person who creates and
  530. maintains a word list.
  531. Note that we avoid the word "dictionary" here. That is because the goal of
  532. spell checking differs from writing a dictionary (as in the book). For
  533. spelling we need a list of words that are OK, thus should not be highlighted.
  534. Person and company names will not appear in a dictionary, but do appear in a
  535. word list. And some old words are rarely used while they are common
  536. misspellings. These do appear in a dictionary but not in a word list.
  537. There are two formats: A straight list of words and a list using affix
  538. compression. The files with affix compression are used by Myspell (Mozilla
  539. and OpenOffice.org). This requires two files, one with .aff and one with .dic
  540. extension.
  541. FORMAT OF STRAIGHT WORD LIST *spell-wordlist-format*
  542. The words must appear one per line. That is all that is required.
  543. Additionally the following items are recognized:
  544. - Empty and blank lines are ignored.
  545. # comment ~
  546. - Lines starting with a # are ignored (comment lines).
  547. /encoding=utf-8 ~
  548. - A line starting with "/encoding=", before any word, specifies the encoding
  549. of the file. After the second '=' comes an encoding name. This tells Vim
  550. to setup conversion from the specified encoding to 'encoding'. Thus you can
  551. use one word list for several target encodings.
  552. /regions=usca ~
  553. - A line starting with "/regions=" specifies the region names that are
  554. supported. Each region name must be two ASCII letters. The first one is
  555. region 1. Thus "/regions=usca" has region 1 "us" and region 2 "ca".
  556. In an addition word list the region names should be equal to the main word
  557. list!
  558. - Other lines starting with '/' are reserved for future use. The ones that
  559. are not recognized are ignored. You do get a warning message, so that you
  560. know something won't work.
  561. - A "/" may follow the word with the following items:
  562. = Case must match exactly.
  563. ? Rare word.
  564. ! Bad (wrong) word.
  565. 1 to 9 A region in which the word is valid. If no regions are
  566. specified the word is valid in all regions.
  567. Example:
  568. # This is an example word list comment
  569. /encoding=latin1 encoding of the file
  570. /regions=uscagb regions "us", "ca" and "gb"
  571. example word for all regions
  572. blah/12 word for regions "us" and "ca"
  573. vim/! bad word
  574. Campbell/?3 rare word in region 3 "gb"
  575. 's mornings/= keep-case word
  576. Note that when "/=" is used the same word with all upper-case letters is not
  577. accepted. This is different from a word with mixed case that is automatically
  578. marked as keep-case, those words may appear in all upper-case letters.
  579. FORMAT WITH .AFF AND .DIC FILES *aff-dic-format*
  580. There are two files: the basic word list and an affix file. The affix file
  581. specifies settings for the language and can contain affixes. The affixes are
  582. used to modify the basic words to get the full word list. This significantly
  583. reduces the number of words, especially for a language like Polish. This is
  584. called affix compression.
  585. The basic word list and the affix file are combined with the ":mkspell"
  586. command and results in a binary spell file. All the preprocessing has been
  587. done, thus this file loads fast. The binary spell file format is described in
  588. the source code (src/spell.c). But only developers need to know about it.
  589. The preprocessing also allows us to take the Myspell language files and modify
  590. them before the Vim word list is made. The tools for this can be found in the
  591. "src/spell" directory.
  592. The format for the affix and word list files is based on what Myspell uses
  593. (the spell checker of Mozilla and OpenOffice.org). A description can be found
  594. here:
  595. https://lingucomponent.openoffice.org/affix.readme
  596. Note that affixes are case sensitive, this isn't obvious from the description.
  597. Vim supports quite a few extras. They are described below |spell-affix-vim|.
  598. Attempts have been made to keep this compatible with other spell checkers, so
  599. that the same files can often be used. One other project that offers more
  600. than Myspell is Hunspell ( https://hunspell.github.io ).
  601. WORD LIST FORMAT *spell-dic-format*
  602. A short example, with line numbers:
  603. 1 1234 ~
  604. 2 aan ~
  605. 3 Als ~
  606. 4 Etten-Leur ~
  607. 5 et al. ~
  608. 6 's-Gravenhage ~
  609. 7 's-Gravenhaags ~
  610. 8 # word that differs between regions ~
  611. 9 kado/1 ~
  612. 10 cadeau/2 ~
  613. 11 TCP,IP ~
  614. 12 /the S affix may add a 's' ~
  615. 13 bedel/S ~
  616. The first line contains the number of words. Vim ignores it, but you do get
  617. an error message if it's not there. *E760*
  618. What follows is one word per line. White space at the end of the line is
  619. ignored, all other white space matters. The encoding is specified in the
  620. affix file |spell-SET|.
  621. Comment lines start with '#' or '/'. See the example lines 8 and 12. Note
  622. that putting a comment after a word is NOT allowed:
  623. someword # comment that causes an error! ~
  624. After the word there is an optional slash and flags. Most of these flags are
  625. letters that indicate the affixes that can be used with this word. These are
  626. specified with SFX and PFX lines in the .aff file, see |spell-SFX| and
  627. |spell-PFX|. Vim allows using other flag types with the FLAG item in the
  628. affix file |spell-FLAG|.
  629. When the word only has lower-case letters it will also match with the word
  630. starting with an upper-case letter.
  631. When the word includes an upper-case letter, this means the upper-case letter
  632. is required at this position. The same word with a lower-case letter at this
  633. position will not match. When some of the other letters are upper-case it will
  634. not match either.
  635. The word with all upper-case characters will always be OK,
  636. word list matches does not match ~
  637. als als Als ALS ALs AlS aLs aLS
  638. Als Als ALS als ALs AlS aLs aLS
  639. ALS ALS als Als ALs AlS aLs aLS
  640. AlS AlS ALS als Als ALs aLs aLS
  641. The KEEPCASE affix ID can be used to specifically match a word with identical
  642. case only, see below |spell-KEEPCASE|.
  643. Note: in line 5 to 7 non-word characters are used. You can include any
  644. character in a word. When checking the text a word still only matches when it
  645. appears with a non-word character before and after it. For Myspell a word
  646. starting with a non-word character probably won't work.
  647. In line 12 the word "TCP/IP" is defined. Since the slash has a special
  648. meaning the comma is used instead. This is defined with the SLASH item in the
  649. affix file, see |spell-SLASH|. Note that without this SLASH item the word
  650. will be "TCP,IP".
  651. AFFIX FILE FORMAT *spell-aff-format* *spell-affix-vim*
  652. *spell-affix-comment*
  653. Comment lines in the .aff file start with a '#':
  654. # comment line ~
  655. Items with a fixed number of arguments can be followed by a comment. But only
  656. if none of the arguments can contain white space. The comment must start with
  657. a "#" character. Example:
  658. KEEPCASE = # fix case for words with this flag ~
  659. ENCODING *spell-SET*
  660. The affix file can be in any encoding that is supported by "iconv". However,
  661. in some cases the current locale should also be set properly at the time
  662. |:mkspell| is invoked. Adding FOL/LOW/UPP lines removes this requirement
  663. |spell-FOL|.
  664. The encoding should be specified before anything where the encoding matters.
  665. The encoding applies both to the affix file and the dictionary file. It is
  666. done with a SET line:
  667. SET utf-8 ~
  668. The encoding can be different from the value of the 'encoding' option at the
  669. time ":mkspell" is used. Vim will then convert everything to 'encoding' and
  670. generate a spell file for 'encoding'. If some of the used characters to not
  671. fit in 'encoding' you will get an error message.
  672. *spell-affix-mbyte*
  673. When using a multibyte encoding it's possible to use more different affix
  674. flags. But Myspell doesn't support that, thus you may not want to use it
  675. anyway. For compatibility use an 8-bit encoding.
  676. INFORMATION
  677. These entries in the affix file can be used to add information to the spell
  678. file. There are no restrictions on the format, but they should be in the
  679. right encoding.
  680. *spell-NAME* *spell-VERSION* *spell-HOME*
  681. *spell-AUTHOR* *spell-EMAIL* *spell-COPYRIGHT*
  682. NAME Name of the language
  683. VERSION 1.0.1 with fixes
  684. HOME https://www.example.com
  685. AUTHOR John Doe
  686. EMAIL john AT Doe DOT net
  687. COPYRIGHT LGPL
  688. These fields are put in the .spl file as-is. The |:spellinfo| command can be
  689. used to view the info.
  690. *:spellinfo* *:spelli*
  691. :spelli[nfo] Display the information for the spell file(s) used for
  692. the current buffer.
  693. CHARACTER TABLES
  694. *spell-affix-chars*
  695. When using an 8-bit encoding the affix file should define what characters are
  696. word characters. This is because the system where ":mkspell" is used may not
  697. support a locale with this encoding and isalpha() won't work. For example
  698. when using "cp1250" on Unix.
  699. *E761* *E762* *spell-FOL*
  700. *spell-LOW* *spell-UPP*
  701. Three lines in the affix file are needed. Simplistic example:
  702. FOL áëñ ~
  703. LOW áëñ ~
  704. UPP ÁËÑ ~
  705. All three lines must have exactly the same number of characters.
  706. The "FOL" line specifies the case-folded characters. These are used to
  707. compare words while ignoring case. For most encodings this is identical to
  708. the lower case line.
  709. The "LOW" line specifies the characters in lower-case. Mostly it's equal to
  710. the "FOL" line.
  711. The "UPP" line specifies the characters with upper-case. That is, a character
  712. is upper-case where it's different from the character at the same position in
  713. "FOL".
  714. An exception is made for the German sharp s ß. The upper-case version is
  715. "SS". In the FOL/LOW/UPP lines it should be included, so that it's recognized
  716. as a word character, but use the ß character in all three.
  717. ASCII characters should be omitted, Vim always handles these in the same way.
  718. When the encoding is UTF-8 no word characters need to be specified.
  719. *E763*
  720. Vim allows you to use spell checking for several languages in the same file.
  721. You can list them in the 'spelllang' option. As a consequence all spell files
  722. for the same encoding must use the same word characters, otherwise they can't
  723. be combined without errors.
  724. If you get an E763 warning that the word tables differ you need to update your
  725. ".spl" spell files. If you downloaded the files, get the latest version of
  726. all spell files you use. If you are only using one, e.g., German, then also
  727. download the recent English spell files. Otherwise generate the .spl file
  728. again with |:mkspell|. If you still get errors check the FOL, LOW and UPP
  729. lines in the used .aff files.
  730. The XX.ascii.spl spell file generated with the "-ascii" argument will not
  731. contain the table with characters, so that it can be combine with spell files
  732. for any encoding. The .add.spl files also do not contain the table.
  733. MID-WORD CHARACTERS
  734. *spell-midword*
  735. Some characters are only to be considered word characters if they are used in
  736. between two ordinary word characters. An example is the single quote: It is
  737. often used to put text in quotes, thus it can't be recognized as a word
  738. character, but when it appears in between word characters it must be part of
  739. the word. This is needed to detect a spelling error such as they'are. That
  740. should be they're, but since "they" and "are" are words themselves that would
  741. go unnoticed.
  742. These characters are defined with MIDWORD in the .aff file. Example:
  743. MIDWORD '- ~
  744. FLAG TYPES *spell-FLAG*
  745. Flags are used to specify the affixes that can be used with a word and for
  746. other properties of the word. Normally single-character flags are used. This
  747. limits the number of possible flags, especially for 8-bit encodings. The FLAG
  748. item can be used if more affixes are to be used. Possible values:
  749. FLAG long use two-character flags
  750. FLAG num use numbers, from 1 up to 65000
  751. FLAG caplong use one-character flags without A-Z and two-character
  752. flags that start with A-Z
  753. With "FLAG num" the numbers in a list of affixes need to be separated with a
  754. comma: "234,2143,1435". This method is inefficient, but useful if the file is
  755. generated with a program.
  756. When using "caplong" the two-character flags all start with a capital: "Aa",
  757. "B1", "BB", etc. This is useful to use one-character flags for the most
  758. common items and two-character flags for uncommon items.
  759. Note: When using utf-8 only characters up to 65000 may be used for flags.
  760. Note: even when using "num" or "long" the number of flags available to
  761. compounding and prefixes is limited to about 250.
  762. AFFIXES *spell-PFX* *spell-SFX*
  763. The usual PFX (prefix) and SFX (suffix) lines are supported (see the Myspell
  764. documentation or the Aspell manual:
  765. http://aspell.net/man-html/Affix-Compression.html).
  766. Summary:
  767. SFX L Y 2 ~
  768. SFX L 0 re [^x] ~
  769. SFX L 0 ro x ~
  770. The first line is a header and has four fields:
  771. SFX {flag} {combine} {count}
  772. {flag} The name used for the suffix. Mostly it's a single letter,
  773. but other characters can be used, see |spell-FLAG|.
  774. {combine} Can be 'Y' or 'N'. When 'Y' then the word plus suffix can
  775. also have a prefix. When 'N' then a prefix is not allowed.
  776. {count} The number of lines following. If this is wrong you will get
  777. an error message.
  778. For PFX the fields are exactly the same.
  779. The basic format for the following lines is:
  780. SFX {flag} {strip} {add} {condition} {extra}
  781. {flag} Must be the same as the {flag} used in the first line.
  782. {strip} Characters removed from the basic word. There is no check if
  783. the characters are actually there, only the length is used (in
  784. bytes). This better match the {condition}, otherwise strange
  785. things may happen. If the {strip} length is equal to or
  786. longer than the basic word the suffix won't be used.
  787. When {strip} is 0 (zero) then nothing is stripped.
  788. {add} Characters added to the basic word, after removing {strip}.
  789. Optionally there is a '/' followed by flags. The flags apply
  790. to the word plus affix. See |spell-affix-flags|
  791. {condition} A simplistic pattern. Only when this matches with a basic
  792. word will the suffix be used for that word. This is normally
  793. for using one suffix letter with different {add} and {strip}
  794. fields for words with different endings.
  795. When {condition} is a . (dot) there is no condition.
  796. The pattern may contain:
  797. - Literal characters.
  798. - A set of characters in []. [abc] matches a, b and c.
  799. A dash is allowed for a range [a-c], but this is
  800. Vim-specific.
  801. - A set of characters that starts with a ^, meaning the
  802. complement of the specified characters. [^abc] matches any
  803. character but a, b and c.
  804. {extra} Optional extra text:
  805. # comment Comment is ignored
  806. - Hunspell uses this, ignored
  807. For PFX the fields are the same, but the {strip}, {add} and {condition} apply
  808. to the start of the word.
  809. Note: Myspell ignores any extra text after the relevant info. Vim requires
  810. this text to start with a "#" so that mistakes don't go unnoticed. Example:
  811. SFX F 0 in [^i]n # Spion > Spionin ~
  812. SFX F 0 nen in # Bauerin > Bauerinnen ~
  813. However, to avoid lots of errors in affix files written for Myspell, you can
  814. add the IGNOREEXTRA flag.
  815. Apparently Myspell allows an affix name to appear more than once. Since this
  816. might also be a mistake, Vim checks for an extra "S". The affix files for
  817. Myspell that use this feature apparently have this flag. Example:
  818. SFX a Y 1 S ~
  819. SFX a 0 an . ~
  820. SFX a Y 2 S ~
  821. SFX a 0 en . ~
  822. SFX a 0 on . ~
  823. AFFIX FLAGS *spell-affix-flags*
  824. This is a feature that comes from Hunspell: The affix may specify flags. This
  825. works similar to flags specified on a basic word. The flags apply to the
  826. basic word plus the affix (but there are restrictions). Example:
  827. SFX S Y 1 ~
  828. SFX S 0 s . ~
  829. SFX A Y 1 ~
  830. SFX A 0 able/S . ~
  831. When the dictionary file contains "drink/AS" then these words are possible:
  832. drink
  833. drinks uses S suffix
  834. drinkable uses A suffix
  835. drinkables uses A suffix and then S suffix
  836. Generally the flags of the suffix are added to the flags of the basic word,
  837. both are used for the word plus suffix. But the flags of the basic word are
  838. only used once for affixes, except that both one prefix and one suffix can be
  839. used when both support combining.
  840. Specifically, the affix flags can be used for:
  841. - Suffixes on suffixes, as in the example above. This works once, thus you
  842. can have two suffixes on a word (plus one prefix).
  843. - Making the word with the affix rare, by using the |spell-RARE| flag.
  844. - Exclude the word with the affix from compounding, by using the
  845. |spell-COMPOUNDFORBIDFLAG| flag.
  846. - Allow the word with the affix to be part of a compound word on the side of
  847. the affix with the |spell-COMPOUNDPERMITFLAG|.
  848. - Use the NEEDCOMPOUND flag: word plus affix can only be used as part of a
  849. compound word. |spell-NEEDCOMPOUND|
  850. - Compound flags: word plus affix can be part of a compound word at the end,
  851. middle, start, etc. The flags are combined with the flags of the basic
  852. word. |spell-compound|
  853. - NEEDAFFIX: another affix is needed to make a valid word.
  854. - CIRCUMFIX, as explained just below.
  855. IGNOREEXTRA *spell-IGNOREEXTRA*
  856. Normally Vim gives an error for an extra field that does not start with '#'.
  857. This avoids errors going unnoticed. However, some files created for Myspell
  858. or Hunspell may contain many entries with an extra field. Use the IGNOREEXTRA
  859. flag to avoid lots of errors.
  860. CIRCUMFIX *spell-CIRCUMFIX*
  861. The CIRCUMFIX flag means a prefix and suffix must be added at the same time.
  862. If a prefix has the CIRCUMFIX flag then only suffixes with the CIRCUMFIX flag
  863. can be added, and the other way around.
  864. An alternative is to only specify the suffix, and give that suffix two flags:
  865. the required prefix and the NEEDAFFIX flag. |spell-NEEDAFFIX|
  866. PFXPOSTPONE *spell-PFXPOSTPONE*
  867. When an affix file has very many prefixes that apply to many words it's not
  868. possible to build the whole word list in memory. This applies to Hebrew (a
  869. list with all words is over a Gbyte). In that case applying prefixes must be
  870. postponed. This makes spell checking slower. It is indicated by this keyword
  871. in the .aff file:
  872. PFXPOSTPONE ~
  873. Only prefixes without a chop string and without flags can be postponed.
  874. Prefixes with a chop string or with flags will still be included in the word
  875. list. An exception if the chop string is one character and equal to the last
  876. character of the added string, but in lower case. Thus when the chop string
  877. is used to allow the following word to start with an upper case letter.
  878. WORDS WITH A SLASH *spell-SLASH*
  879. The slash is used in the .dic file to separate the basic word from the affix
  880. letters and other flags. Unfortunately, this means you cannot use a slash in
  881. a word. Thus "TCP/IP" is not a word but "TCP" with the flags "IP". To include
  882. a slash in the word put a backslash before it: "TCP\/IP". In the rare case
  883. you want to use a backslash inside a word you need to use two backslashes.
  884. Any other use of the backslash is reserved for future expansion.
  885. KEEP-CASE WORDS *spell-KEEPCASE*
  886. In the affix file a KEEPCASE line can be used to define the affix name used
  887. for keep-case words. Example:
  888. KEEPCASE = ~
  889. This flag is not supported by Myspell. It has the meaning that case matters.
  890. This can be used if the word does not have the first letter in upper case at
  891. the start of a sentence. Example:
  892. word list matches does not match ~
  893. 's morgens/= 's morgens 'S morgens 's Morgens 'S MORGENS
  894. 's Morgens 's Morgens 'S MORGENS 'S morgens 's morgens
  895. The flag can also be used to avoid that the word matches when it is in all
  896. upper-case letters.
  897. RARE WORDS *spell-RARE*
  898. In the affix file a RARE line can be used to define the affix name used for
  899. rare words. Example:
  900. RARE ? ~
  901. Rare words are highlighted differently from bad words. This is to be used for
  902. words that are correct for the language, but are hardly ever used and could be
  903. a typing mistake anyway.
  904. This flag can also be used on an affix, so that a basic word is not rare but
  905. the basic word plus affix is rare |spell-affix-flags|. However, if the word
  906. also appears as a good word in another way (e.g., in another region) it won't
  907. be marked as rare.
  908. BAD WORDS *spell-BAD*
  909. In the affix file a BAD line can be used to define the affix name used for
  910. bad words. Example:
  911. BAD ! ~
  912. This can be used to exclude words that would otherwise be good. For example
  913. "the the" in the .dic file:
  914. the the/! ~
  915. Once a word has been marked as bad it won't be undone by encountering the same
  916. word as good.
  917. The flag also applies to the word with affixes, thus this can be used to mark
  918. a whole bunch of related words as bad.
  919. *spell-FORBIDDENWORD*
  920. FORBIDDENWORD can be used just like BAD. For compatibility with Hunspell.
  921. *spell-NEEDAFFIX*
  922. The NEEDAFFIX flag is used to require that a word is used with an affix. The
  923. word itself is not a good word (unless there is an empty affix). Example:
  924. NEEDAFFIX + ~
  925. COMPOUND WORDS *spell-compound*
  926. A compound word is a longer word made by concatenating words that appear in
  927. the .dic file. To specify which words may be concatenated a character is
  928. used. This character is put in the list of affixes after the word. We will
  929. call this character a flag here. Obviously these flags must be different from
  930. any affix IDs used.
  931. *spell-COMPOUNDFLAG*
  932. The Myspell compatible method uses one flag, specified with COMPOUNDFLAG. All
  933. words with this flag combine in any order. This means there is no control
  934. over which word comes first. Example:
  935. COMPOUNDFLAG c ~
  936. *spell-COMPOUNDRULE*
  937. A more advanced method to specify how compound words can be formed uses
  938. multiple items with multiple flags. This is not compatible with Myspell 3.0.
  939. Let's start with an example:
  940. COMPOUNDRULE c+ ~
  941. COMPOUNDRULE se ~
  942. The first line defines that words with the "c" flag can be concatenated in any
  943. order. The second line defines compound words that are made of one word with
  944. the "s" flag and one word with the "e" flag. With this dictionary:
  945. bork/c ~
  946. onion/s ~
  947. soup/e ~
  948. You can make these words:
  949. bork
  950. borkbork
  951. borkborkbork
  952. (etc.)
  953. onion
  954. soup
  955. onionsoup
  956. The COMPOUNDRULE item may appear multiple times. The argument is made out of
  957. one or more groups, where each group can be:
  958. one flag e.g., c
  959. alternate flags inside [] e.g., [abc]
  960. Optionally this may be followed by:
  961. * the group appears zero or more times, e.g., sm*e
  962. + the group appears one or more times, e.g., c+
  963. ? the group appears zero times or once, e.g., x?
  964. This is similar to the regexp pattern syntax (but not the same!). A few
  965. examples with the sequence of word flags they require:
  966. COMPOUNDRULE x+ x xx xxx etc.
  967. COMPOUNDRULE yz yz
  968. COMPOUNDRULE x+z xz xxz xxxz etc.
  969. COMPOUNDRULE yx+ yx yxx yxxx etc.
  970. COMPOUNDRULE xy?z xz xyz
  971. COMPOUNDRULE [abc]z az bz cz
  972. COMPOUNDRULE [abc]+z az aaz abaz bz baz bcbz cz caz cbaz etc.
  973. COMPOUNDRULE a[xyz]+ ax axx axyz ay ayx ayzz az azy azxy etc.
  974. COMPOUNDRULE sm*e se sme smme smmme etc.
  975. COMPOUNDRULE s[xyz]*e se sxe sxye sxyxe sye syze sze szye szyxe etc.
  976. A specific example: Allow a compound to be made of two words and a dash:
  977. In the .aff file:
  978. COMPOUNDRULE sde ~
  979. NEEDAFFIX x ~
  980. COMPOUNDWORDMAX 3 ~
  981. COMPOUNDMIN 1 ~
  982. In the .dic file:
  983. start/s ~
  984. end/e ~
  985. -/xd ~
  986. This allows for the word "start-end", but not "startend".
  987. An additional implied rule is that, without further flags, a word with a
  988. prefix cannot be compounded after another word, and a word with a suffix
  989. cannot be compounded with a following word. Thus the affix cannot appear
  990. on the inside of a compound word. This can be changed with the
  991. |spell-COMPOUNDPERMITFLAG|.
  992. *spell-NEEDCOMPOUND*
  993. The NEEDCOMPOUND flag is used to require that a word is used as part of a
  994. compound word. The word itself is not a good word. Example:
  995. NEEDCOMPOUND & ~
  996. *spell-ONLYINCOMPOUND*
  997. The ONLYINCOMPOUND does exactly the same as NEEDCOMPOUND. Supported for
  998. compatibility with Hunspell.
  999. *spell-COMPOUNDMIN*
  1000. The minimal character length of a word used for compounding is specified with
  1001. COMPOUNDMIN. Example:
  1002. COMPOUNDMIN 5 ~
  1003. When omitted there is no minimal length. Obviously you could just leave out
  1004. the compound flag from short words instead, this feature is present for
  1005. compatibility with Myspell.
  1006. *spell-COMPOUNDWORDMAX*
  1007. The maximum number of words that can be concatenated into a compound word is
  1008. specified with COMPOUNDWORDMAX. Example:
  1009. COMPOUNDWORDMAX 3 ~
  1010. When omitted there is no maximum. It applies to all compound words.
  1011. To set a limit for words with specific flags make sure the items in
  1012. COMPOUNDRULE where they appear don't allow too many words.
  1013. *spell-COMPOUNDSYLMAX*
  1014. The maximum number of syllables that a compound word may contain is specified
  1015. with COMPOUNDSYLMAX. Example:
  1016. COMPOUNDSYLMAX 6 ~
  1017. This has no effect if there is no SYLLABLE item. Without COMPOUNDSYLMAX there
  1018. is no limit on the number of syllables.
  1019. If both COMPOUNDWORDMAX and COMPOUNDSYLMAX are defined, a compound word is
  1020. accepted if it fits one of the criteria, thus is either made from up to
  1021. COMPOUNDWORDMAX words or contains up to COMPOUNDSYLMAX syllables.
  1022. *spell-COMPOUNDFORBIDFLAG*
  1023. The COMPOUNDFORBIDFLAG specifies a flag that can be used on an affix. It
  1024. means that the word plus affix cannot be used in a compound word. Example:
  1025. affix file:
  1026. COMPOUNDFLAG c ~
  1027. COMPOUNDFORBIDFLAG x ~
  1028. SFX a Y 2 ~
  1029. SFX a 0 s . ~
  1030. SFX a 0 ize/x . ~
  1031. dictionary:
  1032. word/c ~
  1033. util/ac ~
  1034. This allows for "wordutil" and "wordutils" but not "wordutilize".
  1035. Note: this doesn't work for postponed prefixes yet.
  1036. *spell-COMPOUNDPERMITFLAG*
  1037. The COMPOUNDPERMITFLAG specifies a flag that can be used on an affix. It
  1038. means that the word plus affix can also be used in a compound word in a way
  1039. where the affix ends up halfway through the word. Without this flag that is
  1040. not allowed.
  1041. Note: this doesn't work for postponed prefixes yet.
  1042. *spell-COMPOUNDROOT*
  1043. The COMPOUNDROOT flag is used for words in the dictionary that are already a
  1044. compound. This means it counts for two words when checking the compounding
  1045. rules. Can also be used for an affix to count the affix as a compounding
  1046. word.
  1047. *spell-CHECKCOMPOUNDPATTERN*
  1048. CHECKCOMPOUNDPATTERN is used to define patterns that, when matching at the
  1049. position where two words are compounded together forbids the compound.
  1050. For example:
  1051. CHECKCOMPOUNDPATTERN o e ~
  1052. This forbids compounding if the first word ends in "o" and the second word
  1053. starts with "e".
  1054. The arguments must be plain text, no patterns are actually supported, despite
  1055. the item name. Case is always ignored.
  1056. The Hunspell feature to use three arguments and flags is not supported.
  1057. *spell-NOCOMPOUNDSUGS*
  1058. This item indicates that using compounding to make suggestions is not a good
  1059. idea. Use this when compounding is used with very short or one-character
  1060. words. E.g. to make numbers out of digits. Without this flag creating
  1061. suggestions would spend most time trying all kind of weird compound words.
  1062. NOCOMPOUNDSUGS ~
  1063. *spell-SYLLABLE*
  1064. The SYLLABLE item defines characters or character sequences that are used to
  1065. count the number of syllables in a word. Example:
  1066. SYLLABLE aáeéiíoóöõuúüûy/aa/au/ea/ee/ei/ie/oa/oe/oo/ou/uu/ui ~
  1067. Before the first slash is the set of characters that are counted for one
  1068. syllable, also when repeated and mixed, until the next character that is not
  1069. in this set. After the slash come sequences of characters that are counted
  1070. for one syllable. These are preferred over using characters from the set.
  1071. With the example "ideeen" has three syllables, counted by "i", "ee" and "e".
  1072. Only case-folded letters need to be included.
  1073. Another way to restrict compounding was mentioned above: Adding the
  1074. |spell-COMPOUNDFORBIDFLAG| flag to an affix causes all words that are made
  1075. with that affix to not be used for compounding.
  1076. UNLIMITED COMPOUNDING *spell-NOBREAK*
  1077. For some languages, such as Thai, there is no space in between words. This
  1078. looks like all words are compounded. To specify this use the NOBREAK item in
  1079. the affix file, without arguments:
  1080. NOBREAK ~
  1081. Vim will try to figure out where one word ends and a next starts. When there
  1082. are spelling mistakes this may not be quite right.
  1083. *spell-COMMON*
  1084. Common words can be specified with the COMMON item. This will give better
  1085. suggestions when editing a short file. Example:
  1086. COMMON the of to and a in is it you that he she was for on are ~
  1087. The words must be separated by white space, up to 25 per line.
  1088. When multiple regions are specified in a ":mkspell" command the common words
  1089. for all regions are combined and used for all regions.
  1090. *spell-NOSPLITSUGS*
  1091. This item indicates that splitting a word to make suggestions is not a good
  1092. idea. Split-word suggestions will appear only when there are few similar
  1093. words.
  1094. NOSPLITSUGS ~
  1095. *spell-NOSUGGEST*
  1096. The flag specified with NOSUGGEST can be used for words that will not be
  1097. suggested. Can be used for obscene words.
  1098. NOSUGGEST % ~
  1099. REPLACEMENTS *spell-REP*
  1100. In the affix file REP items can be used to define common mistakes. This is
  1101. used to make spelling suggestions. The items define the "from" text and the
  1102. "to" replacement. Example:
  1103. REP 4 ~
  1104. REP f ph ~
  1105. REP ph f ~
  1106. REP k ch ~
  1107. REP ch k ~
  1108. The first line specifies the number of REP lines following. Vim ignores the
  1109. number, but it must be there (for compatibility with Myspell).
  1110. Don't include simple one-character replacements or swaps. Vim will try these
  1111. anyway. You can include whole words if you want to, but you might want to use
  1112. the "file:" item in 'spellsuggest' instead.
  1113. You can include a space by using an underscore:
  1114. REP the_the the ~
  1115. SIMILAR CHARACTERS *spell-MAP* *E783*
  1116. In the affix file MAP items can be used to define letters that are very much
  1117. alike. This is mostly used for a letter with different accents. This is used
  1118. to prefer suggestions with these letters substituted. Example:
  1119. MAP 2 ~
  1120. MAP eéëêè ~
  1121. MAP uüùúû ~
  1122. The first line specifies the number of MAP lines following. Vim ignores the
  1123. number, but the line must be there.
  1124. Each letter must appear in only one of the MAP items. It's a bit more
  1125. efficient if the first letter is ASCII or at least one without accents.
  1126. .SUG FILE *spell-NOSUGFILE*
  1127. When soundfolding is specified in the affix file then ":mkspell" will normally
  1128. produce a .sug file next to the .spl file. This file is used to find
  1129. suggestions by their sound-a-like form quickly. At the cost of a lot of
  1130. memory (the amount depends on the number of words, |:mkspell| will display an
  1131. estimate when it's done).
  1132. To avoid producing a .sug file use this item in the affix file:
  1133. NOSUGFILE ~
  1134. Users can simply omit the .sug file if they don't want to use it.
  1135. SOUND-A-LIKE *spell-SAL*
  1136. In the affix file SAL items can be used to define the sounds-a-like mechanism
  1137. to be used. The main items define the "from" text and the "to" replacement.
  1138. Simplistic example:
  1139. SAL CIA X ~
  1140. SAL CH X ~
  1141. SAL C K ~
  1142. SAL K K ~
  1143. There are a few rules and this can become quite complicated. An explanation
  1144. how it works can be found in the Aspell manual:
  1145. http://aspell.net/man-html/Phonetic-Code.html.
  1146. There are a few special items:
  1147. SAL followup true ~
  1148. SAL collapse_result true ~
  1149. SAL remove_accents true ~
  1150. "1" has the same meaning as "true". Any other value means "false".
  1151. SIMPLE SOUNDFOLDING *spell-SOFOFROM* *spell-SOFOTO*
  1152. The SAL mechanism is complex and slow. A simpler mechanism is mapping all
  1153. characters to another character, mapping similar sounding characters to the
  1154. same character. At the same time this does case folding. You can not have
  1155. both SAL items and simple soundfolding.
  1156. There are two items required: one to specify the characters that are mapped
  1157. and one that specifies the characters they are mapped to. They must have
  1158. exactly the same number of characters. Example:
  1159. SOFOFROM abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ ~
  1160. SOFOTO ebctefghejklnnepkrstevvkesebctefghejklnnepkrstevvkes ~
  1161. In the example all vowels are mapped to the same character 'e'. Another
  1162. method would be to leave out all vowels. Some characters that sound nearly
  1163. the same and are often mixed up, such as 'm' and 'n', are mapped to the same
  1164. character. Don't do this too much, all words will start looking alike.
  1165. Characters that do not appear in SOFOFROM will be left out, except that all
  1166. white space is replaced by one space. Sequences of the same character in
  1167. SOFOFROM are replaced by one.
  1168. You can use the |soundfold()| function to try out the results. Or set the
  1169. 'verbose' option to see the score in the output of the |z=| command.
  1170. UNSUPPORTED ITEMS *spell-affix-not-supported*
  1171. These items appear in the affix file of other spell checkers. In Vim they are
  1172. ignored, not supported or defined in another way.
  1173. ACCENT (Hunspell) *spell-ACCENT*
  1174. Use MAP instead. |spell-MAP|
  1175. BREAK (Hunspell) *spell-BREAK*
  1176. Define break points. Unclear how it works exactly.
  1177. Not supported.
  1178. CHECKCOMPOUNDCASE (Hunspell) *spell-CHECKCOMPOUNDCASE*
  1179. Disallow uppercase letters at compound word boundaries.
  1180. Not supported.
  1181. CHECKCOMPOUNDDUP (Hunspell) *spell-CHECKCOMPOUNDDUP*
  1182. Disallow using the same word twice in a compound. Not
  1183. supported.
  1184. CHECKCOMPOUNDREP (Hunspell) *spell-CHECKCOMPOUNDREP*
  1185. Something about using REP items and compound words. Not
  1186. supported.
  1187. CHECKCOMPOUNDTRIPLE (Hunspell) *spell-CHECKCOMPOUNDTRIPLE*
  1188. Forbid three identical characters when compounding. Not
  1189. supported.
  1190. CHECKSHARPS (Hunspell) *spell-CHECKSHARPS*
  1191. SS letter pair in uppercased (German) words may be upper case
  1192. sharp s (ß). Not supported.
  1193. COMPLEXPREFIXES (Hunspell) *spell-COMPLEXPREFIXES*
  1194. Enables using two prefixes. Not supported.
  1195. COMPOUND (Hunspell) *spell-COMPOUND*
  1196. This is one line with the count of COMPOUND items, followed by
  1197. that many COMPOUND lines with a pattern.
  1198. Remove the first line with the count and rename the other
  1199. items to COMPOUNDRULE |spell-COMPOUNDRULE|
  1200. COMPOUNDFIRST (Hunspell) *spell-COMPOUNDFIRST*
  1201. Use COMPOUNDRULE instead. |spell-COMPOUNDRULE|
  1202. COMPOUNDBEGIN (Hunspell) *spell-COMPOUNDBEGIN*
  1203. Words signed with COMPOUNDBEGIN may be first elements in
  1204. compound words.
  1205. Use COMPOUNDRULE instead. |spell-COMPOUNDRULE|
  1206. COMPOUNDLAST (Hunspell) *spell-COMPOUNDLAST*
  1207. Words signed with COMPOUNDLAST may be last elements in
  1208. compound words.
  1209. Use COMPOUNDRULE instead. |spell-COMPOUNDRULE|
  1210. COMPOUNDEND (Hunspell) *spell-COMPOUNDEND*
  1211. Probably the same as COMPOUNDLAST
  1212. COMPOUNDMIDDLE (Hunspell) *spell-COMPOUNDMIDDLE*
  1213. Words signed with COMPOUNDMIDDLE may be middle elements in
  1214. compound words.
  1215. Use COMPOUNDRULE instead. |spell-COMPOUNDRULE|
  1216. COMPOUNDRULES (Hunspell) *spell-COMPOUNDRULES*
  1217. Number of COMPOUNDRULE lines following. Ignored, but the
  1218. argument must be a number.
  1219. COMPOUNDSYLLABLE (Hunspell) *spell-COMPOUNDSYLLABLE*
  1220. Use SYLLABLE and COMPOUNDSYLMAX instead. |spell-SYLLABLE|
  1221. |spell-COMPOUNDSYLMAX|
  1222. KEY (Hunspell) *spell-KEY*
  1223. Define characters that are close together on the keyboard.
  1224. Used to give better suggestions. Not supported.
  1225. LANG (Hunspell) *spell-LANG*
  1226. This specifies language-specific behavior. This actually
  1227. moves part of the language knowledge into the program,
  1228. therefore Vim does not support it. Each language property
  1229. must be specified separately.
  1230. LEMMA_PRESENT (Hunspell) *spell-LEMMA_PRESENT*
  1231. Only needed for morphological analysis.
  1232. MAXNGRAMSUGS (Hunspell) *spell-MAXNGRAMSUGS*
  1233. Set number of n-gram suggestions. Not supported.
  1234. PSEUDOROOT (Hunspell) *spell-PSEUDOROOT*
  1235. Use NEEDAFFIX instead. |spell-NEEDAFFIX|
  1236. SUGSWITHDOTS (Hunspell) *spell-SUGSWITHDOTS*
  1237. Adds dots to suggestions. Vim doesn't need this.
  1238. SYLLABLENUM (Hunspell) *spell-SYLLABLENUM*
  1239. Not supported.
  1240. TRY (Myspell, Hunspell, others) *spell-TRY*
  1241. Vim does not use the TRY item, it is ignored. For making
  1242. suggestions the actual characters in the words are used, that
  1243. is much more efficient.
  1244. WORDCHARS (Hunspell) *spell-WORDCHARS*
  1245. Used to recognize words. Vim doesn't need it, because there
  1246. is no need to separate words before checking them (using a
  1247. trie instead of a hashtable).
  1248. ==============================================================================
  1249. 5. Spell checker design *develop-spell*
  1250. When spell checking was going to be added to Vim a survey was done over the
  1251. available spell checking libraries and programs. Unfortunately, the result
  1252. was that none of them provided sufficient capabilities to be used as the spell
  1253. checking engine in Vim, for various reasons:
  1254. - Missing support for multi-byte encodings. At least UTF-8 must be supported,
  1255. so that more than one language can be used in the same file.
  1256. Doing on-the-fly conversion is not always possible (would require iconv
  1257. support).
  1258. - For the programs and libraries: Using them as-is would require installing
  1259. them separately from Vim. That's mostly not impossible, but a drawback.
  1260. - Performance: A few tests showed that it's possible to check spelling on the
  1261. fly (while redrawing), just like syntax highlighting. But the mechanisms
  1262. used by other code are much slower. Myspell uses a hashtable, for example.
  1263. The affix compression that most spell checkers use makes it slower too.
  1264. - For using an external program like aspell a communication mechanism would
  1265. have to be setup. That's complicated to do in a portable way (Unix-only
  1266. would be relatively simple, but that's not good enough). And performance
  1267. will become a problem (lots of process switching involved).
  1268. - Missing support for words with non-word characters, such as "Etten-Leur" and
  1269. "et al.", would require marking the pieces of them OK, lowering the
  1270. reliability.
  1271. - Missing support for regions or dialects. Makes it difficult to accept
  1272. all English words and highlight non-Canadian words differently.
  1273. - Missing support for rare words. Many words are correct but hardly ever used
  1274. and could be a misspelled often-used word.
  1275. - For making suggestions the speed is less important and requiring to install
  1276. another program or library would be acceptable. But the word lists probably
  1277. differ, the suggestions may be wrong words.
  1278. Spelling suggestions *develop-spell-suggestions*
  1279. For making suggestions there are two basic mechanisms:
  1280. 1. Try changing the bad word a little bit and check for a match with a good
  1281. word. Or go through the list of good words, change them a little bit and
  1282. check for a match with the bad word. The changes are deleting a character,
  1283. inserting a character, swapping two characters, etc.
  1284. 2. Perform soundfolding on both the bad word and the good words and then find
  1285. matches, possibly with a few changes like with the first mechanism.
  1286. The first is good for finding typing mistakes. After experimenting with
  1287. hashtables and looking at solutions from other spell checkers the conclusion
  1288. was that a trie (a kind of tree structure) is ideal for this. Both for
  1289. reducing memory use and being able to try sensible changes. For example, when
  1290. inserting a character only characters that lead to good words need to be
  1291. tried. Other mechanisms (with hashtables) need to try all possible letters at
  1292. every position in the word. Also, a hashtable has the requirement that word
  1293. boundaries are identified separately, while a trie does not require this.
  1294. That makes the mechanism a lot simpler.
  1295. Soundfolding is useful when someone knows how the words sounds but doesn't
  1296. know how it is spelled. For example, the word "dictionary" might be written
  1297. as "daktonerie". The number of changes that the first method would need to
  1298. try is very big, it's hard to find the good word that way. After soundfolding
  1299. the words become "tktnr" and "tkxnry", these differ by only two letters.
  1300. To find words by their soundfolded equivalent (soundalike word) we need a list
  1301. of all soundfolded words. A few experiments have been done to find out what
  1302. the best method is. Alternatives:
  1303. 1. Do the sound folding on the fly when looking for suggestions. This means
  1304. walking through the trie of good words, soundfolding each word and
  1305. checking how different it is from the bad word. This is very efficient for
  1306. memory use, but takes a long time. On a fast PC it takes a couple of
  1307. seconds for English, which can be acceptable for interactive use. But for
  1308. some languages it takes more than ten seconds (e.g., German, Catalan),
  1309. which is unacceptable slow. For batch processing (automatic corrections)
  1310. it's too slow for all languages.
  1311. 2. Use a trie for the soundfolded words, so that searching can be done just
  1312. like how it works without soundfolding. This requires remembering a list
  1313. of good words for each soundfolded word. This makes finding matches very
  1314. fast but requires quite a lot of memory, in the order of 1 to 10 Mbyte.
  1315. For some languages more than the original word list.
  1316. 3. Like the second alternative, but reduce the amount of memory by using affix
  1317. compression and store only the soundfolded basic word. This is what Aspell
  1318. does. Disadvantage is that affixes need to be stripped from the bad word
  1319. before soundfolding it, which means that mistakes at the start and/or end
  1320. of the word will cause the mechanism to fail. Also, this becomes slow when
  1321. the bad word is quite different from the good word.
  1322. The choice made is to use the second mechanism and use a separate file. This
  1323. way a user with sufficient memory can get very good suggestions while a user
  1324. who is short of memory or just wants the spell checking and no suggestions
  1325. doesn't use so much memory.
  1326. Word frequency
  1327. For sorting suggestions it helps to know which words are common. In theory we
  1328. could store a word frequency with the word in the dictionary. However, this
  1329. requires storing a count per word. That degrades word tree compression a lot.
  1330. And maintaining the word frequency for all languages will be a heavy task.
  1331. Also, it would be nice to prefer words that are already in the text. This way
  1332. the words that appear in the specific text are preferred for suggestions.
  1333. What has been implemented is to count words that have been seen during
  1334. displaying. A hashtable is used to quickly find the word count. The count is
  1335. initialized from words listed in COMMON items in the affix file, so that it
  1336. also works when starting a new file.
  1337. This isn't ideal, because the longer Vim is running the higher the counts
  1338. become. But in practice it is a noticeable improvement over not using the word
  1339. count.
  1340. vim:tw=78:sw=4:ts=8:noet:ft=help:norl: