Commit History

Author SHA1 Message Date
  dan 3fb7bd5ee0 Fix sanitizer complaint in fts3 code. 4 years ago
  drh 2d77d80a65 Use 64-bit math to compute the sizes of memory allocations in extensions. 6 years ago
  dan e89feee5c3 Add the "remove_diacritics=2" option to the unicode61 tokenizer in both FTS5 6 years ago
  dan 920c83f18f Fix some problems in fts3 found by address-sanitizer. 8 years ago
  drh 490fe86f1a Fix harmless compiler warnings. 10 years ago
  mistachkin 86ac612e8a Fix some harmess compiler warnings in the FTS3 Unicode module. 10 years ago
  dan 2eaf03d72b Change fts3/4 so that the "unicode61" is included in builds by default. It may now be excluded by defining SQLITE_DISABLE_FTS3_UNICODE. 10 years ago
  mistachkin 48864df97d Many spelling fixes in comments. No changes to code. 12 years ago
  dan 25cdf46ae4 Add the "tokenchars=" and "separators=" options, for customizing the set of characters considered to be token separators, to the unicode61 tokenizer. 13 years ago
  dan 2c897e3e5f Disable FTS unicode61 by default. It is enabled by specifying compile time option SQLITE_ENABLE_FTS4_UNICODE61. 13 years ago
  dan 754d3adf7c Have the FTS unicode61 strip out diacritics when tokenizing text. This can be disabled by specifying the tokenizer option "remove_diacritics=0". 13 years ago
  dan 7946c53009 If SQLITE_DISABLE_FTS3_UNICODE is defined, do not build the "unicode61" tokenizer. 13 years ago
  dan 1c7016c9a5 Add special fast paths to sqlite3FtsUnicodeTolower() and Isalnum() for codepoints in the ASCII range. 13 years ago
  dan 3d403c71a8 Add an experimental tokenizer to fts4 - "unicode". This tokenizer works in the same way except that it understands unicode "simple case folding" and recognizes all characters not classified as "Letters" or "Numbers" by unicode as token separators. 13 years ago