Paul Boersma 4f5ff55abd rid register 6 年之前
..
30112d10.pdf 6eaa7efd9b ink 6 年之前
PropList.txt 6eaa7efd9b ink 6 年之前
README.TXT 5c009ecddd Unicode features 6 年之前
UAX #44: Unicode Character Database.html 5c009ecddd Unicode features 6 年之前
UCD_features_generated_h.praat 4f5ff55abd rid register 6 年之前
UnicodeData File Format.html 5c009ecddd Unicode features 6 年之前
UnicodeData.txt 5c009ecddd Unicode features 6 年之前
UnicodeStandard-10.0.pdf 5c009ecddd Unicode features 6 年之前
gen-unicode-ctype.c.html 6eaa7efd9b ink 6 年之前

README.TXT

File generate/unicode/README.TXT
Paul Boersma, 20180405

Steps to generate the source file UnicodeData.cpp,
which is to be put in the sys folder:

1. Download UnicodeData.txt from unicode.org.

2. Prepend the following header, using a simple text editor:

code;name;category;combining;bidi;decomp;num1;num2;num3;mirror;dum1;dum2;upper;lower;title

After this, the file can be read as a Table in Praat with
"Read Table from semicolon-separated file...".
After that, the Table can be viewed with "View & Edit",
and it is easy to extract information with commands such as
"Extract rows where column (text)..." and "Extract rows where...".

For information on the meanings of the features, see the attached HTML file and
the attached UnicodeStandard-10.0.pdf (both from June 2017).

3. Run the script UnicodeData.cpp.praat. This creates UnicodeData.cpp.
The details of the process are discussed in UnicodeData.cpp.praat.

## Timing.

Step 3 is computationally intensive, and therefore a good test of the speed of the
Praat scripting language.

The measured time on my Late-2013 MacBook Pro with 2.3 GHz Intel Core i7:

Praat 6.0.40: XX seconds.