basis/unicode/UCD/NamedSequencesProv.txt

   1 # NamedSequencesProv-13.0.0.txt
   2 # Date: 2020-01-22, 19:32:00 GMT [KW, LI]
   3 # © 2020 Unicode®, Inc.
   4 # For terms of use, see http://www.unicode.org/terms_of_use.html
   5 #
   6 # Unicode Character Database
   7 # For documentation, see http://www.unicode.org/reports/tr44/
   8 #
   9 # Provisional Unicode Named Character Sequences
  10 #
  11 # Note: This data file contains those named character
  12 #   sequences which have been designated to be provisional,
  13 #   rather than fully approved.
  14 #
  15 # Format:
  16 # Name of Sequence; Code Point Sequence for USI
  17 #
  18 # Code point sequences in the Unicode Character Database
  19 # use spaces as delimiters. The corresponding format for a
  20 # UCS Sequence Identifier (USI) in ISO/IEC 10646 uses
  21 # comma delimitation and angle brackets. Thus, a Unicode
  22 # named character sequence of the form:
  23 #
  24 # EXAMPLE NAME;1000 1001 1002
  25 #
  26 # in this data file, would correspond to an ISO/IEC 10646 USI
  27 # as follows:
  28 #
  29 # <1000, 1001, 1002>
  30 #
  31 # For more information, see UAX #34: Unicode Named Character
  32 # Sequences, at http://www.unicode.org/unicode/reports/tr34/
  33 #
  34 # Note: The order of entries in this file is not significant.
  35 # However, entries are generally in script order corresponding
  36 # to block order in the Unicode Standard, to make it easier
  37 # to find entries currently in the list.
  38
  39 # ================================================
  40
  41 # Provisional entries for NamedSequences.txt.
  42
  43 # Entries that correspond to Indic characters with nuktas
  44 # that are also listed in CompositionExclusions.txt.
  45 # These characters decompose for normalized text, even
  46 # in NFC. Having named sequences for these helps in
  47 # certain specifications, including Label Generation Rules (LGR)
  48 # for Internationalized Domain Names (IDN).
  49 #
  50 # Provisional 2020-01-16
  51
  52 DEVANAGARI SEQUENCE FOR LETTER QA; 0915 093C
  53 DEVANAGARI SEQUENCE FOR LETTER KHHA; 0916 093C
  54 DEVANAGARI SEQUENCE FOR LETTER GHHA; 0917 093C
  55 DEVANAGARI SEQUENCE FOR LETTER ZA; 091C 093C
  56 DEVANAGARI SEQUENCE FOR LETTER DDDHA; 0921 093C
  57 DEVANAGARI SEQUENCE FOR LETTER RHA; 0922 093C
  58 DEVANAGARI SEQUENCE FOR LETTER FA; 092B 093C
  59 DEVANAGARI SEQUENCE FOR LETTER YYA; 092F 093C
  60 BENGALI SEQUENCE FOR LETTER RRA; 09A1 09BC
  61 BENGALI SEQUENCE FOR LETTER RHA; 09A2 09BC
  62 BENGALI SEQUENCE FOR LETTER YYA; 09AF 09BC
  63 GURMUKHI SEQUENCE FOR LETTER LLA; 0A32 0A3C
  64 GURMUKHI SEQUENCE FOR LETTER SHA; 0A38 0A3C
  65 GURMUKHI SEQUENCE FOR LETTER KHHA; 0A16 0A3C
  66 GURMUKHI SEQUENCE FOR LETTER GHHA; 0A17 0A3C
  67 GURMUKHI SEQUENCE FOR LETTER ZA; 0A1C 0A3C
  68 GURMUKHI SEQUENCE FOR LETTER FA; 0A2B 0A3C
  69 ORIYA SEQUENCE FOR LETTER RRA; 0B21 0B3C
  70 ORIYA SEQUENCE FOR LETTER RHA; 0B22 0B3C
  71
  72 # ================================================
  73
  74 # Entries from Unicode 4.1.0 version of NamedSequences.txt,
  75 # subsequently disapproved because of potential errors in
  76 # representation.
  77
  78 # GURMUKHI HALF YA;0A2F 0A4D
  79 # GURMUKHI PARI YA;0A4D 0A2F
  80
  81 # Entry removed 2006-05-18:
  82 #
  83 # LATIN SMALL LETTER A WITH ACUTE AND OGONEK;00E1 0328
  84 #
  85 # This entry was removed because the sequence was not in NFC,
  86 # as required. It was replaced with the NFC version of
  87 # the sequence, based on the Lithuanian additions accepted
  88 # for Unicode 5.0.
  89
  90 # EOF