unicode: keep only the files we need locally

[factor.git] / basis / unicode / UCD / NamesList.html
diff --git a/basis/unicode/UCD/NamesList.html b/basis/unicode/UCD/NamesList.html

deleted file mode 100644 (file)

index 8b1d51a..0000000
--- a/basis/unicode/UCD/NamesList.html
+++ /dev/null
@@ -1,691 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
-
-       "http://www.w3.org/TR/html4/loose.dtd"> 
-
-<html>
-
-<head>
-<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
-<meta http-equiv="Content-Language" content="en-us">
-<title>UCD: Unicode NamesList File Format</title>
-<link rel="stylesheet" type="text/css" href="https://www.unicode.org/reports/reports-v2.css">
-</head>
-
-<body>
-
-       <table class="header">
-               <tr>
-          <td class="icon" style="width:38px; height:35px">
-          <a href="https://www.unicode.org/">
-          <img border="0" src="https://www.unicode.org/webscripts/logo60s2.gif" align="middle" 
-          alt="[Unicode]" width="34" height="33"></a>
-          </td>
-
-          <td class="icon" style="vertical-align:middle">
-          <a class="bar"> </a>
-          <a class="bar" href="https://www.unicode.org/ucd/"><font size="3">Unicode Character Database</font></a>
-          </td>
-               </tr>
-               <tr>
-                       <td colspan="2" class="gray">&nbsp;</td>
-               </tr>
-       </table>
-<div class="body">
-  <h1>Unicode® NamesList File Format</h1>   
-  <table class="simple" width="90%">
-    <tbody>
-      <tr>
-        <td valign="top" width="144">Revision</td>
-        <td valign="top">15.0.0</td>
-      </tr>
-      <tr>
-        <td valign="top" width="144">Authors</td>
-        <td valign="top">Asmus Freytag, Ken Whistler</td>
-      </tr>
-      <tr>
-        <td valign="top" width="144">Date</td>
-        <td valign="top">2022-08-08</td>
-      </tr>
-      <tr>
-        <td valign="top" width="144">This Version</td>
-        <td valign="top">
-               <a href="http://www.unicode.org/Public/15.0.0/ucd/NamesList.html">
-               http://www.unicode.org/Public/15.0.0/ucd/NamesList.html</a></td>
-      </tr>
-      <tr>
-        <td valign="top" width="144">Previous Version</td>
-        <td valign="top">
-               <a href="http://www.unicode.org/Public/14.0.0/ucd/NamesList.html">
-               http://www.unicode.org/Public/14.0.0/ucd/NamesList.html</a></td>
-      </tr>
-      <tr>
-        <td valign="top" width="144">Latest Version</td>
-        <td valign="top"><a href="http://www.unicode.org/Public/UCD/latest/ucd/NamesList.html">http://www.unicode.org/Public/UCD/latest/ucd/NamesList.html</a></td>
-      </tr>
-    </tbody>
-  </table>
-  <p>&nbsp;</p>
-  <h3><i>Summary</i></h3>
-  <blockquote>
-    <p>This file describes the format and contents of NamesList.txt</p>
-  </blockquote>
-  <h3><i>Status</i></h3>
-  <blockquote>
-    <p><i>The file and the files described herein are part of the <a href="http://www.unicode.org/ucd/">Unicode 
-    Character Database</a> (UCD). The Unicode <a href="http://www.unicode.org/terms_of_use.html"> 
-    Terms of Use</a> apply.</i></p>
-  </blockquote>
-  <hr width="50%">
-
-<h2>1.0 <a name="Introduction" href="#Introduction">Introduction</a></h2>
-
-<p>The Unicode name list file NamesList.txt (also NamesList.lst) is a plain
-text file used to drive the layout of the character code charts in the Unicode
-Standard. The information in this file is a combination of several fields from
-the UnicodeData.txt and Blocks.txt files, together with additional annotations
-for many characters.</p>
-<p>This document describes the syntax rules for the file 
-format, but also gives brief information on how each construct is rendered
-when laid out for the code charts. Some of the syntax elements are used only in
-preparation of the drafts of the code charts and are not present in the final,
-released form of the NamesList.txt file.</p>
-
-<p>Over time, the syntax has been extended by adding new features. The syntax for formal aliases and index tabs was introduced with Unicode
-5.0. The syntax for marginal sidebar comments is utilized extensively in
-draft versions of the NamesList.txt file. The support for UTF-8 encoded files and the syntax for the UTF-8 charset 
-declaration in a comment at the head of the file were introduced after Unicode 
-6.1.0 was published, as was the syntax for the specification of variation sequences and alternate glyphs and their respective summaries. The repertoire restriction
-in comments and aliases in the names list format was loosened from the prior
-limitation to U+0020..U+00FF, to include the wider range U+0020..U+02FF, as of Unicode 11.0.</p>
-
-<p>The same input file can be used for the preparation of drafts and final editions for ISO/IEC 
-  10646. Earlier versions of that standard used a different style, referred to below as ISO-style. That style necessitated the presence of some 
-  information in the name list file that is not needed (and in fact removed
-  during parsing) for the Unicode code charts.</p>
-
-<p>With access to the layout program (<a href="http://www.unicode.org/unibook/">Unibook</a>) it is a simple matter of 
-creating name lists for the purpose of formatting working drafts or other documents containing 
-proposed characters.</p>  
-  <p>The content of the NamesList.txt file is optimized for code chart creation. 
-  Some information that can be inferred by the reader from context has been 
-  suppressed to make the code charts more readable. See the chapter on Code 
-       Charts in the <a href="http://www.unicode.org/versions/latest">Unicode 
-       Standard</a>.</p> 
-
-<h3>1.1 <a name="Overview" href="#Overview">NamesList File Overview</a></h3>
-
-<p>The NamesList files are plain text files which in their most simple form look 
-like this:</p>
-
-<p>@@&lt;tab&gt;0020&lt;tab&gt;BASIC LATIN&lt;tab&gt;007F<br>
-; this is a file comment (ignored)<br>
-0020&lt;tab&gt;SPACE<br>
-0021&lt;tab&gt;EXCLAMATION MARK<br>
-0022&lt;tab&gt;QUOTATION MARK<br>
-. . . <br>
-007F&lt;tab&gt;DELETE</p>
-
-<p>The semicolon (as first character), @ and &lt;tab&gt; characters are used
-by the file syntax and must be provided as shown. Hexadecimal digits must be 
-in UPPERCASE. A double @@ introduces a block header, with the title, and 
-start and ending code of the block provided as shown.</p>
-
-<p>For a minimal name list, only the NAME_LINE and BLOCKHEADER and 
-their constituent syntax elements are needed.</p>
-
-<p>The full syntax with all the options is provided in the following sections.</p>
-
-<h2>2.0 <a name="FileStructure" href="#FileStructure">NamesList File Structure</a></h2>
-
-<p>This section defines the overall file structure</p>
-
-<pre><strong>NAMELIST:     TITLE_PAGE* EXTENDED_BLOCK*
-</strong>
-<strong>TITLE_PAGE:   TITLE 
-               | TITLE_PAGE SUBTITLE 
-               | TITLE_PAGE SUBHEADER 
-               | TITLE_PAGE IGNORED_LINE 
-               | TITLE_PAGE EMPTY_LINE
-               | TITLE_PAGE NOTICE_LINE
-               | TITLE_PAGE COMMENT_LINE
-               | TITLE_PAGE PAGEBREAK 
-               | TITLE_PAGE FILE_COMMENT 
-               | FILE_COMMENT
-
-
-EXTENDED_BLOCK:        BLOCK 
-               | BLOCK SUMMARY
-
-
-BLOCK: BLOCKHEADER 
-               | BLOCKHEADER INDEX_TAB
-               | BLOCK CHAR_ENTRY 
-               | BLOCK SUBHEADER 
-               | BLOCK NOTICE_LINE 
-               | BLOCK EMPTY_LINE 
-               | BLOCK IGNORED_LINE
-               | BLOCK SIDEBAR_LINE
-               | BLOCK PAGEBREAK
-               | BLOCK FILE_COMMENT
-               | BLOCK CROSS_REF
-
-
-CHAR_ENTRY:   NAME_LINE | RESERVED_LINE
-               | CHAR_ENTRY ALIAS_LINE
-               | CHAR_ENTRY FORMALALIAS_LINE
-               | CHAR_ENTRY COMMENT_LINE
-               | CHAR_ENTRY CROSS_REF
-               | CHAR_ENTRY DECOMPOSITION
-               | CHAR_ENTRY COMPAT_MAPPING
-               | CHAR_ENTRY IGNORED_LINE
-               | CHAR_ENTRY EMPTY_LINE
-               | CHAR_ENTRY NOTICE_LINE
-               | CHAR_ENTRY FILE_COMMENT 
-               | CHAR_ENTRY VARIATION_LINE
-</strong></pre>
-
-<p>In other words:</p>
-<p> 
-  Neither TITLE nor SUBTITLE may occur after the first BLOCKHEADER. </p>
-<p>Only TITLE, SUBTITLE, SUBHEADER, PAGEBREAK, COMMENT_LINE, NOTICE_LINE, 
-    EMPTY_LINE, IGNORED_LINE and FILE_COMMENT may occur before the first BLOCKHEADER.</p>
-<ul>
-  <li>CROSS_REF, DECOMPOSITION, COMPAT_MAPPING, VARIATION_LINE, ALIAS and FORMALALIAS_LINE lines 
-    occurring before the first block header are treated as if they were 
-    COMMENT_LINEs.</li>
-</ul>
-<p>Directly following either a NAME_LINE or a RESERVED_LINE an uninterrupted 
-    sequence of the following lines may occur (in any order and repeated as often
-    as needed): ALIAS_LINE, CROSS_REF, DECOMPOSITION, COMPAT_MAPPING, FORMALALIAS_LINE, NOTICE_LINE, 
-    EMPTY_LINE, IGNORED_LINE, VARIATION_LINE and FILE_COMMENT.</p>
-<ul>
-  <li>The conventional order of elements in a char entry: NAME_LINE, 
-    FORMALALIAS_LINE, ALIAS, COMMENT_LINE or NOTICE_LINE, CROSS_REFs, VARIATION_LINE, and optionally 
-    ending in either DECOMPOSITION or COMPAT_MAPPING is not enforced by the layout program 
-    (<a href="http://www.unicode.org/unibook/">Unibook</a>). </li>
-</ul>
-<p>Except for CROSS_REF, NOTICE_LINE, SIDEBAR_LINE, EMPTY_LINE, IGNORED_LINE and 
-    FILE_COMMENT, none of these lines may 
-    occur in any other place.</p>
-<ul>
-  <li>A NOTICE_LINE or CROSS_REF displays differently depending on whether it follows a header or title 
-    or is part of a CHAR_ENTRY</li>
-  </ul>
-<p>A PAGEBREAK may appear anywhere, except the middle of a CHARACTER_ENTRY. 
-    A PAGEBREAK before the file title lines may not be supported. INDEX_TABs may 
-    appear after any block header.</p>
-<p>If the first line of a file is a file comment, it may contain a UTF-8 
-    charset declaration (see below). Alternatively, or in addition, a BOM may be 
-    present at the very beginning of the file, forcing the encoding to be 
-    interpreted as UTF-16 (little-endian only) or UTF-8. When
-    declared as UTF-8, the names list format will support use of characters in
-    the range U+0020..U+02FF in LINE and LABEL elements. Otherwise,
-    the supported repertoire is limited to Latin-1, and attempted use of characters outside
-    the Latin-1 range will result in data corruption.</p>
-<p>Several of these elements, while part of the formal definition of the 
-  file format, do not occur in final published versions of 
-   NamesList.txt in the UCD.</p>
-
-<h4>Blocks followed by Summaries</h4>
-<p>A block may be extended by a summary of standard variation sequences or selected alternate glyphs (or both) defined for characters in the block:</p>
-<pre><strong>
-SUMMARY:   ALTGLYPH_SUMMARY
-               | VARIATION SUMMARY
-               | ALTGLYPH_SUMMARY VARIATION_SUMMARY
-               | MIXED_SUMMARY
-
-ALTGLYPH_SUMMARY:   ALTGLYPH_SUBHEADER
-               | ALTGLYPH_SUMMARY SUMMARY_LINE
-
-VARIATION_SUMMARY:   VARIATION_SUBHEADER
-               | VARIATION_SUMMARY SUMMARY_LINE
-                               
-MIXED_SUMMARY:   MIXED_SUBHEADER
-               | MIXED_SUMMARY SUMMARY_LINE
-
-SUMMARY_LINE:   SUBHEADER
-               | NOTICE_LINE
-               | FILE_COMMENT
-               | EMPTY_LINE</strong>
-</pre>
-
-<p>When formatted for display, each summary will recap the information presented in the VARIATION_LINE elements
-of the preceding block, grouped by alternate glyph variants and standardized variation sequences, and
-preceded by the corresponding subheader. Additional SUBHEADER and NOTICE lines, if provided, immediately
-follow the ALTGLYPH_SUBHEADER, VARIATION_SUBHEADER or MIXED_SUBHEADER. There is no provision to provide subheaders that are
-interspersed between items in the summary.</p>
-
-<p>These syntax constructs are entirely optional. If the ALTGLYPH_SUBHEADER or VARIATION_SUBHEADER are 
-omitted from the names list, but the preceding block nevertheless contains VARIATION_LINE elements 
-as described below, Unibook will automatically generate any required summaries using a default format for the headers.</p>
-
-<p>Thus, the main purpose for providing ALTGLYPH_SUBHEADER or VARIATION_SUBHEADER elements would be to 
-provide specific contents for these summary titles as well as allow the ability to add additional 
-information via SUBHEADER and NOTICE elements. The final published version of the Unicode names list 
-is machine generated and will always explicitly provide any summary subheaders.</p>
-
-<h3>2.1 <a name="FileElements" href="#FileElements">NamesList File Elements</a></h3>
-
-<p>This section provides the details of the syntax for the individual elements.</p>
-
-<pre><strong>ELEMENT           SYNTAX</strong> // How rendered
-
-<strong>NAME_LINE:     CHAR TAB NAME LF</strong>
-                       // The CHAR and the corresponding image are echoed, 
-                       // followed by the name as given in NAME
-
-<strong>               | CHAR TAB &quot;&lt;&quot; LCNAME &quot;&gt;&quot; LF</strong>
-                       // Control and noncharacters use this form of       
-                       // lowercase, bracketed pseudo character name
-
-<strong>               | CHAR TAB NAME SP COMMENT LF</strong>
-                       // Names may have a comment, which is stripped off
-                       // unless the file is parsed for an ISO style list
-                        
-<strong>               | CHAR TAB &quot;&lt;&quot; LCNAME &quot;&gt;&quot; SP COMMENT LF</strong>
-                       // Control and noncharacters may also have comments
-       
-<strong>RESERVED_LINE: CHAR TAB &quot;&lt;reserved&gt;&quot; LF</strong>
-                       // The CHAR is echoed followed by an icon for the
-                       // reserved character and a fixed string e.g. &quot;&lt;reserved&gt;&quot;
-       
-<strong>COMMENT_LINE:  TAB &quot;*&quot; SP EXPAND_LINE</strong>
-                       // * is replaced by BULLET, output line as comment
-
-<strong>               | TAB EXPAND_LINE</strong>       
-                       // Output line as comment
-
-<strong>ALIAS_LINE:    TAB &quot;=&quot; SP LINE</strong>      
-                       // Replace = by itself, output line as alias
-
-<strong>FORMALALIAS_LINE:
-               TAB &quot;%&quot; SP NAME LF</strong>   
-                       // Replace % by U+203B, output line as formal alias
-
-<strong>CROSS_REF:     TAB &quot;x&quot; SP CHAR SP LCNAME LF  
-               | TAB &quot;x&quot; SP CHAR SP &quot;&lt;&quot; LCNAME &quot;&gt;&quot; LF</strong>
-                       // x is replaced by a right arrow
-
-<strong>               | TAB &quot;x&quot; SP &quot;(&quot; LCNAME SP &quot;-&quot; SP CHAR &quot;)&quot; LF    
-               | TAB &quot;x&quot; SP &quot;(&quot; &quot;&lt;&quot; LCNAME &quot;&gt;&quot; SP &quot;-&quot; SP CHAR &quot;)&quot; LF</strong>  
-                       // x is replaced by a right arrow;
-                       // (second type as used for control and noncharacters)
-
-                       // In the forms with parentheses the &quot;(&quot;,&quot;-&quot; and &quot;)&quot; are removed
-                       // and the order of CHAR and LCNAME is reversed;
-                       // i.e. all inputs result in the same order of output
-
-<strong>               | TAB &quot;x&quot; SP CHAR LF</strong>
-                       // x is replaced by a right arrow
-                       // (this type is the only one without LCNAME 
-                       // and is used for ideographs)
-
-<strong>VARIATION_LINE:        TAB &quot;~&quot; SP CHAR VARSEL SP LABEL LF   
-               | TAB &quot;~&quot; SP CHAR VARSEL SP LABEL &quot;(&quot; LCTAG &quot;)&quot;LF</strong>      
-                       // output standardized variation sequence or simply the char code in case of alternate
-                       // glyphs, followed by the alternate glyph or variation glyph and the label and context
-
-<strong>FILE_COMMENT:  &quot;;&quot;  LINE</strong> 
-
-<strong>EMPTY_LINE:    LF</strong>       
-                       // Empty and ignored lines as well as 
-                       // file comments are ignored
-
-<strong>IGNORED_LINE:  TAB &quot;;&quot; LINE</strong>
-                       // Ignore LINE
-
-<strong>SIDEBAR_LINE:  &quot;;;&quot; LINE</strong>
-                       // Output LINE as marginal note
-
-<strong>DECOMPOSITION: TAB &quot;:&quot; SP EXPAND_LINE       
-               | TAB &quot;:&quot; SP &quot;&lt;&quot; TAG &quot;&gt;&quot; SP EXPAND_LINE</strong>       
-                       // Replace ':' by EQUIV, expand line into decomposition
-                       // The &lt;tag&gt; gives optional information,
-                       // e.g., about composition exclusion. 
-                       // by convention the tag has initial lowercase
-
-<strong>COMPAT_MAPPING:        TAB &quot;#&quot; SP EXPAND_LINE       
-               | TAB &quot;#&quot; SP &quot;&lt;&quot; TAG &quot;&gt;&quot; SP EXPAND_LINE</strong>    
-                       // Replace '#' by APPROX, output line as mapping
-                       // The &lt;tag&gt; is the optional compatibility decomposition tag.
-                       // by convention the tag has initial lowercase
-
-<strong>NOTICE_LINE:   &quot;@+&quot; TAB LINE</strong>       
-                       // Output LINE as notice
-
-<strong>               | &quot;@+&quot; TAB * SP LINE</strong>   
-                       // Output LINE as notice
-                       // &quot;*&quot; expands to a bullet character
-                       // Notices following a character code apply to the
-                       // character and are indented. Notices not following
-                       // a character code apply to the page/block/column 
-                       // and are italicized, but not indented
-
-<strong>TITLE:         &quot;@@@&quot; TAB LINE</strong>      
-                       // Output LINE as text
-                       // Title is used in page headers
-
-<strong>SUBTITLE:      &quot;@@@+&quot; TAB LINE</strong>      
-                       // Output LINE as subtitle
-
-<strong>SUBHEADER:     &quot;@&quot; TAB LINE</strong> 
-                       // Output LINE as column header
-
-<strong>VARIATION_SUBHEADER:</strong>  <strong>&quot;@~&quot; TAB LINE</strong>                        
-                       // Output LINE as column header (summary subheader)
-               <strong>| &quot;@~&quot;</strong>                                       
-                       // Output a default standard variation sequences summary subheader
-               <strong>| &quot;@~&quot; TAB &quot;!&quot;</strong>             
-                       // Suppress output of a default standard variant sequences summary subheader
-                       // and disable display of summary
-               <strong>| &quot;@~&quot; TAB &quot;!&quot; VARSEL_LIST</strong>
-               <strong>| &quot;@~&quot; TAB &quot;!&quot; VARSEL_LIST LINE</strong>
-                       // Output a standard summary subheader, using default or LINE respectively
-                       // Suppress any std variation sequences using selectors from the list
-                                                                                       
-<strong>ALTGLYPH_SUBHEADER:</strong>   <strong>&quot;@@~&quot; TAB LINE</strong>                       
-                       // Output LINE as column header (summary subheader)
-               <strong>| &quot;@@~&quot;</strong>                              
-                       // Output a default alternate glyph summary subheader
-               <strong>| &quot;@@~&quot; TAB &quot;!&quot;</strong>            
-                       // Suppress output of a default alternate glyph summary subheader
-                       // and disable display of summary
-
-<strong>MIXED_SUBHEADER:       </strong><strong>&quot;@@@~&quot; TAB LINE</strong>                     
-                       // Output LINE as column header (summary subheader)
-               <strong>| &quot;@@@~&quot;</strong>                                     
-                       // Output a default combined variation and alternate glyph summary subheader
-               <strong>| &quot;@@@~&quot; TAB &quot;!&quot;</strong>                   
-                       // Suppress output of a default alternate glyph summary subheader
-                       // and disable display of summary
-               <strong>| &quot;@@@~&quot; TAB &quot;!&quot; VARSEL_LIST</strong> 
-               <strong>| &quot;@@@~&quot; TAB &quot;!&quot; VARSEL_LIST LINE</strong>
-                       // Output a combined summary subheader, using default or LINE respectively
-                       // Suppress any std variation sequences using selectors from the list
-
-<strong>BLOCKHEADER:   &quot;@@&quot; TAB BLOCKSTART TAB BLOCKNAME TAB BLOCKEND LF</strong>
-                       // Cause a page break and optional
-                       // blank page, then output one or more charts
-                       // followed by the list of character names. 
-                       // Use BLOCKSTART and BLOCKEND to define
-                       // what characters belong to a block.
-                       // Use BLOCKNAME in page and table headers
-
-<strong>BLOCKNAME:     LABEL
-               | LABEL SP &quot;(&quot; LABEL &quot;)&quot;</strong>       
-                       // If an alternate label is present it replaces 
-                       // the BLOCKNAME when an ISO-style names list is
-                       // laid out; it is ignored in the Unicode charts
-
-<strong>BLOCKSTART:    CHAR</strong>   // First character position in block
-<strong>BLOCKEND:      CHAR</strong>   // Last character position in block
-<strong>PAGEBREAK:     &quot;@@&quot;</strong> // Insert a (column) break
-<strong>INDEX_TAB:     &quot;@@+&quot;</strong>        // Start a new index tab at latest BLOCKSTART
-
-<strong>EXPAND_LINE:   {ESC_CHAR | CHAR | STRING | ESC +}+ LF</strong>
-                       // Instances of CHAR (see Notes) are replaced by 
-                       // CHAR NBSP x NBSP where x is the single Unicode
-                       // character corresponding to CHAR.
-                       // If character is combining, it is replaced with
-                       // CHAR NBSP &lt;circ&gt; x NBSP where &lt;circ&gt; is the 
-                       // dotted circle
-</pre>
-
-
-       <b>Notes:</b><ul>
-       <li>Blocks must be aligned on 16-code point boundary and contain an integer 
-               multiple of 16-code point columns. The exception to that rule is for blocks of
-               ideographs, <i>etc.</i>, for which no names are listed in the file. The BLOCKEND for such blocks 
-               must correspond to the last assigned character, and not the actual end of the block.</li>
-       <li>Blocks must be non-overlapping and in ascending order. NAME_LINEs 
-               must be in ascending order and follow the block header for the block to 
-               which they belong. </li>
-       <li>Reserved entries are optional, and will normally be supplied automatically. They are 
-               required whenever followed by ALIAS_LINE, COMMENT_LINE, NOTICE_LINE or CROSS_REF.
-       </li>
-<li>An empty alternative glyph summary subheader expression will result in default header &quot;Selected Alternative Glyphs&quot;</li>
-<li>An empty standard variation subheader expression will result in the default header &quot;Standardized Variation Sequences&quot;</li>
-<li> A VARSEL_LIST may only contain code points for standard variation selectors (including script specific ones)</li>
-<li>When displaying a VARIATION_LINE for alternate glyphs, the &quot;ALTn&quot; selector is not displayed. </li>
-<li>If a glyph is unavailable for the variant glyph in a VARIATION_LINE it is replaced by the glyph for U+2591 LIGHT SHADE.</li>
-       </ul>
-
-
-<h3>2.2 <a name="FilePrimitives" href="#FilePrimitives">NamesList File Primitives</a></h3>
-
-<p>The following are the primitives and terminals for the NamesList syntax.</p>
-
-<pre><strong>LINE</strong>:            <strong>STRING LF
-COMMENT:       &quot;(&quot; LABEL &quot;)&quot;
-               | &quot;(&quot; LABEL &quot;)&quot; SP &quot;*&quot;
-               | &quot;*&quot;</strong>
-
-<strong>NAME</strong>:         &lt;sequence of uppercase ASCII letters, digits, space and hyphen&gt; 
-<strong>LCNAME</strong>:               &lt;sequence of lowercase ASCII letters, digits, space and hyphen&gt;
-               <strong>| LCNAME &quot;-&quot; CHAR</strong>
-
-<strong>TAG</strong>:          &lt;sequence of ASCII letters&gt;
-<strong>LCTAG</strong>:                &lt;sequence of lowercase ASCII letters&gt;
-<strong>STRING</strong>:               &lt;sequence of characters in the range U+0020..U+02FF, except controls&gt; 
-<strong>LABEL</strong>:                &lt;sequence of characters in the range U+0020..U+02FF, except controls, &quot;(&quot; or &quot;)&quot;&gt; 
-<strong>VARSEL</strong>:               <strong>CHAR
-               | ALT ( &quot;1&quot;|&quot;2&quot;|&quot;3&quot;|&quot;4&quot;|&quot;5&quot;|&quot;6&quot;|&quot;7&quot;|&quot;8&quot;|&quot;9&quot; )</strong>
-<strong>VARSEL_LIST</strong>:  <strong>&quot;{&quot; CHAR_LIST &quot;}&quot;</strong>
-<strong>CHAR_LIST</strong>:    <strong>CHAR
-               | CHAR_LIST SP CHAR</strong>
-<strong>CHAR</strong>:         <strong>X X X X</strong>
-               <strong>| X X X X X </strong>
-               <strong>| X X X X X X </strong>
-<strong>X</strong>:            <strong>&quot;0&quot;|&quot;1&quot;|&quot;2&quot;|&quot;3&quot;|&quot;4&quot;|&quot;5&quot;|&quot;6&quot;|&quot;7&quot;|&quot;8&quot;|&quot;9&quot;|&quot;A&quot;|&quot;B&quot;|&quot;C&quot;|&quot;D&quot;|&quot;E&quot;|&quot;F&quot;</strong> 
-<strong>ESC_CHAR</strong>:     <strong>ESC CHAR</strong>       
-<strong>ESC</strong>:          <strong>&quot;\&quot;</strong>  
-                       // Special semantics of backslash (\) are supported
-                       // only in EXPAND_LINE.
-<strong>TAB</strong>:          &lt;sequence of one or more ASCII tab characters 0x09&gt;      
-<strong>SP</strong>:           &lt;ASCII 20&gt;
-<strong>LF</strong>:           &lt;any sequence of ASCII 0A and 0D&gt;
-</pre>
-
-<p><b>Notes:</b></p>
-<ul>
-       <li>Multiple or leading spaces, multiple or leading hyphens, as well as 
-       word-initial digits in NAMEs or LCNAMEs are illegal.</li>
-       <li>The French version of the names list uses French rules, which allow 
-               apostrophe and accented letters in character names.</li>
-       <li>When names containing code points are lowercased to make them LCNAMEs, 
-         the code point values remain uppercase. Such code points by convention 
-         follow a hyphen and are the last element in the name.</li>
-       <li>Special lookahead logic prevents a 4 digit number for a standard, such 
-       as ISO 9999 from being misinterpreted as ISO CHAR. Currently recognized are 
-       &quot;ISO&quot;, &quot;DIN&quot;, &quot;IEC&quot; and &quot;S X&quot; as well as &quot;S C&quot; for the JIS X and JIS C series of 
-       standards. For other standards, or for four-digit years in a comment, use a 
-       NOTICE_LINE instead, which prevents expansion, or use '\&quot; to escape the digits.</li>
-       <li>Single and double straight quotes in an EXPAND_LINE are replaced by curly quotes using English rules.
-    Smart apostrophes are supported, but nested quotes are not.
-       Single quotes can only be applied around a single word.</li>
-       <li>A CHAR inside ' or &quot; is expanded, but only its glyph image is printed, the
-    code value is not echoed.</li>
-       <li>Inside an EXPAND_LINE, backslash is treated as an escape character that
-       removes the special meaning of any literal character and also prevents
-       the following digit sequence from being expanded. A backslash character in
-       isolation is never displayed. A sequence of two backslash characters results
-       in display of a single backslash, but has no effect on the interpretation
-       of following characters.</li>
-       <li>The hyphen in a character range CHAR-CHAR is replaced by an EN DASH on 
-       output.</li>
-       <li>The NamesList.txt file is encoded in UTF-8 if the <i>first line</i> is a 
-       FILE_COMMENT containing the declaration &quot;UTF-8&quot; or any casemap variation 
-       thereof. Otherwise the file is encoded in Latin-1 (older versions). Beyond 
-       detecting the charset declaration (typically: &quot;; charset=utf-8&quot;) the 
-       remainder of that comment is ignored. 
-       If the file is not encoded as 
-       UTF-8, the character repertoire for running text (anything
-       other than CHAR) is effectively restricted to the repertoire of Latin-1.
-       Otherwise, characters in the range U+0020..U+02FF
-               are allowed in STRING or LABEL elements, and elements derived from them.</li>
-       <li>The code chart layout program
-       (<a href="http://www.unicode.org/unibook/">Unibook</a>)
-       can accept files in several other formats. These include little-endian UTF-16,
-       prefixed with a BOM, or UTF-8 prefixed with the UTF-8 BOM.</li>
-       <li>While the format allows multiple &lt;tab&gt; characters, by convention the 
-       actual number of tabs is always one or two, chosen to provide the best 
-       layout of the plain text file.</li>
-       <li>Earlier published versions of the NamesList.txt file may contain trailing or otherwise extraneous 
-       spaces or tab characters; while these are errors in the files, they are not 
-       being corrected, to retain stability of the published versions. Anyone 
-       writing a parser for older versions of this file may need to be prepared to 
-       handle such exceptions.</li>
-       <li>The final LF in the file must be present.</li>
-</ul>
-  <h2><a name="Modifications" href="#Modifications">Modifications</a></h2>
-
-    <p><b>Version 15.0.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 15.0.0.</li>
-       </ul>
-    <p><b>Version 14.0.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 14.0.0.</li>
-        <li>Corrected character name LIGHT SCREEN to LIGHT SHADE.</li>
-       </ul>
-    <p><b>Version 13.0.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 13.0.0.</li>
-        <li>Added a second expansion for DECOMPOSITION, for possible future
-               use to designate specific subtypes of canonical decompositions
-               in the names list output.</li>
-       </ul>
-    <p><b>Version 12.1.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 12.1.0.</li>
-       </ul>
-    <p><b>Version 12.0.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 12.0.0.</li>
-        <li>Added definition of TAG (allowing uppercase letters), distinct from LCTAG.</li>
-        <li>Corrected definition of VARIATION_LINE to use LCTAG instead of LCNAME.</li>
-        <li>Corrected definition of COMPAT_MAPPING to use TAG instead of LCTAG.</li>
-        <li>Corrected the documentation regarding which elements allow use of characters
-               in the range U+0020..U+02FF.</li>
-        </ul>
-    <p><b>Version 11.0.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 11.0.0.</li>
-        <li>Loosened the limitation on repertoire allowed in LINE and LABEL
-               elements to include characters outside Latin-1, in the range
-               U+0100..U+02FF.</li>
-        </ul>
-  <p><b>Version 10.0.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 10.0.0.</li>
-        </ul>
-  <p><b>Version 9.0.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 9.0.0.</li>
-        </ul>
-  <p><b>Version 8.0.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 8.0.0.</li>
-  <li>Added MIXED_SUBHEADER, VARSEL_LIST, and CHAR_LIST to the syntax.</li>
-  <li>Tweaked BNF and notes for variation summaries.</li>
-        </ul>
-  <p><b>Version 7.0.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 7.0.0.</li>
-        </ul>
-  <p><b>Version 6.3.0</b></p>
-        <ul>
-        <li>Reissued for Unicode 6.3.0.</li>
-        </ul>
-  <p><b>Version 6.2.0</b></p>
-        <ul>
-  <li>Edited the variation syntax definitions, description and corresponding notes for wording.</li>
-  <li>Minor tweaks to the layout of BNF syntax, mostly adding tabs and | characters as needed.</li>
-  <li>Fixed some typographical errors and minor inconsistencies.</li>
-  <li>Added syntax for elements required by variation sequence and alternate glyph summaries.</li>
-  <li>Edited and reformatted some notes for readability.</li>
-  <li>Documented the permitted presence of CROSS_REF outside character entries within blocks. 
-  Such CROSS_REFs have been present in published names lists, but that information was missing in 
-  the syntax description. For an example see the Currency Symbols block in the code charts.</li>
-  <li>Added description of UTF-8 charset declaration and file encoding.</li>
-        </ul>
-  <p><b>Version 6.1.0</b></p>
-        <ul>
-               <li>Removed constraint that LCTAG consist only of lowercase letters,
-                because of the existence of the &quot;noBreak&quot; tag.</li>
-        </ul>
-  <p><b>Version 6.0.0</b></p>
-        <ul>
-               <li>Added definitions for ESC_CHAR and ESC primitives.</li>
-               <li>Clarified interpretation of backslash escapes in EXPAND_LINE.</li>
-        </ul>
-  <p><b>Version 5.2.0</b></p>
-       <ul>
-               <li>Better aligned the rules section with the actual published files and 
-               behavior of existing parsers. This included fixing some obvious typos 
-               and clarifying some notes as well as the following changes, which are 
-               listed individually.</li>
-               <li>Replaced instances of &lt;tab&gt; by TAB throughout.</li>
-               <li>NAME_LINE for special names may have trailing COMMENTs including COMMENTs 
-               consisting entirely of &quot;*&quot;.</li>
-               <li>In CROSS_REF added the form without LCNAME, fixed the literal to the 
-               correct lowercase &quot;x&quot; and noted that LCNAME may have &quot;&lt;&quot; and &quot;&gt;&quot; around 
-               it in the data. Also added missing LF in the rules.</li>
-               <li>Removed a redundant rule for BLOCKHEADER.</li>
-               <li>Changed FORMALALIAS_LINE from LINE to NAME to match actual restriction 
-               on contents.</li>
-               <li>Extended the documentation of lookahead logic for CHAR.</li>
-               <li>Accounted for FILE_COMMENT in overall file structure.</li>
-       </ul>
-       <p><b>Version 5.1.0</b></p>
-       <ul>
-               <li>Noted that comments in NAME_LINEs must be preceded by SP.</li>
-               <li>Provided additional information on allowable characters in names.</li>
-               <li>Added SIDEBAR_LINE.</li>
-               <li>Noted that CROSS_REF must contain a SP and CHAR, and that 
-               COMPAT_MAPPING must contain a SP and may contain a &lt;tag&gt;</li>
-               <li>Noted that LCNAME may contain uppercase characters under 
-               exceptional circumstances.</li>
-               <li>Relaxed the restriction on lines starting with #, :, %, x and = on 
-               the TITLE_PAGE. These are now treated as comments.</li>
-       </ul>
-       <p><b>Version 5.0.0</b></p>
-       <ul>
-               <li>Added FORMALALIAS_LINE and INDEX_TAB to syntax.</li>
-               <li>Fixed the list of lines that may appear before a BLOCKHEADER by 
-               adding NOTICE_LINE.</li>
-               <li>Minor fixes to the wording of several syntax definitions.</li>
-       </ul>
-       <p><b>Version 4.0.0</b></p>
-       <ul>
-               <li>Fixed syntax to better reflect restrictions on characters 
-  in character and block names.</li>
-               <li>Better document treatment of comments in block names, plus 
-  French name rules.</li>
-       </ul>
-  <p><b>Version 3.2.0</b></p>
-       <ul>
-               <li>Fixed several broken links, added a left margin,  
-  changed version numbering.</li>
-       </ul>
-  <p><b>Version 3.1.0 (2)</b></p>
-       <ul>
-               <li>Use of 4-6 digit hex notation is now supported.</li>
-       </ul>
-  <hr width="50%">
-  <div align="center">
-    <center>
-    <table cellspacing="0" cellpadding="0" border="0">
-      <tr>
-        <td><a href="https://www.unicode.org/copyright.html">
-        <img src="https://www.unicode.org/img/hb_notice.gif" border="0" alt="Access to Copyright and terms of use" width="216" height="50"></a></td>
-      </tr>
-    </table>
-    </center>
-  </div>
-</div>
-
-</body>
-
-</html>
-