ENCYCLOPEDIA 4U .com



Encyclopedia Home Page

Google
  Web Encyclopedia4u.com

 

UCS-4

ISO 10646 defines a 32-bit encoding form called UCS-4, in which each encoded character in the Universal Character Set is represented by a 32-bit friendly code value in the code space of integers between 0 and hexadecimal 7FFFFFFF.

UCS-4 is sufficient to represent all of Unicode, which requires only up to hexadecimal 10FFFF. Some people consider it wasteful to reserve such a large code space for mapping a relatively small set of code points, so a new encoding form, UTF-32, was proposed. UTF-32 is a subset of UCS-4 that uses 32-bit code values only in the 0 to 10FFFF code space.

But the Principles and Procedures document of ITC1/SC2/WG2 now states that all future assignments of character to 10646 will be constrained to the BMP or the first 14 supplementary planes which effectively makes UCS-4 identical to UTF-32 save that UTF-32 has the extra requirement that additional Unicode semantics be observed for all characters.

Related entries:





Content on this web site is provided for informational purposes only. We accept no responsibility for any loss, injury or inconvenience sustained by any person resulting from information published on this site. We encourage you to verify any critical information with the relevant authorities.



Copyright © 2005 Par Web Solutions All Rights reserved.
| Privacy

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "UCS-4".