charset.xml 1.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445
  1. <PAGE>
  2. <INCLUDE file="inc/header.tmpl" />
  3. <VAR match="VAR_SEL_FEATURES" replace="selected" />
  4. <VAR match="VAR_SEL_FEATURE_CHARSET" replace="selected" />
  5. <PARSE file="menu1.xml" />
  6. <PARSE file="menu2-features.xml" />
  7. <INCLUDE file="inc/content.tmpl" />
  8. <h1>Character set handling</h1>
  9. <p>OpenConnect development started in 2008 on a modern Linux box, and
  10. as such the character set handling was extremely simplistic. It boiled
  11. down to the simple but reasonable assumption that <i>"everything is UTF-8,
  12. all of the time"</i>. This was the case up to and including the OpenConnect
  13. 6.00 release in July 2014.</p>
  14. <p>Since its inception, however, OpenConnect has been ported to
  15. various less progressive POSIX-based systems and also to Windows,
  16. which has its own particular style of charset insanity. It was
  17. therefore necessary to implement some explicit handling for character
  18. set conversion.</p>
  19. <p>The design of this character set handling is that the internal
  20. <tt>libopenconnect</tt> library still handles every string as UTF-8.
  21. All input and output of the library remains UTF-8, and all callers
  22. of the library are expected to handle them appropriately. For the GNOME and
  23. KDE GUI tools, this should come naturally as all strings are expected to
  24. be UTF-8 there. For the command-line tool <tt>openconnect</tt> itself,
  25. implemented in <tt>main.c</tt>, this means that character set conversion
  26. is done on all terminal input and output, and all arguments provided
  27. on the command line.</p>
  28. <p>Where it is necessary to open files or interact with the system in
  29. other ways using the legacy character set, <tt>libopenconnect</tt>
  30. will do the required conversion transparently. On POSIX systems with
  31. legacy non-UTF-8 character sets, it will use <tt>iconv</tt> to
  32. convert, while on Windows it will convert to UTF-16 and use the wide
  33. character (so-called "Unicode") APIs instead.</p>
  34. <INCLUDE file="inc/footer.tmpl" />
  35. </PAGE>