Skip to content
jalf edited this page May 4, 2013 · 2 revisions

utf.hpp

utf.hpp is a tiny, easy-to use single-header C++ library for converting text between UTF-8, UTF-16 and UTF-32 encodings.

If you just want to use the library, check out the API and samples here.

For a clarification of the basic Unicode terminology, read here.

Why would you use utf.hpp

  • utf.hpp is a single header: the library consists of a single header file (conveniently named utf.hpp). Include it, and you're good to go. There's nothing to build, nothing to link. Just #include "utf.hpp".
  • utf.hpp has no external dependencies: the library uses a few headers from the standard library, but requires no external dependencies.
  • utf.hpp works with any string representation: the library relies on iterators (or even raw pointers) to represent strings, and never creates strings or takes ownership of memory. (Which also means no calls to new or malloc)
  • utf.hpp is tiny: like, really really small. About 400 lines of code all in all. You could read it in your lunch break.
  • utf.hpp is lightweight: no heap allocations, no unnecesary copying of data. No virtual functions, and no exceptions. The library does what you ask it to, and nothing else, with no unnecessary overhead.
  • utf.hpp is a really really easy way to convert text between UTF-8, UTF-16 and UTF-32.

What utf.hpp isn't

  • utf.hpp is not a Unicode library: it does not do Unicode normalization, cannot tell you a characters properties, doesn't and implement collation algorithms. If you want to actually manipulate Unicode text, you need a proper Unicode library such as ICU.
  • utf.hpp is not a string library: it does not define a string class. It does not tell you how strings should be represented, or how the underlying memory should be allocated. Nor does it provide all the typical string functionality.

Oh, and it's available under the permissive MIT-style Boost Software License.

Clone this wiki locally