LibWchar2

Git HOME Readme RU Documentation License MIT Build Travis Build Appveyor Job Status

Repo size Code size

LibWchar2


A wrapper library for compatibility with projects written for the Windows API, where wchar_t size == 2 bytes on *nix platforms. Replaces libc, glibc functions using the wchar_t type.

What is this for:


When porting software using the internal type wchar_t from the Windows API platform, a wchar_t size equal to two bytes (16 bit) is required for normal operation. In *nix systems, by default the wchar_t type size corresponds to 4 bytes (32 bit).

But, it is possible to compile a program with the type wchar_t equal to 2 bytes. This feature is available when using GCC or clang compilers. To ensure the efficiency of these collected programs, it is necessary that libc and as a consequence all other libraries be assembled with wchar_t support equal to 2 bytes. This condition is usually impracticable..

Compiler keys:

Using compilation keys to build your program with this library. Enable assembly with wchar_t type equal to 2 bytes.

CC key
GCC -fshort-wchar
clang -fwchar-type=short -fno-signed-wchar

Features of the library:

Standart ISO/IEC 9899:2011 + stddef.h

A short free translation of the essence of what is stated in the standards concerning wchar_t and related types.

  1. wchar_t which is an integer type whose range of values can be different codes for all members of the largest extended character set specified among the supported local encodings. A null character must have a zero code value. Each member of the base character set must have a code value equal to its value when used as a single character in an integer symbolic constant if the implementation does not specify STDC_MB_MIGHT_NEQ_WC.
  2. wint_t - which is an integer type that is unchanged by default, since it can contain any value that corresponds to members of the extended character set, as well as at least one value that does not correspond to any extended character set.
  3. mbstate_t - which is the complete type of the object, other than the type of the array, which can contain the conversion state information needed to convert between the sequences of many byte characters and wide characters.

Explanations of the developers of the implementation of the functional wchar_t

Ideological changes:

Library LibWchar2 removes these restrictions, and does not require reassembly of all libraries, while allowing you to create applications with two byte type wchar_t.

In the library LibWchar2 the variable with the type mbstate_t is ignored, and even if you do not set this variable, it removes the intermediate states that are stored and prevent the mutual execution of ` input/output` in one thread.

Also, work with the orientation of the stream is deleted in the input/output functions, its necessity is a very controversial issue, but this also affects the stability of the functions associated with the input/output operations.

In the library LibWchar2 the problem of types is solved, all functions that work in one way or another with wide characters are reduced to a single type wchar_t.

Assembly and Installation:

Assembling a *nix platform is performed in a typical way, using the autotool package.

Run the configure installation script from the project’s root directory. In addition to typical keys, the script understands the following options:

Next, you need to compile the library with the standard commands:

./configure --prefix=/usr
make
make check
make install

Also, you can use the script build.sh from the root directory, this will allow you not to enter these commands with your hands.

If there is a need to rebuild the script ./configure, execute:

./autogen.sh

Library extensions for Windows API

API

To use the library in the project, connect the header last, after all system headers, while wchar.h and wctype.h can be omitted, they are already included.

#include <stdio.h>
#include <string.h>
#include ...
#include <wchar2.h>

The library itself is connected in the standard way:

Makefile:

CFLAGS = -I. -fshort-wchar /* GCC */
CFLAGS = -I. -fwchar-type=short -fno-signed-wchar /* clang */
LDFLAGS = -L. -lwchar2

Definitely convenient is the redefinition of standard functions working with the file system, such as: mkdir, remove, rename, stat, access, basename, dirname, fopen, fputc, fputs.

For this, before including the header, define the following definitions:

#define WS_FS_REDEFINE 1
#include <wchar2.h>

or, if there is a need to use only UTF-8 encoding:

#define WS_FS_REDEFINE 1
#define WS_FS_UTF8 1
#include <wchar2.h>

Attention, the WS_FS_UTF8 key will not work separately.

Example code snippets

In the WS_FS_UTF8 definition mode, functions mkdir, remove, rename, stat, fopen, access only accept input data in wchar_t format, otherwise the input data can be in the formats shown in the table, in this case the determination is made automatically.

See: wchar2.h macro __wchar_type_id(..)

Type const array const array
char* const char* char[] const char[]
wchar_t* const wchar_t* wchar_t[] const wchar_t[]
string_ws* const string_ws*    

Test & Testing

Projects adapted to the library

Source & Materials used:

The project uses the revised code of the authors:

for which they have a special thank you :)

License

MIT