In a previous post, we talked about the mechanism behind library wrapping. In it, I said box86/box64 used some manually written files to correctly call functions. However, there is a big question: how are these files written?
To follow this article, you will need at least a basic understanding of function signatures.
- The basics
- Registering a new library
- Adding simple functions
- Adding complex functions
- An update: the wrapper helper
- Addendum: a small type to letter table
The basics
First, a reminder: what is library wrapping (in box86/box64)?
Library wrapping can refer to multiple things, but here it refers to how box86/box64 interacts with native libraries.
These native libraries have a list of functions (exported symbols) that the end user (the program) can call. However, to call these functions, box86/box64 needs to know what the arguments are so it can put the correct arguments at the correct place (i.e., is that argument coming from the stack, a register…?). However, this is all done automagically behind the scene, and the wrapper writer doesn’t need to know these details.
The first step: registering a new wrapper
So, how do we write a wrapper? First, we need to tell box that the wrapper exists. This must be done only once, but at multiple places. Therefore, this step can be skipped if you just want to add some functions in an already registered library.
First, in the src/library_list.h
file, there is a list of all wrapped libraries. To add one, you need to add a line containing:
GO("my_library_file_name.so", my_library)
where my_library_file_name.so
is the library file name in the file system, and my_library
is how you name the auxiliary file (more precisely, it is how you name the library in that library-specific auxiliary file).
Next, you need to tell the compiler toolchain about the new auxiliary file. In the CMakeLists.txt
file, there is a list of filenames put in the WRAPPED
variable (the block starts with set(WRAPPED
, continues with the list and ends with a )
). You need to add a new line in this variable containing your auxiliary file.
Congratulations, you have registered a new library wrapping! Yes, it really boils down to two lines.
The last step is to create the missing files: just paste the following templates:
#define _GNU_SOURCE /* See feature_test_macros(7) */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dlfcn.h>
#include "wrappedlibs.h"
#include "wrapper.h"
#include "bridge.h"
#include "librarian/library_private.h"
#include "x64emu.h"
#include "generated/wrappedmy_librarydefs.h"
const char* my_libraryName = "my_library_file_name.so";
#define LIBNAME my_library
#include "generated/wrappedmy_librarytypes.h"
#include "wrappercallback.h"
// Insert code here
#include "wrappedlib_init.h"
in the wrappedmy_library.c
file, and
#if !(defined(GO) && defined(GOM) && defined(GO2) && defined(DATA))
#error Meh....
#endif
in the wrappedmy_library_private.h
file. Correct the macros in the C file (by replacing the my_library
s by your library name), and you’re done!
You should now be able to compile box86/box64/box32, and running it with a binary that uses the library should fail with symbol not found
errors.
Sometimes, some more library initialization code is required. For arbitrary code, you can put it in the CUSTOM_INIT
macro. However, there are other helper macros: for example, if the library depends on loading other libraries, you can define the NEEDED_LIBS
macro to the comma-separated list of other libraries to include.
Similarly, the CUSTOM_FINI
macro is used when finishing the library.
These macros must be defined in the C file before including the wrappedlib_init.h
file. In the generic CUSTOM_INIT
macro, the library_t
structure which references the current library is called lib
.
The final step (easy): adding simple functions
Update: you can use the wrapper helper to do this completely automatically.
Now, open the header file (wrappedmy_library_private.h
). This file should have three non-empty lines (#if ...
, #error Meh....
, #endif
).
You are now ready to add a new function.
First, look at its signature. I will use the example of memcpy
:
void* memcpy( void *dest, const void *src, size_t count );
Notice that there are no callbacks or a variable number of arguments in this signature (no ...
). Notice that the function also requires no realignment of the arguments. This is important, and we will come back to functions with those issues later.
First, you can prepare a line in the header file:
GO(memcpy, )
Notice that since this is a simple function, I used the GO
macro. If the library declares this symbol as “weak” (which it usually doesn’t), you need to use the GOW
macro.
The second argument should contain the function signature. Here is how:
- The first letter corresponds to the return type
- The second letter corresponds to the calling convention
- The remaining letters correspond to the types of the arguments in the correct order, or
v
if there is none
The calling convention is almost always F
(for the System V calling convention).
If the function returns a void
, the first letter should be v
. If the function returns an int
, the first letter should be i
, and so on. You can look at the top of src/wrapped/generated/wrapper.h
to have a list of all the common types. See also the end for a table of more types.
For example, the return type of memcpy
is a pointer, as are the first two arguments, which corresponds to the letter p
; the last argument, size_t
, is a typedef
of unsigned long
which is the letter L
.
Therefore, we can complete the line:
GO(memcpy, pFppL)
which is exactly what you will find if you search through wrappedlibc_private.h
.
That’s it! Doing so with a few functions then compiling, you may be greeted with errors from the wrapper generator script. This means you have incorrectly written something. Read the error, correct it, and try again. (Note that the script is called automatically when compiling. Calling it manually is both non-trivial and unnecessary.)
Hopefully, all functions of the library you want to wrap are such functions. Otherwise, you will need to do…
The final step (hard): adding redirected functions
Update: the wrapper helper can generate the correct signature, which is the first part of this chapter, but the rest of it remains necessary even now.
For functions with a callback, or with structures with differing sizes, you will need to write C code. If you don’t know how, please just open a pull request; this is not a beginner-friendly project.
Let’s take an example of a harder function: printf
. Its signature is the following:
int printf( const char *format, ... );
First, notice the use of the ...
: this function takes a variable number of arguments, the exact number depending on the format string.
As with the easy way, we can add a new line in the wrappedmy_library_private.h
file:
GOM(printf, iFEpV)
Notice the use of the GOM
macro: this means that instead of calling the native function directly, box86/box64 will instead call its own wrapper function my_printf
. For weak functions, there is an equivalent GOWM
.
Side note: the my_
prefix is configurable. To change it, define the ALTMY
macro to whatever you want in the C file before including the wrappedlib_init.h
file. You can look at SDL2 for an example.
Next, notice the E
as its first argument: my_printf
will receive the active x86emu_t*
/x64emu_t*
as its first argument. This is a special argument letter, reserved only for functions with redirection and as the first argument only. The generator will enforce this and output an error otherwise.
The letter V
as the final argument is the variadic argument.
Next, we need to define the my_printf
function somewhere.
It is done in the wrappedmy_library.c
file (usually before include
ing wrappedlib_init.h
).
You need to EXPORT
the function my_printf
with the correct arguments:
EXPORT int my_printf(x64emu_t *emu, void* fmt, void* b) {
Next, you need to do whatever you need to do to align the stack:
myStackAlign(emu, (const char*)fmt, b, emu->scratch, R_EAX, 1);
PREPARE_VALIST;
And finally, call the function:
return vprintf((const char*)fmt, VARARGS);
}
And you’re done!
Of course, the “whatever you need to do” may take a lot more than two lines. If you want some good-ish examples, just take a look at how SDL2 is done.
Can it be automated?
Maybe. There are a few efforts toward making an LLVM frontend to parse headers then generate the _private.h
file automatically, but AFAIK none of them are ready yet, all of them having issues with integer sizes, callbacks, correctness of the output…
Furthermore, (one of) the issue(s) with all the currently available generators I know of is that once they successfully output a file, you will still need to come back to check whether they are correct, and/or you will still need to write code for the more complex functions.
An update: the wrapper helper
There is now an automated wrapper generator! It is called the wrapper helper, and completely contained in the wrapperhelper
folder of the box64
project. This tool will automatically generate the function signatures (i.e. the thing after the function name in the wrappedmy_lib_private.h
file) and data sizes.
The usage is very simple: you run make
in the wrapperhelper
folder to compile it, which generates a wrapperhelper
binary in bin
. Then, you run that binary with three parameters: the first is the compilation header (which includes every function that you need); the second is a reference wrapped*_private.h
file, which is parsed to get every symbol to output; the third parameter is the output file.
This may fail to parse the compilation header, in which case please open an issue on box64
, or the reference file, in which case please make sure the reference file is indeed a reference file; it may generate some warnings about unknown functions, in which case the compilation header is incomplete; it may also generate some warnings about “converting functions”, which simply means that the reference has some invalid signatures and/or invalid function/data type (i.e. GO
instead of GOM
…) in it.
Also, it may generate some GOM
s; if the reference already marks the function as a GOM
, it will leave it alone, otherwise the function will be commented in the output file. You will still need to manually write the wrapper in that case, as it is likely there is either a misalignment issue with a structure parameter, or a callback function which needs to be wrapped manually.
Addendum: a table of types
C type | Letter |
---|---|
x64emu_t | E (*) |
void | v |
int8_t | c |
char | c |
uint8_t | C |
unsigned char | C |
int16_t | w |
short | w |
uint16_t | W |
unsigned short | W |
int32_t | i |
int | i |
uint32_t | u |
unsigned | u |
int64_t | I |
long long | I |
uint64_t | U |
unsigned long long | U |
long | l |
unsigned long | L |
float | f |
double | d |
long double | D or K (see libm ) |
void* | p (rarely, P (*))Be careful of function pointers |
... | V (*) or N or M |
FILE* | S |
uint128_t | H |
float complex | x |
double complex | X |
long double complex | Y or y (see libm ) |
va_list | A |
xcb_connection_t* | b |
(*) Usually, having this letter means you need to add a wrapper function (my_...
).