Are you interested in writing portable and efficient C++ code? Look no further than this guideline! Unlike other guidelines, such as the C++ Core Guidelines, our guidelines prioritize portability and avoid problematic coding practices.
It's important to remember that there's no such thing as a zero-cost or zero-overhead (runtime) abstraction. Beware of anyone who claims otherwise! Even features like borrow checkers and C++ exceptions have runtime overhead (see here: When Zero Cost Abstractions Aren't Zero Cost and here: Zero-cost exceptions aren’t actually zero cost).
Note that the current C++ standard is C++23.
To write truly portable C++ code, it's important to stick to the freestanding C++ part and use platform macros to guard against the hosted environment. The cppreference website page Freestanding and hosted implementations provides a useful guide on freestanding implementations and the functions available in this environment.
However, it's important to note that many headers that you may assume should be available may not be present in a freestanding environment. Additionally, certain headers that are available may have limitations or issues that could prevent you from using them.
If you want to ensure that your code works in a freestanding environment, you can try using the x86_64-elf-gcc compiler. This compiler is available on the following GitHub repository: windows-hosted-x86_64-elf-toolchains
This is because C++ allows overloading of the operator &
which is a historical mistake. To overcome this issue, the ISO C++11 standard introduced std::addressof
. However, it was not until C++17 that std::addressof
became constexpr. Unfortunately, std::addressof
is part of the <memory>
header which is not freestanding, making it impossible to implement it in C++ without compiler magic before C++23. In C++23, the <memory>
header is finally made partial freestanding, but it is important to update to the latest toolchains to ensure its availability. It is worth noting that function pointers cannot use std::addressof
and must use operator &
to get their address.
//bad! memory header may not be available before C++23.
//#include <memory> may also hurt compilation speed
#include<memory>
int main()
{
int a{};
int *pa{std::addressof(a)};//bad! memory header may does not exist
int *pb{&a};//ok but still bad! operator & can be overloaded for class object types.
}
/*
Compilers give us errors.
x86_64-elf-g++ -v
gcc version 13.0.0 20220520 (experimental) (GCC)
D:\Desktop>x86_64-elf-g++ -c addressof.cc -O3 -std=c++23 -s -flto
addressof.cc:1:9: fatal error: memory: No such file or directory
1 | #include<memory>
| ^~~~~~~~
compilation terminated.
*/
int main()
{
int a{};
int *pc{__builtin_addressof(a)};
//Ok. GCC, clang and MSVC all support __builtin_addressof. It is safe to use it.
//Maybe some compilers don't support __builtin_addressof? Then we are screwed up for sure. That is the C++ standard to blame
}
/*
D:\Desktop>x86_64-elf-g++ -c builtin_addressof.cc -O3 -std=c++23 -s -flto
Compilation success
*/
The functions std::move
, std::forward
, and std::move_if_noexcept
are defined in the <utility>
header, which is not freestanding until C++23. To ensure maximum portability, it's recommended to write these functions yourself. However, it's worth noting that recent versions of the Clang compiler (version 15 onwards) have added a patch that treats these functions as compiler magics. As a result, it may not be possible to achieve 100% efficiency by writing these functions yourself.
//bad! utility header may not be available.
//#include <utility> may also hurt compilation speed
#include<utility>
int main()
{
int a{};
int b{std::move(a)};
}
/*
D:\Desktop>x86_64-elf-g++ -c move.cc -O3 -std=c++23 -s -flto
move.cc:1:9: fatal error: utility: No such file or directory
1 | #include<utility>
| ^~~~~~~~~
compilation terminated.
*/
Similar to the reasons why you should avoid using std::addressof
and std::move
, the <array>
header is also not freestanding
//bad! array header may not be available.
//#include <array> may also hurt compilation speed
#include<array>
int main()
{
std::array<int,4> a{};
}
/*
D:\Desktop>x86_64-elf-g++ -c move.cc -O3 -std=c++23 -s -flto
move.cc:1:9: fatal error: array: No such file or directory
1 | #include<array>
| ^~~~~~~~~
compilation terminated.
*/
The solution is to use C style array.
//Good! You should use C style array because <array> does not exist.
//cstdint does exist.
#include<cstdint>
int main()
{
::std::int_least32_t arr[4]{};//Ok! ::std::int_least32_t is freestanding
}
/*
x86_64-elf-g++ -c carray.cc -O3 -std=c++23 -s -flto
*/
Update: In C++23, std::array is not considered freestanding. However, it is possible that it will become freestanding in future versions of the C++ standard.
To achieve maximum portability, it is recommended to avoid all C++ containers, iterators, algorithms, and allocators. These components are not freestanding and have complex designs that interact with heap and exception handling, creating portability issues when included. Despite being promoted as the default container in "modern" C++ books and the C++ Core Guidelines, std::vector
is not freestanding, and including it can cause portability issues. Additionally, allocators are not freestanding, and even std::allocator
cannot be considered freestanding due to its interaction with exceptions. If exceptions are disabled, dependencies to exceptions remain, causing linkage errors or binary bloat issues.
/// WRONG!!! <vector> is not freestanding
#include<vector>
#include<cstdint>
int main()
{
std::vector<::std::int_least32_t> vec;
}
/*
D:\Desktop>x86_64-elf-g++ -c vector.cc -O3 -std=c++23 -s -flto
vector.cc:1:9: fatal error: vector: No such file or directory
1 | #include<vector>
| ^~~~~~~~
compilation terminated.
*/
Avoid using the <ranges>
and <iterator>
headers in C++ versions prior to C++23. While concepts such as std::random_access_range
and std::contiguous_range
may be useful, they are not freestanding and may not be available in all contexts. It is often simpler to stick with using pointers instead.
Starting with C++23, <ranges>
and <iterator>
are partially freestanding, meaning that they can be used in most contexts except for stream iterator-related functionality.
While it is true that <new>
is freestanding, it is not designed to be particularly useful in its current state. This is due to the fact that exceptions are thrown in a somewhat chaotic manner, making it difficult to rely on in certain contexts.
Avoid using std::nothrow
and std::nothrow_t
in C++, as the standard library implements nothrow new by internally using thrown new. This means that std::nothrow
provides no guarantee that memory allocation will not throw an exception. Additionally, the C++ standard does not allow overriding the default implementation of operator new(std::nothrow_t const&)
, unlike operator new()
. Therefore, there is no practical benefit to using std::nothrow
, and it should be avoided. Additionally, exceptions should be avoided in freestanding environments, so it is recommended to implement custom memory allocation functions that do not throw exceptions.
/*
Do not use new (std::nothrow_t) in any form. It is just horrible.
Here is the implementation from libsupc++ in GCC.
https://github.com/gcc-mirror/gcc/blob/aa2eb25c94cde4c147443a562eadc69de03b1556/libstdc%2B%2B-v3/libsupc%2B%2B/new_opnt.cc#L31
*/
_GLIBCXX_WEAK_DEFINITION void *
operator new (std::size_t sz, const std::nothrow_t&) noexcept
{
// _GLIBCXX_RESOLVE_LIB_DEFECTS
// 206. operator new(size_t, nothrow) may become unlinked to ordinary
// operator new if ordinary version replaced
__try
{
return ::operator new(sz);
}
__catch (...)
{
// N.B. catch (...) means the process will terminate if operator new(sz)
// exits with a __forced_unwind exception. The process will print
// "FATAL: exception not rethrown" to stderr before exiting.
//
// If we propagated that exception the process would still terminate
// (because this function is noexcept) but with a less informative error:
// "terminate called without active exception".
return nullptr;
}
}
We can see C++ standard library implements nothrow version's operator new with thrown version's new. It is completely useless.
In short, you cannot assume there is a default heap that is available.
Reasons:
- There might not be a heap available at all. While you could argue that you could always provide one, the reality is that this is not always possible.
- In some situations, multiple heaps might be available, such as within operating system kernels or even in Win32 applications with ABI compatibility issues between msvcrt and Universal CRT. In these cases, using any of the defaults might be the wrong choice. For example, the Windows kernel provides different heaps, one of which is interrupt-safe but has limited space, while the others are not interrupt-safe but have sufficient space.
- C++ does not provide a thread-local implementation of the heap, and the default heap might be highly inefficient. Of course, thread-local storage might not be available, but that is a separate issue.
For a portable codebase, forcing a default heap implementation never works out, so it's best to avoid using new.
It's best to avoid handling stack or heap allocation failure. One design flaw with C and C++ is that stack exhaustion is undefined behavior, while heap exhaustion is not. This creates inconsistency and can lead to a lot of issues.
In general, it's preferable to simply call std::abort
if malloc(3)
fails. Here are some reasons why:
- Destructors can allocate memory again, creating a vicious cycle that can lead to unpredictable results.
- The GCC libsupc++ implementation uses an emergency heap, but even if you implement an emergency heap, it might still call
std::terminate
in many situations, leading to crashes. To verify this, I removed the emergency heap from the libsupc++ and found no issues, including no ABI issues. - Many libraries beneath you, including glibc, call
xmalloc
, which will still crash formalloc(3)
failure. You cannot avoid the issue unless you control all your source code. - Operating systems like Linux will overcommit and kill your process if you hit allocation failures. In general, programming languages like C++ cannot handle allocation failures in any meaningful way, whether on the stack or the heap.
- C++'s
new
operator throws exceptions, and C++ codebases usually allocate memory on the heap using new, creating invisible code paths and potential exception-safety bugs.
If you want to load large files, consider using the memory mapping API instead of loading them into memory.
Loading large files, such as images, into memory may seem like a reasonable approach, but it is often not the best solution. Instead, it is recommended to use system calls such as fstat(2)
or Linux's statx
to obtain information about the file and the mmap(2)
syscall for memory mapping.
- Memory mapping avoids the issue of file size overflow on 32-bit machines.
- It avoids messing up the CRT heap and never triggers allocation failure from malloc or new.
- It avoids copying the content from kernel space to user space, compared to loading the entire file to a std::string.
- Memory mapping allows file sharing among different processes, saving your physical memory.
fseek(3)
orseek(2)
to load a file may create more TOCTOU security vulnerabilities.- The overcommit is less likely if you do not write to the copy-on-write pages.
- Memory mapping creates a
std::contiguous_range
, which is extremely useful for many workflows. - You can write to memory-mapped memory without changing the file's content if you load the pages with private pages. However, writing content to the memory region will trigger a page fault, and the operating system kernel will allocate new pages for your process.
/*
BAD. fseek(3) and fread(3) to load file
https://stackoverflow.com/questions/174531/how-to-read-the-content-of-a-file-to-a-string-in-c
*/
char * buffer = 0;
long length;
FILE * f = fopen (filename, "rb");
if (f)
{
fseek (f, 0, SEEK_END);
length = ftell (f);
fseek (f, 0, SEEK_SET);
buffer = malloc (length);
if (buffer)
{
fread (buffer, 1, length, f);
}
fclose (f);
}
if (buffer)
{
// start to process your data / extract strings here...
}
Libraries like fast_io
provide direct support for file loading and handling all the corner cases for you, including transparent support for platforms that do not offer memory mapping (like web assembly). The memory is writable.
/*
Good! native_file_loader with memory mapping
https://github.com/cppfastio/fast_io/blob/master/examples/0006.file_io/file_loader.cc
*/
#include<fast_io.h>
int main()
{
fast_io::native_file_loader loader(u8"a.txt");
//This will load entire a.txt to memory through memory mapping.
/*
This is a contiguous range of the file.
You can do these things:
std::size_t sum{};
for(auto e:loader)
sum+=e;
*/
}
- Overreliance on
std::unique_ptr
encourages overuse of OOP, perpetuating the same problems as before. - According to Google, using
std::unique_ptr
can cause significant performance inefficiencies, with potential improvements of up to 1.6% observed on certain large server macrobenchmarks. This has also resulted in slightly smaller binary sizes. For more information, refer to the libc++ documentation on Enablestd::unique_ptr [[clang::trivial_abi]]
. This indicates thatstd::unique_ptr
is highly inefficient when considering micro-level performance. - When using
std::unique_ptr
for pimpl, it is not recommended due to the compilation speed issue. Instead, module usage can help address the issue. Additionally, it is important to note thatstd::unique_ptr
is not ABI-stable, which is caused by the ABI bug in the previous point. - While
std::unique_ptr
helps with memory leaks, it cannot fix issues like type-confusions or vptr-injection. - Using
std::unique_ptr
to manage resources beyond memory is generally ineffective since APIs often do not support specific types like Unix file descriptors or SQLite3. Writing a class to wrap these resources is preferable to address all related issues, such as ignoring error codes. - Compilation speeds may suffer due to the inclusion of the
<memory>
header. - Overuse of
nullptr
may occur when relying onstd::unique_ptr
to represent empty states. - The deleter of
std::unique_ptr
can lead to inefficiencies, and it is challenging to avoid unintended performance degradation, regardless of whether the deleter is a lambda, function pointer, or function object. - Before using
std::unique_ptr
to make non-movable objects movable, it is important to question why the type is unmovable in the first place. - For
std::unique_ptr<T[]>
, the[]
can be easily forgotten, leading to undefined behavior. - Implementing data structures with
std::unique_ptr
is always incorrect and may cause stack overflow. - While
std::unique_ptr
is partially freestanding in C++23, it is not useful sincestd::make_unique
is not freestanding andnew
is not guaranteed to be available. - Most importantly,
std::unique_ptr<T>
lacks type richness, making it unclear whatstd::unique_ptr<shape>
signifies. WritingT
instead ofstd::unique_ptr<T>
would provide more clarity and meaning.
In general, std::unique_ptr
is not a smart pointer but rather a questionable pointer type that can be harmful.
C++ exception is probably the largest issue for portability. Even Linus Torvalds complaint about C++ EH before.
The issue with C++ exception handling goes beyond just portability. The reliance on operating system support and hosted C++ features makes it difficult to use in various contexts, including embedded systems and operating system development. This limitation can pose significant challenges for developers who want to write portable code.
The lack of support for exceptions on certain architectures, such as AVR and wasm, is another major challenge. While some tools, such as wasm2lua, allow for the compilation of C++ code to other languages, implementing C++-style exception handling in these contexts can be a difficult and performance-intensive process.
Even on platforms that do support exceptions, the performance hit can be significant. The SJLJ exception, a common implementation for many platforms, can slow down both happy and slow paths. This can make exception handling impractical for performance-critical applications.
Additionally, not every architecture supports C++ exceptions, and even when they do, the implementation difficulty can be significant. The need for heap memory in exception handling, as well as the enormous runtime of C++ exceptions (typically around 100kb), can make it unacceptable for many embedded systems, where binary size and memory usage are critical factors.
Ultimately, the bloat in binary size caused by C++ exceptions is yet another challenge faced by developers in many contexts, including embedded systems. With all of these issues in mind, it is important for developers to carefully consider the use of C++ exception handling in their code and to explore alternative error-handling mechanisms where appropriate.
- Many platforms do not implement table-based exception handling models, which are required for C++ exception handling.
- C++ exceptions can negatively impact compiler optimizations, regardless of whether the exception handling model is table-based or SJLJ-based.
- C++ exceptions can hurt memory locality, including TLB, BTB, cache, and page locality, regardless of the exception handling model.
- C++ exception handling does not work well with ASLR (Address Space Layout Randomization) and PIC (Position Independent Code), as loaders must relocate all exception table pointers, which can break ABIs. Recent security papers have discussed ways to make this work with PIC, but it may still break ABIs.
- C header files often do not correctly mark their functions with the "noexcept" specifier, creating not only performance issues but also technically violating the One-Definition Rule (ODR). Calling C APIs that are not "noexcept"-marked results in undefined behavior, regardless of the exception handling model.
- Throwing C++ exceptions can be extremely costly, potentially taking 200-300 times longer than a syscall instruction on x86_64.
- C++ exceptions do not work well in multithreaded systems, which has become a significant problem as more and more software becomes multithreaded. The paper C++ exceptions are becoming more and more problematic provides more information on this issue.
- Statistics show that 95% of exceptions are due to programming bugs, and these cases should be dealt with using assertions or even "std::terminate" rather than throwing exceptions, as this can result in improved exception-safety and performance.
C++ Exceptions are widely considered to be a significant pain point in the language, with few redeeming qualities. However, some members of the C++ community see hope in proposals like Herb Sutter's P0709R0. For more information, you can watch his video presentation here: De-fragmenting C++: Making Exceptions and RTTI More Affordable and Usable - Herb Sutter CppCon 2019
Avoid using iostream and stdio as they come with many pitfalls.
Even toolchain vendors like LLVM libcxx cannot build without stdio, which makes bootstrapping difficult.
The behavior of stdio and iostream can change randomly with locale settings, making a program that works on one machine fail on another.
Changes to locale settings at runtime can cause thread-safety issues with stdio and iostream. Additionally, iostream does not have any built-in thread awareness or locking mechanisms, leading to potential issues when multiple threads access the stream.
The printf
family of functions are powerful tools for formatting and printing output to the console, but they can also be a source of serious security vulnerabilities if used improperly. One of the most common ways that printf functions can be misused is through format string attacks, which occur when an attacker is able to inject format specifiers into a printf call, causing the program to output data from the stack or other sensitive areas of memory.
To avoid these kinds of vulnerabilities, it's important to always use printf functions with caution. This means carefully validating and sanitizing any input that will be used in a printf call, and avoiding the use of format strings that can be controlled by an attacker. Additionally, it's important to always use the correct format specifiers for the data being printed, and to avoid using the %n specifier, which can be used to write to arbitrary memory locations.
Overall, while printf functions can be a powerful and useful tool, it's important to use them with care and attention to detail to avoid introducing security vulnerabilities into your code.
inline void foo(std::string const& str)
{
//DANGER! format string vulnerability
printf(str.c_str());
//DANGER ignore the return value of printf or scanf
}
One of the primary issues with using std::endl
is the impact it can have on performance. Since std::endl
flushes the output buffer every time you insert a new line character, it can be inefficient if you are printing a lot of output. This can slow down your program and make it less responsive, especially if you are working with large data sets or running your program on a slow machine.
Another problem with using std::endl
is that it can be redundant. C++ output streams include a mechanism called "tie" that automatically flushes the output buffer when necessary, such as when the buffer is full or when the program terminates. This means that using std::endl to manually flush the buffer is often unnecessary and can even interfere with the automatic flushing mechanism.
To avoid these issues, it's generally better to rely on the automatic flushing mechanism built into C++ output streams and avoid using std::endl
altogether. If you do need to flush the buffer at a specific point in your code, you can use the flush()
method on the stream object itself. This method only flushes the buffer and does not insert a new line character, which can improve performance in cases where you do not need to print a new line.
If you want to print your output immediately without buffering, you can use unbuffered streams. However, the C++ stream library does not provide robust support for unbuffered streams, so you may need to use a third-party IO library like fast_io
to achieve this.
int8_t *p{};
std::cout<<p; // DANGER! iostream overloads treats p as char* and this will treat p as a string.
Unfortunately, random pitfalls like that are everywhere for iostream.
It is recommended to use the fast_io
library over stdio
and iostream
for improved performance and additional features. Unlike stdio
and iostream
, fast_io is not plagued with issues such as buffering and provides more comprehensive functionality.
Furthermore, fast_io
offers a deep understanding and integration of stdio
and iostream
, exposing more internal details that are not readily available with these standard libraries.
Perhaps the most significant advantage of using fast_io
is the dramatic increase in speed. fast_io
is generally 10-100 times faster than stdio
and iostream
due to the minimal redundant work that it performs in comparison. This enhanced performance can be particularly beneficial for applications that require a large amount of data processing or need to handle data in real-time.
It is important to avoid assuming that write(2)
or read(2)
system calls will always perform "real IO". In many cases, the operating system kernel caches files in memory to allow for fast file reads and writes without blocking the process. As a result, write(2)
and read(2)
often function similarly to memcpy()
and may not perform actual IO in the traditional sense. The data will eventually be flushed out to disk, but this process may not happen immediately.
As a consequence, IO operations typically do not take much time, with most of the time being spent on the overhead associated with the abstractions provided by iostream and stdio. It is therefore important to understand the underlying mechanisms involved in IO operations and to avoid making assumptions that can lead to inefficiencies or unexpected behavior.
It is important to avoid assuming that the performance of iostream and stdio is consistent across different platforms. Different operating systems and toolchains may provide different implementations of these libraries, resulting in significant performance gaps between platforms. As a result, it is important to test and measure the performance of IO operations on each platform to ensure optimal performance.
Moreover, while iostream and stdio are standard libraries and offer cross-platform compatibility, they may not be the fastest options for IO operations. As mentioned before, the fast_io
library can often outperform these libraries by a significant margin due to its optimization for IO operations and reduced overhead. Therefore, it is worth considering alternative options such as fast_io
for IO-intensive applications or performance-critical scenarios.
All C++ standard library implementations (libstdc++, libcxx, and msvc stl) implement <iostream>
using <stdio.h>
.
All implementations of <iostream>
such as libstdc++, libcxx and msvc stl, use <stdio.h>
in their implementation. This can be confirmed by examining the source code of each implementation. Due to this fact, it may not make sense to use iostream since it simply wraps around stdio.
Iostream can significantly increase binary size due to its object-oriented design, making it unsuitable for embedded systems. Toolchain vendors rarely optimize iostream for such systems, resulting in a typical implementation that can cost up to 1MB of binary size. Additionally, iostream needs to include all of stdio's code, making it even more bloated. In such cases, it is better to use stdio instead of iostream.
std::filesystem
is not thread-safe due to its locale-aware nature. Moreover, its design is outdated and does not meet the criteria of POSIX 2008. The lack of available at()
methods for std::filesystem
means you will always face potential TOCTOU security vulnerabilities. The extensive use of locales also results in significant bloat, adding up to 1MB in some cases.
In addition, as pointed out in Herb Sutter's P0709, std::filesystem
creates dual error reporting issues.
//https://en.cppreference.com/w/cpp/filesystem/copy_file
bool copy_file( const std::filesystem::path& from,
const std::filesystem::path& to );// Return bool but this api always returns true
bool copy_file( const std::filesystem::path& from,
const std::filesystem::path& to,
std::error_code& ec ); // NOTICE!!! NOT NOEXCEPT
Overall, C++17's std::filesystem is not a good API to use and should be avoided.
It's highly recommended to avoid using any feature that uses locale internally, especially the <cctype>
header.
The <cctype>
header is notoriously problematic. It's slow, not thread-safe, and can create undefined behavior at random. Some C++ books suggest using functions like isupper(3)
to check whether a character is uppercase, but this is a terrible recommendation for several reasons:
- The
isupper(ch)
function assumes ch is in the range [0,127], which means undefined behavior can be triggered randomly if checks aren't done beforehand. - The function is locale-aware, and locale is not thread-safe. This can lead to issues.
- Due to its locale usage, the function does not always provide deterministic results.
- The function is very slow, and in many implementations, even a trivial task like this would be a DLL indirect call. This can result in a significant performance hit, especially on platforms like Windows where it can cause a 100x performance downgrade.
- The function is neither constexpr nor noexcept, making it a poor choice.
- The function is not generic and only works for
char
. It does not work forchar8_t
orchar16_t
, for example. - Additionally, the
<cctype>
header is not freestanding, which means it cannot be used in certain contexts.
In summary, it's best to avoid using the header and other locale-dependent features to ensure better performance, thread safety, and determinism in your code.
char ch{};
if(isupper(ch))//BAD
puts("Do something\n");
template<std::integral char_type>
inline constexpr bool myisupper(char_type ch) noexcept
{
return u8'A'<=ch&&ch<=u8'Z';//simple and naive implementation
// Ok for ascii based execution charset.
// Need to do more work for ebcdic and big endian wchar_t.
// We ignore those cases here if you do not use them.
// They should not matter for 99.999999% of code.
}
char ch{};
if(myisupper(ch))//Mostly ok
puts("Do something\n");
The fast_io
library offers functions that work with wchar_t with big endian, or even with execution charsets like EBCDIC. Moreover, the library is freestanding, making it a great choice for use in a wide variety of environments.
char ch{};
if(fast_io::char_category::is_c_upper(ch))//ok
print("Do something\n");
When choosing integer types, it's generally recommended to use ::std::size_t
as the default type rather than int. While int is a commonly used type, the C++ standard does not define how large sizeof(int)
is, which can lead to issues.
On the other hand, ::std::size_t
is defined by the C++ standard to be an unsigned integer type that is guaranteed to be able to represent the size of any object that can be allocated in memory. Using ::std::size_t
as the default type can help ensure portability and prevent unexpected errors due to integer overflow.
In summary, it's best to use ::std::size_t
as the default integer type in your code to ensure better portability and prevent potential issues arising from undefined behavior.
//DANGER: This is potentially disastrous as it results in undefined behavior when vec.size() exceeds INT_MAX.
for(int i{};i!=vec.size();++i)
{
}
Exceptions: APIs that use int
, such as int main()
, for example.
Avoid assuming that char
, wchar_t
, char8_t
, char16_t
, and char32_t
are solely character types, as they are actually integer types.
Avoid assuming that char
, wchar_t
, char8_t
, char16_t
, and char32_t
are character types. While they may seem like character types, they are actually integer types, and making assumptions otherwise can lead to undefined behavior. For example, the way C++ iostream handles these types can cause issues if you assume they are purely character types.
//libc might define int8_t as char
int8_t i{},*p{__builtin_addressof(i)};
std::cout<<p;//DANGER!!!
std::cout<<std::format("{}\n",p);//DANGER!!!
When working with wchar_t, it's important to be aware of its endianness. It's possible for the endianness of wchar_t to differ from that of your native machine. For instance, wchar_t may use the UTF32BE execution encoding, while your machine uses little endian. This means that any operations you perform on wchar_t may require swapping its endianness first. Other character types, such as char16_t and char32_t, don't have this issue, as their endianness always matches that of your machine.
It's recommended to use integer types in <cstdint>
instead of basic integer types like int
. The C++ standard doesn't specify the size of sizeof(T)
for short
, int
, long
, and long long
, so relying on these types can lead to unexpected behavior.
It's best to use ::std::(u)int_leastxx_t
rather than ::std::(u)intxx_t
, as the latter types are optional and may not exist. They are used when a single byte on certain architectures isn't 8 bits, which can save money for embedded systems. For maximum portability, always use ::std::(u)int_leastxx_t
. It's also recommended to use the INTXX_C()
and UINTXX_C()
macros to define constants for ::std::(u)int_leastxx_t
, as this makes working with these types much easier. It's worth noting that despite their names, these macros are used to define constants for ::std::(u)int_leastxx_t
, not ::std::(u)intxx_t
.
These types only exist for 64-bit targets, and even then, they don't generate efficient code for compilers. It's better to unpack __uint128_t
into two ::std::uint_least64_t
.
The reason to avoid using ::std::uintmax_t
and ::std::intmax_t
is because they can cause issues with ABI (Application Binary Interface) stability in C and C++.
ABI stability refers to the ability of a library or program to maintain compatibility with other libraries or programs that use it, even when changes are made to the implementation details of that library or program.
In the case of ::std::uintmax_t
and ::std::intmax_t
, using these types can cause ABI issues because their size can vary depending on the platform and compiler being used. This can lead to compatibility issues when trying to use a library or program that was compiled with a different size for these types.
To avoid these issues and ensure compatibility, it's best to use fixed-size integer types like ::std::(u)int_leastxx_t
instead.
In portable code, it is best to avoid assuming the existence of floating point types. There are several reasons for this:
- Floating point registers might not be saved by the kernel, which could result in undefined behavior if they are used.
- Soft floating point implementations can be very slow on hardware that lacks floating point arithmetic support.
- The use of
<cmath>
APIs can lead to issues withmath_errhandling
. - If you need to treat floating point types as integers, use
std::bit_cast
instead of pointer tricks, which can violate the strict-aliasing rule.
The strict aliasing rule is a rule in the C and C++ programming languages that states that a pointer of one type cannot be dereferenced as a pointer of a different type. In other words, it is not allowed to access the same memory location through two different pointers with different types, except for a few specific cases defined by the standard.
The strict aliasing rule is important because it allows the compiler to make optimizations based on the assumption that pointers of different types do not refer to the same memory location. Violating the strict aliasing rule can result in undefined behavior, such as crashes, incorrect results, or other unexpected behavior.
For more information on the strict-aliasing rule, see: What is the Strict Aliasing Rule and Why do we care?
//BAD
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
//GOOD
#include <bit>
#include <limits>
#include <cstdint>
constexpr float Q_rsqrt(float number) noexcept
{
static_assert(std::numeric_limits<float>::is_iec559); // (enable only on IEEE 754)
float const y = std::bit_cast<float>(
0x5f3759df - (std::bit_cast<std::uint32_t>(number) >> 1));
return y * (1.5f - (number * 0.5f * y * y));
}
When you need to use SIMD (Single Instruction Multiple Data) in your code, it's generally better to use GCC and Clang's vector extensions instead of the <immintrin.h>
library.
The vector extensions are more fundamental and more flexible, and they work on a variety of platforms, including wasm. You can find more information about the vector extensions in the GCC documentation.
It's worth noting that some types of SIMD data, such as _m128
, are actually implemented using vector extensions on GCC and Clang. This means that using the <immintrin.h>
library to work with these types may not provide any additional benefits and may add unnecessary complexity to your code.
In general, using the vector extensions provided by GCC and Clang is a simpler and more portable way to incorporate SIMD operations into your code.
When working with threads, it's important to ensure that your code is platform-agnostic. This means that you should avoid making assumptions about the underlying platform and instead write code that is compatible with multiple platforms.
Spin locks are a synchronization primitive used in multithreaded programming to ensure that only one thread is accessing a shared resource at a time. When one thread acquires a spin lock, other threads that try to acquire the same lock will spin in a loop, waiting for the lock to be released.
However, spin locks can cause problems in certain contexts. For example, if a thread holding a spin lock is preempted by the kernel scheduler, other threads waiting for the lock will spin indefinitely, consuming CPU resources and potentially causing the system to become unresponsive.
This issue was famously highlighted by Linus Torvalds, the creator of the Linux operating system, in a rant from 2007. In his post, Torvalds criticized the use of spin locks in the kernel and argued that they should be replaced with other synchronization primitives that were less likely to cause scheduling problems.
While spin locks can be useful in certain situations, it's important to use them judiciously and be aware of their potential downsides. In particular, in contexts where preemptive scheduling is used, it's generally better to use other synchronization primitives, such as mutexes or semaphores, that are less likely to cause scheduling problems.
See Linus Torvalds' rant. No nuances, just buggy code
Consider using type-erasure instead of Object-Oriented Programming (OOP). Herb Sutter's Metaclasses can simplify the use of type-erasure and eliminate the need for OOP in most cases.
Herb Sutter's Metaclasses proposal aims to extend the C++ language with a new feature that allows programmers to define custom language extensions, such as type erasure, in a more efficient and safer way. Metaclasses provide a way to generate code at compile-time based on user-defined specifications, without the need for macros or external tools.
With metaclasses, it might become easier to use type erasure in C++ and, in turn, make OOP less necessary in certain situations. However, metaclasses are not yet part of the C++ standard, and their exact design and implementation may change in the future.
Ensuring code reliability and security is a critical aspect of software development. To achieve this, there are several important practices that developers should follow. Firstly, it is recommended to always run code with sanitizers. This is because sanitizers can detect a wide range of issues such as memory errors, undefined behavior, data races, and more.
Another important practice is fuzzing. Fuzzing involves generating random inputs to a program and monitoring for unexpected behaviors or crashes. This can help uncover bugs or security vulnerabilities that may not be immediately apparent during development.
In terms of efficient bounds checking, it is recommended to use macros like _GLIBCXX_ASSERTIONS
instead of the at()
method for performing bounds checking on the entire C++ standard library. This can significantly improve performance and reduce overhead. See: How to make "modern" C++ programs safer
Finally, memory tagging is a powerful tool for defending against memory safety bugs. By adding tags to memory allocations, developers can detect buffer overflows, use-after-free errors, and other common memory safety issues. This can help prevent exploits and improve overall program stability.
It's recommended to include the following flags when compiling with GCC and Clang:
-Wall -Wextra -Wpedantic -Wmisleading-indentation -Wunused -Wuninitialized -Wshadow -Wconversion
here is an extended explanation of each flag:
-Wall: Enables all warnings that are deemed safe enough to be enabled by default. However, this does not enable all warnings that GCC or Clang is capable of generating.
-Wextra: Enables even more warnings, including some that are not enabled by -Wall, such as warnings about uninitialized variables and unused function parameters.
-Wpedantic: Enables warnings about non-standard code, as defined by the relevant language standard. This can be useful for ensuring portability of code between different compilers and platforms.
-Wmisleading-indentation: Warns about possible misleading indentation, which can lead to code that is difficult to read and understand.
-Wunused: Warns about unused variables, functions, and other entities. This can help identify code that is no longer needed, or code that has not yet been completed.
-Wuninitialized: Warns about using uninitialized variables. This can help catch potential bugs that might cause undefined behavior.
-Wshadow: Warns about variables that shadow other variables with the same name in an outer scope. This can help avoid confusion and unintended behavior.
-Wconversion: Warns about implicit conversions that might result in loss of data or precision. This can help catch potential bugs and improve code quality.
In general, you should think about things like CYGWIN/MSYS2 exist. wine-gcc will define _WIN32 macros for POSIX compliant hosted GCC compilers, which are incorrect on Linux, FreeBSD, etc. That is why we should also exclude __WINE__
Reasons:
- Libraries like Boost may include the header as <Windows.h>, causing issues for cross-compilation since UNIX filesystems are case-sensitive. This can lead to compiler errors when GCC or clang fails to find the header.
- Windows APIs are not correctly marked as noexcept.
- The header can slow down the compilation process significantly.
- It introduces macros like min and max which can conflict with the C++ standard libraries, leading to unexpected behavior.
//BAD:
#pragma once
#include<Windows.h>
If you want to use win32 or nt APIs, import them by yourself. NEVER USE extern "C" to import APIs.
#pragma once
//Wrong! Here is the wrong way to import APIs.
namespace mynamespace::nt
{
struct io_status_block
{
union
{
std::uint_least32_t Status;
void* Pointer;
} DUMMYUNIONNAME;
std::uintptr_t Information;
};
using pio_apc_routine = void (*)(void*,io_status_block*,std::uint_least32_t) noexcept;
#if defined(_MSC_VER) && !defined(__clang__)
__declspec(dllimport)
#elif __has_cpp_attribute(__gnu__::__dllimport__)
[[__gnu__::__dllimport__]]
#endif
extern "C" std::uint_least32_t __stdcall ZwWriteFile(void*,void*,pio_apc_routine,void*,io_status_block*,
void const*,std::uint_least32_t,std::int_least64_t*,std::uint_least32_t*) noexcept;
}
#include<windows.h>
//BOOM!! Compiler complains.
//GOOD:
#pragma once
namespace mynamespace::nt
{
struct io_status_block
{
union
{
std::uint_least32_t Status;
void* Pointer;
} DUMMYUNIONNAME;
std::uintptr_t Information;
};
using pio_apc_routine = void (*)(void*,io_status_block*,std::uint_least32_t) noexcept;
#if defined(_MSC_VER) && !defined(__clang__)
__declspec(dllimport)
#elif __has_cpp_attribute(__gnu__::__dllimport__)
[[__gnu__::__dllimport__]]
#endif
extern std::uint_least32_t __stdcall ZwWriteFile(void*,void*,pio_apc_routine,void*,io_status_block*,
void const*,std::uint_least32_t,std::int_least64_t*,std::uint_least32_t*) noexcept
#if defined(__clang__) || defined(__GNUC__)
#if SIZE_MAX<=UINT_LEAST32_MAX &&(defined(__x86__) || defined(_M_IX86) || defined(__i386__))
#if !defined(__clang__)
__asm__("ZwWriteFile@36")
#else
__asm__("_ZwWriteFile@36")
#endif
#else
__asm__("ZwWriteFile")
#endif
#endif
;
}
/*
Ignore the part that deals with name mangling for Visual C++.
See here:
https://github.com/trcrsired/fast_io/blob/master/include/fast_io_hosted/platforms/win32/msvc_linker.h
*/
For Windows programming, it's important to know which API to call depending on the version of the operating system. In general, it's recommended to use the W APIs on modern versions of Windows, such as Windows NT-based systems, including Windows 10.
However, on older versions of Windows, such as Windows 95/98 and ME, only the A APIs are available. The W APIs do exist, but they don't do anything on these systems.
The problem with using the A APIs on modern versions of Windows is that they are influenced by the Windows Locale, which can cause issues. In contrast, all NT APIs are unicode APIs, so using the W APIs is generally safer and more portable.
To avoid issues with the wchar_t type and execution charset, it's recommended to import the APIs using C++'s char8_t and char16_t types instead. This can help ensure that your code is more portable and less likely to encounter issues related to character encoding and localization.
//Use if constexpr to trivialize the API calls.
namespace mynamspace
{
enum class win32_family
{
ansi_9x,
wide_nt,
#ifdef _WIN32_WINDOWS
native = ansi_9x
#else
native = wide_nt
#endif
};
template<typename... Args>
inline void* my_create_file(Args... args) noexcept
{
if constexpr(win32_family::native==win32_family::ansi_9x)
{
//Call A apis for Windows 95
return ::mynamespace::win32::CreateFileA(args...);
}
else
{
//Call W apis for Windows NT
return ::mynamespace::win32::CreateFileW(args...);
}
}
}
This rule applies to CRT APIs, including fopen(3) _fdopen(3). You should _wfdopen for windows NT (Yes, this includes Windows 10). Better use them with win32 or NT APIs.
Windows does provide file descriptors, and the Windows CRT actually implements file descriptors using Win32 HANDLE internally. So, although the programming interface to access file descriptors on Windows is different from that on UNIX-based systems, file descriptors can still be used in Windows programs.
See the API: _open_osfhandle
Avoid using std::unique_ptr for managing win32 HANDLE resources. While it may seem convenient to use C++ smart pointers, such as std::unique_ptr, it can lead to undefined behavior and other issues. Instead, it is recommended to write an RAII class that properly wraps the HANDLE resource.
This ensures that the HANDLE resource is properly managed and released when it is no longer needed, without relying on the behavior of smart pointers that may not be compatible with HANDLE.
/*BAD!!!
Watch the video:
https://youtu.be/5vGWM9DLrko?t=1205
*/
std::unique_ptr<HANDLE,std::function<decltype(CloseHandle)>> pHandle(hEvent,CloseHandle);
//Good! Just write a class
win32_file file(u8"a.txt");
Windows targets in GCC support three different threading ABIs, including win32, posix, and mcf. Only posix provides the <pthread.h>
header, which may not be available or used by users who opt for win32 or mcf. Relying on <pthread.h>
can cause your compilation to break and may also break the C++ standard library, which does not provide <thread>
and <mutex>
headers for win32. For instance, LLVM libc++ makes this mistake, which is why it is not available for windows targets. Instead of assuming the presence of <pthread.h>
, consider using other threading mechanisms that are available on Windows targets.
//BAD!!
#include<pthread.h>
//BAD!!
#include<thread>
//BAD!!
#include<mutex>
//BAD!!
#include<threads.h>
-
For cpu-windows-gnu clang triple targets, it's best to avoid using thread_local and _Thread since clang does not correctly implement GCC's ABI for Windows.
-
Be aware of the ABI differences between win32, posix, and mcf of GCC libstdc++-6.dll. Linking to the wrong C++ standard library runtime can cause iostream to break.
-
Testing and benchmarking your Windows applications on WINE, a compatibility layer that runs Windows apps on Linux, can help avoid issues like ABIs. To include C++ standard library DLLs like libstdc++-6.dll, remember to export the WINEPATH environment in $HOME/.bashrc.
The purpose of the inline
keyword in C++ is to prevent ODR (One Definition Rule) violations, not for providing hints or expansions. However, the keyword's usage is often confusing due to its different meanings compared to C. If a function with the same signature is defined in multiple translation units, the linker will fail to link them. When a function is marked as inline, the linker can discard any functions with the same signature and keep only one copy, assuming they work the same. However, if they don't work the same, it results in undefined behavior. Additionally, marking a function as inline allows GCC and clang to avoid emitting the function unless it's used, which is important for preventing dead code. For a better understanding of inline, watch the video at What does the keyword "inline" REALLY mean in C++??
and refer to this article A noinline inline function? What sorcery is this?
.
To ensure maximum portability, it is recommended to build and test your code on as many GCC cross/canadian toolchains as possible.
This will help you identify potential issues early on and allow you to develop a more robust and portable codebase. By testing on various platforms, you can ensure that your code behaves consistently across different systems and architectures, thus minimizing the risk of unexpected behavior and errors.