Fast integer to string conversion in C++

In this post I compare the performance of several methods of integer to string conversion in C++:

  1. sprintf
  2. std::stringstream
  3. std::to_string from C++11
  4. boost::format from the Boost Format library
  5. boost::lexical_cast
  6. karma::generate from the Boost Spirit Parser framework
  7. fmt::Writer from the fmt library
  8. fmt::format from the fmt library
  9. Public-domain ltoa implementation
  10. decimal_from function suggested by Alf P. Steinbach
  11. fmt::FormatInt from the fmt library
  12. strtk::type_to_string from the strtk library

To measure the performance I used a benchmark from Boost Karma. This benchmark generates 10,000,000 random integers and converts them to strings using different methods measuring conversion time. I’ve replaced nonportable itoa with sprintf and added std::to_string, boost::lexical_cast, fmt::Writer and fmt::format methods.

Apart from adding new conversion methods, I’ve also noticed that the benchmark used unnecessary conversion to std::string in sprintf and karma::generate tests to compensate for string operations in other tests. To get more useful results, I’ve split every such test in two, one that does conversion to std::string and one that doesn’t. Tests that do unnecessary conversion to std::string have suffix +std::string. They are suboptimal, but I’ve included them for reference.

Here are the results ordered by the time it took a method to convert 10,000,000 integers to strings (obviously smaller is better); time ratio is the ratio of conversion time to the best time:

I consider these results pretty exciting. First they show that fmt::Writer is the fastest (was the fastest, see the updates at the bottom of the post) of the tested methods, almost 40% faster than karma::generate, the next contender. Here’s the code used to convert an integer n to a string using fmt::Writer:

fmt::Writer w;
w << n;
// The result can be converted to std::string using w.str() or
// accessed as a C string using w.c_str().

Note that fmt::Writer automatically allocates enough space to hold the formatted output unlike sprintf and karma::generate which use a preallocated buffer. In case of karma::generate you can probably use another output iterator, but the performance is likely to be lower.

Another remarkable and surprising (to me) thing about the results is that sprintf is not particularly fast for integer formatting. It has about the same performance as std::stringstream, about 6 times slower than fmt::Writer. One possible reason for this is that sprintf parses the format string, but so does fmt::format which is two times faster than sprintf. Anyway, the good thing is that you don’t have to use sprintf even for performance reasons. There are much faster or at least equally slow but safer methods even in the standard library.

The benchmark results were obtained on Ubuntu 13.04 with GCC 4.7.3 and the following compiler flags: -O3 -DNDEBUG -std=c++11.

Running the benchmark:

$ git clone --recursive https://github.com/vitaut/format.git
$ cd format
$ cmake .
$ make
$ cd format-benchmark
$ ./int-generator-test.py

You can find out more about fmt::Writer and fmt::format in the fmt library repository on GitHub and in the documentation.

Update: Since I don’t have ltoa on my platform, I’ve added a basic public-domain implementation of this function from here. Let me know in the comment section if there is a better version available somewhere.

Update 2: Added decimal_from function suggested by Alf P. Steinbach. It has approximately the same performance as fmt::Writer, the difference of 0.5% is probably less than the measurement error. In some runs it is even marginally faster. As sprintf and ltoa and unlike fmt::Writer it requires preallocated buffer.

Update 3: Inspired by a lesson learned from Alexandrescu’s talk that “no work is less work than some work” I’ve come up with a faster method of integer to string conversion. Unlike other methods it does one pass over the digits. All other methods I know do two passes and can be divided into two categories:

  1. Count digits (pass 1), then convert digits to chars writing from the end of the buffer (pass 2).
  2. Convert digits to chars writing from the beginning of the buffer (pass 1). Reverse the string in the buffer (pass 2).

Instead of doing this, I just convert digits to chars writing from the end of the buffer and return the pointer to the start of the converted string. In most cases there is some space left in the beginning of the buffer, but that’s fine because the same is true for the second category of methods above, they just have this space at the end of the buffer. This avoids unnecessary copying within a buffer that is discarded anyway.

I’ve implemented this method in the fmt::FormatInt class which can be used as follows:

fmt::FormatInt(42).str();   // convert to std::string
fmt::FormatInt(42).c_str(); // convert and get as a C string
                            // (mind the lifetime, same as std::string::c_str())

I’ve updated the test results and as you can see it is about 30% faster than the previous winner, fmt::Writer.

Update 4:

Added strtk::type_to_string as suggested in the comments.

Update 5:

Added side effects to make sure that the code being tested is not optimized away by a super clever compiler (I wish there existed one). This is implemented by computing a sum of lengths of all formatted strings using strlen. The strlen function is used even in cases where std::string::size could be used to make sure the same extra computation is done for all methods. Note that since this adds a more or less constant factor to all the methods, high performers are penalized more.

Update 6:

Fixed links to the fmt library (formerly C++ Format).

comments powered by Disqus