2012-04-27

Speedy integer to string conversion in C++

Test

This is a log entry about playing around the question. There are robust ways of converting integers to string already implemented out there, for example FastFormat. Good for them. Let's see how I played on my playground.

I measured the speed of sprintf(), snprintf() and my own implementation. I tried to make most of the decisions compile-time. I didn't care to write time measurement into my program, instead I'm using GNU's time utility with three different versions of my program separated using macros.

#include <iostream>
#include <cstdio>
#include <limits>
#define __STDC_FORMAT_MACROS
#include <inttypes.h>

#ifndef TEST_CASE
#define TEST_CASE 1
#endif
#ifndef TESTCOUNT
#define TESTCOUNT 100*1000*1000
#endif
#ifndef INTTYPE
#define INTTYPE uint64_t
#define INTFMT PRIu64
#endif

template<typename Integral>
char *myUIntToStr(Integral i, char *endPtr)
{
    char *&cursor(endPtr);
    if(!i)
    {
        *--cursor='0';
        return cursor;
    }
    for(;i;i/=10)
    {
        *--cursor = char(i%10+'0');
    }
    return cursor;
}

int main()
{
    const size_t BUFSIZE = std::numeric_limits<INTTYPE>::digits10 + 2;
    char buffer[BUFSIZE];
    buffer[BUFSIZE-1] = 0;
#if TEST_CASE == 1
    for(INTTYPE i = 0; i<TESTCOUNT; ++i)
    {
        sprintf(buffer,"%" PRIu64,i);
    }
#endif
#if TEST_CASE == 2
    for(INTTYPE i = 0; i<TESTCOUNT; ++i)
    {
        snprintf(buffer,BUFSIZE,"%" PRIu64,i);
    }
#endif
#if TEST_CASE == 3
    char * const endPtr = buffer + BUFSIZE - 1;
    char * output;
    for(INTTYPE i = 0; i<TESTCOUNT; ++i)
    {
        output = myUIntToStr(i,endPtr);
    }
#endif
#if TEST_CASE == 4
    std::stringstream ss;
    for(INTTYPE i = 0; i<TESTCOUNT; ++i)
    {
        ss << i;
        ss.str(""); // clearing
    }
#endif
    return 0;
}

In many cases the arithmetic correctness of a program or interoperability of a binary format can be achieved by using fixed size integers like uint64_t instead of the fluffy definition of short, int, long, etc.. These fixed size integers are defined in the inttypes.h header file of the C99 standard. I don't know any standard options for this purpose under C++03, so I chose to use this second best solution. I'm using this decision for a while and the time is finally on my side: The C++11 standard includes this header. The defining of __STDC_FORMAT_MACROS before including this header results in the creation of printf() formatters for those types (without the % sign).

In theory this is a portable source code in practice (sic), in practice I haven't tried. This thing is only a proof of a concept.

The original idea for the conversion implementation came from abelenky at a stackoverflow.com question which was about the alternatives to itoa(). I added that the buffer is allocated outside the function on stack (something like for free) and a pointer to its end passed into the function.

Environment

  • OS: Debian (6.0.4 Squeezee) GNU/Linux (2.6.32-5-amd64)
  • G++ 4.4.5
  • CPU: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz

Results

$ g++ -W -Wall -pedantic -Wextra -O2 -DTEST_CASE=1 sprintf_speed.cpp -o sprintf_speed
$ time ./sprintf_speed

real    0m16.776s
user    0m16.757s
sys     0m0.012s
$ g++ -W -Wall -pedantic -Wextra -O2 -DTEST_CASE=2 sprintf_speed.cpp -o sprintf_speed
$ time ./sprintf_speed

real    0m16.951s
user    0m16.953s
sys     0m0.000s
$ g++ -W -Wall -pedantic -Wextra -O2 -DTEST_CASE=3 sprintf_speed.cpp -o sprintf_speed
$ time ./sprintf_speed

real    0m4.295s
user    0m4.296s
sys     0m0.000s
$ g++ -W -Wall -pedantic -Wextra -O2 -DTEST_CASE=4 sprintf_speed.cpp -o sprintf_speed
$ time ./sprintf_speed

real    0m15.443s
user    0m15.429s
sys     0m0.012s

The difference between the sprintf() and snprintf() version was around 2 to 4 percent consistently in favor of sprintf(). It's unfair to compare these guys to my solution because they also have to parse their given formatter parameter. I believe that my solution is somewhat optimal. The fourth case with std::stringstream is there for reference, that is the clean C++ style portable way for doing the conversion. Surprisingly fast.

No comments:

Post a Comment