Perl cache modules performance

January 4th, 2010

There are a lot of cache modules available on the great CPAN already. The newest kid on the block is CHI, a Moose based intelligent and flexible caching solution with a very sane API and good design, separating the driver backends from the caching logic as much as possible.

The results show, that Cache::FastMmap is by far the most efficient implementation. Cache::FastMmap (just like memory based caches) is limited to the local host by design, whereas the Memcached based caches allow distributed caching, that can be accessed from various hosts. A distributed cache is a more flexible solution for scaling your application and caching requirements. With CHI using a subcache is possible, and the L1 cache subcache implementation is exactly what can be used to combine a fast local cache with a slower but persistent “across application restarts) cache.

It seems that CHI has a still room for improvement in terms of efficiency, which probably is caused by using Moose and not doing any XS optimizations yet. But still the CHI::L1 combination of memory and memcached is quite efficient, when dealing with a high number of cache reads compared to writes (see 1:100 ratio in the test below).

Of course this benchmark does not mimic any real world scenario (and most notably not yours), but should give some overview of what overhead the caching layer itself poses. Keys used for storing are always 36 character UUID strings. The values used for caching are separated into small, medium and large datasets. Small values are actually random binary UUIDs (16 bytes), the medium dataset the same but with 10 times longer (160 bytes) values, and the large dataset with 100 times the UUID length (1600 bytes). The tests how set/get ratios related always use the medium dataset (160 byte values) and the used ratios are 1:1 (you should probably not do any caching in that situation anyway), 1:10 and 1:100.

The script used to do the benchmarks is attached and is intended to be run with prove -v bench_cache.pl (it’s actually a test useing Test::More and not cleaned up). But if you find some obvious mistakes in generating the benchmarks I would be interested to know.

The script used to generate these results is available here: bench_cache.pl (and is being tuned to allow some graphing of the results)

    The abbriviations used below are listed here:

  • CHI:Mc:lIP … CHI::Driver::Memcached::libmemcached over IP
  • CHI:Mc:l … CHI::Driver::Memcached::libmemcached over Socket
  • CHI:L1 … CHI::Driver::Memcached::libmemcached (IP) with CHI::Driver::Memory L1 cache
  • CHI:FMmap … CHI::Driver::FastMmap
  • CHI:Mem … CHI::Driver::Memory (with max_size set)
  • C:Mc:lIP … Cache::Memcached::libmemcached over IP
  • C:Mc:l … Cache::Memcached::libmemcached over Socket
  • C:FMmap … Cache::FastMmap

And here are the results being generated on a Dell D830 dual-core laptop using perl-5.10.1 of Debian testing:

Benchmarking caches with ratio 1:10 and small values

              Rate CHI:Mc:lIP CHI:Mc:l CHI:L1 CHI:FMmap CHI:Mem C:Mc:lIP C:Mc:l C:FMmap
CHI:Mc:lIP  5107/s         --      -2%   -23%      -26%    -43%     -43%   -49%    -56%
CHI:Mc:l    5219/s         2%       --   -22%      -25%    -41%     -42%   -48%    -55%
CHI:L1      6669/s        31%      28%     --       -4%    -25%     -26%   -34%    -42%
CHI:FMmap   6920/s        36%      33%     4%        --    -22%     -23%   -31%    -40%
CHI:Mem     8885/s        74%      70%    33%       28%      --      -1%   -12%    -23%
C:Mc:lIP    8986/s        76%      72%    35%       30%      1%       --   -11%    -22%
C:Mc:l     10087/s        98%      93%    51%       46%     14%      12%     --    -12%
C:FMmap    11498/s       125%     120%    72%       66%     29%      28%    14%      --


Benchmarking caches with ratio 1:10 and medium values

              Rate CHI:Mc:lIP CHI:Mc:l CHI:L1 CHI:FMmap C:Mc:lIP CHI:Mem C:Mc:l C:FMmap
CHI:Mc:lIP  4628/s         --     -10%   -30%      -30%     -46%    -47%   -55%    -59%
CHI:Mc:l    5140/s        11%       --   -23%      -23%     -40%    -41%   -50%    -54%
CHI:L1      6639/s        43%      29%     --       -0%     -23%    -23%   -35%    -41%
CHI:FMmap   6643/s        44%      29%     0%        --     -23%    -23%   -35%    -41%
C:Mc:lIP    8615/s        86%      68%    30%       30%       --     -1%   -15%    -23%
CHI:Mem     8661/s        87%      69%    30%       30%       1%      --   -15%    -23%
C:Mc:l     10188/s       120%      98%    53%       53%      18%     18%     --     -9%
C:FMmap    11201/s       142%     118%    69%       69%      30%     29%    10%      --


Benchmarking caches with ratio 1:10 and large values

              Rate CHI:Mc:lIP CHI:Mc:l CHI:FMmap CHI:L1 CHI:Mem C:Mc:lIP C:Mc:l C:FMmap
CHI:Mc:lIP  4139/s         --      -5%      -28%   -28%    -45%     -47%   -56%    -60%
CHI:Mc:l    4380/s         6%       --      -24%   -24%    -42%     -44%   -54%    -57%
CHI:FMmap   5731/s        38%      31%        --    -1%    -24%     -26%   -40%    -44%
CHI:L1      5777/s        40%      32%        1%     --    -23%     -26%   -39%    -44%
CHI:Mem     7501/s        81%      71%       31%    30%      --      -4%   -21%    -27%
C:Mc:lIP    7779/s        88%      78%       36%    35%      4%       --   -18%    -24%
C:Mc:l      9484/s       129%     117%       65%    64%     26%      22%     --     -7%
C:FMmap    10230/s       147%     134%       78%    77%     36%      32%     8%      --


Benchmarking caches with ratio 1:1 and medium values

              Rate CHI:L1 CHI:Mem CHI:Mc:lIP CHI:Mc:l CHI:FMmap C:FMmap C:Mc:lIP C:Mc:l
CHI:L1      2192/s     --    -42%       -42%     -47%      -54%    -59%     -73%   -79%
CHI:Mem     3787/s    73%      --        -0%      -9%      -21%    -29%     -53%   -64%
CHI:Mc:lIP  3806/s    74%      0%         --      -8%      -20%    -29%     -53%   -64%
CHI:Mc:l    4155/s    90%     10%         9%       --      -13%    -23%     -48%   -60%
CHI:FMmap   4781/s   118%     26%        26%      15%        --    -11%     -41%   -54%
C:FMmap     5368/s   145%     42%        41%      29%       12%      --     -33%   -49%
C:Mc:lIP    8043/s   267%    112%       111%      94%       68%     50%       --   -23%
C:Mc:l     10441/s   376%    176%       174%     151%      118%     94%      30%     --


Benchmarking caches with ratio 1:10 and medium values

              Rate CHI:Mc:lIP CHI:Mc:l CHI:L1 CHI:FMmap CHI:Mem C:Mc:lIP C:Mc:l C:FMmap
CHI:Mc:lIP  4630/s         --      -7%   -28%      -30%    -45%     -48%   -55%    -59%
CHI:Mc:l    4953/s         7%       --   -23%      -25%    -42%     -44%   -52%    -56%
CHI:L1      6408/s        38%      29%     --       -3%    -25%     -27%   -37%    -43%
CHI:FMmap   6604/s        43%      33%     3%        --    -22%     -25%   -35%    -42%
CHI:Mem     8493/s        83%      71%    33%       29%      --      -4%   -17%    -25%
C:Mc:lIP    8834/s        91%      78%    38%       34%      4%       --   -14%    -22%
C:Mc:l     10218/s       121%     106%    59%       55%     20%      16%     --    -10%
C:FMmap    11298/s       144%     128%    76%       71%     33%      28%    11%      --


Benchmarking caches with ratio 1:100 and medium values

              Rate CHI:Mc:lIP CHI:Mc:l CHI:FMmap CHI:L1 C:Mc:lIP CHI:Mem C:Mc:l C:FMmap
CHI:Mc:lIP  4626/s         --     -10%      -34%   -44%     -47%    -53%   -56%    -64%
CHI:Mc:l    5141/s        11%       --      -27%   -38%     -42%    -48%   -51%    -60%
CHI:FMmap   7004/s        51%      36%        --   -15%     -20%    -30%   -33%    -45%
CHI:L1      8279/s        79%      61%       18%     --      -6%    -17%   -21%    -36%
C:Mc:lIP    8799/s        90%      71%       26%     6%       --    -12%   -16%    -32%
CHI:Mem     9943/s       115%      93%       42%    20%      13%      --    -6%    -23%
C:Mc:l     10525/s       128%     105%       50%    27%      20%      6%     --    -18%
C:FMmap    12849/s       178%     150%       83%    55%      46%     29%    22%      --

These results were producted with Perl 5.10.1, here is perl -V output for reference:

Summary of my perl5 (revision 5 version 10 subversion 1) configuration:
   
  Platform:
    osname=linux, osvers=2.6.31.6-dsa-ia32, archname=i486-linux-gnu-thread-multi
    uname='linux murphy 2.6.31.6-dsa-ia32 #1 smp tue nov 10 09:21:59 cet 2009 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i486-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.10 -Darchlib=/usr/lib/perl/5.10 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.10.1 -Dsitearch=/usr/local/lib/perl/5.10.1 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.10.1 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.3.4', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /usr/lib64
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.10.1.so, so=so, useshrplib=true, libperl=libperl.so.5.10.1
    gnulibc_version='2.10.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'


Characteristics of this binary (from libperl): 
  Compile-time options: MULTIPLICITY PERL_DONT_CREATE_GVSV
                        PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP USE_ITHREADS
                        USE_LARGE_FILES USE_PERLIO USE_REENTRANT_API
  Built under linux
  Compiled at Nov 21 2009 22:39:09
  @INC:
    /etc/perl
    /usr/local/lib/perl/5.10.1
    /usr/local/share/perl/5.10.1
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.10
    /usr/share/perl/5.10
    /usr/local/lib/site_perl

Comments are closed.

primary