Packing/unpacking binary data in C - doubles, 64 bits

bruce.labitt at autoliv.com bruce.labitt at autoliv.com
Fri Sep 11 13:37:43 EDT 2009


gnhlug-discuss-bounces at mail.gnhlug.org wrote on 09/11/2009 10:33:10 AM:

> On Thu, 2009-09-10 at 17:36 -0400, Ben Scott wrote:
> >   Just for the sake of example: Bruce said 160 MB of data.  Let's
> > assume it's all 4-byte integers.  That's roughly 42 million integers.
> > Calling sprintf() and sscanf() 42 million times is going to slow
> > things down.  Likewise, if we assume a newline separated format and
> > all significant digits used, an ASCII representation is going to use
> > 11 bytes per integer, turning 160 MB into 440 MB.
> > 
> 
> The ASCII is triple the binary in size.  That could be bearable in most
> situations.  It should also compress fairly well. 
> 

Ben has been doing a good job looking out for me :)  Thanks, Ben!

The reason, I've been plugging away at this binary network stuff, is that 
this code will be embedded into a higher level optimization routine.  The 
optimization routine will need to do the above FFT 100's and perhaps 
1000's of times. 

I started out the whole exercise in python.  Each FFT iteration takes 20 
sec in python.  In comparison, my blade, using FFTW on the Cell, it takes 
0.2 sec.  Umm, that is why I bought the Cell, it has awesome FFT 
performance.  (That is why geology/oil people buy these things.  LLNL has 
made a 1 petaFLOP computer from these Cell processors, google Roadrunner.)

Then I wrote a client-server app with an FFT engine on the server.  This 
was easy (well, don't ask the PySIG list) in python.  It is tested and 
runs on a local network.  However, the FFT engine is in python, and it is 
still slow.  (Dominant time = FFT)

Now I am porting the server to C so I can easily exploit FFTW on the Cell. 
 I needed to transport tons of data to (and from) the engine so I was 
interested in solving the big number / binary problem.  Waiting 3 minutes 
to transport data is a non-starter for me.  A couple of seconds per 
iteration is what I'm after.  I'd like results from the optimizer in less 
than a day, rather than months!

So, now that I can 'easily' code and encode data for the network, I can 
proceed with my work.
Hopefully, you all have found it slightly entertaining.


-Bruce


> I'm not second guessing Bruce's decision here.  It's all about getting
> the most out of your time using the available tools.
> 

Absolutely!  I was driven towards the solution due to some of the 
constraints.  I never would have done all this if this was a 'small' 
problem.

> >>>> Python Code >>>>>>
> In [11]: m10 = 10 * 1000 * 1000
> # easier on the eyes than a long list of 0s
> 
> In [12]: m10
> Out[12]: 10000000
> 
> In [17]: f_list = [random.random()*20 for x in xrange(m10)]
> # force some of the random numbers to be greater than 1
> 
> In [24]: now();s_list = map(repr, f_list);now()
> Out[24]: datetime.datetime(2009, 9, 11, 10, 12, 22, 549050)
> Out[24]: datetime.datetime(2009, 9, 11, 10, 12, 54, 261281)
> # created 10,000,000 float strings in 32 seconds
> 
> In [25]: now();f2_list = map( float, s_list); now()
> Out[25]: datetime.datetime(2009, 9, 11, 10, 13, 11, 215100)
> Out[25]: datetime.datetime(2009, 9, 11, 10, 13, 24, 218123)
> # converted 10,000,000 strings to float in 13 seconds
> 
> In [26]: f_list[:10]
> Out[26]: 
> [3.2547270222254054,
>  4.1187838723903596,
>  19.029531987086656,
>  14.980165347124705,
>  2.1337003969489698,
>  8.2395337150073527,
>  4.7579966946618608,
>  0.88969361970157923,
>  9.5651010251147905,
>  16.707563948930382]
> 
> In [27]: f2_list[:10]
> Out[27]: 
> [3.2547270222254054,
>  4.1187838723903596,
>  19.029531987086656,
>  14.980165347124705,
>  2.1337003969489698,
>  8.2395337150073527,
>  4.7579966946618608,
>  0.88969361970157923,
>  9.5651010251147905,
>  16.707563948930382]
> 
> In [28]: from itertools import izip
> 
> In [29]: any(f1-f2 for (f1,f2) in izip(f_list, f2_list))
> Out[29]: False
> # all differences were 0 so the round trip processing was correct
> <<<<<<< end of python code <<<<<<<<
> 
> -- 
> Lloyd Kvam

Interesting code sample.  Amazing what python does....
-Bruce

******************************
Neither the footer nor anything else in this E-mail is intended to or constitutes an <br>electronic signature and/or legally binding agreement in the absence of an <br>express statement or Autoliv policy and/or procedure to the contrary.<br>This E-mail and any attachments hereto are Autoliv property and may contain legally <br>privileged, confidential and/or proprietary information.<br>The recipient of this E-mail is prohibited from distributing, copying, forwarding or in any way <br>disseminating any material contained within this E-mail without prior written <br>permission from the author. If you receive this E-mail in error, please <br>immediately notify the author and delete this E-mail.  Autoliv disclaims all <br>responsibility and liability for the consequences of any person who fails to <br>abide by the terms herein. <br>
******************************



More information about the gnhlug-discuss mailing list