FWIW: The bigger picture... Or why I have been asking a lot of questions lately...

bruce.labitt at autoliv.com bruce.labitt at autoliv.com
Tue Oct 13 11:49:22 EDT 2009


gnhlug-discuss-bounces at mail.gnhlug.org wrote on 10/13/2009 08:44:49 AM:

> Bruce Labitt <bruce.labitt at myfairpoint.net 
> <mailto:bruce.labitt at myfairpoint.net>> wrote:
>  >
>  > What I'm trying to do:  Optimizer for a radar power spectral density 
> problem
>  >
>  > Problem:  FFTs required in optimization loop take too long on current
>  > workstation for the optimizer to even be viable.
>  >
>  > Attempted solution:  FFT engine on remote server to reduce overall
>  > execution time
>  >
>  > Builds client - server app implementing above solution.  Server uses
>  > OpenMP and FFTW to exploit all cores.
> [...]
>  > Implements better binary packing unpacking in code.  Stuff works
>  >
>  > Nit in solution:  TCP transport time >> FFT execution time, rendering
>  > attempted solution non-viable
>  >
> [...]
>  > Hey, that is my bigger picture...  Any and all suggestions are
>  > appreciated.  Undoubtedly, a few dumb questions will follow.  I 
appear
>  > to be good at it.  :P  Maybe this context will help list subscribers
>  > frame their answers if they have any, or ask insightful question.
> 
> I don't understand anything about your domain of application,
> so take this for what its worth...
> 
> I've gleaned the following from the previous posts. Is it a fair 
summary?

Not exactly, minor corrections below

> - The local FFT is taking ~200 ms, which isn't fast enough.

remote FFT takes 200 ms.  local machine takes ~20 sec.

> - The remote FFT is substantially faster than this once the data gets 
> there.

Remote is 100x faster than local.

> - However, it takes substantially longer (~1.2 seconds) to move the data
>   than to process it locally.
> 

Tranfer time to server is indeed ~1.2 sec if running at full link 
bandwidth.  I have yet to achieve this in my application.  I'd be 
'deliriously happy' to get this bandwidth.  Could I use more bandwidth, 
yes.

> What does "fast enough" mean here? What is your "time budget" per data 
set?

This is a very good question.  ITBP (in the big picture) I need to compute 
100 FFTs for 1 data set.  This data will be fed to an optimizer, which 
will modify coefficients and ask for another 100 FFTs.  If things are set 
up properly, stuff eventually converges.  If not, it can be ages before 
you know you have a poor run.  This makes debug dificult.  So I like to 
build in some sort of parameter that I can monitor for simulation 
progress. 

Hmm, I digressed.  At 20 sec /cycle it takes about 40 minutes to do the 
100 FFTs that make a dataset. 
If I could get ~2 sec per FFT cycle, I'd be much happier.  It would be a 
lot easier to debug, and watch the convergence indicators. 


> Is it only constrained by "catching and cooking" one data set before it 
is
> overwritten by a new one (or before you choke on the stream buffers :) 
)?
> Are there latency/timeliness requirements from downstream?
> If so, what are they?
> Provided your processing rate keeps up with the arrival rate,
> how far behind can you afford to deliver results?
> (i.e. how much pipelining is permitted in a solution?)
> 
> How fast is the remote FFT? I didn't catch a number for this one.
> Or was the 200 ms the remote processing time?
> (In which case, what't the local processing time?)
> Do you have the actual server you're targeting to benchmark this on?
> 
> This helps to frame the external requirements more clearly.
> 
> You've stated the problem in the implementation domain.
> It sounds like your range of solutions could leave very little headroom.
> My instinctive response is to ask
> "Is there a more frugal approach in the application domain?"

This is an insightful observation.  Good question!

> 
> Do you need to grind down the whole field of potential interest?
> Are there ways to narrow and intensify your focus partway through?
> Perhaps to do a much faster but weaker FFT,
> analyze it quickly to identify a narrower problem of interest,
> and then do the slower, much stronger FFT on a lot less data?
> Reducing the data load for the hard part may help with on-chip or
> off-chip solutions. It may also help to identify hybrid solutions.
> 
> Alternatively, a mid-stream focusing analysis might be so expensive
> as to negate the benefit, or any performant mid-stream analysis might
> be merely a too-risky heuristic, or the problem may simply not lend
> itself to that kind of decomposition. You did say that you had already
> encountered a number of dead-ends - this may be familiar ground :)
> 

Believe me, a lot of it is familiar :)  Nonetheless, things are worth 
mulling over.  Despite the time invested, nothing really is cast in 
concrete...

> I don't know your domain. I don't have answers, just questions.
> I just figured those kind of questions were worth asking
> before we try squeezing the last Mbps out of the network...
> 
> Lupestro
> 

In my current solution space - the network transport IS the dominant 
bottleneck.  I truly was not expecting such slow performance there.  Of 
course, once 'fixed' there will be a new bottleneck.  Some bottlenecks 
cannot be corrected, either one accepts the performance, tweaks things or 
looks for an alternate architecture that does not have the same weakness.

You are correct to advise stepping back to rethink the problem.  I think 
that is worthwhile.  My current solution space may not be an optimal one. 
There may be other spaces that are better.

Thanks for your thoughts and observations!
Bruce


******************************
Neither the footer nor anything else in this E-mail is intended to or constitutes an <br>electronic signature and/or legally binding agreement in the absence of an <br>express statement or Autoliv policy and/or procedure to the contrary.<br>This E-mail and any attachments hereto are Autoliv property and may contain legally <br>privileged, confidential and/or proprietary information.<br>The recipient of this E-mail is prohibited from distributing, copying, forwarding or in any way <br>disseminating any material contained within this E-mail without prior written <br>permission from the author. If you receive this E-mail in error, please <br>immediately notify the author and delete this E-mail.  Autoliv disclaims all <br>responsibility and liability for the consequences of any person who fails to <br>abide by the terms herein. <br>
******************************



More information about the gnhlug-discuss mailing list