• Home
  • Assertions
  • Poetry
  • Programming

Record and Reverie

General things I find interesting

Feed on
Posts
Comments
« Fire’s Death
Spring Break »

Branchless Code

Mar 20th, 2007 by Graham Booker

Apologies if this is technical, but since I learned something in this experiment, I thought I would write it up:

I remember going to a seminar a few weeks ago delivered by a faculty candidate which discussed the idea of how to get around the branch misprediction problem present in microprocessors. Anyway, her idea was to use predicated statements to make the instructions conditional on a comparison which may not have been completed yet. Essentially, when faced with a decision between two routes, the processor starts computation on both before it is even sure which it should take. Then, when the decision is known, it can proceed to throw out the results from the task it should not have started and only keep the results from the other. Her research mentioned using the processor’s internal register renaming system to prevent the results from the two colliding. There is a problem of both processor support as well as compiler support, as well as the fact that the active power in the processor is now increased. From the first two issues, I doubt we will ever see this come to light, especially when such things are already possible:

Recently, I rewrote a few C functions in ffmpeg’s H.264 video decompresser in Altivec. This way, instead of the processor working on one value at a time, it can work on 16, thus greatly increasing its speed. The only issue is the computation may take one route for some elements, and another route for others. So, how do you handle both situations when you can only take one at a time? The answer is to do both, and then use masks to keep a certain result.

So, to test if (x > y), just do something like:
mask = vec_cmpgt(x, y);
and multiple ifs, like (x > y && y > z) can be done like
mask = vec_and(vec_cmpgt(x, y), vec_cmpgt(y, z));
Then, when doing your operation, simply keep the mask, and use it to determine if you commit the result. Simple, eh 😉

Anyway, as a result of writing this code, not only does it process more values at a time, but it also does it without any branches, thus not creating branch mispredictions as well as doing memory operations in larger chunks, which is more efficient. Now the question is, if it were rewritten to simply use branchless code instead of being vectorized, would it have been faster, or is the branch predictor good enough in processors to have not made a difference? I suppose I just found an excellent test for that faculty candidates research.

Posted in Programming

3 Responses to “Branchless Code”

  1. on 22 Mar 2007 at 5:19 pm1Alan Humpherys

    The best place to implement this is within the hardware itself. The CPU designs I worked on at Amdahl in the late 80’s had this implemented in hardware. There were limitations, but we had a 6 stage execution pipeline where there were duplicates of registers and Arithmetic Units for 3 stages which gave sufficient time for the original conditional expression to complete evaluation so that the correct branch/nobranch sequence could be followed.

    As with all good things we did in the “dinosaur Mainframe” era, they will be “rediscovered” and put into a PC and the world will be amazed….

    Virtualization… Been there – done that – 1986

    (And we did it so well that the CIA approved our computers to run top level security clearance programs in one virtual mainframe, while potentially hostile code (obtained from inteligence operations) could be run in another virtual mainframe, and the 2 were guaranteed not to be able to interfere with each other… Something not even remotely possible on the pathetic PC hardware we are dealing with today.)

  2. on 03 May 2008 at 1:58 pm2Greg Davies

    I heard something about branchless programming before, but it was explained differently to me. Basically, boolean logic was altered by the compiler and linker to compute the address to jump to. That way you never do a comparison at all, just an unconditional jump.

  3. on 03 May 2008 at 3:17 pm3Graham Booker

    Eh, computing an address for a jump is really not much different from a branch. However, processors have branch prediction logic, so changing a branch to an address computation and a jump could potentially be much worse than a branch. The only case where such a change won’t cause a pipeline stall is when the address is computed well beforehand, in which case the branch wouldn’t stall either.

    There are actually no jumps in the code I mentioned above. It is essentially computing both paths from a branch, and then using boolean logic to determine which result is committed, and which is thrown away.

  • Recent Posts

    • Google’s Analytics Mistake
    • Switching back to Safari
    • Apple has become the new Microsoft
    • Sapphire Plugin
    • Key Fixed
    • Sometimes you just got to laugh
  • Archives

    2021
    2020
    March 2020 (1)
    2019
    November 2019 (1)
    2018
    June 2018 (1)July 2018 (1)December 2018 (1)
    2017
    January 2017 (2)June 2017 (1)August 2017 (1)
    2016
    June 2016 (1)August 2016 (1)
    2015
    January 2015 (1)February 2015 (1)December 2015 (1)
    2014
    June 2014 (1)July 2014 (1)August 2014 (2)
    2013
    February 2013 (2)March 2013 (1)April 2013 (1)June 2013 (1)November 2013 (1)
    2012
    April 2012 (2)May 2012 (1)June 2012 (1)November 2012 (1)
    2011
    January 2011 (1)October 2011 (1)November 2011 (1)December 2011 (1)
    2010
    February 2010 (2)April 2010 (1)June 2010 (1)July 2010 (1)August 2010 (1)September 2010 (1)October 2010 (2)December 2010 (3)
    2009
    January 2009 (1)February 2009 (1)March 2009 (2)May 2009 (1)July 2009 (3)September 2009 (1)
    2008
    January 2008 (1)February 2008 (4)March 2008 (1)April 2008 (6)May 2008 (1)June 2008 (3)August 2008 (1)September 2008 (2)October 2008 (2)December 2008 (1)
    2007
    January 2007 (1)February 2007 (4)March 2007 (5)April 2007 (4)May 2007 (1)June 2007 (6)August 2007 (3)September 2007 (3)November 2007 (3)December 2007 (4)
    2006
    January 2006 (4)February 2006 (10)March 2006 (4)April 2006 (6)May 2006 (2)June 2006 (4)July 2006 (1)August 2006 (1)September 2006 (4)October 2006 (6)November 2006 (3)December 2006 (3)
    2005
    October 2005 (6)November 2005 (13)December 2005 (1)
    2004
    February 2004 (2)March 2004 (1)April 2004 (1)May 2004 (6)June 2004 (6)July 2004 (3)August 2004 (2)September 2004 (1)November 2004 (5)
    2003
    September 2003 (1)October 2003 (3)November 2003 (1)December 2003 (1)
  • Categories

    • Breakaway (5)
    • Family (4)
    • Friends (2)
    • General (148)
    • Nature Pictures (8)
    • Politics (2)
    • Programming (41)
    • School (11)
    • SysAdmin (8)
    • Teaching (2)
  • Tags

    AC3 Ads Code Frontrow Java Objective-C Open Source Perian Perl permissions plex plugin RSS Sapphire School Servers ZFS

  • Pages

    • Assertions
      • Female Friends Who Won’t Date You
      • Not Dating Friends
    • Poetry
      • Curtis Staying Over
      • Girl Questions
      • Scaring Girls Off
      • Summer’s End
    • Programming
      • Fire Development
      • Kyocera Ringtone Converter for the Mac
      • Perian
      • Text Compression

Record and Reverie © 2021 All Rights Reserved.

WordPress Themes | Web Hosting Bluebook