Is this XOR value swap algorithm still in use or useful

Question

When I first started working a mainframe assembler programmer showed me how they swap two values without using the traditional algorithm of:

a = 0xBABE
b = 0xFADE

temp = a
a = b
b = temp

What they used to swap two values - from a bit to a large buffer - was:

a = 0xBABE
b = 0xFADE

a = a XOR b
b = b XOR a
a = a XOR b

now

b == 0xBABE
a == 0xFADE

which swapped the contents of 2 objects without the need for a third temp holding space.

My question is: Is this XOR swap algorithm still in use and where is it still applicable?

It's still in wide use for showing off and bad interview questions; that's about it. — Michael Borgwardt, Jan 09 '13 at 14:40
@PieterB yes they are both cases of this http://stackoverflow.com/questions/1826159/swapping-two-variable-value-without-using-3rd-variable/1826259#1826259 — jk., Jan 09 '13 at 15:44
So... You save a register but pay with 3 more instructions. I think the temp version would be faster anyway. — Billy ONeal, Jan 10 '13 at 08:03
Besides, any modern CPU has plenty of registers. The only problem is x86, which has a limited number of register _names_, and plays games with register renaming. But on a CPU which can rename registers, swap is utterly trivial. — MSalters, Jul 08 '13 at 12:24
@MSalters value swap haven't been a problem in x86 from its very beginning due to xchg instruction. OTOH swapping 2 memory locations require 3 xchgs:) — Netch, Dec 19 '13 at 10:36

score 42 · Accepted Answer · edited Jun 10 '23 at 20:54

42

When using xorswap there's a danger of supplying same variable (same memory address, not same value) as both arguments to the function which zeroes out the said variable due to it being xor'd with itself which turns all the bits to zero. Of course this itself would result in unwanted behavior regardless of algorithm used, but the behavior might be surprising and not obvious at first glance.

Traditionally xorswap has been used for low-level implementations for swapping data between registers. In practice there are better alternatives for swapping variables in registers. For example Intel's x86 has a XCHG instruction which swaps the contents of two registers. Many times a compiler will figure out the semantics of a such function (it swaps contents of the values passed to it) and can make its own optimizations if needed, so trying to optimize something as trivial as a swap function does not really buy you anything in practice. It's best to use the obvious method unless there's a proven reason why it would be inferior to say xorswap within the problem domain.

edited Jun 10 '23 at 20:54

Harry

3
3

answered Jan 09 '13 at 14:35

zxcdw

5,075
2
29
31

20

If both values are the same, then you'd still end up with the correct result. `a: 0101 ^ 0101 = 0000; b: 0101 ^ 0000 = 0101; a: 0101 ^ 0000 = 0101;` – Bobson Jan 09 '13 at 14:44
@Bobson Indeed. My bad. Fixd. – zxcdw Jan 09 '13 at 14:48
@apsillers That's correct, I was confused with thinking of bits instead of references to the same pattern of bits. :) – zxcdw Jan 09 '13 at 14:51
1

What @GlenH7 said. `-1` -> `+1`. – Bobson Jan 09 '13 at 14:55
1

You still seem to have the line about "When using xorswap there's a danger of supplying same variable as both arguments to the function which zeroes out the said variable due to it being xor'd with itself which turns all the bits to zero."... Was that not meant to have been removed? – Chris Jan 09 '13 at 17:58
10

@Chris No, at first I had written that if the values were identical as if `a=10, b=10`, and if you did `xorswap(a,b)` that would work and not zero out the variables which is false and now removed. But if you did `xorswap(a, a)` then `a` would get zeroed which I had originally meant but was being stupid. :) – zxcdw Jan 09 '13 at 18:03
Ah. I see. Though true given the OP isn't talking about a function as such, just an algorithm I'm not sure that is relevant here and (as I can attest) still a little confusing). :) – Chris Jan 09 '13 at 18:06
3

It's pretty relevant in certain languages - small little dangers like that make for difficult to find bugs in the future when dealing with two pointers that somehow were set to point to the same address 100+ lines up that you don't see. – Drake Clarris Jan 09 '13 at 19:03
1

Just want to add that XCHG is two cycles on Skylake, while the XOR trick requires 3 cycles to execute (since each instruction uses the result of the previous). In terms of code size, XCHG wins as well. So there's pretty much no reason to use the XOR trick anymore, even in low-level assembly coding. – Eloff Aug 24 '16 at 04:11
The question is about swapping values in registers. The concern about supplying the same variable address twice would only be if it was variable references passed to a function. You cant pass registers to a function. – JacquesB Jun 15 '23 at 06:09

score 15 · Answer 2 · 2013-07-07T21:31:30.203

The key to the answer is in the question - "working a mainframe assembler programmer" - in the days before the compiler. When humans hunkered down with assembly instructions and hand crafted exact solutions for a particular piece of hardware (that may or may not work on another model of the same computer - issues such as the timing of hard drives and drum memory had impact on how code was written - read The Story of Mel if one feels the need to be nostalgic).

Back in these bygone days, registers and memory were both scarce and any trick to not have to beg for another byte or word of memory from the lead architect was time saved - both in writing the code and execution time.

Those days are gone. The trick of swapping two things without using a third is a trick. Memory and registers are both plentiful in modern computing and humans don't write assembly anymore. We've taught all of our tricks to our compilers, and they do a better job of it than we do. Chances are the compiler did something even better than what we would have done. In most cases, sometimes we need to write assembly in some inner bit of a tight loop for some reason... but it isn't to save a register or a word of memory.

It might be useful again if one is working in a particularly limited microcontroller, but optimizing a swap isn't likely the source of one's problem then - trying to be too clever is more likely a problem.

Registers aren't really so plentiful in many contexts. Even if a processor has an ample supply of registers, every register used in an ISR is a register which must be saved beforehand and restored afterward. If an ISR takes a twenty cycles, and is supposed to run every forty cycles, each extra cycle added to the ISR will degrade system performance be five percentage points. — supercat, Aug 26 '15 at 17:51
Assembly is still used - after all, *someone* have to write the backends to the compilers you rely on. And compiler output is extremely performance sensitive, so saving a register can be a big deal. — JacquesB, Jun 11 '23 at 19:15
I can't imagine this being economic even for a compiler these days as it creates more cases where the pipeline can stall waiting for results. In the obvious way only instruction 3 depends on instruction 1, while in the trick way instruction 2 depends on 1 and 3 depends on 2. In the era of assembly programming on mainframes such stalls did not exist. — Loren Pechtel, Jun 13 '23 at 00:07

score 9 · Answer 3 · answered Jan 09 '13 at 15:07

Will it work? Yes.

Should you use it? No.

This sort of micro-optimization would make sense if:

you have looked at the code the compiler generates for the straightforward way of doing this (assignment and a temporary) and decided that the XOR approach generates faster code
you have profiled your app, and found the cost of the straightforward approach outweighs the clarity of the code (and resulting savings in maintainability)

To the first point, unless you've done this measurement, you should be trusting the compiler. When the semantics of what you're trying to do are clear, there are a lot of tricks the compiler can do, including rearranging variable access so that the swap is not needed at all, or in-lining whatever machine level instructions provide the fastest swap for a given data type. "Tricks" such as the XOR swap make it harder for the compiler to see what you are trying to do, and thus make it less able to apply such optimizations.

To the second point, what are you gaining for the added complexity? Even if you've measured and found the XOR approach faster, is this having enough impact to justify a less clear approach? How do you know?

Finally, you should look into whether there is a standard swap function for your platform/language -- the C++ STL, for instance, provides a template swap function which will tend to be highly optimized for your compiler/platform.

The question is about using the trick in assembler code, not about using it in a high-level language. There is not compiler involved. — JacquesB, Jun 11 '23 at 19:22

score 3 · Answer 4 · answered Dec 19 '13 at 10:47

My colleague reported that this is a base trick he has taught in university studying for a programmer of automated systems. Many such systems are embedded ones with limited resources and they could lack a free register to keep the temporary value; in that case, such tricky exchange (or its analog with adding and subtracting) is getting vital so still being used.

But one shall care using it that these two locations can't be identical because in the latter case it will effectively zero both values. So usually it's being limited to obvious cases like exchanging a register and a memory location.

With x86, the xchg and cmpxchg instructions satisfy the need in most cases, but RISCs generally aren't covered with them (except Sparc).

score 0 · Answer 5 · answered Jun 11 '23 at 16:13

0

If it makes sense to do that, then any decent compiler will be smart enough to recognise the situation and do the xor trick for you. If it doesn’t make sense, then the compiler won’t.

You will get the best results by swapping the two numbers. And it will also work for exchanging two pointers, two doubles, two arrays with 8 chars and so on, when the xor trick will be hard to use.

answered Jun 11 '23 at 16:13

gnasher729

42,090
4
59
119

2

To apply this optimization the compiler would need to prove that you're not trying to swap a location with itself. So it's quite likely that it wouldn't be able to apply that optimization, since that proof might require a deeper understanding of the surrounding code than the compiler has. – CodesInChaos Jun 11 '23 at 20:18
@CodesInChaos It’s called alias analysis. Any decent compiler will know must of the time that two addresses are the same, or that they are different. Swap (&a, &b) for example. – gnasher729 Jun 13 '23 at 21:28

score 0 · Answer 6 · answered Jun 12 '23 at 18:59

If you are working with distributed systems, it is plausible (although unlikely) that you may run into a memory constraint somewhere and still need to swap a variable value with another variable's value without pointing at a third location in memory.

These days, I'd bet this is the only time where this knowledge would be useful beyond an interview question.

score 0 · Answer 7 · answered Jun 13 '23 at 06:45

It will be relevant only if you

are writing in assembler
using an instruction set that does not have a native swap operation
need to be economical with registers, and the code is performance-critical enough that you can't just push a value on the stack.

The only plausible scenario I can imagine is in code generation in a compiler backend. Compiler output is performance-critical and you want to utilize all available registers to avoid more expensive stack or memory operations.

In a high level language, the xor-swap is not useful, since you use local variables stored in memory rather than registers. Introducing a temp variable does not have notable cost, and the xor swap is likely to be slower because the xor operations would be in addition to the memory reads or writes.

In a modern language, you would just write

[a, b] = [b, a]

and let the compiler figure it out.

Is this XOR value swap algorithm still in use or useful

7 Answers7

Linked