c - Efficient integer compare function

This one has no branches, and doesn't suffer from overflow or underflow: return (a > b) - (a < b); With gcc -O2 -S, this compiles down to the following six instructions: xorl %eax, %eax cmpl %e... Read More

c++ - How to force GCC to assume that a floating-point expression is non-negative?

You can write assert(x*x >= 0.f) as a compile-time promise instead of a runtime check as follows in GNU C: #include <cmath> float test1 (float x) { float tmp = x*x; if (!(tmp >= 0.0f))... Read More

java - How to see JIT-compiled code in JVM?

General usage As explained by other answers, you can run with the following JVM options: -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly Filter on a specific method You can also filter on a specifi... Read More

How does the stack work in assembly language?

I think primarily you're getting confused between a program's stack and any old stack. A Stack Is an abstract data structure which consists of information in a Last In First Out system. You put arbit... Read More

performance - Why is a conditional move not vulnerable for Branch Prediction Failure?

Mis-predicted branches are expensive A modern processor generally executes between one and three instructions each cycle if things go well (if it does not stall waiting for data dependencies for thes... Read More

assembly - What do the brackets mean in x86 asm?

[L1] means the memory contents at address L1. After running mov al, [L1] here, The al register will receive the byte at address L1 (the letter 'w').... Read More

c++ - Why does using the ternary operator to return a string generate considerably different code from returning in an equivalent if/else block?

The overarching difference here is that the first version is branchless. 16 isn’t the length of any string here (the longer one, with NUL, is only 15 bytes long); it’s an offset into the return objec... Read More

c++ - Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations with _mm_popcnt_u64 on Intel CPUs

Culprit: False Data Dependency (and the compiler isn't even aware of it) On Sandy/Ivy Bridge and Haswell processors, the instruction: popcnt src, dest appears to have a false dependency on the dest... Read More

assembly - Purpose of ESI & EDI registers?

SI = Source Index DI = Destination Index As others have indicated, they have special uses with the string instructions. For real mode programming, the ES segment register must be used with DI and DS... Read More

c++ - How to disassemble a binary executable in Linux to get the assembly code?

I don't think gcc has a flag for it, since it's primarily a compiler, but another of the GNU development tools does. objdump takes a -d/--disassemble flag: $ objdump -d /path/to/binary The disassembl... Read More