For various reasons, I've become quite curious about what the % operator does _internally_ in C++.
For instance, one could define N%k for 0 < k < N by repeated subtraction, as in
while(N >= k) { N -= k; } return N;
but this would be very bad ( floor(N/k) steps is too much!! ).
Therefore, I was wondering: How is the modulus (%) actually implemented in C++? Is it dependent on the compiler or the system you're running on? Is there any online documentation which answers this type of question? (I have similar questions about integer division, bitwise operators, comparison operators, etc.)
-----------
Alternatively, here is a more explicit question. Its answer should be fairly illustrative, with or without the previous answers...
Suppose N is some non-negative integer, stored as an unsigned int, or an unsigned long long, etc... What is the difference between N % 4 and N & 3 ?
2015-02-14T00:51:10Z
Since the only answer so far doesn't answer my intended question at all, it occurs to me that I may not have been clear enough.
I have absolutely no problem with using the % operator or understanding its _result_. My question is: How?
That is, when you use the operator %, what steps does the program do to find the result?
2015-02-14T00:57:46Z
It definitely isn't by repeated subtraction, since then something like N % 4 for a typical unsigned long long N would take absolutely forever (roughly O(N) ).
I suppose it could be something like repeated subtraction of left-shifted copies of 4 (which is O(log N), I think), but this is exactly what I'm asking:
What is the precise method used to find the modulus?
Kookiemon2015-02-14T00:51:37Z
Favorite Answer
How is the modulus (%) actually implemented in C++?
If I were to hazard a guess I would imagine it relies on the DIV Assembly command which stores the remainder in the AH, DX, or EDX register depending on the size of the values being divided; a BYTE, WORD, or DWORD.
So what does all this mean?
A computer uses machine code to run every single command that the CPU uses to do it's magic; it's fairly unintelligible though. Just above machine code is assembly language which is a low-level programming language that directly translates to machine code sans all the semantic names. A compiler such as C++ converts the code you type into machine code.
MOV dl, 'a'; ' Move character 'a' into DL register. MOV ah, 2h; ' 2h is the command to print DL to the screen. ' AH is where commands can be placeD. int 21h; ' Execute the command in AH.
' This prints out the letter 'a' to the display.
DIV is an assembly instruction. It tells the processor to divide the contents of one register by the contents of another register. This is as low-level as you can get without designing the logic gates in a chip. DIV works differently based on the size of the integers you want to divide. There are essentially three different sizes of data that it can take as input; a byte, a word, and a double word. A byte is 16-bits long, a word is 32-bits long, and a double word is 64-bits long. The size of a register is dependent on the "size" of the CPU. A 32-bit CPU will have registers that are 32 bits. A true 64-bit processor will have 64-bit registers.
Now each register has a name and is broken down into 8 bits. One register name is AX which is 16 bits long. It has an upper and lower 8 bit register called AH and AL. Above AX is EAX which is 32 bits long and on 64-bit CPU's you have a 64-bit register which would be called RAX. There are other registers so don't get stuck on the example I used.
Each register has a purpose. Some store data, some store commands, some store the resulting data from the execution of those commands. You can cause some serious damage to your computer if make a mistake at this level. There is no error checking.
Is it dependent on the compiler or the system you're running on?
The registers that store the remainder would most certainly be different for big- and little-endian systems.
You can Google the differences.
Is there any online documentation which answers this type of question? (I have similar questions about integer division, bitwise operators, comparison operators, etc.)
Google
assembly <keyword>
and you should get plenty of hits that adequately explain their implementations.
Suppose N is some non-negative integer, stored as an unsigned int, or an unsigned long long, etc... What is the difference between N % 4 and N & 3 ?
The latter method only works for finding the remainder when the divisor is of the form 2ᴺ.
The latter method is significantly more efficient in Assembly but only with the aforementioned divisors.
Every piece of code you write has a cost to it. That cost is running time. The longer the running time the longer it takes for your code to run. The DIV assembly instruction is a relatively costly instruction to run. It takes several cycles of the processor to calculate the results and place those results into the registers.
The modulo operator (%) likely uses the DIV assembly instruction to calculate the remainder so it has a moderately high cost in running time compared to an AND instruction which may take only two or three cycles.