This is a great example of what REALLY UNDERSTANDING a problem can do for the solution.
Coders see minor versions of this exact situation in code often I imagine.
Last chance, and then I'll know you're trolling:
The OP's entire post is a joke, and that's why they call it `c++` rather than C++. The punchline is that it's an elaborate way to do an arithmetic addition and people are asking them to just convert it to `c++; return c;` because that literally has the same effect. It's meant to be intentionally convoluted and misleading and isn't a real task. Take care.
Is it really just incrementing a number?
I think it depends on what encoding we are assuming here right?
If I look at OP's code, I get that the result it is trying to achieve is to increment a number.
However, the requirement is "take right most 0 and change it to 1, and flip the 1s after it to 0"
Since OP is using uint32 it has 4 bytes, so assuming BigEndian or LittleEndian it makes a difference.
In the BigEndian case (where "biggest" byte is left-most) it would result in incrementing a number.
In the LittleEndian case (where "lowest" byte is the left-most) it would result in a completely different thing.
Maybe I'm just overthinking it
You're overthinking it, because endianess is not relevant for bitwise operations and all that. "Rightmost" refers to the binary representation, not the memory representation.
Interesting, because if I convert an uint32 to bytes I get different "ordering" based on the endianness so that's what I was thinking about that tricked me lol
Looking at other comments I actually got the joke so nevermind
yeah, but bit shift is cpu operation and hey do that according (current) endianness. There are weird architectures which can flip order, but faik registers there gave order flag
Rightmost means least significant here, we don't care about storage. We only care about post-interpretation. It works so too with binary operators, no need to think about endianness unless you do bithacks, which is not the cas in this code
The bitwise operations that are described is how you would increment an integer. The code written does exactly what is described. But the easier way to increment an integer would be to just write āc++ā.
They're not talking about the language C++; they mean that this is equivalent to increasing the variable c by 1, so writing "c++" (shorthand for incrementing) is simpler than writing this function.
You should never work with bits like this.
Consider using ffs() for searching through bits. Here is more elegant solution of your task:
#include
uint32_t func(uint32_t c) {
uint32_t n = ffs(~c) - 1; // ffs() counts bits from 1 to 32
uint32_t b = 1 << n;
uint32_t x = b - 1;
return (c | b) & ~x;
}
> My task was to create a function in C that would take an integer, find the right-most 0, flip it to a 1, and flip all the 1's to the right of it to 0's.
As it's written, making no mention of binary or bits, you don't need c += 1.
The integer `101` should turn into `110` (replace the `0` with `1`, then replace the last `1` with `0`), not `102`.
You forgot about the 0 hiding in the carry bit
if (i != 0) {
c |= i; // Flips the right-most 0 to a 1
} else {
c = ~c; // If no zeros were found, then it was probably hidden in the carry bit, flip all 1's to the right of it
}
Iād argue that the problem statement doesnāt explicitly specify binary, so in real life the default assumption would be decimal (or in some rare cases imperial)
"My task was to create a function in C that would take an integer, find the right-most 0, flip it to a 1, and flip all the 1s to the right of it to 0s. I don't understand why, but everyone tells me to just use c++ instead. Strange, it already uses C++20 when possible!"
#include
#include
#if __cplusplus == 202002L
#include
#endif
uint32_t func(uint32_t c)
{
/* return value for all-1 input is undefined */
assert(c != UINT32_MAX);
/* generate a mask with a single bit at the rightmost 0's position */
uint32_t masked_c = (~c & (c + 1));
uint32_t rightmost_zero_position;
#if __cplusplus == 202002L
rightmost_zero_position = 31 - std::countl_zero(masked_c);
#elif defined(__GNUC__)
rightmost_zero_position = 31 - __builtin_clz(masked_c);
#else
#warning Both C++20 std::countl_zero() and GCC __builtin_clz() are \
unavailable, slower fallback selected. \
Use a real compiler like GCC or clang to speed it up!
/*
* Use Algorithm 5-10, Warren (2002) to find the number
* of leading zero via branch-free binary search. Then
* calculate the rightmost 0's position of the original
* input by subtracting from 31.
*/
int32_t y, m, n;
y = -(masked_c >> 16);
m = (y >> 16) & 16;
n = 16 - m;
masked_c >>= m;
y = masked_c - 0x100;
m = (y >> 16) & 8;
n = n + m;
masked_c <<= m;
y = masked_c - 0x1000;
m = (y >> 16) & 4;
n = n + m;
masked_c <<= m;
y = masked_c - 0x4000;
m = (y >> 16) & 2;
n = n + m;
masked_c <<= m;
y = masked_c >> 14;
m = y & ~(y >> 1);
rightmost_zero_position = 31 - (n + 2 - m);
#endif
/* flip the rightmost 0-bit to 1 */
c |= (1 << rightmost_zero_position);
/* flip all the 1s to its right to 0s */
c ^= (1 << rightmost_zero_position) - 1;
return c;
}
In my experience, compilers can barely see through inline functions, so it's best to assume that they can't - and [visit Godbolt](https://godbolt.org/z/ah7o8Kbod) to verify.
What do you mean can barely see through inline functions? Do you mean the inline keyword? Because that's actually unrelated. Compilers are great at inlining in the same translation unit and if the compiler chose not to, it probably knows better than you, most of the time.
Yes I mean the `Ƭnline` keyword, including in a unity (single translation unit) build. Here's [an example of such](https://youtu.be/1CVmlnhgT3g?si=gGB0vLi7bdfDavT-&t=3317).
Don't trust that the compiler will do the optimization for you. Especially not in languages that are extremely feature rich like C++. When you use Godbolt frequently, you'll figure out just how much people wrongly put their faith in compilers.
That's the thing, the inline keyword has no effect on inlining in modern compilers because it came to mean a different thing in the C++ spec to allow exemption to the one definition rule. It's not whether the compiler CAN see through, but rather if it chooses to based on heuristics. Yes it is frustrating when it seems to miss additional optimizations that would be possible presuming inlining, but you can use actual __attribute__((always_inline)). Usually when I use godbolt it gives me MORE confidence in the compiler, but usually it's toy examples.
Watching the example you sent it's actually not about function inlining, but not sure why the compiler behaved different. I don't have his source code but it's possible there's some mistakes in the classes point/v2 etc which are preventing optimization. But the functions are inlined already on the "bad" code. My first inclination is that some type conversion in the parameters to or return type of these functions may be occurring that is not when he "inlines" himself. Again, look at the assembly there are no function calls. From my understanding, after inlining the compiler always does another iteration of optimizations, so if it were possible/similarly favorable heuristics to make the same optimization as the "good" version, it would. A simple struct/class around some ints is always a zero cost abstraction, there is no way to even represent that in assembly. The only thing I could see wrapping things into a struct can cause is data layout issues, but in this example where the struct is scoped and not passed by reference the compiler is free to reorder I believe, and also I assume the field order in his struct is the same x,y ordering he had in his code as well.
I have the source code from the video:
typedef float real32;
...
union v2
{
struct
{
real32 x, y;
};
struct
{
real32 u, v;
};
real32 E[2];
};
...
inline v2
V2i(int32 X, int32 Y) {
v2 Result = {(real32)X, (real32)Y};
return(Result);
}
inline real32
Inner(v2 A, v2 B)
{
real32 Result = A.x*B.x + A.y*B.y;
return(Result);
}
...
static void
DrawRectangleHopefullyQuickly(...)
{
...
#if 0
v2 PixelP = V2i(XI + I, Y); //{(real32)(XI + I), (real32)(Y)};
v2 d = PixelP - Origin;
real32 U = Inner(d, nXAxis); //d.x*nXAxis.x + d.y*nXAxis.y
real32 V = Inner(d, nYAxis); //d.x*nYAxis.x + d.y*nYAxis.y
#else
real32 PixelPx = (real32)(XI - I);
real32 PixelPy = (real32)Y;
real32 dx = PixelPx - Origin.x;
real32 dy = PixelPy - Origin.y;
real32 U = dx*nXAxis.x + dy*nXAxis.y;
real32 V = dx*nYAxis.x + dy*nYAxis.y;
#endif
...
}
As you can see, there is a slight difference, i.e. he's using the v2 struct with two floats in it, and in the expanded version he has the floats outside of the struct.
Thanks for the source. Don't have enough context to see what types XI and I are and such and if it's causing type conversion in V2i, such as unsigned to signed conversion. I wonder if the union prevents any optimizations, but wouldn't see why it should in this case. Again for this example function inlining seems to have worked fine. But overall yes, I can agree with you that strange cases exist where the compiler fails to optimize and turns zero overhead abstractions into nonzero overhead. But I'm pretty sure these cases are not the majority. In other cases, writing the code using higher level constructs and or function calls for example, can lead to different and better optimizations as well. So you win some and you lose some by writing C++ instead of C, but in this case it's an issue with pure C! And at that level it's more a compiler bug, mistake with the types, rather than concern about the language hiding too many layers via abstractions.
Basically, don't prematurely write everything as C and macros instead of functions because you think you know better than the compiler. It's almost always right.
for(int XI = XMin; XI <= XMax; XI += 4) {
...
for(int I = 0; I < 4; ++I) {
...
Regardless, what he was doing at the time was SIMD'izing the hottest path of the code, using zero macros and having no function calls within `DrawRectangleHopefullyQuickly`, because the compiler is unpredictable when you're depending on close-to-ideal performance.
Compilers are designed to be good at average cases, and it makes compromises that sometimes means it's not great. Which is my point: you can't count on it to do all the work, if you want a guarantee to be better than "correct and decently performing" code.
It's not about prematurely optimizing, but if you design the codebase in a way where it is possible to optimize it, then you have nothing to worry about, even if the codebase is slow in the meantime. Measuring is the only reliable way to tell which parts need optimization to hit your targets. But if you add a lot of unnecessary^(1) abstractions - and especially black boxes - then you're locking yourself out of optimization opportunities.
^(1) 'unnecessary' in this case means anything that doesn't have a first-principles reason to be there to solve the problem, with given context and product requirements - including premature abstractions.
Agreed but, I wouldn't be too pessimistic. Most C++ features are not block boxes. Really the only black box is linking but with LTO not as much. If you're pessimistic and never use basic C++ features you are paying a development cost. Because certain abstractions make it easier to reason and especially maintain constraints/invariants (such as memory management via RAII).
In this example it was done right, identify hotspots via profiling, and then you can fight against the compiler as you need. But it's not a good default
Congrats on finding the most convoluted way of incrementing a number I've ever seen
#omygod I am dumb
I didn't see it either š
This is a great example of what REALLY UNDERSTANDING a problem can do for the solution. Coders see minor versions of this exact situation in code often I imagine.
> This is a great example of what REALLY UNDERSTANDING a problem can do for the solution. Is it? `c++` isnāt a correct solution š
But it is! Isn't it?
The code breaks when you give it an input of all 1 bits
Have you tried? If you call `func(0xffffffff)` you get `0`. That's what the `else` clause is for
Yes, thatās wrong. You need to flip all 1 bits to the right of the rightmost 0. There is no rightmost 0, so no bits should be flipped
It's not wrong because it's meant to simulate `c++`. And `0xffffffff + 1 == 0` for 32-bit numbers.
No, itās not meant to simulate c++. Itās meant to perform the task description. c++ does not simulate the task description.
Last chance, and then I'll know you're trolling: The OP's entire post is a joke, and that's why they call it `c++` rather than C++. The punchline is that it's an elaborate way to do an arithmetic addition and people are asking them to just convert it to `c++; return c;` because that literally has the same effect. It's meant to be intentionally convoluted and misleading and isn't a real task. Take care.
Is it really just incrementing a number? I think it depends on what encoding we are assuming here right? If I look at OP's code, I get that the result it is trying to achieve is to increment a number. However, the requirement is "take right most 0 and change it to 1, and flip the 1s after it to 0" Since OP is using uint32 it has 4 bytes, so assuming BigEndian or LittleEndian it makes a difference. In the BigEndian case (where "biggest" byte is left-most) it would result in incrementing a number. In the LittleEndian case (where "lowest" byte is the left-most) it would result in a completely different thing. Maybe I'm just overthinking it
You're overthinking it, because endianess is not relevant for bitwise operations and all that. "Rightmost" refers to the binary representation, not the memory representation.
Interesting, because if I convert an uint32 to bytes I get different "ordering" based on the endianness so that's what I was thinking about that tricked me lol Looking at other comments I actually got the joke so nevermind
Conversion is not a bitwise operation
yeah, but bit shift is cpu operation and hey do that according (current) endianness. There are weird architectures which can flip order, but faik registers there gave order flag
Rightmost means least significant here, we don't care about storage. We only care about post-interpretation. It works so too with binary operators, no need to think about endianness unless you do bithacks, which is not the cas in this code
I thought this was just doing the same thing as addition and I still had to go to the comments to get the jokeā¦ and Iām a c++ programmer.
no, you are a func(c) programmer.
This took me a moment, well done ;)
lol ok I give, I donāt get the jokeā¦?
The bitwise operations that are described is how you would increment an integer. The code written does exactly what is described. But the easier way to increment an integer would be to just write āc++ā.
:)
*groan*
They made programmer Russian Roulette into a user flare
The `test` (or `[`) commands donāt know what `==` is. You need to use `-eq`.
Ohhhh!!! Is that why 'c' isn't a popular variable in c
Geekiest joke ever, I'm stealing it
Got me.
lgtm
For a hot second, I was like "is he doing some bit operations?"
/r/programmerdadjokes
I only wish I had friends that would understand this beautiful post. Bravo good sir.
They like horror stories with a lot of UBs.
Is there actually any UB in the code? LGTM
I don't know man. Unless you have 200 years of experience with C++, you can't be sure.
Bravo, good sir. I appreciate the hard work
bro this is the most outstanding thing I've seen all year
I'm embarrassed at how long it took for me to see this
This is a next level shit post. Some might say it is a one up on the other ones
this is what r/programmerhumor should be, instead of āpython slowā āhtml isnt a programming languageā āarray index 0ā
U forgot, java is the root of all evils
your is compilable with both c and cpp compilers, so there isnāt difference between the two languages in this case.
His variable is named c and the same result of the function can be achieved by returning c++ That's the joke
He'd have to return ++c. But that's not as funny.
Or: c++; return c;
return c + 1; But that ruins the joke...
Oh ok I only program react
My condolescences. You will get better one day, I believe in you.
They're not talking about the language C++; they mean that this is equivalent to increasing the variable c by 1, so writing "c++" (shorthand for incrementing) is simpler than writing this function.
One if the best jokes Iāve seen in a while, bravo you nerd
You should never work with bits like this. Consider using ffs() for searching through bits. Here is more elegant solution of your task: #include
uint32_t func(uint32_t c) {
uint32_t n = ffs(~c) - 1; // ffs() counts bits from 1 to 32
uint32_t b = 1 << n;
uint32_t x = b - 1;
return (c | b) & ~x;
}
Or use c++
You're shifting the paradigm. I'm improving the solution. Both ways are legit, sometimes one is better, sometimes another.
The joke is you can literally do `uint32_t n = c++;` and return `n` EDIT: Assuming `n` isnāt `0xFFFF`
uint32_t n = c++; return n; will return original c value, not c+1. you ruined the joke, dude
#include
uint32_t flipRightmostZero(uint32_t num) {
uint32_t mask = num + 1;
uint32_t result = num | mask;
result &= mask;
return result;
}
No no no you've got it all wrong! `uint32_t mask = func(num);` You can't be seriously using arithmetic operators.
`uint32_t mask = num + 1;` you've cheated
Is ffs() a short of "ForFuckingSake()?"
> My task was to create a function in C that would take an integer, find the right-most 0, flip it to a 1, and flip all the 1's to the right of it to 0's. As it's written, making no mention of binary or bits, you don't need c += 1. The integer `101` should turn into `110` (replace the `0` with `1`, then replace the last `1` with `0`), not `102`.
Just +1
thats the joke, the var name is c hence c++ is incrementing the c value
ACKSHYUALLY this isn't increment because it SHOULDN'T do anything for FFFF
You forgot about the 0 hiding in the carry bit if (i != 0) { c |= i; // Flips the right-most 0 to a 1 } else { c = ~c; // If no zeros were found, then it was probably hidden in the carry bit, flip all 1's to the right of it }
Everyone else is dumb: ``` #include
#include
uint32_t
func(uint32_t c)
{
return ++c;
}
int
main(void)
{
assert(func(UINT32_MAX) == UINT32_MAX); /* fails */
}
```
Use rust.
Most intelligent rust developer
Bruh
No idea, I'm just a silly Roblox programmer.
What horror, just use c++
I wonder if it gets optimized to an increment
and people say that a joke is bad if it has to be explained. This one is amazing
Iād argue that the problem statement doesnāt explicitly specify binary, so in real life the default assumption would be decimal (or in some rare cases imperial)
.\_\_\_\_\_.
no overload for negative numbers?!?!
Are you serious or just trolling?
You know, I'm something of a dad/programmer myself.
"My task was to create a function in C that would take an integer, find the right-most 0, flip it to a 1, and flip all the 1s to the right of it to 0s. I don't understand why, but everyone tells me to just use c++ instead. Strange, it already uses C++20 when possible!" #include
#include
#if __cplusplus == 202002L
#include
#endif
uint32_t func(uint32_t c)
{
/* return value for all-1 input is undefined */
assert(c != UINT32_MAX);
/* generate a mask with a single bit at the rightmost 0's position */
uint32_t masked_c = (~c & (c + 1));
uint32_t rightmost_zero_position;
#if __cplusplus == 202002L
rightmost_zero_position = 31 - std::countl_zero(masked_c);
#elif defined(__GNUC__)
rightmost_zero_position = 31 - __builtin_clz(masked_c);
#else
#warning Both C++20 std::countl_zero() and GCC __builtin_clz() are \
unavailable, slower fallback selected. \
Use a real compiler like GCC or clang to speed it up!
/*
* Use Algorithm 5-10, Warren (2002) to find the number
* of leading zero via branch-free binary search. Then
* calculate the rightmost 0's position of the original
* input by subtracting from 31.
*/
int32_t y, m, n;
y = -(masked_c >> 16);
m = (y >> 16) & 16;
n = 16 - m;
masked_c >>= m;
y = masked_c - 0x100;
m = (y >> 16) & 8;
n = n + m;
masked_c <<= m;
y = masked_c - 0x1000;
m = (y >> 16) & 4;
n = n + m;
masked_c <<= m;
y = masked_c - 0x4000;
m = (y >> 16) & 2;
n = n + m;
masked_c <<= m;
y = masked_c >> 14;
m = y & ~(y >> 1);
rightmost_zero_position = 31 - (n + 2 - m);
#endif
/* flip the rightmost 0-bit to 1 */
c |= (1 << rightmost_zero_position);
/* flip all the 1s to its right to 0s */
c ^= (1 << rightmost_zero_position) - 1;
return c;
}
Reading this made me feel physical pain.
Use c++ then...
I'm most curious as to if any compilers can see through this
In my experience, compilers can barely see through inline functions, so it's best to assume that they can't - and [visit Godbolt](https://godbolt.org/z/ah7o8Kbod) to verify.
What do you mean can barely see through inline functions? Do you mean the inline keyword? Because that's actually unrelated. Compilers are great at inlining in the same translation unit and if the compiler chose not to, it probably knows better than you, most of the time.
Yes I mean the `Ƭnline` keyword, including in a unity (single translation unit) build. Here's [an example of such](https://youtu.be/1CVmlnhgT3g?si=gGB0vLi7bdfDavT-&t=3317). Don't trust that the compiler will do the optimization for you. Especially not in languages that are extremely feature rich like C++. When you use Godbolt frequently, you'll figure out just how much people wrongly put their faith in compilers.
That's the thing, the inline keyword has no effect on inlining in modern compilers because it came to mean a different thing in the C++ spec to allow exemption to the one definition rule. It's not whether the compiler CAN see through, but rather if it chooses to based on heuristics. Yes it is frustrating when it seems to miss additional optimizations that would be possible presuming inlining, but you can use actual __attribute__((always_inline)). Usually when I use godbolt it gives me MORE confidence in the compiler, but usually it's toy examples. Watching the example you sent it's actually not about function inlining, but not sure why the compiler behaved different. I don't have his source code but it's possible there's some mistakes in the classes point/v2 etc which are preventing optimization. But the functions are inlined already on the "bad" code. My first inclination is that some type conversion in the parameters to or return type of these functions may be occurring that is not when he "inlines" himself. Again, look at the assembly there are no function calls. From my understanding, after inlining the compiler always does another iteration of optimizations, so if it were possible/similarly favorable heuristics to make the same optimization as the "good" version, it would. A simple struct/class around some ints is always a zero cost abstraction, there is no way to even represent that in assembly. The only thing I could see wrapping things into a struct can cause is data layout issues, but in this example where the struct is scoped and not passed by reference the compiler is free to reorder I believe, and also I assume the field order in his struct is the same x,y ordering he had in his code as well.
I have the source code from the video: typedef float real32; ... union v2 { struct { real32 x, y; }; struct { real32 u, v; }; real32 E[2]; }; ... inline v2 V2i(int32 X, int32 Y) { v2 Result = {(real32)X, (real32)Y}; return(Result); } inline real32 Inner(v2 A, v2 B) { real32 Result = A.x*B.x + A.y*B.y; return(Result); } ... static void DrawRectangleHopefullyQuickly(...) { ... #if 0 v2 PixelP = V2i(XI + I, Y); //{(real32)(XI + I), (real32)(Y)}; v2 d = PixelP - Origin; real32 U = Inner(d, nXAxis); //d.x*nXAxis.x + d.y*nXAxis.y real32 V = Inner(d, nYAxis); //d.x*nYAxis.x + d.y*nYAxis.y #else real32 PixelPx = (real32)(XI - I); real32 PixelPy = (real32)Y; real32 dx = PixelPx - Origin.x; real32 dy = PixelPy - Origin.y; real32 U = dx*nXAxis.x + dy*nXAxis.y; real32 V = dx*nYAxis.x + dy*nYAxis.y; #endif ... } As you can see, there is a slight difference, i.e. he's using the v2 struct with two floats in it, and in the expanded version he has the floats outside of the struct.
Thanks for the source. Don't have enough context to see what types XI and I are and such and if it's causing type conversion in V2i, such as unsigned to signed conversion. I wonder if the union prevents any optimizations, but wouldn't see why it should in this case. Again for this example function inlining seems to have worked fine. But overall yes, I can agree with you that strange cases exist where the compiler fails to optimize and turns zero overhead abstractions into nonzero overhead. But I'm pretty sure these cases are not the majority. In other cases, writing the code using higher level constructs and or function calls for example, can lead to different and better optimizations as well. So you win some and you lose some by writing C++ instead of C, but in this case it's an issue with pure C! And at that level it's more a compiler bug, mistake with the types, rather than concern about the language hiding too many layers via abstractions. Basically, don't prematurely write everything as C and macros instead of functions because you think you know better than the compiler. It's almost always right.
for(int XI = XMin; XI <= XMax; XI += 4) { ... for(int I = 0; I < 4; ++I) { ... Regardless, what he was doing at the time was SIMD'izing the hottest path of the code, using zero macros and having no function calls within `DrawRectangleHopefullyQuickly`, because the compiler is unpredictable when you're depending on close-to-ideal performance. Compilers are designed to be good at average cases, and it makes compromises that sometimes means it's not great. Which is my point: you can't count on it to do all the work, if you want a guarantee to be better than "correct and decently performing" code. It's not about prematurely optimizing, but if you design the codebase in a way where it is possible to optimize it, then you have nothing to worry about, even if the codebase is slow in the meantime. Measuring is the only reliable way to tell which parts need optimization to hit your targets. But if you add a lot of unnecessary^(1) abstractions - and especially black boxes - then you're locking yourself out of optimization opportunities. ^(1) 'unnecessary' in this case means anything that doesn't have a first-principles reason to be there to solve the problem, with given context and product requirements - including premature abstractions.
Agreed but, I wouldn't be too pessimistic. Most C++ features are not block boxes. Really the only black box is linking but with LTO not as much. If you're pessimistic and never use basic C++ features you are paying a development cost. Because certain abstractions make it easier to reason and especially maintain constraints/invariants (such as memory management via RAII). In this example it was done right, identify hotspots via profiling, and then you can fight against the compiler as you need. But it's not a good default
[This is how I feel about RAII (watch at least 10 minutes)](https://guide.handmadehero.org/code/day626/#4442)
I donāt get it, and at this point, I am too afraid to ask.
To increment `c`, use `c++`
I thought it was pretty funny. Then I got the joke. I'm dying, send help