Hacker News new | past | comments | ask | show | jobs | submit login

You don't even need to do division. There are only 3 patterns that repeat for every set of 10 numbers. You just need to track which one you're on.



Yes the patterns repeat every 15 integers. So you only need to do one division operation, get the index modulo-15.

I was bored once at work and figured out a way to compute this without doing much arithmetic. It only requires 1 modulo operation.

   for (int i=1; i<100;i++)
   {
       // make a bit vector with a 1 in ith bit, modulo 15.
       unsigned int i_as_bitvector = 1 << (i % 15); 
       // Check it against the valid positions for FizzBuzz

       // 0ctal 011111 is binary 1001001001001          
       // Octal  02041 is binary   10000100001
       printf("%d  ", i); 
       if (i_as_bitvector & 011111 ) printf("Fizz"); 
       if (i_as_bitvector &  02041 ) printf("Buzz"); 
       printf("\n");
  }

I also have a version which has NO magic constants anywhere in the program, except 3 and 5. I'll post it if someone is interested.


Also, after printing 1 to 9, you have a single pattern of 30 numbers that repeats exactly 3 times from 10 to 99, then 30 times from 100 to 999, then 300 times from 1000 to 9999, and so on. That lets you extract lots of code from the loop and run it roughly once every 10^n numbers.


Why would you think in sets of ten, when there should actually just be one pattern in 15? Then it just becomes a challenge to arrange your code to work on these blocks of 15.

We could probably code 16 versions of the block of 15 code that repeat and are nicely aligned for SIMD.


In my mind a SIMD version would work by predefining a string buffer 15 elements long with Fizz in positions 0,3,6,... and Buzz in positions 0,5,10. These are constants that don't ever change. Then vectorized code would only write to positions that change, the numbers: 1,2,4,7,8,11,13,14. Most of the time these positions would have fixed width too (lengths of large numbers don't change often) so you can use constant write addresses. So 8 SIMD threads could handle a block, and write everything blockwise.

Same idea could be used to parallelize for a GPU.


Because the digits increment in groups of ten.


I was thinking blocks of 10 because I can itoa(the number of tens) and then concat with the subset of 0, 1, 2, 3, 4, etc. that I care about. I guess with blocks of 15 you just need to do two itoas, and worry about two types of block vs 3.


> There are only 3 patterns that repeat for every set of 10 numbers

Ah, yep, good point!


The real question is if you can use some for formatting numbers.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: