Yes the patterns repeat every 15 integers.
So you only need to do one division operation, get the index modulo-15.
I was bored once at work and figured out a way to compute this without doing much arithmetic. It only requires 1 modulo operation.
for (int i=1; i<100;i++)
{
// make a bit vector with a 1 in ith bit, modulo 15.
unsigned int i_as_bitvector = 1 << (i % 15);
// Check it against the valid positions for FizzBuzz
// 0ctal 011111 is binary 1001001001001
// Octal 02041 is binary 10000100001
printf("%d ", i);
if (i_as_bitvector & 011111 ) printf("Fizz");
if (i_as_bitvector & 02041 ) printf("Buzz");
printf("\n");
}
I also have a version which has NO magic constants anywhere in the program, except 3 and 5. I'll post it if someone is interested.
Also, after printing 1 to 9, you have a single pattern of 30 numbers that repeats exactly 3 times from 10 to 99, then 30 times from 100 to 999, then 300 times from 1000 to 9999, and so on. That lets you extract lots of code from the loop and run it roughly once every 10^n numbers.
Why would you think in sets of ten, when there should actually just be one pattern in 15? Then it just becomes a challenge to arrange your code to work on these blocks of 15.
We could probably code 16 versions of the block of 15 code that repeat and are nicely aligned for SIMD.
In my mind a SIMD version would work by predefining a string buffer 15 elements long with Fizz in positions 0,3,6,... and Buzz in positions 0,5,10. These are constants that don't ever change. Then vectorized code would only write to positions that change, the numbers: 1,2,4,7,8,11,13,14. Most of the time these positions would have fixed width too (lengths of large numbers don't change often) so you can use constant write addresses. So 8 SIMD threads could handle a block, and write everything blockwise.
I was thinking blocks of 10 because I can itoa(the number of tens) and then concat with the subset of 0, 1, 2, 3, 4, etc. that I care about. I guess with blocks of 15 you just need to do two itoas, and worry about two types of block vs 3.