More

bazzargh · 2025-09-04T15:02:18 1756998138

A word of caution. A few years ago we had a production impact event where customers were getting identical cookies (and so started seeing each others sessions). When I took a look at the code, what I found was that they were doing something very like your code - using a time() based seed and an PRNG.

Whenever we deployed new nginx configs, those servers would roll out and restart, getting _similar_ time() results in the seed. But the individual nginx workers? Their seeds were nearly identical. Not every call to the PRNG was meant for UUIDs, but enough were that disaster was inevitable.

The solution is to use a library that leverages libuuid (via ffi or otherwise). A "native lua" implementation is always going to miss the entropy sources available in your server and generate clashes if it's seeded with time(). (eg https://github.com/Kong/lua-uuid, https://github.com/bungle/lua-resty-uuid)

tptacek · 2025-09-04T15:25:34 1756999534

Why would you ever use an insecure RNG to generate a cookie?

bazzargh · 2025-09-04T16:01:46 1757001706

In the code I saw, at least twice in its history people had introduced a "pure lua" solution for speed, and were clearly unaware of the shotgun they'd just pointed at their feet. (as in, somebody saw the issue and fixed it, and then someone else _fixed it back_ before I came along).

But in case _I'm_ messing up here, I'll bow to your expertise: libuuid uses /dev/random, which uses a CSPRNG (ChaCha20) with entropy ingested via Blake2 from whatever sources the system can get, right?

We did actually do a bunch of before/after testing showing the collision rates (zero after), and I believe the cookie in question has been replaced with a third party identity system in the intervening years - but if we did it wrong, I'd like to know.

akerl_ · 2025-09-04T16:55:39 1757004939

I think the question isn’t “why switch it to libuuid”, it’s “why is anybody ever setting it to a time-based non-CS PRNG”.

magicalhippo · 2025-09-04T18:12:26 1757009546

Had this issue on a ray tracer I worked on. Since sampling was supposed to be random, you could fire it up on multiple machines and just average the result to get a lower noise image.

Except the distributed code fired it up all worker instances almost simultaneously and the code used time() to seed the RNG, so many workers ended up using the same seed and hence averaging those results did nothing.

01HNNWZ0MV43FF · 2025-09-04T19:29:08 1757014148

I am reminded of an article about a poker site:

"There are 52-factorial ways to shuffle a deck of cards, but the site's PRNG only has 32 bits of state. 4 billion is alarmingly less than 52-factorial! But even worse, the PRNG is seeded using the number of milliseconds since midnight. 86 million is alarmingly less than 4 billion!"

So the actual entropy on the card table was equivalent to about 5 cards' worth. After seeing the 2 cards in his hand, and the 3 cards in the flop, he could use a program to solve for every other card in everyone's hand and in the entire deck!

(I may have mixed up many details - If anyone has an archive of the article please post it!)

degamad · 2025-09-04T22:12:07 1757023927

I assume you're talking about

https://web.archive.org/web/20140210072712/http://www.laurad...

Previously on HN

https://news.ycombinator.com/item?id=7207851

jandrewrogers · 2025-09-04T18:08:56 1757009336

UUIDv4 is banned in some environments because of how common it is to find someone using weak PRNGs to generate them. It happens way more often than it should.

delduca · 2025-09-04T15:34:09 1757000049

Thank you for your advise, I will update my blog with it.

Just FYI I only use this on a hidden n' seek game engine, so it is fine.

bazzargh · 2025-08-27T19:18:42 1756322322

Bush Derangement Syndrome is covered (the writeup is linked to from the TDS article) but there is something special when republicans in multiple state legislatures have proposed _legislation_ on the subject of TDS, under that name, which would spend taxpayer money. https://en.m.wikipedia.org/wiki/Trump_derangement_syndrome#P...

bazzargh · 2025-08-27T19:11:07 1756321867

Remove 501(c)3 status, apparently. Trump's repeatedly threatened this in other cases - the TNPA concluded he didn't have that power with executive orders, but congress did https://tnpa.org/nonprofits-under-fire-how-the-irs-can-and-c...

Not a lawyer tho, and it seems that even with a majority getting something like that through congress would be very difficult.

hshdhdhj4444 · 2025-08-27T19:14:21 1756322061

So bias is reason to remove 501c3 status?

Then should we remove the 501c3 status of every church, mosque, temple, etc in the U.S. because they are biased towards not just the existence of a god, but the existence of their particular version of god?

slipperydippery · 2025-08-27T22:48:48 1756334928

More relevantly, it’s an open secret that a lot of churches are heavily into political advocacy directly for candidates, which they’re not supposed to do under their tax status, but they’ve been playing with the boundaries unchecked and are now really obviously past where they’re supposed to be—but nobody’s got the guts to go after them, so they just keep getting bolder.

bazzargh · 2025-08-22T09:28:15 1755854895

the paper also has a website with a longer supplemental video than the one linked to above, includes most of the examples from the paper but animated

https://www.epfl.ch/labs/gcm/research-projects/c-tubes/

bazzargh · 2025-08-21T12:08:28 1755778108

Absolutely you can. The places in France and Spain I've flown to just suggest you bring your own pedals, so they match your shoe cleats; they'll fit them before your hire. You can usually bring your own saddle too. It's far more convenient than bringing the bike.

I've also done it the other way, my main bike has S&S coupling so I could bring it aboard the Eurostar. For touring I prefer my own bike, because I have the racks set up for my panniers, but when I do that, I prefer travelling by ferry/train.

mc3301 · 2025-08-22T00:30:48 1755822648

I love it! Even for the lift/shuttling-bike-park crowd, I bet many places would happily install cleat-matching pedals (or your own), your saddle... maybe your grips if you are picky?

bazzargh · 2025-08-12T16:19:52 1755015592

The limit given in the article is 360KB (on floppy). At that size, you can't use Tries, you need lossy compression. A Bloom filter can get you 1 in 359 false positives with the size of word list given https://hur.st/bloomfilter/?n=234936&p=&m=360KB&k=

The error rate goes up to 1 in 66 for 256KB (in memory only);

bazzargh · 2025-08-05T16:04:23 1754409863

to me it doesn't look great that Perplexity use BrowserBase at all. I asked BB's doc bot if you can customise the user agent; it says you can't because it sets the user agent automatically _in order to bypass bot checks_.

This seems to be the only secret sauce they offer; other than that it's just a headless browser farm. So perplexity saying "companies like Cloudflare mischaracterize user-driven AI assistants as malicious bots" is disingenuous at best; they chose to use a tool designed to mask their traffic and it blew up in their face?

bazzargh · 2025-07-28T12:36:41 1753706201

While this is nice enough, it bothers me that these don't look much like "art". If you look at real roman mosaics, they do not place points in a grid - they use a technique called "opus vermiculatum" https://en.wikipedia.org/wiki/Opus_vermiculatum ... snaking the tiles around so that there is a flow to it; the overall effect is much better.

I think that'd be possible to automate too. I was doing something related over here: https://hachyderm.io/@bazzargh/112767548339559102 - in that I was trying to generate sketch-like renderings from photographs. What I did was to pick random points, look at the brightness gradient (taken from the Sobel operator, there are other ways to do this), move up the gradient a bit and sketch some parallel lines (and then various experiments with hatching for shading the flatter areas)

In a similar way you could start with a grid of tiles _with some separation_, and allow them to move and align better with the gradient of the underlying picture, and not lie _on_ edges, if possible. If they overlap, allow the tessera to be cut, and only then choose images to colour-match the average on the tile, leaving some "grout" in the image (I'd probably speckle that a bit so it didn't look too uniform). Then the result might look more like real mosaics.

I might give this a go...

schwartzworld · 2025-07-28T15:50:40 1753717840

> While this is nice enough, it bothers me that these don't look much like "art".

The author didn't invent this style. The first thing I thought of was the poster for the Truman Show. https://artofthemovies.co.uk/products/the-truman-show-1998-m... The site seems to make that happen quite well.

bazzargh · 2025-07-29T21:14:10 1753823650

I managed to get some decent results using this algorithm (not at my own computer so can't post code, yet):

- create a smaller greyscale copy of the image, and use sobel to calculate gradients (smaller to speed this up)

- set a gradient magnitude threshold, and add those points to a queue, largest first.

- for these points, add 2 squares, one either side of the queued point, in the direction of the gradient (ie you expect one to be light and one dark)

- add the squares to a location hash as they are placed. (I'm using a grid size slightly larger than my square tiles). The location hash is to speed up comparisons.

- skip placing a square if it would fall within a small radius of a previous square's centre (I used 0.5 of a square size). When looking up a point in the location hash, remember to look not for the point itself, but for the hash values for the corners of an axis-aligned square with your point at the centre; this is to catch overlaps when the point falls need grid lines.

- once all points in this queue have been processed or skipped, we start on a new queue, containing all squares placed so far.

- for each square in the queue, try to place a new square to its north, south, east and west along its alignment; as before skip these if they overlap too much

- any squares we place - jitter their position and angle slightly (otherwise it looks horribly unnatural)

- any squares we do place, add them to the phase 2 queue.

The first phase is quite slow, set the threshold high. Second phase placement is very fast. My squares all use a grey stroke the same grey as the background for the grout effect, and the squares are drawn using the colour of the point picked as the square's centre (I don't bother averaging). I have it rendering this interactively, using requestAnimationFrame, so it doesn't clog up the browser - I add about 50 tiles per frame

I'm looking at one it did of the mona lisa; it places the phase 1 tiles along her hairline and hand in a nice "vermiculatum" way, the phase 2 placement is less satisfying but with jitter it seems ok. Originally I'd thought about calculating where squares overlap and cutting tiles nicely but it was quicker just to _allow_ the overlap and so most of what you see are the whole tiles placed on top of partials. The overall effect isn't _quite_ like hand placed tiles but I like it better than a grid.

You can see a couple of output images here. https://hachyderm.io/deck/@bazzargh/114938616011157584

Theodores · 2025-07-28T17:14:29 1753722869

Whilst you are at it...

https://github.com/openseadragon/openseadragon

A photo mosaic demands to have photos that are high resolution. You don't want to zoom in to find blurry jpegs, it just isn't right!

I have had great fun with OpenSeadragon in the past and now there is the VIPS image processing library for writing out a massive set of image tiles.

Hence it is possible to work with thumbnails and then render out the thing with OpenSeadragon and VIPS.

OpenSeadragon was amazing when it was a Microsoft demo a few decades ago, but time has moved on. I wonder what can be done with tilesets in HTML5 with picture tags or in SVG to present a infinitely zoomable montage.

I like your suggestion and the options for rotating and clipping images with SVG methods. I always confuse mask and clip, but, in SVG, much could be done.

For me, the starting point would be to do an SVG with thousands of images in it, to just watch my computer crash as I step up the resolution. Really I want to recreate OpenSeadragon in SVG...

wang_li · 2025-07-28T16:09:44 1753718984

There is a naive approach to making this kind of thing that reduces the component images to such a small size (2-3 pixels in large image) that turns this into more of a dithering exercise than looking for artifacts in each component image to match up lines. It's still a nice effect, but it's quite different when the component images are > 10% the size of the final image, instead of < 1% the size.

wvlia5 · 2025-07-28T12:55:42 1753707342

It could be better sometimes for tiles to lie over edges. For example, there might be an edge dividing red and green areas, and one of your tiles is mostly half red and half green.

bazzargh · 2025-07-13T21:09:15 1752440955

This is frustratingly difficult to understand from the site because they never include a human in the photos for scale (though the text says it's 3m in diameter). This video is a bit better https://www.youtube.com/watch?v=BvL0T5xyG5E ... it shows a cutaway in a scale model that makes it clearer where people stand, pictures of people painting the exterior and video of the globe in motion. I can't find any of what it looks like inside while in motion, which would have been nice. I guess I'll just have to go visit!

bazzargh · 2025-05-13T20:10:37 1747167037

Back in... 2006ish? I got annoyed with being unable to copy text from multicolumn scientific papers on my iRex (an early ereader that was somewhat hackable) so dug a bit into why that was. Under the hood, the pdf reader used poppler, so I modified poppler to infer reading order in multicolumn documents using algorithms that tessaract's author (Thomas Breuel) had published for OCR.

It was a bit of a heuristic hack; it was 20 years ago but as I recall poppler's ancient API didn't really represent text runs in a way you'd want for an accessibility API. A version of the multicolumn select made it in but it was a pain to try to persuade poppler's maintainer that subsequent suggestions to improve performance were ok - because they used slightly different heuristics so had different text selections in some circumstances. There was no 'right' answer, so wanting the results to match didn't make sense.

And that's how kpdf got multicolumn select, of a sort.

Using tessaract directly for this has probably made more sense for some years now.

steeeeeve · 2025-05-13T21:53:46 1747173226

I too went down that rabbithole. Haha. Anything around that time to get an edge in a fantasy football league. I found a bunch of historical NFL stats pdfs and it took forever to make usable data out of them.