> Can you explain why such large numbers are required? Absolutely. Even if you s...

jiggawatts · on April 20, 2022

Disclaimer: I have zero silicon design experience.

However, I have designed computer game engines that use 32-bit floats throughout and encountered rounding errors in practice.

I’ve found that there’s always a solution that avoids the need to go past 64 bits, and even that is a last resort.

So for example the circle could be approximated with a polygon. Or fixed-point arithmetic can be used. Or simply use a quad-tree or related space partitioning algorithms to check for intersections.

There are literally thousands of algorithms that sidestep these issues and are used extensively in computer games, typically at 32-bit precision exclusively.

For example “back in the day” you would often see shimmering due to “z-fighting”. You would also often see white pixels due to “cracks” between adjacent polygons.

All of these are largely gone now. The problems have been solved without the enormous performance hit of 64-bit doubles, let alone 128!

Meanwhile contemporary CAD programs would insist on using 64-bit doubles, even through the OpenGL pipeline out to the screen.

But if you sit down for a second and just divide your screen (or wafer mask) by the range of your chosen numbers you’ll instantly see that you have thousands of steps per pixel (or atom!) to work with.

Any visible noise is your fault for choosing the wrong algorithm, not the fault of the hardware for not providing enough bits.

q3k · on April 20, 2022

Please keep in mind that even the slightest error in a silicon mask is likely to cause hundreds of millions of dollars of losses and months of delay in time to market for a modern chip.

With that in mind, does it make more sense to come up with new, experimental, untested algorithms... or just use wider numbers and slowly iterate on well known algorithms? Especially with LVS/DRC you really want the dumbest, easiest to reason about thing that is most likely to catch design issues no matter what. Even if it's excruciatingly slow, it's your last line of defense against writing off a set of masks as a hundreds of millions of dollars loss.

EDA / silicon CAD is a totally different world of design requirements compared to video games or even MCAD software.

EDIT: and just for context, here's a DRC set for the (very much not modern) SKY130 process: https://github.com/RTimothyEdwards/open_pdks/blob/master/sky...

jiggawatts · on April 20, 2022

The exact same arguments were made by CAD people insisting on 64-bit maths for OpenGL. They were wrong. They too were working on projects worth billions of dollars, over decades, where mistakes were very costly.

Your link to a "DRC set" doesn't mean much to me out of context. I see some basic looking code with small-ish numeric constants in it. So what? This is not that different to the input to a simple physics simulation or a computer game.

hardolaf · on April 20, 2022

So let's get this straight. You know nothing about this area and you assume the experts in it are wrong? Do you know what happens if you accidentally couple lines during one of the manufacturing steps? The wafer can, in the absolute worst case scenario, explode from super heating destroying not just the wafer but potentially the entire chamber it is in (any defect beyond what was designated as allowable by the design engineers means the chamber and everything in it is now scrap).

For somewhat obvious reasons, we have a vested interest in this never occurring. So we default to safety over speed. Meanwhile in the CAD world with 64-bit math not making it into OpenGL, they just wrote a library to do 64-bit math anyways on-top of or in parallel to OpenGL. They didn't switch away from 64-bit math, they just reduced its use where it isn't needed and kept it where it is needed. The semiconductor industry is full of absolutely brilliant engineers who know far too much about all of the problems and if they could use 64-bit instead of 128-bits for a data structure, they'd switch in a heartbeat to save massive amounts of compute time (and thus money).

jiggawatts · on April 21, 2022

> Do you know what happens if you accidentally couple lines during one of the manufacturing steps?

I understand the consequences. I also understand both both physics and computer science. A 32-bit integer is sufficient to subdivide something the size of a wafer mask to well under the wavelength of the light used for photolithography. There is literally no way for additional precision to matter for things like "coupling lines". It is impossible.

See for yourself: https://www.wolframalpha.com/input?i=3+cm++*+2%5E-32

Iterated algorithms are a different beast entirely, but there are fixed-point or integer algorithms that sidestep these issues.

You cannot imagine the volume of computer science research that has been written on shape-shape intersections in both 2D and 3D! Literal textbooks worth. Hundreds if not thousands of PhD-level papers. The sheer intellectual effort that has gone into optimisations in this space is staggering.

Hence my incredulity. I've worked with 128-bit numbers and even arbitrary-precision numbers, but only in the context of computational mathematics. There are no "physics constraints" in mathematics to limit the benefit of additional range or precision.

Also, the financial argument doesn't hold water either. Modern chips have tens of billions of features. The data volume can exceed the size of main memory of even the largest computers. Data representation efficiency and simulation speed absolutely would have tangible business benefits: faster iteration cycles, lower simulation cost, better optimisation solutions, etc...

This is literally the point of the article -- being able to do things in GPUs using their native 32-bit maths capabilities is a huge benefit to the chip design workflow. This requires clever algorithms and data structure design. You can't be wasteful because "it feels safer" if you have a budget of 24 GB (or whatever) to squeeze the mask data into.

> assume the experts in it are wrong?

Yes! Something I've noticed is that there is surprisingly little "cross pollination" between fields. You can have very smart people in one industry blithely unaware that another industry has solved their "very hard problem". I've seen this with biology, physics, medicine, etc...

How many chip design automation experts have also done low-level game engine programming? Maybe half a dozen in the whole world? Less?

jiggawatts · on April 21, 2022

Replying to myself to add:

Here's a Design Rule Checking (DRC) paper that mentions using 32-bit floats for things like checking distances between polygonal traces: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49...

If 32-bit is sufficient, then 64-bit definitely is, even for the latest & greatest silicon processes...

gchadwick · on April 20, 2022

Of course when you're doing such intersection calculations you know the things you're intersecting are very close. You don't need a general method that can test arbitrarily sized and spaced polygons against each other. You need a method to determine what is sufficiently close to each other to be worthy of a more detailed check. Then a more specific method to do this check.

You could use 32 bit integers with all shapes specified vs say a 0.1 nm grid giving you around a maximum 0.4m x 0.4m chip size which seems ample. Then when you want to check for rules violations in the cases like you specify with very fine precision use a dedicated check that can assume the relevant geometry is within a small number of grid points of each other. For example the check could work using relative coordinates rather than absolute so say switch to a grid on a 0.00001nm basis (to pull an arbitrary precision out of a hat) and convert the 32-bit absolute 0.1nm coords to relative 32-bit 0.00001nm coords.

Easier said then done to be sure (as you say the tools are egregiously expensive!) but just saying I need a 64-bit or a 128-bit float isn't trying to get to the grips with the problem, just hoping you wave it away with more bits.

bsder · on April 21, 2022

> You need a method to determine what is sufficiently close to each other to be worthy of a more detailed check.

Your line sweep data structure effectively already does that. I recommend reading the Hobby paper and thinking about how it works. And then you should think about how you differentiate inside from outside when you union/difference your polygons.

Any segments in the line sweep data structure simultaneously have already demonstrated that they need the detailed check.

If you want to argue this, you're going to need to study up on about 40 years of prior art. Given how much money this is worth and how many really smart people went after it (it basically drove the field of Computational Geometry for decades), the probability of you contributing something new to the current algorithms is basically zero.

However, the probability of you contributing something new to parallel DRC algorithms is really quite decent. Nobody I know of has yet come up with "good" parallel algorithms for DRC--most of them are hacks that quite often break down when they hit even common cases.

Being able to handle DRC on a billion VLSI polygons/10 billion line segments in a parallel fashion would be quite an advance, and the field is waiting for it.