GC is inherently harder. It has nothing to do with expectations. You have all th...

lisper · on Aug 25, 2022

> GC is inherently harder.

You need to pay closer attention to the claim I am actually making. Of course GC is harder than malloc/free. But malloc/free is not hard-real-time by default [1], and making malloc/free hard-real-time is just as hard as (indeed is pretty much the same problem as) making GC hard-real-time. Hard-real-timeness is a hard problem that is almost entirely independent of the allocation/deallocation strategy.

[1] https://militaryembedded.com/avionics/safety-certification/j...

pharrington · on Aug 25, 2022

That linked article recommends using a custom allocation/deallocation strategy, because malloc/free do not know about your application's specific memory management requirements.

p_l · on Aug 25, 2022

IIRC, malloc() is generally banned, instead of "database" approaches of object pools (possibly partitioned, possibly not), or in fact GC are used.

pizlonator · on Aug 25, 2022

I’m paying extremely close attention to repeated claim that malloc is just as hard as GC for hard real time. That claim is false. See my previous comments for the reason why.

lisper · on Aug 25, 2022

Here is your previous comment in its entirety:

> GC is inherently harder. It has nothing to do with expectations.

> You have all the proof burden of using malloc (total size of live objects must fit in memory and WCET of allocation can’t suck) plus the additional proof burden that the collector’s cycle is schedulable.

I'm sorry, but I don't find that at all persuasive. You haven't even presented an argument. All you've done is proclaim that I have the burden of proof. Sorry, but I have no such obligation. I am providing information that I believe in good faith to be true based on my experience, but it comes with no warranty of any kind. I may well be wrong. In particular, there may have been a research breakthrough in real-time malloc/free that I am unaware of. It has been quite a while (>20 years) since I kept up with this literature. But back when I was an expert there were fundamental reasons why it wasn't any easier to make malloc/free hard-real-time than GC, so a new development on this front would be a major breakthrough. It's possible that such a breakthrough has occurred and I just haven't heard about it. But it seems unlikely. It's not like I've been living in a cave, and a web search doesn't yield any recent new results.

But the appropriate response if you want to challenge me on this would be to point me to the publication where this alleged simple solution to hard-real-time malloc/free is described, not to simply proclaim that I have the burden of proof.

pizlonator · on Aug 25, 2022

The proof burden isn’t on you. The proof burden is on someone trying to ship a hard real time system with a GC. That person has to convince themselves that their system meets its deadlines. In a system that uses GC, the GC is either:

- stop the world. Then you have to prove that the GC pause will never be so long that it causes you to miss deadlines and that the pause can’t happen so frequently (even if small) that you miss deadlines. That’s impractical to do unless your heap is tiny. But some success stories of GC in hard RT did take the “lots of tiny heaps” approach, though I don’t know if they went as far as meeting the kind of proof burden that is demanded by the hardest of real time. With malloc, you could do tiny heaps if you wanted to, but you’d never be forced to do it just to appease malloc.

- the GC is a task. Then you have to prove that the task will meet its deadline. You have to do that for any task in a real time system. Unfortunately, the GC task’s deadline depends on allocation rate, which depends on the schedule of the entire rest of your system, plus a worst case allocation rate (bytes allocated per second) analysis. You don’t have to think about allocation rates if you’re using malloc. You just have to know how fast malloc is. With GC, you also have to know how fast it can allocate, and you need all that other stuff. It’s more work to get right.

- Work based incremental GC. These can avoid the schedulability analysis, sort of, but come with a lot of other baggage. The worst misfeature of these collectors is that the execution time of allocation is hella jittery. Sometimes very fast, other times very slow, though always O(1). That’s the kind of thing you could almost use in a hard RT system so long as the worst case isn’t too bad, but in practice it’s a nightmare because if you really assume that every allocation will perform as badly as the worst allocation case in one of these GCs then you’ll have to overprovision to the point of absurdity.

Most folks use a concurrent GC with some element of work based. So, the GC is a task, and its schedule is somehow tightly coupled to the current allocation rate. That’s good enough for soft RT and it can be good enough for some things on the softer side of hard RT. But even that’s hard to pull off and involves a beast of a collector, tight integration between RTOS and GC, and tight integration of the GC and test plan. No matter how you do it, way harder than malloc. You have all the problems of malloc plus other problems that malloc doesn’t have.

lisper · on Aug 26, 2022

It's true that malloc/free does not have these problems. It has different problems. A standard implementation of malloc/free is going to sometimes do heap defragmentation, or search a free list for a block of the right size, or something that isn't going to be O(1), and making it O(1) is not going to be significantly easier than making a GC O(1). And it is not true that GC has all the problems of malloc. GC can relocate objects. Malloc can't.

pizlonator · on Aug 26, 2022

the fact that GC can move objects is a plus for GC only if you don’t care about pauses. If you do care about pauses then moving objects makes everything harder.

lisper · on Aug 26, 2022

We seem to be talking past each other. Let's go back to basics:

Neither GC nor malloc/free (which I am going to start abbreviating as MF) are 100% guaranteed not to fail under all circumstances. As a trivial example, if you try to allocate more memory than you actually have physically available, both GC and MF will fail. That is not the only example of a circumstance under which they will fail, but that is irrelevant. The point is simply that there are circumstances under which both GC and MF can fail. This is true regardless of any real-time considerations. So the possibility of failure is a fact of life that cannot be discharged even with arbitrarily clever allocation technologies, and even in the absence of hard-real-time constraints.

Likewise, there are circumstances where both GC and MF are essentially equivalent. If (for example) you only ever allocate a finite amount of memory, and that amount is less than what is available, and you never free anything, and you never produce any garbage, then both GC and MF will work. As an ancillary observation, even the most naive implementation of GC and MF will return results in O(1) time, and the multiplicative constant will be very low. In other words, under these circumstances, both GC and MF can be reasonably considered to be "hard-real-time". Their behaviors will be essentially indistinguishable.

The only circumstance under which the question of GC vs MF becomes interesting at all is in more complicated situations, where you have complex allocation and de-allocation patterns, and the total amount of memory you are allocating over the lifetime of the program is more than the amount of physical memory available. Under those circumstances you have to be clever and re-use memory that has been previously allocated and freed, either manually, or by a collector.

Under those circumstances, the devil is very much in the details. It is not hard, for example, to come up with an allocation/deallocation pattern that will (almost certainly) cause an allocator that cannot relocate objects to fail, but which would succeed using a relocating allocator.

So although you are correct when you say that "if you care about pauses then moving objects makes everything harder," that is missing the point rather badly. The alternative to "making everything harder" is accepting failure in circumstances under which they would not have failed if you had "made everything harder".

Which brings me full circle to my original point: making things hard-real-time is hard, full stop. Hard-real-time GC is hard, but it is not the GC part that makes it hard, it is the hard-real-time part that makes it hard. And yes, of course you can make it less hard, but at the cost of having a system that is less capable, and which will fail in more circumstances. Making those kinds of trade-offs is part of what makes this problem hard. The differences between GC and MF are just part of the details. They are not the fundamental drivers of the difficulty of this problem.

ungamedplayer · on Aug 26, 2022

Why not put option 2 on another cpu, that way you will only get minimalistic interrupts to the main workload?

Is that a done thing ?

pizlonator · on Aug 26, 2022

Sure. But that doesn’t solve the schedulability problem. You’d still have to prove that the GC task completes soon enough. What does soon enough mean? All of the hard things I mentioned. Parallelism may help with provisioning but it doesn’t reduce the proof burden.

admax88qqq · on Aug 25, 2022

Malloc does not have constant nor guaranteed runtime.

It leads to memory fragmentation and requires non constant searches to find a free spot for the next malloc.

pizlonator · on Aug 25, 2022

It’s extremely difficult to make anything in GC have constant time guarantees.

Although non real time GCs can defrag the heap, it’s a huge problem to do it in a real time GC.

admax88qqq · on Aug 26, 2022

Sane problems as malloc and free. And due to GC being abstracted and faster amortized time per allocation you have more resources to work with.

Look at the Metronome GC from IBM

You basically have to guarantee a certain level of resources or a certain allocation rate. Things you have to do anyways in non GCd languages.

This whole conversation is moot. Hard real time does not mesh with dynamic allocation, unless your allocation behaviour is very well understood and predictable, in which case it ain't hard to design a GC that works for it.

Malloc/free are not free. They have their own limitations and performance characteristics.