Can someone clarify this part like I’m 5: > Its design is inspired by purely fun...

qayxc · on Feb 2, 2024

As far as my very limited understanding goes, the transputer reference is simply referring to a highly parallel message-passing system based on message queues, not actual hardware implementations.

As for lack of preemption, that's explained by the system not implementing a scheduler. The system "just" implements an executor.

An Executor is a component that handles subscriptions (e.g. to interrupts or events/messages), services (both clients and servers), timers, and QoS events (deadline missed, invalid QoS request, task died/unresponsive, etc.), which are implemented as tasks (Aerugo calls them "tasklets") and registered with the executor.

• Subscriptions are tasks that subscribe to a topic (of a message).

• Service servers are tasks that are invoked when a request from a client is received.

• Service clients are tasks that are invoked whenever a response from a server is received.

• Timers trigger when a timer expired.

• QoS events are fired when a deadline has been missed, a task died/became unresponsive, and other stuff like a QoS request couldn't be met, etc.

Aerugo supports a subset of these capabilities, namely message queues (for subscribers/clients and producers/servers), events (IRQs, h/w signals, etc.), and cyclic execution (execute the task n-times or forever).

Scheduling - i.e. the priority of things - is handled by the user or built into the Executor itself. For Aerugo the former is the case, so if you need control over priorities and order of task execution, you'd have to implement it on top of it.

Now the realtime-part isn't really affected by any of this. It still allows for weaker RT systems. RT just means there's a time constraint on executed tasks. Weaker systems have average runtime guarantees (e.g. a tasks have to finish within a limit on average, i.e. outliers are allowed). Stronger systems are stricter in the sense that tasks always have to finish within the limit. The strongest systems even enforce an exact limit (that is, tasks aren't even allowed to finish sooner - they have to take exactly a given amount of time).

Preemption is required if tasks have to be scheduled w.r.t. to different priorities, especiallyy in multi-threaded scenarios. Lack of it means that the executor will never interrupt a running task to execute tasks that take priority - at least from what I understand, which might be wrong.

RossBencina · on Feb 3, 2024

A couple of things of note:

Run-to-completion scheduling is a thing in the real-time world. See for example Miro Samek's book Practical UML Statecharts in C++.

XMOS makes real-time embedded CPUs supporting task-per-core architecture and hardware message passing. Commonly used in consumer real-time audio hardware, to pick one example. I believe some of the original Transputer people are involved.

flyinglizard · on Feb 2, 2024

No preemption makes this an organizer more than an RTOS.

csdreamer7 · on Feb 2, 2024

Would you please explain more?

kimixa · on Feb 2, 2024

No preemption means that any realtime guarantees are handled by user space, as the kernel relies on that yielding to hit time guarantees. It means it doesn't have to do much of the "difficult" work for a realtime system.

aidenn0 · on Feb 2, 2024

On a preemptive OS, all multitasking with (non-CPU) contended resources is, to a certain extent, cooperative. If one task acquires a mutex and then never releases it, no amount of priority-inheritance, highest-locker semantics &c. will fix that. Making everything cooperative increases the analysis needed because the CPU becomes a manually managed contended resource.

In addition, while not being preemptive, it sounds like executors are expected to run in bounded time, and the system triggers an event when the time is exceeded.

Complete systems, not operating-systems, are realtime. Operating systems can be non-realtime (most general purpose schedulers are completely unsuited for hard realtime systems), but at their best only make the design of realtime systems tractable.

PaulDavisThe1st · on Feb 2, 2024

You are confusing deadlock with scheduling. Sure, deadlock can happen in the absence of "cooperation". And lack of on-time scheduling can happen in a cooperatively scheduled system in the absence of the appropriate cooperation.

But scheduling has been considered by most kernel designers to be the responsible of the kernel, not the participants (i.e. preemptively scheduled threads not cooperatively).

Even if a thread is going to deadlock as soon as it starts running (again), there is a huge difference between that thread being scheduled at the right time and it not being scheduled. You can fix the former (deadlock) with better coding. You cannot fix the latter without fixing the kernel.

aidenn0 · on Feb 2, 2024

> You are confusing deadlock with scheduling. Sure, deadlock can happen in the absence of "cooperation". And lack of on-time scheduling can happen in a cooperatively scheduled system in the absence of the appropriate cooperation.

I was describing situations much broader than deadlock. The following pseudocode is not deadlock, but nevertheless a failure of cooperation:

  WaitForMutex(m)
  DoSomeReallyLongComputation()

My point was that the above code in a preemptively scheduled system is as damaging to all tasks that will contend for "m" as this code is in a cooperatively scheduled system:

  DoSomeReallyLongComputation()

> But scheduling has been considered by most kernel designers to be the responsible of the kernel, not the participants (i.e. preemptively scheduled threads not cooperatively).

Yes and no. Schedulers tend to have parameters. Realtime systems will rely heavily on those parameters. Those parameters will sometimes even include promises for thread T to not run for more than X amount of time in a period of Y time.

> Even if a thread is going to deadlock as soon as it starts running (again), there is a huge difference between that thread being scheduled at the right time and it not being scheduled. You can fix the former (deadlock) with better coding. You cannot fix the latter without fixing the kernel.

This is true in a preemptively scheduled kernel. It's kind of tautological that fixing issues with resource X needs to fix the kernel IFF X is managed by the kernel. See also my above paragraph about kernel scheduler parameters.

[edit]

Just saw who I was replying to. I suspect that you and I have different visions of what a "Real Time System" is, given that I'm thinking industrial control and you're probably thinking audio. There's definitely overlap in theory and discipline, but the hardware and software stacks are rather different.

PaulDavisThe1st · on Feb 2, 2024

A kernel (because that's where interrupt handlers are located) can ensure that a thread is scheduled with N usecs of when it "ought to be" (which could be based on some sort of time allocation algorithm, or simple priorities or whatever other scheme may be in use. The kernel can say "oh look, it's been N usecs, let's check who is running and who is ready to run ... hey, time for Thread 2 to run". This is preemptive scheduling.

No cooperative scheduling system can ensure this.

Your example involves poorly designed code, which is not the responsibility of the scheduler. Its job is just to make sure that threads run when they "ought to" - it cannot protect against priority inversions in user space and ensure that RT guarantees are met (pick 1, and even then, you lose).

Veserv · on Feb 2, 2024

You two are talking past each other.

They are saying that a real time application requires a transitive closure analysis of all cooperating partners to verify if service can be guaranteed.

A cooperative scheduler requires you to extend the transitive closure to all code in the system.

For a critical appliance where the appliance as a whole needs to guarantee service, this means your transitive closure already encompasses all code in the system, so you are not losing too much.

The advantage of preemptive scheduling is that it allows you to subdivide your applications so that you do not need to analyze everything in the system. You can restrict yourself to only considering direct and intentional interaction. This provides modularity advantages and allows you to provide guarantees to sub-components even in the presence of errors or malicious behavior in other components.

However, if you are doing whole system analysis anyways and the systems are simple enough to be tractable to analyze without decomposition, then a cooperative scheduler is adequate to ensure real time performance.

PaulDavisThe1st · on Feb 2, 2024

The original claim was:

> On a preemptive OS, all multitasking with (non-CPU) contended resources is, to a certain extent, cooperative

This just isn't correct.

Veserv · on Feb 2, 2024

“with contended resources” is the part you are misinterpreting. They are saying that two applications that contend on a resource must cooperate to operate properly.

As they clearly state later: “Making everything cooperative increases the analysis needed because the CPU becomes a manually managed contended resource.”

i.e. a cooperative scheduler makes all code contend on the CPU, thus requiring global cooperation.

Given that they are providing that case as establishing a new requirement (the CPU becomes a contended resource) they are clearly stating in the preemptive scheduler case that the CPU is not a contended resource and thus no global cooperation is required. Only if they contend on a resource do they need cooperation amongst the contending partys.

But again, if you are already doing a whole system analysis anyways, then the benefits are less pronounced.

aidenn0 · on Feb 3, 2024

Veserv has correctly restated what I meant by my original comment (which I still stand by). In my first reply, I used a mutex as an example for a contended resource. A cooperative multitasking system just extends "contended resource" to include the CPU.

Just like you need to make sure you don't hold a critical lock for too long in a preemptive multitasking system, you need to make sure you don't run on the CPU for too long in a cooperative multitasking system.

[replying to GP so as to not fork this thread too much]

> A kernel (because that's where interrupt handlers are located) can ensure that a thread is scheduled with N usecs of when it "ought to be" (which could be based on some sort of time allocation algorithm, or simple priorities or whatever other scheme may be in use. The kernel can say "oh look, it's been N usecs, let's check who is running and who is ready to run ... hey, time for Thread 2 to run". This is preemptive scheduling.

> No cooperative scheduling system can ensure this.

Sure it can: just poll for preemption every N usecs. You can even statically analyze the assembly code to calculate the maximum number of clock cycles between two points where you poll for preemption (in the event that your microcontroller has caches, you will want to use writethrough caching to help with this analysis).

> Your example involves poorly designed code, which is not the responsibility of the scheduler. Its job is just to make sure that threads run when they "ought to" - it cannot protect against priority inversions in user space and ensure that RT guarantees are met (pick 1, and even then, you lose).

This was the whole point of that example. A realtime scheduler only guarantees that a thread gets the CPU time it is promised. CPI time is one of (potentially many) contended resources in a multitasking system. It is so helpful because all tasks will be contending for CPU time and there are few other resources for which this is true (memory bandwidth, particularly on SMP systems immediately comes to mind).

To be clear: I like realtime schedulers. The are great and simplify many things; the analyses of tasks that only contend for CPU are greatly simplified by them.

aidenn0 · on Feb 4, 2024

> A kernel (because that's where interrupt handlers are located)..

Thought I would pull this one phrase out to demonstrate how we are clearly talking about different systems. For many embedded control systems, I would not say "the interrupt handlers are located in the kernel." On processors that have the distinction, interrupt handlers typically run in Supervisor mode (What the x86 calls "Ring 0"), but it's entirely possible that 100% of the code will run in Supervisor mode. When the kernel provides preemptive scheduling, it may handle a single interrupt (a timer); alternatively the kernel doesn't handle any interrupts, but instead provides an API call that application developer(s) must invoke for entering the scheduler.

Many things that are called an "RTOS" in the embedded world would be called a "threading library" in the desktop world.

thfuran · on Feb 2, 2024

What happens when a tasklet take too long?

qayxc · on Feb 2, 2024

I haven't poked into the source code, but usually a QoS event is fired, e.g. "deadline missed". What happens exactly in that case is very application specific.

addaon · on Feb 2, 2024

> What happens when a tasklet take too long?

The same thing that happens when 1+1 == 3, or when a task tries to write to memory that it doesn't have permissions for. The static analysis that your system relies on for correct behavior is no longer valid, so a hardware belt-and-suspender mechanism (a schedule overrun timer interrupt, a lockstep core check failure, or an MPU fault, respectively) resets or otherwise safe-states the failed ECU and safety is assured higher up in the system analysis.

toast0 · on Feb 3, 2024

>> Executor runs tasklets, [...] that execute a processing step in a finite amount of time.

> lack of preemption seems hard to reconcile with real time guarantees?

Depends on the requirements and the bounds of the finite time allowed and the enforcement of it. For example, Erlang claims to be soft real-time; it doesn't have preemption, but any function call can result in yielding, and as a functional language there are no loops without calling functions.

jacquesm · on Feb 3, 2024

And that works because it is a VM that is executing the instructions. The higher level of the VM can switch threads at will. The measure used is the number of reductions, once a set limit is used or something more urgent crops up a context switch will occur. This functionally indistinguishable from a hardware interrupt for the task that was executing but because it is all done in software you don't actually have that interrupt. It's more as though every reduction has the potential to interrupt a task just like you can expect an interrupt everywhere in a regular stream of machine code for some processor.

The nice benefit of the Erlang method is that you can graft this onto a user process without a need for special privilege or exception handling, nothing ever really has to deal with interrupts or the messy aftermath of it, it's much cleaner than an interrupt driven system.