The CPUs of Spacecraft Computers in Space

randomstring · on Dec 18, 2020

Reminds me of this post comparing the CPU and RAM of the Apollo 11 flight computer with modern USB chargers. (spoiler alert: the USB charger wins)

https://forrestheller.com/Apollo-11-Computer-vs-USB-C-charge...

Hacker News comment link: https://news.ycombinator.com/item?id=22254719

verberant · on Dec 18, 2020

Space flight computers in the public sector are generally 15-20 years behind the types of hardware we commonly work with on the ground, as I think this page shows.

We now have pretty capable low-power SoCs and FPGAs that we've yet to see broadly leveraged for govt. space applications. SpaceX flies Starlink with Xilinx FPGAs, while NASA and DoD are still baselining new platforms on incredibly expensive (albeit rad-hard) PowerPC RAD750 and similar. This is a huge bottleneck for any computationally intensive task we might want to do on-orbit, and I'm curious if or when it will change. It's one technical reason, in my opinion, that the private sector is currently calling the shots in space.

cpgxiii · on Dec 18, 2020

The RAD750 (edit - the whole RAD family, there are newer models available) remains the standard because it's the highest performance rad-hard design available, period. If you're putting an expensive satellite in orbit for 5,10 years, the cost of the processors is insignificant compared to everything else.

The real problem is that we don't have good solutions for improving the performance of rad-hard designs, so we're stuck with older, larger process sizes that limit what can be implemented. Look at the lengths involved in getting A* to run on Curiosity, and you see just how limiting the hardware is. Everyone, Nasa especially, wants more compute available.

In low earth orbits and shorter mission durations, you can get away with redundant hardware instead of rad-hard. Most of the damage done by radiation is upsets, so you can reboot the affected hardware and keep going. But on an unprotected design some of the damage can be permanent, and thus redundancy alone isn't enough for longer/farther missions.

kevin_thibedeau · on Dec 19, 2020

I had to select a processor that controls the camera in the GOES-R ABI. The image processing is all done by custom hardware so all that was needed was microcontroller level performance. It turns out there are very limited options in this space and all of them are quirky outdated architectures with limited available tooling.

The RAD750 in particular is a bit of a nightmare because of the high pin count, need for a support chip, and the 32-bit bus forces the use of more RAM and ROM than a smaller micro would need. I took a pass on that. I also never have liked IBM's reverse bit numbering and the implications it has on SRAM power consumption.

trzy · on Dec 19, 2020

How does the PowerPC’s reverse bit numbering impact power consumption?

kevin_thibedeau · on Dec 19, 2020

If you wire the bits as numbered to a conventional memory device designed with LSB as bit-0, the internal address bus will induce more switching from sequential access than normal. The internal row and column decoders will be working overtime consuming more power than necessary. Reversing the bus to deal with that isn't always straightforward on a space constrained board.

trzy · on Dec 19, 2020

If I understand correctly, you are saying that with the reversed bit numbering, bit 31 (in a 32-bit address bus) changes most frequently with sequential accesses because it is the LSB but when wired to the MSB of SRAM, it causes switching in the column decoder for every single access.

That makes sense but I didn’t realize that it was difficult to simple swap the wiring. Are the physical pins ordered backwards as well (that is, do PowerPC’s A31 and A30 appear where A0 and A1, respectively, would be on a “normal” system)?

yholio · on Dec 19, 2020

I have absolutely no idea why someone would connect the upper bits of the CPU address bus to the lower bits of the memory, if this is what the GP refers to. Their naming scheme seems irrelevant.

Almost all modern memory is built in a large matrix where the upper bits select the row into a buffer and the lower bits control a multiplexer that selects a slice of that row. Scanning incrementally through the memory will hit the fast multiplexer path and result in much faster access.

Propagating into the whole matrix at each increment is not only a power draw but a massive slowdown.

geomark · on Dec 19, 2020

Since you have recently been through this, what were the other available options? There were a couple other rad hard processors under development years ago when I left the industry. Do you know what happened to those?

formerly_proven · on Dec 18, 2020

> The RAD750 remains the standard because it's the highest performance rad-hard design available, period.

RAD5500?

cpgxiii · on Dec 18, 2020

You're right, the RAD5500 and family are available. I should have said the whole BAE RAD family.

The reality hasn't changed much, though, there's really only one game in town for high rad-hard performance, and it's still well behind conventional processors.

nzentzis · on Dec 20, 2020

> Look at the lengths involved in getting A* to run on Curiosity

Can you link to something that goes into detail? Googling it doesn't turn up anything relevant, but it sounds like it'd be interesting to read about.

runeb · on Dec 19, 2020

Could they not offload a lot of compute to ground based computers and submit results back via radio? Or are these real-time applications?

cpgxiii · on Dec 19, 2020

The whole point of implementing A* on Curiosity was to give it some navigation autonomy. The time delay in getting sensor data back to earth, coming up with a motion plan, then sending the plan back to be executed imposes tight limits on how fast the rover can drive, what kinds of terrain it can cover, and ultimately how much science can be done. Local autonomy for basic "go over to than weird-looking rock" tasks is a major improvement.

gorgoiler · on Dec 19, 2020

You could A* your way around the whole planet by using an Earth based computation but only if you knew where every rock was.

There must be some equation of motion in space robots that combines terrain difficulty, robot speed, round trip time to Earth, and how far ahead you’d need to be able to see.

Curiosity moves about as fast as a Roomba. The ping is (min/avg/max) 10’/24’/40’. Ergo, it needs to be able to see X yards ahead of itself to plan A* from Earth, requiring a camera boom Y feet tall producing images with Z megapixels of resolution.

I wonder what X, Y and Z are.

cpgxiii · on Dec 19, 2020

Some very rough numbers, if you want plans valid until you see the results and travel at a good speed:

- You want a travel speed of 0.5 m/s - Worst-case round trip time from command to result is 40 minutes, or 2400 seconds - Max distance covered is 1200 meters - Assume you can drive over any rock 10cm or less in size (the rover can do better, but at a reduced speed)

So you need to scan an area 1200m x 1200m, with range accuracy of, say, +/-5cm. Forget the camera boom height, or the time it might take to scan and process, there's no sensor that will give you that kind of accuracy. The stereo baseline would be huge, and leave you entirely at the mercy of whatever texture or not a given part of the martian surface has to offer. LIDAR is OK if you definition of "long range sensing" is larger objects at 200m. The time cost for shipping back all the raw data for processing would kill performance as well, and if you wanted to process it locally the compute requirements would be just as high as doing the planning locally as well.

Onboard, or at least much closer,compute is the only way forward in autonomy. Honestly, the best bet on improving local compute would be to send a robot bulldozer, some C4, a rack full of milspec servers, and a big RTG. Blow a nice crater, push the servers to the bottom, and bury them in dirt for shielding. If you get really lucky, you could find some old cave or lava tube.

scoopertrooper · on Dec 19, 2020

That's more or less what they used to do with Sojourner, only with humans setting out the way points rather than A* . It never managed to get more than 10 meters from its lander though, I assume this was partly driven by the limitations on a human's patience in operating a slow vehicle with 28 minute feedback loop.

One drawback of the remote A* approach is that you end up using more energy as the rover would have to be constantly communicating with its onboard antenna. Its relay satellites are only in range for a limited period each day. Fine grained maneuvers (like drive around that big rock to get to this small rock) would also prove difficult because of likely errors in the rover's inertial-navigation system.

https://en.wikipedia.org/wiki/Sojourner_(rover)

throwaway189262 · on Dec 19, 2020

Does curiosity really move that fast? Roomba is maybe 2 mph. I thought curiosity was closer to 0.1 mph

Mistletoe · on Dec 19, 2020

His or her estimate seems wildly off. I had the same response because my Roomba moves fast!

Some quick googling says-

Curiosity max speed equals 0.08699 mph.

Roomba equals a foot per second which is 0.682 mph.

Off by about 7.82x.

gorgoiler · on Dec 20, 2020

Thanks: my estimate of Roomba slowness was way off. I’ve only ever seen them on TV.

I remember once hearing that Curiosity, flat out, could do 1km a day.

throwaway189262 · on Dec 23, 2020

Random but I highly recommend the new "mapping" type roombas. Best thing I bought all year.

publicola1990 · on Dec 19, 2020

With what kind of processing power did the recent Change'5 probe did the autonomous docking manoeuvre in lunar orbit..

Also it seems to be doing sometype of image processing to identify a suitable landing spot and guiding on to that point.

arjun-menon · on Dec 19, 2020

I was thinking, instead of that, what if you had a separate isolated tiny computer on spacecraft, that was powered by its solar panels (so there's no electrical wiring, or other connection to it), and have its own radio. And this separate computer could use the latest bleeding-edge CPU, and be encased in a radiation-hardened shell. It would use its radio to talk to the slower main computer, and do math really fast locally, and if need be, beam the results to Earth, or to a nearby orbiting satellite.

angry_octet · on Dec 19, 2020

Unfortunately there isn't really any practical way to have a radiation hardened shell that is sufficiently effective. Eg. 5cm of aluminium stops only 30% of the galactic radiation. (Heavier elements (e.g. gold) are scattered by incoming particles causing incoming heavy ions which cause even more damage.)

So practically it would still experience significant radiation.

But having the main compute for Mars remain in orbit with the relay isn't a bad idea.

throwaway189262 · on Dec 19, 2020

Shannon limit implies a linear relation between bit rate and transmit power. The only way to get fast enough transfer would be to spend tons of the power budget on radio

senkora · on Dec 19, 2020

For Mars, at least, that would be tens of minutes round trip because of the speed of light.

It works for some things, but for pathfinding it isn’t a great fit.

The other issue is bandwidth between the craft and Earth, which is quite limited.

Maybe there would be benefits to a “orbiting datacenter” around Mars carrying a bunch of rad-hardened compute? I assume NASA has considered this and decided it would be a bad idea.

Rebelgecko · on Dec 19, 2020

SpaceX doesn't have the same requirements--The radiation environment by Mercury or halfway to Jupiter is drastically different than LEO.

SpaceX missions are also a lot shorter. Having one unrecoverable latchup a week isn't a big deal if your mission is 2 weeks long. If you mission is 10 years, it starts to become a problem (especially since some radiation damage can be cumulative)

>NASA and DoD are still baselining new platforms on incredibly expensive (albeit rad-hard) PowerPC RAD750 and similar

NASA and DOD have also been sending up Xilinx and Altera boards for ages (even the non space-grade ones). However you can get rad-hard ARM CPUs that are cheaper and more powerful than the ones in a Zynq board.

tkinom · on Dec 19, 2020

It would be interesting to know if someone put a raspberry pi inside and outside space station in exposed complete unprotected environment and run some continuous tests, how long would we start to see any failures and what kind failure would be that be.

qayxc · on Dec 19, 2020

This has been done multiple times. Amateur radio satellites and some cubesat kits [1] use primarily COTS components.

The lifetime and radiation environment for those applications are very limited, though. It seems that for short missions (e.g. <2 years) and low orbits (<500km), COTS hardware should be fine if properly shielded.

It would be interesting to see what difference it actually makes for HEO or even BEO missions, especially if a high degree of redundancy is introduced as well.

[1] http://www.cubesatkit.com

Rebelgecko · on Dec 19, 2020

Typically those sorts of tests can be done on Earth if you have access to a cyclotron. My guess is that the SD card would be the weak link.

geomark · on Dec 19, 2020

"Having one unrecoverable latchup a week isn't a big deal if your mission is 2 weeks long.Having one unrecoverable latchup a week isn't a big deal if your mission is 2 weeks long."

Unless it happens in your attitude control system or your command and control system, causing you to lose control of or communication with your spacecraft.

Rebelgecko · on Dec 19, 2020

I guess I didn't actually say so, but my implicit assumption was that you have a voting setup where a single failure isn't necessarily a problem

mikepurvis · on Dec 19, 2020

I think the assumption is that a redundancy scheme is in place. So you have your unrecoverable issue in some module of compute A, but compute B and C vote them down and life proceeds. The problem is when your mission is long enough that the same module gets hit in one of the other two units, and now you're in trouble.

mhh__ · on Dec 19, 2020

> Starlink with Xilinx FPGAs, while NASA and DoD are still baselining new platforms on incredibly expensive (albeit rad-hard) PowerPC RAD750 and similar.

Ignoring that Starlink isn't very far away, I would assume NASA stuff would also have FPGAs and ASICs on them - they aren't CPUs and aren't used like them.

johnwalkr · on Dec 19, 2020

It’s pretty common in space to implement a soft core CPU (or redundant ones) on a space-grade FPGA.

bumby · on Dec 19, 2020

Some NASA orgs have tried using FPGAs as a way to get around software requirements, to varying levels of success

mhh__ · on Dec 19, 2020

Interesting, although I was more thinking about FPGA's in things like acquisition and processing rather than overall logic as the PC seemed to imply.

bumby · on Dec 19, 2020

The high level NASA requirements cast a pretty wide net (to include data acquisition and processing) as to what falls under the purview of those requirements. From 7150.2:

“ A.30 Software. Computer programs, procedures, scripts, rules, and associated documentation and data pertaining to the development and operation of a computer system. Software includes programs and data. This also includes COTS, GOTS, MOTS, reused software, auto generated code, embedded software, firmware, and open source software components.”

https://nodis3.gsfc.nasa.gov/displayCA.cfm?Internal_ID=N_PR_...

dimator · on Dec 18, 2020

My understanding is that certification is the bottle neck, in both time and cost. No one wants to spend the money or time to flight certify something new when something already battle tested will suffice.

But your comment makes me wonder if the private sector doesn't have those certification requirements?

The other differentiating factor is that the private sector is not sending multi-year (indeed multi-decade) deep space missions, where the need for battle tested systems is paramount.

pauldino · on Dec 18, 2020

Flagship multi-year science missions are generally conservative with technology choices, but some NASA projects are intended as technology demonstrations and can on take more risks.

So like the Perseverance rover on its way to Mars is powered by redundant RAD 750s (same as Curiosity), but the Ingenuity helicopter along for the ride is powered by a Snapdragon 801.

It will be interesting to see how it holds up.

dylan604 · on Dec 18, 2020

How do you battle test a RAD prototype? Stick it in microwave like device with ionizing radiation and see how many bit-flips occur?

sjburt · on Dec 19, 2020

It depends on where the spacecraft is going as radiation environments differ. I've taken parts to be exposed by a proton line at a particle accelerator. For some environments they just use Cobalt-60 as a radiation source.

smueller1234 · on Dec 19, 2020

This reminds me of a relevant anecdote: back in the naughts, I was doing research in cosmic rays at a large nuclear research facility. I did simulation and data analysis - office/computer work mostly. One day, a person with a clipboard comes into my office and asks about the whereabouts of some radiation source. I look at them confused - I had not touched sources since teaching nuclear physics labs. They show me their clipboard and lo and behold, it has my name next to a really high intensity source and they're looking to locate it.

After a few minutes of awkward shock and denying all involvement, we realized it was a colleague at the same research institute (and of the same name). He was, one building down, doing his PhD research on radiation hardened detectors for the CMS experiment at CERN(1). He was using the source for the testing. But I had a minute of real stress before that came together in my mind...

(1) I think this was his work: https://onlinelibrary.wiley.com/doi/abs/10.1002/pssa.2007763...

vbtemp · on Dec 19, 2020

Pretty much!

inamberclad · on Dec 18, 2020

To answer - no we don't have the same certification requirements. NASA steps in when there's human lives and/or a lot of money on the line, but most smaller projects and just about every independent project is free to assume its own level of risk.

bumby · on Dec 19, 2020

You know (and probably are implying) this but it’s completely program/project specific. Some projects out of Armstrong, for example, must meet FAA certification requirements

bryananderson · on Dec 19, 2020

NASA has used Xilinx FPGAs on a number of missions (though still mostly smaller missions). They are doing so for precisely this reason: on-spacecraft computation for intensive tasks such as image processing.

Here’s the website for the SpaceCube platform (developed at NASA Goddard). This is a little out of date (I worked on flight software for a mission called STP-H6 which I don’t see listed here), but gives an idea of how this idea is slowly but surely gaining steam in NASA.

https://spacecube.nasa.gov/

throwaway189262 · on Dec 19, 2020

The private sector has decided to put regular ground chips in spacecraft and just deal with errors using triple redundancy. Low earth orbit where most satellites hang out doesn't have much radiation anyways.

The cost savings from using regular chips is so high that I bet SpaceX will continue to use them even in deep space. Just surround them with sheilding. When a $400 desktop cpu is 500X faster than a $40,000 space rated one a couple pounds of shielding is well worth it

chmod775 · on Dec 19, 2020

The kind of radiation you want to protect against is not "easily" shielded.

The effectiveness of shielding is proportional to its mass and thickness, and both are at a premium for spacecraft.

quazar987 · on Dec 19, 2020

There’s more to it than just specs. Consumer grade silicon will not survive in space, radiation will just kill them.

myself248 · on Dec 18, 2020

On some of the silicon details: https://habr.com/en/post/518366/

synack · on Dec 18, 2020

ESA uses an OpenSparc fork, LEON.

http://www.esa.int/Enabling_Support/Space_Engineering_Techno...

snvzz · on Dec 18, 2020

And that's expected to be succeeded by NOEL (LEON anagram), a radiation hardened RISC-V CPU focused on high reliability.

kristoffer · on Dec 19, 2020

It is not related to OpenSparc. LEON is a SPARCv8 developed initially by ESA and later by swedish company Gaisler Research.

bitcharmer · on Dec 18, 2020

How do these circuits deal with random bit flipping from cosmic rays?

inamberclad · on Dec 18, 2020

There's three basic ways this is done:

1: By process, where chips are created with special or larger features to better resist cosmic rays. This is Expensive since they're made in very low volumes and the cost of the new fab line can't be spread among many millions of units. Instead, a few thousand chips might be made.

2: By design, where redundant systems such as triple redundant memory or voting computers are used. This is probably the most interesting as you can get into issues like the Byzantine Generals problem here. All the redundancy can be implemented in a single FPGA by simply routing out the design 3 times and using voting logic, assuming the FPGA is large enough.

3: By shielding. Just fly regular chips in a shielded box. This causes thermal issues, but is sometimes necessary, such as in Juno, which has to deal with the enormous radiation flux around Jupiter.

gpm · on Dec 18, 2020

As sending mass to space becomes cheaper I wonder if shielding will become more popular... Maybe there is some oil-like material that could serve bath as a cooling bath and as a radiation shield?

geomark · on Dec 19, 2020

Polyethylene is usually used to shield against cosmic rays. Nearly identical shielding performance as water. Slightly less weight per volume. Can be formed easily and retains its shape.

klodolph · on Dec 18, 2020

As mentioned in another comment, water is a radiation shield, but ionizing radiation in space will attenuate 50% after 7cm of water (I could be wrong), and if you want a lot of attenuation, you need a lot of water (which is extremely heavy).

A small amount of water for shielding is undoubtedly much more massive than simply using bigger, rad-hard processors.

gpm · on Dec 18, 2020

I don't see any comments about water on this post?

If we're talking about $50/kg [1] in the future... well...

- Sending a few kg of water (or other shielding) to space costs a fraction of the price of a fast non-rad-hardened CPU.

- It really costs less than the extra development cost associated with having to use bespoke toolchains.

[1] Number from the other space post on the front page today: https://getmeflyingcars.medium.com/how-much-does-it-cost-to-...

klodolph · on Dec 19, 2020

> I don't see any comments about water on this post?

Looks like it was deleted.

> - Sending a few kg of water (or other shielding) to space costs a fraction of the price of a fast non-rad-hardened CPU.

It doesn’t sound like a few kg is going to cut it. Recall that a 1 kg of water is about 1 L, which gives you about 6 cm of shielding, which is simply not enough. That’s less than 50% attenuation of the ionizing radiation you find in space. Mass increases with the cube of the thickness. If you want 20 cm of shielding, that’s 33 kg, and something like 86% attenuation.

It seems to me like there are better things you can do with your mass, which also needs to be spent on things like fuel for stationkeeping. Lower mass also means more satellites.

> - It really costs less than the extra development cost associated with having to use bespoke toolchains.

POWER is not exactly some obscure ISA. It is well-supported and battle-tested. You don’t need a bespoke toolchain.

p_l · on Dec 19, 2020

The most bespoke parts would be tool chain certification, which among other things, covers "if I have source A, then I can be sure resulting machine code does B, not anything else".

And that's something you'll get even if you're running MS-DOS on 286 in space ;-)

rzzzt · on Dec 18, 2020

The packaging can be different or redundancy can be added to the design: https://en.wikipedia.org/wiki/Radiation_hardening#Radiation-...

MichaelZuo · on Dec 18, 2020

Continuous voting on hard real-time systems

tibbon · on Dec 19, 2020

So my watch (or maybe headphones even) is several times faster than anything that's ever run a spacecraft?

(I understand why, but that this is so just blows me away)

LeoPanthera · on Dec 19, 2020

Though true, it is important to remember that Computers Are Fast. https://computers-are-fast.github.io

dreamcompiler · on Dec 19, 2020

Especially when they don't have to deal with 37 layers of Windows/MacOS abstractions and backward-compatibility concessions.

wiz21c · on Dec 18, 2020

What about chinese/indian/japanese spacecrafts ?

sanxiyn · on Dec 19, 2020

China's Chang'e 4 lander is known to use ATMEL AT697F. It is SPARC-compatible.

Source: The scientific objectives and payloads of Chang'e 4 mission. DOI 10.1016/j.pss.2018.02.011. No free source, use Sci-Hub.

webmonkeyuk · on Dec 18, 2020

"A CPU for use in space must first be MIL-STD-883"

Is this just for NASA craft? Are there any regulations for private craft or international standards?

GlenTheMachine · on Dec 18, 2020

Depends, but mostly no. No regulations.

p_l · on Dec 19, 2020

A non-trivial amount of satellites have also used Transputers (and there's legacy of it in SpaceWire network)

anon2021 · on Dec 18, 2020

There have been none RAD hard CPUs working in space. https://www.hpe.com/us/en/insights/articles/the-space-statio...

FredFS456 · on Dec 19, 2020

Low Earth Orbit (LEO) which the ISS is in is very different from deep space. You can get away with much less radiation hardening in LEO. For example, I know a company consumer-grade Xilinx MPSoCs with 4xA53 cores at 1.5GHz.

phire · on Dec 19, 2020

They literally ship off the shelf laptops and smartphones to the ISS.

They do modify them slightly to remove the lithium-ion batteries, which you do not want inside a space craft.

I think they replace them wit NiMH cells if the devices still need to be battery powered.

sephamorr · on Dec 19, 2020

They no longer do this, and are now happy with just testing li ion products. There are iPads all over the station (and used by the astronauts on the SpaceX Dragon)

abeppu · on Dec 18, 2020

Do we actually have much data on the failure rates to different kinds of chips to radiation in space? If so ... how? What conditions have to hold so that when a chip is damaged by radiation you get enough information about it to know which chip had what kind of problem?

cpgxiii · on Dec 19, 2020

Testing is done on earth in lab conditions, and some testing has been done in space. We've had good models of radiation energy and type in space for a while, so you can reproduce type and intensity of radiation on earth to test how the chip will behave. You can also fly chips for testing, where instrumentation and testing are controlled by known working hardware.

Diagnosing novel failures on hardware in space is hard, but the overall types of failures and their underlying physical phenomena are known. Obviously, in the case of a satellite that goes completely unresponsive you can't answer 100%, but you may have sensor data in the leadup to the failure, or from other "nearby" satellites that would allow you partially reproduce conditions before the failure in a test scenario.

snarkypixel · on Dec 18, 2020

When I see posts like this, I always think about how insane it will be in 2520 if we keep the same innovation growth (and not die to whatever)

InitEnabler · on Dec 19, 2020

Pretty interesting. Wonder what SpaceX uses for there CPUs?

on Dec 18, 2020

[deleted]

vntok · on Dec 18, 2020

> Saying there are physical backup controls is a bit like saying there are backup controls at the bottom on your new TV.

Can you expand on this? Are you against physical backup controls?

cpgxiii · on Dec 18, 2020

What they were trying to say is that physical backup controls are often incomplete and inconvenient, just like how you can control a TV using the buttons on it but doing so is far less effective than using the remote. The idea is that making physical controls the primary interface (rather than the backup) for critical tasks means that more effort would go into making them ergonomic and effective.