Hacker News new | past | comments | ask | show | jobs | submit login
The CPUs of Spacecraft Computers in Space (cpushack.com)
179 points by muunbo on Dec 18, 2020 | hide | past | favorite | 82 comments



Reminds me of this post comparing the CPU and RAM of the Apollo 11 flight computer with modern USB chargers. (spoiler alert: the USB charger wins)

https://forrestheller.com/Apollo-11-Computer-vs-USB-C-charge...

Hacker News comment link: https://news.ycombinator.com/item?id=22254719


Space flight computers in the public sector are generally 15-20 years behind the types of hardware we commonly work with on the ground, as I think this page shows.

We now have pretty capable low-power SoCs and FPGAs that we've yet to see broadly leveraged for govt. space applications. SpaceX flies Starlink with Xilinx FPGAs, while NASA and DoD are still baselining new platforms on incredibly expensive (albeit rad-hard) PowerPC RAD750 and similar. This is a huge bottleneck for any computationally intensive task we might want to do on-orbit, and I'm curious if or when it will change. It's one technical reason, in my opinion, that the private sector is currently calling the shots in space.


The RAD750 (edit - the whole RAD family, there are newer models available) remains the standard because it's the highest performance rad-hard design available, period. If you're putting an expensive satellite in orbit for 5,10 years, the cost of the processors is insignificant compared to everything else.

The real problem is that we don't have good solutions for improving the performance of rad-hard designs, so we're stuck with older, larger process sizes that limit what can be implemented. Look at the lengths involved in getting A* to run on Curiosity, and you see just how limiting the hardware is. Everyone, Nasa especially, wants more compute available.

In low earth orbits and shorter mission durations, you can get away with redundant hardware instead of rad-hard. Most of the damage done by radiation is upsets, so you can reboot the affected hardware and keep going. But on an unprotected design some of the damage can be permanent, and thus redundancy alone isn't enough for longer/farther missions.


I had to select a processor that controls the camera in the GOES-R ABI. The image processing is all done by custom hardware so all that was needed was microcontroller level performance. It turns out there are very limited options in this space and all of them are quirky outdated architectures with limited available tooling.

The RAD750 in particular is a bit of a nightmare because of the high pin count, need for a support chip, and the 32-bit bus forces the use of more RAM and ROM than a smaller micro would need. I took a pass on that. I also never have liked IBM's reverse bit numbering and the implications it has on SRAM power consumption.


How does the PowerPC’s reverse bit numbering impact power consumption?


If you wire the bits as numbered to a conventional memory device designed with LSB as bit-0, the internal address bus will induce more switching from sequential access than normal. The internal row and column decoders will be working overtime consuming more power than necessary. Reversing the bus to deal with that isn't always straightforward on a space constrained board.


If I understand correctly, you are saying that with the reversed bit numbering, bit 31 (in a 32-bit address bus) changes most frequently with sequential accesses because it is the LSB but when wired to the MSB of SRAM, it causes switching in the column decoder for every single access.

That makes sense but I didn’t realize that it was difficult to simple swap the wiring. Are the physical pins ordered backwards as well (that is, do PowerPC’s A31 and A30 appear where A0 and A1, respectively, would be on a “normal” system)?


I have absolutely no idea why someone would connect the upper bits of the CPU address bus to the lower bits of the memory, if this is what the GP refers to. Their naming scheme seems irrelevant.

Almost all modern memory is built in a large matrix where the upper bits select the row into a buffer and the lower bits control a multiplexer that selects a slice of that row. Scanning incrementally through the memory will hit the fast multiplexer path and result in much faster access.

Propagating into the whole matrix at each increment is not only a power draw but a massive slowdown.


Since you have recently been through this, what were the other available options? There were a couple other rad hard processors under development years ago when I left the industry. Do you know what happened to those?


> The RAD750 remains the standard because it's the highest performance rad-hard design available, period.

RAD5500?


You're right, the RAD5500 and family are available. I should have said the whole BAE RAD family.

The reality hasn't changed much, though, there's really only one game in town for high rad-hard performance, and it's still well behind conventional processors.


> Look at the lengths involved in getting A* to run on Curiosity

Can you link to something that goes into detail? Googling it doesn't turn up anything relevant, but it sounds like it'd be interesting to read about.


Could they not offload a lot of compute to ground based computers and submit results back via radio? Or are these real-time applications?


The whole point of implementing A* on Curiosity was to give it some navigation autonomy. The time delay in getting sensor data back to earth, coming up with a motion plan, then sending the plan back to be executed imposes tight limits on how fast the rover can drive, what kinds of terrain it can cover, and ultimately how much science can be done. Local autonomy for basic "go over to than weird-looking rock" tasks is a major improvement.


You could A* your way around the whole planet by using an Earth based computation but only if you knew where every rock was.

There must be some equation of motion in space robots that combines terrain difficulty, robot speed, round trip time to Earth, and how far ahead you’d need to be able to see.

Curiosity moves about as fast as a Roomba. The ping is (min/avg/max) 10’/24’/40’. Ergo, it needs to be able to see X yards ahead of itself to plan A* from Earth, requiring a camera boom Y feet tall producing images with Z megapixels of resolution.

I wonder what X, Y and Z are.


Some very rough numbers, if you want plans valid until you see the results and travel at a good speed:

- You want a travel speed of 0.5 m/s - Worst-case round trip time from command to result is 40 minutes, or 2400 seconds - Max distance covered is 1200 meters - Assume you can drive over any rock 10cm or less in size (the rover can do better, but at a reduced speed)

So you need to scan an area 1200m x 1200m, with range accuracy of, say, +/-5cm. Forget the camera boom height, or the time it might take to scan and process, there's no sensor that will give you that kind of accuracy. The stereo baseline would be huge, and leave you entirely at the mercy of whatever texture or not a given part of the martian surface has to offer. LIDAR is OK if you definition of "long range sensing" is larger objects at 200m. The time cost for shipping back all the raw data for processing would kill performance as well, and if you wanted to process it locally the compute requirements would be just as high as doing the planning locally as well.

Onboard, or at least much closer,compute is the only way forward in autonomy. Honestly, the best bet on improving local compute would be to send a robot bulldozer, some C4, a rack full of milspec servers, and a big RTG. Blow a nice crater, push the servers to the bottom, and bury them in dirt for shielding. If you get really lucky, you could find some old cave or lava tube.


That's more or less what they used to do with Sojourner, only with humans setting out the way points rather than A* . It never managed to get more than 10 meters from its lander though, I assume this was partly driven by the limitations on a human's patience in operating a slow vehicle with 28 minute feedback loop.

One drawback of the remote A* approach is that you end up using more energy as the rover would have to be constantly communicating with its onboard antenna. Its relay satellites are only in range for a limited period each day. Fine grained maneuvers (like drive around that big rock to get to this small rock) would also prove difficult because of likely errors in the rover's inertial-navigation system.

https://en.wikipedia.org/wiki/Sojourner_(rover)


Does curiosity really move that fast? Roomba is maybe 2 mph. I thought curiosity was closer to 0.1 mph


His or her estimate seems wildly off. I had the same response because my Roomba moves fast!

Some quick googling says-

Curiosity max speed equals 0.08699 mph.

Roomba equals a foot per second which is 0.682 mph.

Off by about 7.82x.


Thanks: my estimate of Roomba slowness was way off. I’ve only ever seen them on TV.

I remember once hearing that Curiosity, flat out, could do 1km a day.


Random but I highly recommend the new "mapping" type roombas. Best thing I bought all year.


With what kind of processing power did the recent Change'5 probe did the autonomous docking manoeuvre in lunar orbit..

Also it seems to be doing sometype of image processing to identify a suitable landing spot and guiding on to that point.


I was thinking, instead of that, what if you had a separate isolated tiny computer on spacecraft, that was powered by its solar panels (so there's no electrical wiring, or other connection to it), and have its own radio. And this separate computer could use the latest bleeding-edge CPU, and be encased in a radiation-hardened shell. It would use its radio to talk to the slower main computer, and do math really fast locally, and if need be, beam the results to Earth, or to a nearby orbiting satellite.


Unfortunately there isn't really any practical way to have a radiation hardened shell that is sufficiently effective. Eg. 5cm of aluminium stops only 30% of the galactic radiation. (Heavier elements (e.g. gold) are scattered by incoming particles causing incoming heavy ions which cause even more damage.)

So practically it would still experience significant radiation.

But having the main compute for Mars remain in orbit with the relay isn't a bad idea.


Shannon limit implies a linear relation between bit rate and transmit power. The only way to get fast enough transfer would be to spend tons of the power budget on radio


For Mars, at least, that would be tens of minutes round trip because of the speed of light.

It works for some things, but for pathfinding it isn’t a great fit.

The other issue is bandwidth between the craft and Earth, which is quite limited.

Maybe there would be benefits to a “orbiting datacenter” around Mars carrying a bunch of rad-hardened compute? I assume NASA has considered this and decided it would be a bad idea.


SpaceX doesn't have the same requirements--The radiation environment by Mercury or halfway to Jupiter is drastically different than LEO.

SpaceX missions are also a lot shorter. Having one unrecoverable latchup a week isn't a big deal if your mission is 2 weeks long. If you mission is 10 years, it starts to become a problem (especially since some radiation damage can be cumulative)

>NASA and DoD are still baselining new platforms on incredibly expensive (albeit rad-hard) PowerPC RAD750 and similar

NASA and DOD have also been sending up Xilinx and Altera boards for ages (even the non space-grade ones). However you can get rad-hard ARM CPUs that are cheaper and more powerful than the ones in a Zynq board.


It would be interesting to know if someone put a raspberry pi inside and outside space station in exposed complete unprotected environment and run some continuous tests, how long would we start to see any failures and what kind failure would be that be.


This has been done multiple times. Amateur radio satellites and some cubesat kits [1] use primarily COTS components.

The lifetime and radiation environment for those applications are very limited, though. It seems that for short missions (e.g. <2 years) and low orbits (<500km), COTS hardware should be fine if properly shielded.

It would be interesting to see what difference it actually makes for HEO or even BEO missions, especially if a high degree of redundancy is introduced as well.

[1] http://www.cubesatkit.com


Typically those sorts of tests can be done on Earth if you have access to a cyclotron. My guess is that the SD card would be the weak link.


"Having one unrecoverable latchup a week isn't a big deal if your mission is 2 weeks long.Having one unrecoverable latchup a week isn't a big deal if your mission is 2 weeks long."

Unless it happens in your attitude control system or your command and control system, causing you to lose control of or communication with your spacecraft.


I guess I didn't actually say so, but my implicit assumption was that you have a voting setup where a single failure isn't necessarily a problem


I think the assumption is that a redundancy scheme is in place. So you have your unrecoverable issue in some module of compute A, but compute B and C vote them down and life proceeds. The problem is when your mission is long enough that the same module gets hit in one of the other two units, and now you're in trouble.


> Starlink with Xilinx FPGAs, while NASA and DoD are still baselining new platforms on incredibly expensive (albeit rad-hard) PowerPC RAD750 and similar.

Ignoring that Starlink isn't very far away, I would assume NASA stuff would also have FPGAs and ASICs on them - they aren't CPUs and aren't used like them.


It’s pretty common in space to implement a soft core CPU (or redundant ones) on a space-grade FPGA.


Some NASA orgs have tried using FPGAs as a way to get around software requirements, to varying levels of success


Interesting, although I was more thinking about FPGA's in things like acquisition and processing rather than overall logic as the PC seemed to imply.


The high level NASA requirements cast a pretty wide net (to include data acquisition and processing) as to what falls under the purview of those requirements. From 7150.2:

“ A.30 Software. Computer programs, procedures, scripts, rules, and associated documentation and data pertaining to the development and operation of a computer system. Software includes programs and data. This also includes COTS, GOTS, MOTS, reused software, auto generated code, embedded software, firmware, and open source software components.”

https://nodis3.gsfc.nasa.gov/displayCA.cfm?Internal_ID=N_PR_...


My understanding is that certification is the bottle neck, in both time and cost. No one wants to spend the money or time to flight certify something new when something already battle tested will suffice.

But your comment makes me wonder if the private sector doesn't have those certification requirements?

The other differentiating factor is that the private sector is not sending multi-year (indeed multi-decade) deep space missions, where the need for battle tested systems is paramount.


Flagship multi-year science missions are generally conservative with technology choices, but some NASA projects are intended as technology demonstrations and can on take more risks.

So like the Perseverance rover on its way to Mars is powered by redundant RAD 750s (same as Curiosity), but the Ingenuity helicopter along for the ride is powered by a Snapdragon 801.

It will be interesting to see how it holds up.


How do you battle test a RAD prototype? Stick it in microwave like device with ionizing radiation and see how many bit-flips occur?


It depends on where the spacecraft is going as radiation environments differ. I've taken parts to be exposed by a proton line at a particle accelerator. For some environments they just use Cobalt-60 as a radiation source.


This reminds me of a relevant anecdote: back in the naughts, I was doing research in cosmic rays at a large nuclear research facility. I did simulation and data analysis - office/computer work mostly. One day, a person with a clipboard comes into my office and asks about the whereabouts of some radiation source. I look at them confused - I had not touched sources since teaching nuclear physics labs. They show me their clipboard and lo and behold, it has my name next to a really high intensity source and they're looking to locate it.

After a few minutes of awkward shock and denying all involvement, we realized it was a colleague at the same research institute (and of the same name). He was, one building down, doing his PhD research on radiation hardened detectors for the CMS experiment at CERN(1). He was using the source for the testing. But I had a minute of real stress before that came together in my mind...

(1) I think this was his work: https://onlinelibrary.wiley.com/doi/abs/10.1002/pssa.2007763...


Pretty much!


To answer - no we don't have the same certification requirements. NASA steps in when there's human lives and/or a lot of money on the line, but most smaller projects and just about every independent project is free to assume its own level of risk.


You know (and probably are implying) this but it’s completely program/project specific. Some projects out of Armstrong, for example, must meet FAA certification requirements


NASA has used Xilinx FPGAs on a number of missions (though still mostly smaller missions). They are doing so for precisely this reason: on-spacecraft computation for intensive tasks such as image processing.

Here’s the website for the SpaceCube platform (developed at NASA Goddard). This is a little out of date (I worked on flight software for a mission called STP-H6 which I don’t see listed here), but gives an idea of how this idea is slowly but surely gaining steam in NASA.

https://spacecube.nasa.gov/


The private sector has decided to put regular ground chips in spacecraft and just deal with errors using triple redundancy. Low earth orbit where most satellites hang out doesn't have much radiation anyways.

The cost savings from using regular chips is so high that I bet SpaceX will continue to use them even in deep space. Just surround them with sheilding. When a $400 desktop cpu is 500X faster than a $40,000 space rated one a couple pounds of shielding is well worth it


The kind of radiation you want to protect against is not "easily" shielded.

The effectiveness of shielding is proportional to its mass and thickness, and both are at a premium for spacecraft.


There’s more to it than just specs. Consumer grade silicon will not survive in space, radiation will just kill them.


On some of the silicon details: https://habr.com/en/post/518366/



And that's expected to be succeeded by NOEL (LEON anagram), a radiation hardened RISC-V CPU focused on high reliability.


It is not related to OpenSparc. LEON is a SPARCv8 developed initially by ESA and later by swedish company Gaisler Research.


How do these circuits deal with random bit flipping from cosmic rays?


There's three basic ways this is done:

1: By process, where chips are created with special or larger features to better resist cosmic rays. This is Expensive since they're made in very low volumes and the cost of the new fab line can't be spread among many millions of units. Instead, a few thousand chips might be made.

2: By design, where redundant systems such as triple redundant memory or voting computers are used. This is probably the most interesting as you can get into issues like the Byzantine Generals problem here. All the redundancy can be implemented in a single FPGA by simply routing out the design 3 times and using voting logic, assuming the FPGA is large enough.

3: By shielding. Just fly regular chips in a shielded box. This causes thermal issues, but is sometimes necessary, such as in Juno, which has to deal with the enormous radiation flux around Jupiter.


As sending mass to space becomes cheaper I wonder if shielding will become more popular... Maybe there is some oil-like material that could serve bath as a cooling bath and as a radiation shield?


Polyethylene is usually used to shield against cosmic rays. Nearly identical shielding performance as water. Slightly less weight per volume. Can be formed easily and retains its shape.


As mentioned in another comment, water is a radiation shield, but ionizing radiation in space will attenuate 50% after 7cm of water (I could be wrong), and if you want a lot of attenuation, you need a lot of water (which is extremely heavy).

A small amount of water for shielding is undoubtedly much more massive than simply using bigger, rad-hard processors.


I don't see any comments about water on this post?

If we're talking about $50/kg [1] in the future... well...

- Sending a few kg of water (or other shielding) to space costs a fraction of the price of a fast non-rad-hardened CPU.

- It really costs less than the extra development cost associated with having to use bespoke toolchains.

[1] Number from the other space post on the front page today: https://getmeflyingcars.medium.com/how-much-does-it-cost-to-...


> I don't see any comments about water on this post?

Looks like it was deleted.

> - Sending a few kg of water (or other shielding) to space costs a fraction of the price of a fast non-rad-hardened CPU.

It doesn’t sound like a few kg is going to cut it. Recall that a 1 kg of water is about 1 L, which gives you about 6 cm of shielding, which is simply not enough. That’s less than 50% attenuation of the ionizing radiation you find in space. Mass increases with the cube of the thickness. If you want 20 cm of shielding, that’s 33 kg, and something like 86% attenuation.

It seems to me like there are better things you can do with your mass, which also needs to be spent on things like fuel for stationkeeping. Lower mass also means more satellites.

> - It really costs less than the extra development cost associated with having to use bespoke toolchains.

POWER is not exactly some obscure ISA. It is well-supported and battle-tested. You don’t need a bespoke toolchain.


The most bespoke parts would be tool chain certification, which among other things, covers "if I have source A, then I can be sure resulting machine code does B, not anything else".

And that's something you'll get even if you're running MS-DOS on 286 in space ;-)


The packaging can be different or redundancy can be added to the design: https://en.wikipedia.org/wiki/Radiation_hardening#Radiation-...


Continuous voting on hard real-time systems


So my watch (or maybe headphones even) is several times faster than anything that's ever run a spacecraft?

(I understand why, but that this is so just blows me away)


Though true, it is important to remember that Computers Are Fast. https://computers-are-fast.github.io


Especially when they don't have to deal with 37 layers of Windows/MacOS abstractions and backward-compatibility concessions.


What about chinese/indian/japanese spacecrafts ?


China's Chang'e 4 lander is known to use ATMEL AT697F. It is SPARC-compatible.

Source: The scientific objectives and payloads of Chang'e 4 mission. DOI 10.1016/j.pss.2018.02.011. No free source, use Sci-Hub.


"A CPU for use in space must first be MIL-STD-883"

Is this just for NASA craft? Are there any regulations for private craft or international standards?


Depends, but mostly no. No regulations.


A non-trivial amount of satellites have also used Transputers (and there's legacy of it in SpaceWire network)


There have been none RAD hard CPUs working in space. https://www.hpe.com/us/en/insights/articles/the-space-statio...


Low Earth Orbit (LEO) which the ISS is in is very different from deep space. You can get away with much less radiation hardening in LEO. For example, I know a company consumer-grade Xilinx MPSoCs with 4xA53 cores at 1.5GHz.


They literally ship off the shelf laptops and smartphones to the ISS.

They do modify them slightly to remove the lithium-ion batteries, which you do not want inside a space craft.

I think they replace them wit NiMH cells if the devices still need to be battery powered.


They no longer do this, and are now happy with just testing li ion products. There are iPads all over the station (and used by the astronauts on the SpaceX Dragon)


Do we actually have much data on the failure rates to different kinds of chips to radiation in space? If so ... how? What conditions have to hold so that when a chip is damaged by radiation you get enough information about it to know which chip had what kind of problem?


Testing is done on earth in lab conditions, and some testing has been done in space. We've had good models of radiation energy and type in space for a while, so you can reproduce type and intensity of radiation on earth to test how the chip will behave. You can also fly chips for testing, where instrumentation and testing are controlled by known working hardware.

Diagnosing novel failures on hardware in space is hard, but the overall types of failures and their underlying physical phenomena are known. Obviously, in the case of a satellite that goes completely unresponsive you can't answer 100%, but you may have sensor data in the leadup to the failure, or from other "nearby" satellites that would allow you partially reproduce conditions before the failure in a test scenario.


When I see posts like this, I always think about how insane it will be in 2520 if we keep the same innovation growth (and not die to whatever)


Pretty interesting. Wonder what SpaceX uses for there CPUs?


[deleted]


> Saying there are physical backup controls is a bit like saying there are backup controls at the bottom on your new TV.

Can you expand on this? Are you against physical backup controls?


What they were trying to say is that physical backup controls are often incomplete and inconvenient, just like how you can control a TV using the buttons on it but doing so is far less effective than using the remote. The idea is that making physical controls the primary interface (rather than the backup) for critical tasks means that more effort would go into making them ergonomic and effective.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: