Colonial Pipeline precisely does keep it's control network disconnected from the internet - the only thing that was ransomwared is their corporate network. They shut the pipelines down voluntarily to prevent further spread.
If we define critical system as "necessary to the operation of the business" then the corporate system is absolutely critical. It doesn't matter if the SCADA system is airgapped if you can shut down the capability by crashing the corporate systems.
Their approach makes a lot of sense. Corp network hacked - so to be careful shut down pipeline until you've really made sure that you are fully safe pipeline side as there now may be more attack vectors.
I built some of the SCADA and IT systems for Colonial Pipeline.
Many industrial SCADA systems (nearly all) send data from their "OT" systems (PLC/DCS/SCADA) to their "IT" and business layers (Historians/Timeseries Databases, Dashboards, Power BI/etc). This almost always happens through a two-way link (think TCP/IP, HTTP). While the software should not allow data flow backwards, the hardware absolutely does. So how much do you trust the software?
I often advocate that industrial SCADA systems utilize "data-diodes", one-way opto-isolators, or other physically verifiable methods of confirming that no information/data/instructions can get from a "higher" layer (OSI Pi, PowerBI) to a lower layer (Allen-Bradley PLC, Siemens PLC, Emerson DeltaV DCS, etc).
Convincing the powers-that-be to do this has been incredibly impossible in most places and a large reason why I'm trying to transition to a different space - I simply have had ethical concerns about providing engineering services to critical infrastructure without building in best practices.
Stuxnet was over a decade ago - I don't understand how these protections aren't mandated by the DHS already.
Disclaimer: I don't think it's reasonable for non-involved people to assume the OT side has been compromised. I do think Colonial will need some time to verify the integrity of their SCADA systems and it makes sense to keep the power to the physical devices (valves/pumps) offline until they do. I understand why they chose to shut down but I don't think there's any evidence that they'll be unable to start back up again.
Lastly, I saw a quote in one article:
>>> Digital Shadows thinks the Colonial Pipeline cyber-attack has come about due to the coronavirus pandemic - the rise of engineers remotely accessing control systems for the pipeline from home.
I strongly doubt this. It's possible, of course. But it's extremely unlikely to me that employees would have remotely accessed OT/SCADA systems from home. No one I've worked with has had that capability enabled.
Many companies use products which have been shown to have flaws, like Citrix or various corporate VPNs. These could be compromised to get access "closer" to the OT layers but never directly into it.
Onion layer security is very much practiced everywhere I've been.
Edit: I have heard of some petrochemical facilities moving towards allowing operators and engineers to manipulate valves/pumps on their iPhones. This horrifies me for many reasons. I've never actually seen it implemented and I always bring up Stuxnet when I hear people mention it. I personally believe that DHS should make this sort of thing illegal for critical infrastructure. Many good engineers disagree with me.
Dan Kaminsky spent an enormous amount of time and effort on creating a secure hardware framework 10 years ago. It went nowhere for a lot of the same reasons you discuss in this comment.
The government and industry are all talk. Until we see actual enforcement / incentives for secure hardware, just assume everything (and I mean everything) can get shut down at any time. The only people who think this is an exaggeration are those who haven’t seen what things actually look like on the inside.
> Dan Kaminsky spent an enormous amount of time and effort on creating a secure hardware framework 10 years ago.
Can someone please link me to a page that goes into more detail about this secure hardware framework? Not a month from his passing, and we get a national-security-level attack that might have been prevented if US business took more seriously software engineering and security engineering from back then.
Also, I cannot help but wonder if Dan would still be with us if the irretrievably broken US healthcare system was a national system that supplied him with an inexpensive CGM and insulin no matter his employment situation.
There were a bunch of people who worked with him on this in Taiwan. I don't think anything ever got released publicly, and I think Dan found the whole episode to be so frustrating that he never commented on it much in public.
I’m curious — how would something like a data-diode work in real life? It makes sense, but what about something like TCP where the sending side needs the ability to receive ACK messages? Is a firewall (dedicated, if need be) enough?
Or would this be some other kind of physical interface that took some kind of read-only data (serial?) and sent it up the layers using TCP/IP, where only this box would be at risk?
Edit: looks like you answered part of this below — you suggest switching to UDP protocols.
TCP would not be possible if your physical layer doesn't support two-way communication. I think UDP would.
Firewalls are currently used, and probably generally configured well. Petrochemical companies have a many-layered onion security strategy with minimal communication paths through the firewalls. Generally you might have 4-8 layers of firewalls from public facing internet to the PLC/DCS/SCADA. Administrative people might VPN 1-2 layers deep and engineers would at worst get remote access to the historian, 1-2 firewall layers above the PLC/DCS/SCADA.
It's my professional opinion that firewalls are not good enough for critical infrastructure. Even a completely air-gapped system was hacked thoroughly over a decade ago in Iran (See Stuxnet).
Your suggestion would suffice, if that box ("gateway", in the IoT parlance) was connected with a one way physical connection to the SCADA system over serial or what-have-you. Then it could communicate with TCP using existing application stacks.
I am designing a system like this at my current job, where luckily we are a small enough team so people have genuinely listened to my suggestions about this.
However, good engineers often disagree with me. I may be overly zealous on this particular issue and I take a lot of criticism about how dogmatic I am at times. I'm not a senior engineer by any stretch.
> I’m curious — how would something like a data-diode work in real life?
Low, fixed-bitrate transfer over unidirectional fiber optics. Unidirectional transcievers are the norm for long-haul fiber.
My local electrical utility is still running 11.52kbps RS-232 over fiber for exactly this reason. At those bitrates you don't need backpressure -- your disk will never fill up and if the CPU can't handle that bitrate you already have much larger problems.
It's kind of funny that they have sheaths where one strand is running this piddly dozen-kilobit protocol and other strands in the same sheath are doing 10gbit/sec * 16-channel CWDM.
Most electrical utilities are into fiber optics in a very big way; they already (usually) own the poles and unlike copper it's nonconductive. Many of them have vastly more strands of fiber between substations than they need.
> how would something like a data-diode work in real life?
The ones I have worked with convert a TCP stream to UDP, send it across the diode, and then convert it back to TCP. Each UDP packet has a sequence number and there is a single reverse-diode that is fired when a packet is missed or arrives out of order that triggers a retransmission of the last N packets.
I have heard some plane infotainment systems use a 1-way optical link to solve this problem to get the speed/altitude/etc to the displays. It just receives the data as a downlink (no 2-way communications) and being optical its electrically isolated as well as impossible to transmit or even interfere the other way.
That was FUD[1] spread by Chris Roberts who has been called out for it. He claimed he was able to issue climb commands from the IFE to the CAN bus.
Apart from this being technologically impossible, if he would have really done it he would have been charged for endangering an aircraft and prosecuted. (So the only explanation why he isn't in prison is that it didn't happen). The technical reason is Data diodes are common in aviation to separate IFE and CAN bus or position data. (e.g in the ARINC)
If you don’t allow 2-way comms to SCADA devices, how can you set values on those devices. For example, open valve 9881 to 10% … how would that be done?
That functionality would be local, before the one-way isolator. A human at a terminal located near the valve could still press a button to make that happen. That system could even be running windows (most are!).
But a hacker wouldn't be able to use their access to the Timeseries database for supply chain and logistics, to pivot to the SCADA system because their attempts would be blocked by a lack of a physical layer connection in that direction.
It would significantly reduce the attack surface of the OT systems.
I think the original use-case was delivering data for use in dashboards or other business systems, not the SCADA network in general (where you’d want write access). So, places where you might want to get read-only data from a secured system, but not allow write access. These business/reporting systems might be internet connected, hence the desire for better isolation.
That would be like deploying the landing gear of the airliner, because someone triggered a bug while changing the channel on the in-flight entertainment system.
The whole point of SCADA systems is that you can open and close valves remotely, without requiring to drive hundreds of miles along the pipeline to wherever the particular valve is located.
Industrial applications use unidirectional gateways (See NIST 800-82r2). The gateways have diode-like hardware at their core, but add software. The software acquires snapshots of industrial state, converts those snapshots to proprietary unidirectional protocols & formats, and on the external enterprise network makes the data available to enterprise users.
A common example is a SQLServer database of all industrial data that is authorized to share with the enterprise. Grab new and changed data as it arrives on the industrial side. Push unidirectionally to the enterprise side. Insert/update the data in an identical SQLServer. Enterprise users & applications interact normally and bi-directionally with the replica database.
The technology is used routinely to provide access to industrial data that enables business efficiencies, without providing access to the industrial systems that produce the data.
A data diode doesn't have to be just a one way ethernet port, it can include a pair of dedicated servers.
The inside (isolated) server would poll everything, store it into a buffer, and send that buffer (plus error correction) out through an optoisolator to other server.
The outside (internet facing) server would then keep up with the ring buffer, and serve requests, and do any outbound push of data via any protocol required.
A system to do this could be made with a pair of raspberry pi computers and a little bit of discrete components for less than $150 in hardware costs.
"Operational Tech" (pipeline and safety-critical monitor and control)
and
"Information Tech" (payroll, email, other business stuff)
?
I could only imagine trying to tell a large corporation that their "IT" authentication system can't be linked to the access card keys for the front gate, or whatever other physical security they might have in place.
It doesn't matter if we can formally prove that a remote access system is sufficiently secure as to aloow engineers to operate valves and pumps from home... For inevitably, some months from now, a wildly insecure utility will be connected to that, and you lose the ability to reason about how to keep the streams from crossing.
Easiest opto-isolator is to epoxy the sfp into the socket, and then fill the rx port on the critical side with epoxy, and then just run one fiber. The epoxy may seem excessive, especially if the sfp dies and you have to swap a whole nic, but it makes people stop and think.
Do you think Colonial identified some "physical world" risk, as in the possibility of a pressure overload or pipeline leak? I imagine that verifying the integrity of these SCADA systems is a very complex task, so I'm wondering if they've already identified a possible attack vector/entry point or if this was entirely preventative.
I have no idea. Shutting down preventatively would be smart, and they had good leadership in their IT space while I was there. Friendly people who could make the hard decisions quickly, weren't afraid to pick up the phones to call people, and supported the growth of struggling employees without letting shoddy work get approved. They were also good at managing large multi-year and nation-wide project campaigns - a rare skill in this world.
That said, determining whether or not a system was compromised can be incredibly difficult. I'm sure they'll face massive pressure to turn the pipeline back on as it does supply almost half of the east coast with oil. I wouldn't want to be the person who has to make that call when it's impossible to prove a negative.
CPC had two explosions a few years back which caused gasoline shortages in new england, that may provide indication of the scale of disruption to expect.
Thanks for the response. It's amazing to have a community where "subject-matter experts" like yourself just pop up.
I'm quite surprised and comforted to hear that the leadership there is competent and knows how to manage people. I've heard from friends/acquaintances who have worked in the energy industry about how terribly things are put together on an IT front (PG&E being a prime culprit), so I was expecting the same here.
I really like your "data-diodes" concept. Interested to see if such a thing takes off especially as these attacks evolve.
> I personally believe that DHS should make this sort of thing illegal for critical infrastructure.
I can't speak to non-electrical infrastructure, but the NERC CIP "high impact" standards already make it largely impossible to operate critical electrical infrastructure from anywhere other than a secured control centre. Operating from your laptop or iPhone from the kitchen table is however allowed for "low impact" assets like small power plants.
I also wonder why nobody who has secure computing issues demands physical write-enable switches for ROM, rather than using software switches that are inevitably corrupted.
Generally it's been the opinion that the control systems need to be modifiable. For example if you add a single valve in a facility which has 4,000 valves already, it would be nice to just add add a controller for that valve to the current SCADA system.
However, a write-only ROM system is possible as long as the ROM chips were reasonably affordable and a company could provide reasonable turnaround times for small modifications. That would move the target of vulnerability up the supply chain.
Some of the things which matter though are necessarily run-time variables like "is the valve commanded open or closed?" and "what are the tuning parameters for this PID control loop?". It's always theoretically possible for a buffer overflow/rowhammer/etc to flip the bit responsible for the valve's open/closed command. Even with an OS/Application stack burned into ROM. You still need RAM.
At least power cycling a readonly-storage device would remove any malicious RAM changes.
Thanks for your many informative posts here. It's a pleasure reading from someone who knows what they're talking about :-)
I did say ROMs, but you can also use EEPROMs, which are erasable in-circuit, and you certainly put a physical write-enable in that circuit. Ideally, it would be a momentary push-button that has to be pushed in person on-site.
Back in college we used EPROMs, which are erased by putting them for 20 minutes or so under a UV lamp. EEPROMs came out later.
Another thing that can be done is to divide the pipeline into several sections, not just one long one. So if one section gets compromised, it doesn't propagate to the next.
Do you propose that each section gets their own control/monitoring facility staffed 24/7 ? If not, the shared control/monitoring facility is the most likely place of compromise anyway, and it by design can control all the pipeline hardware.
I'm not sure how that would work-- each section would still need to send its petroleum products to the next section, making it effectively still one pipeline. Unless I misunderstand your statement?
Consider cars on a freeway. There is no central control. Each car controls itself, cooperating with its neighbors. If one car goes berserk, it doesn't take down the whole freeway.
With a pipeline, if sections operated autonomously but cooperated with each other, and one goes berserk, its neighbors will shut down, but they won't be damaged. The repair work only has to repair the one section.
Ah, I see. Not segmented pipeline, segmented pipeline control. That makes a little more sense. However it might make it significantly more difficult to make coordination between segments possible: The self organizing behavior at work with car driving may be significantly different than what is required for a pipeline.
People driving cars are essentially doing what is best for themselves individually (within the bounds of the law), and that ends up translating to something that works for the whole. With a pipeline, that might not work: If pressure gets too high in one area, it might take highly coordinated control across thousands of miles to bleed off contents into buffer tanks & ease pressure a dozen segments away.
I'm not saying that couldn't be done, I'm sure the SCADA systems could be isolated from each other in this way, it just seems like it would require a lot more difficulty with explicit coordination between technicians, not a self-organizing system such with driving on a highway.
I have no idea why they would do that unless the system was not airgapped properly or it was hard to untangle the admin network from the control network (in which case, the control network is effectively not airgapped).
USB ports are generally disabled in BIOS or purposefully physically damaged on most OT systems I've worked on for oil/gas/chemicals. Many places are fond of using epoxy to block the ports.
I like those people. The problem being sometimes you need logs or data off tools. I’m far from an IT wizard so I don’t know what other solutions exist but the flash drives to get stuff off tools was the easiest
It makes some things more difficult. CD/DVD's are generally used instead. Sometimes other computers could be connected but in that case there would be some organizational procedure for attempting to make sure that other computer was as low risk as possible.
You can't eliminate the possibility of malicious action, Stuxnet proves that. It's my opinion that at least for critical infrastructure we can probably make things much more difficult for our adversaries at a relatively low cost. This pipeline is purported to carry half the gasoline/diesel/heating oil to the east coast, but I'd be lying if I said I knew exactly where the cost-benefit equilibrium should land.
I'd be lying if I claimed I knew...but I would be willing to bet that cost-benefit analysis was made a long time ago before these concerns became so timely.
You need the SCADA systems to run the pipeline. They control the pumps, valves, product sequencing, etc. So Colonial purposely shut down the pipeline to prevent the SCADA system from getting affected, which might cause physical damage that truly would be a catastrophe.
It was intended to be airgapped, but we're talking about a pipeline that is several thousand miles long, with many pumping stations and delivery terminals. All it would take is one of the SCADA systems at one of those locations to suddenly open a valve and dump petroleum out into the environment to cause a disaster.
Or worse - rapidly open & close valves in rhythm, and the water hammer effect (the inertia of the petroleum in the pipeline) would cause the pipeline to destroy itself. The repair costs would be astronomical - you'd naturally have to repair the damaged sections, but then also re-test all the welds to see if any had been weakened by the pressure pulses.
It was not intended to be air-gapped. These systems generally communicate to business layers through firewalls.
Onion-layer security rather than air gaps. Communication through the firewall isn't supposed to allow control over the valves, but it does communicate both ways (TCP/IP). This is the general practice in petrochemicals, at any rate.
And to add it’s perfectly possible that the pipeline networks were air gapped (Ed: which don’t believe them) but you still need to shut down.
I could imagine a situation where information another network (e.g. orders or incoming flows from another customer or user) is necessary to run the pipeline but unavailable to use to operate the pipeline control system.
>Colonial has not given any public indication as to the reach of the ransomware outbreak, but Robert M. Lee, chief executive of cybersecurity firm Dragos, said he believed Colonial's operations network was shut down proactively "to make sure that nothing spread into those systems."
Most petrochemical companies have an onion structure. There are lots of layers of firewalls with what are supposed to be limited communication paths for specific applications.
We need to move to using physical layers where data can only be transmitted in one direction (and then use something like UDP)