This chapter examines process control and SCADA systems, followed
by the reasons convergence is necessary and the challenges and threats that face the
organizations responsible for protecting the nation’s critical infrastructure. We will also hear
from two industry experts in one-on-one interviews detailing real-world examples of problems
found in process control networks.
INTRODUCTION
You may not be aware of it, you may not even consider it, but critical infrastructure allows
for all of the modern-day conveniences we are used to. The health of the nation depends on
the infrastructure that provides electricity, moves and controls water, provides gas and oil, and
ensures the operation of our transportation and communication networks. When we flick a
light switch, when we get a glass of water, when we pump gas into our cars, when we dial
9-1-1 in an emergency—all of these things that we may take for granted are available to us
because of the infrastructure that supports the delivery of these goods and services.
It is impractical to think that every time water needs to be redirected or electricity needs
to be routed down a different line that someone would actually go on-site and physically
make the change. That obviously wouldn’t make sense. This is where process control systems
come into play. These systems monitor and control critical infrastructures as well as a
number of other applications, such as automotive manufacturing plants and hospitals. In this
chapter, we will explore how process control systems work, with an emphasis on the systems
used by the oil and gas industry. These systems are commonly referred to as Supervisory
Control and Data Acquisition (SCADA) systems. Most of the time when people are talking
about process control systems they mention SCADA, although SCADA is a subset of the
larger process control system.
This chapter will take a slightly different approach than some of the other use-cases
chapters in the book. The first section examines process control and SCADA systems, followed
by the reasons convergence is necessary and the challenges and threats that face the
organizations responsible for protecting the nation’s critical infrastructure. We will also hear
from two industry experts in one-on-one interviews detailing real-world examples of problems
found in process control networks.
From a longevity perspective, protecting a nation’s critical infrastructure is far more
important than any business system, Web site, or individual organization. Without power,
water, and oil, you may as well forget about the financial database or the Web server that got
hacked, or even the personal information that disappeared on your laptop. There is a serious
threat out there, especially from terror organizations that want to inflict the most damage
they can with no regard for human life. This is not at all meant to be a scare tactic, but the
threat is real and only through a converged solution between the infrastructure owners, the
logical security vendors, and the infrastructure control vendors will a nation’s infrastructure
be protected. After you read this chapter, the benefits of a converged monitoring and detection
solution providing a single pane of glass into both physical and logical threats will be
apparent.
TECHNOLOGY BACKGROUND: PROCESS CONTROL SYSTEMS
Process control systems are commonly referred to as SCADA systems when talked about in the
context of security. Process control systems are designed to allow for automation in industrial
processes, such as controlling the flow of a chemical into a processing plant. Process
control systems are used in automated manufacturing and refinement production. When cars
are manufactured, process control systems measure the speeds at which parts of the manufacturing
process are occurring, and will adjust the rate of the conveyor belts and the delivery
of additional parts to match the current speed of assembly.
Process control systems consist of sensors that are used to detect changes in conditions,
controls which can respond to those changes in conditions, and a human interface that
allows operators to make manual changes. The sensors provide feedback to the control and
the control can reply with commands based on the feedback. A simple example is an air
compression system. A factory is using compressed air as part of its manufacturing process,
and the air should be at a constant pressure of 75 pounds per square inch (psi). When the
pressure drops below 75 psi, the sensor that is monitoring the pressure will report back to
the control. The control can then instruct an air compressor to activate, which builds the
pressure back up. This is an example of an on/off control.
Another example of a system using on/off controls is a heating system. The thermostat
that is monitoring the temperature reports back to the control, and if the temperature drops
below a certain degree, a command is sent to the heater to turn back on. This is a very
simple example, but it gets the point across. Figure 6.1 shows a simple process control model
for a heating system.
The process control model in Figure 6.1 includes a heater, a fuel line providing fuel to
the heater, a control, and a thermostat. The thermostat will provide the current state to the
control and the control will communicate with a valve on the fuel source that opens and
closes based on commands from the control. When the temperature drops, the valve opens,
allowing fuel to flow to the heater, thus enabling the temperature to rise.
Proportional controls don’t just turn things on and off, but also adjust output when the
sensor is reaching an upper or lower bound. In the heater example, when a sensor detects
that the temperature is dropping, it provides feedback to the control which then directs the
fuel input valve to open a little. Rather than have only two states—on and off—there can be
slight corrective actions that force the dropping temperature to be corrected gradually.
The controls live within small microcomputers or embedded systems known as programmable
logic controllers (PLCs). They are designed to be very durable and can survive in
extreme temperatures, in water, and in surroundings with a lot of electrical interference. The
controllers are programmed to respond to particular conditions reported via their sensors,
such as a temperature dropping below a certain degree. The PLC can then send out a
signal—either digital or analog—to activate a pump, open a valve, or close a switch. Digital
signals can support only two states—on and off—and are typically used to control valves.
Analog signals can support multiple states, similar to the temperature control on an oven—
the oven is either completely off, or it is anywhere between off and its maximum allowable
temperature. Analog signals are used to control adjustable switches. Think of the air pressure
example again. Assume that a particular processing component needs to have pressure
between 100 psi and 200 psi. The psi begins to drop toward 100 at a very slow rate. The
PLC receives notice from its sensor of the slow drop in pressure, and instead of signaling the
switch to open full blast, it signals it to open a percentage of full blast. In this way, you can
slowly correct a condition rather than brute force a very delicate process. Imagine if the
range had only a 1 psi to 2 psi difference; fine-tuning would be required to keep the pressure
within that range.
Modbus
The communication between PLCs and the valves and switches that they control is typically
implemented using a protocol called Modbus. Modbus is a messaging structure invented by
Modicon in 1979. Its basic capabilities provide a client-server relationship using a very
simple and open standard. Modbus has been implemented for thousands of different devices
and can be considered the lingua franca for devices manufactured by different companies to
communicate. Basically, Modbus is designed to move raw bits or words over a network
regardless of the medium. This means that Modbus can operate over wireless serial or even
Transmission Control Protocol/Internet Protocol (TCP/IP) networks, which is the most
common application in more advanced implementations.
Modbus-compatible devices communicate with a master controller that sends commands
and receives data back from the end devices. Each device on the Modbus network has a
unique address and the controller will basically broadcast the command across the entire network,
but only the intended recipient will respond. The command can instruct the device to
take action, such as open or close, and can be used to request feedback. The technology is
very simple and is built for scalability and reliability. Security seems to have been an
afterthought, probably because in the past, the security of SCADA environments has always
been looked at from a physical perspective. Spoofing a Modbus command if connected to
the network would be trivial, especially on an Internet Protocol- (IP) enabled network. One
could deduce that if access to the network isn’t an issue, an attacker could spoof a trusted
source address and broadcast commands to the entire network; something will probably
answer. Requesting feedback would also be a good way to map a process control system network:
Send a broadcast requesting feedback and everything that answers is a Modbus-capable
device. At that point, start sending Modbus function commands,1 and see what types of
devices are out there.
Figure 6.2 shows a Modbus query in Ethereal, a packet capture and analysis tool available
with most Linux distributions (see www.ethereal.com).
Figure 6.2 shows the Ethereal console displaying a packet capture from a Modbus network.
To explain what we are looking at, let’s talk about the image in three panels: top,
middle, and bottom.
The top pane shows packet headers, meaning it’s only showing basic information about
the communication, such as the source and destination IP addresses. The several headers that
are shown seem to be a series of queries and responses between 1.1.1.2 and 64.69.103.153.
The middle pane allows an analyst to drill down into certain layers of the packet, such as the
Ethernet layer or the IP layer, which is the first expanded layer in the example. The second
expanded layer is the application protocol that is using TCP/IP, in this case Modbus. When
expanding the Modbus protocol we can see that this is a query from a controller to a
Modbus-capable device requesting that it read data from its registers. The logical response to
this request is to send the values that the registers returned. The bottom pane shows the
actual payload of the packet in which the hex value for the Read multiple registers command
is highlighted. It’s clear that the commands are in clear text and that there is no encryption
or authentication, so the viability of a packet such as this being created is not questionable.
Packet-spoofing programs and skilled hackers are able to construct packets that would look
just like this, and without any authentication, the remote system will believe that the packet
is coming from a trusted controller.
PROGRAMMABLE LOGIC CONTROLLERS
PLCs come in many shapes and sizes, depending on the application, but to give you a sense
of the form factor Figure 6.3 shows several PLCs manufactured by Direct LOGIC.
If you look closely at Figure 6.3, you can see that most of the PLCs have both serial and
Ethernet adapters, indicated by the white circles. They are also commonly equipped with
modems for cases where communications are available only via telephony networks.
Electrical inputs allow the PLCs to communicate with the different systems that they control.
The valves and switches are connected using electrical circuits, so there need to be multiple
inputs on the PLC.
The theory of process control systems and the way in which they communicate is fairly
simple, but when applied to a massive processing application such as an oil refinery, there are
hundreds of thousands of sensors, switches, controllers, and valves. The scale of the process
makes it complex, and thus, challenges arise. The oil and gas industry in particular uses
SCADA technology, which allows the monitoring and control of different aspects of the
processing facilities from a centralized location.
SCADA
SCADA is a subset of process control systems used by the oil and gas industry. SCADA is an
industrial measurement and control system, much like process control, except that the process
control system is typically contained in one facility, such as a factory or a manufacturing
plant, whereas SCADA systems tend to be geographically dispersed. SCADA systems are
designed to enable the monitoring and control of processing systems that may be thousands
of miles away from the controller. SCADA systems are typically architected in a client-server
topology in which you have the controller, or the master terminal unit (MTU), connected
to hundreds or thousands of data-gathering or control units known as remote terminal units
(RTUs). SCADA systems are designed to be rugged and durable and can communicate over
long distances. Imagine an oil pipeline spanning hundreds or thousands of miles. The flow of
oil needs to be monitored and controlled at these remote and often inaccessible locations or
substations.
SCADA is used to monitor and control processing equipment. Some of the controls are
initiated by operators who are monitoring huge process control dashboards and other commands
are automatically issued based on feedback to the controller received from the RTUs.
The RTUs are responsible for collecting or gathering data and sending it back to the
MTUs. The MTUs will process the data to look for alarm conditions; if the MTUs detect
such conditions, either they will automatically send the appropriate command back to the
RTU, or an operator will handle the situation manually. The data received from the RTUs is
typically displayed on a dashboard in a monitoring center for operators to respond to, if necessary.
The data may consist of flow graphs, switches that have been turned on or off, or the
counters of a particular process.
Figure 6.4 shows a simplified example of a SCADA topology. It shows several components
that are part of almost any SCADA network.
Figure 6.4 shows us several things. First, the brain of the operation is the MTU, and this
is where an operator can connect to view the current status of the processing network. We
will look a little closer at an operator’s view a bit later. The next thing to note is the gray
box on the right, which represents a process control system network. This could be an oil
refinery or an electrical processing plant. Within the processing network, there are sensors
connected to RTUs, responsible for controlling process equipment at remote sites and
acquiring data from the equipment to send back to the MTU. The network also consists of
flow computers which have sensors that monitor the flow of material through lines, be it
gas, oil, or electricity. All of these systems typically communicate back to the MTU using
Modbus over varying media. We can also see in Figure 6.4 that the medium used to transmit
data ranges from Frame Relay to satellite or wireless; even modems are used in some
instances where other means of communication are not available.
RTUs
Found in nearly all SCADA implementations, an RTU is a small computer that is designed
to withstand harsh environmental factors such as water, salt, humidity, temperature, dirt, and
dust. For example, RTUs should be able to operate at -10° C and up to 65° C. This is a
range of below freezing to 150° F. An RTU consists of a real-time clock, input/output interfaces,
electrical spike protectors, a restart timer to ensure that the system restarts if it fails, and
a power supply with a battery backup in case power to the system is lost. It also includes
communications ports—either serial, Ethernet, or a modem—along with volatile and nonvolatile
memory. The nonvolatile memory is used in case
system will write its data to memory and then send it to the controller once communications
have been reestablished.
Figure 6.5 shows the inner workings of a typical RTU. All of these components are
generally contained within a very durable case that is designed to withstand the extreme
conditions mentioned earlier. Note that the RTU can also be connected directly to a PLC,
so based on the feedback and the data collected the PLC can make changes to process
components.
Figure 6.6 shows an RTU manufactured by Control Wave. This RTU has built-in
Ethernet and even a File Transfer Protocol (FTP) server. Built-in FTP? That should raise an
immediate red flag. We all know how secure FTP is. It probably wouldn’t be to hard to plug
into this unit via the Ethernet port and compromise the FTP server. Just visit www.sans.org
and search for FTP vulnerabilities. FTP is probably one of the most insecure protocols out
there. And if you think that a hacker would have to get inside the case to access the
Ethernet port, guess what: An attacker would have all the time in the world to do this
because these systems are usually located in remote locations such as swamps and deserts.
Once the attacker is on the system, it’s likely that he could send falsified data back to the
controller by sniffing some sample traffic and adjusting the values. This could cause all sorts
of trouble at a processing plant; indeed, the consequences could be catastrophic.
A skilled attacker could probably compromise one of these systems in a denial of service
(DoS) manner, or actually gain access to the system using a buffer overflow attack on the
FTP service. If he gained physical access to the system, he could connect a laptop via the
Ethernet port and examine traffic to get an IP address. Once the attacker can communicate
with the system, he could conduct a port scan to see what services are running on the
system. If he saw FTP, a knowledgeable hacker would probably target that service for
exploitation. Depending on his goals, he could choose to either take the system offline with
a DoS attack, or if his intent was to gain access, he could use a buffer overflow type exploit,
gaining him console or command-line access to the system. It really depends on the version
and distribution of the FTP server that’s installed as to the extent of the capabilities and the
impact of an attack.
Figure 6.7 shows another RTU. Although this RTU doesn’t come with a hardened case,
you could purchase one for it to protect it from a sledgehammer for at least a minute. A
sledgehammer or other destructive tool would quickly and efficiently provide an attacker
physical access to the RTU and allow him to access communication ports.
As you can see from the image of TUX, the penguin, in the lower right-hand side of the
image, the Linux operating system is running on this RTU. Are you wondering what the
patching process is on these systems? Automatic updates are probably not an option because
this RTU is running an open version of Linux, so it is vulnerable to every exploit to which
the particular version of Linux is vulnerable. If one of these were available, it would be interesting
to attach it to a network and scan it with Nmap, an open source network scanner
available from www.insecure.org, to see what’s running on it. There is even a phone number
for support right on the unit. Would they give out the default password? Social engineers
can be very convincing: "I’m out at pipeline 2234, it’s going to blow, and I have to get access
to the system! I need the admin password!" Numerous problems present themselves
regarding RTUs. We will address some of them in later sections of this chapter that look at
challenges, threats, and solutions.
Just as an example, in 2006, according to Internet Security Systems X-Force, an average
of around 600 vulnerabilities was released per month. Figure 6.8 shows the breakdown by
month.
Of course, not all of these vulnerabilities were directly related to the Linux operating
system, but ISS also reported that 1. 2 percent was directly related to the Linux kernel. Out
of the 7,247 vulnerabilities, ISS reported that almost 100 directly related to the underlying
Linux kernel. It also noted that the top 10 vendors in terms of vulnerability count account
for only 14 percent of the more than 7,000 vulnerabilities reported. This means that 6,233
vulnerabilities are targeting other applications and systems. That’s an alarming number of
vulnerabilities in the applications that are running on the operating systems where the vulnerability
is not directly associated with the operating system vendor. Especially in the Linux
world, where many of the applications are open source and not developed by a particular
vendor, it’s hard to know exactly what you’re vulnerable to without vulnerability assessment
practices. It’s scary to not think about patching these systems.
Flow Computers
Flow computers, as the name suggests, are used to measure flows through lines. These could
be gas lines, oil lines, water lines, or electrical lines. A flow computer’s sole purpose is to
report back to the MTU the current flow rates for the line that it is monitoring. As you can
imagine, flow rates should never drop to zero, and they should typically have an average
operating flow that does not deviate much. Statistical data monitors such as the ones discussed in Chapter 13 can be used to alert on deviations from an average flow if an
Enterprise Security Management (ESM) system were to collect this flow information.
Similar technology exists on the flow computer to detect and report alarm condition based
on feedback from its sensors that sit on transport lines. Figure 6.9 shows what a flow computer
looks like up close, and in the field.
As you can see in the image on the left, the flow sensor has a tap into the medium
through which material is being transported. In this case, it’s an oil line. The most common
form of flow monitoring is done using differential pressure calculations. The basic idea is that
a plate is inserted into the flow and the pressure hitting the plate is measured. This provides
feedback regarding how much of the material is flowing through the pipe at any given time.
The image on the right shows the type of remote environment in which something like this
may be deployed, and in the types of conditions these systems are expected to perform. It’s
no wonder that when they were building systems that can work in the snow, buffer over-flows were not a huge consideration. Now they need to be. Both RTUs and flow sensors
report back to the MTU, the centralized command and control center.
MTUs and Operator Consoles
MTUs are the single point of human interaction for the entire operation. MTUs are used to
monitor and control the RTUs and sensors typically located at the operation’s central
monitoring facilities with which they communicate, and to collect data from the RTUs and
the flow computers. The MTU is also what feeds and responds to the operator’s console. The
console has two uses. First, it is the human interface to the SCADA systems. Operators can
manually respond to alarm conditions by issuing commands to open and close switches, and
turn equipment or control valves on and off. The console is used to issue the command at
the operator’s will, and the command is sent to the MTU. The MTU then sends the command
to the appropriate RTU from which the actual valve or switch is directed to open,
close, or adjust. (Actually, the command is broadcast across the entire network, and the RTU
or PLC with the right address will perform the action. ) Again, we see a weakness of the
protocols and broadcasting commands, in that waiting for a response is neither secure nor
efficient.
Now let’s look at what an operator would actually see when he is sitting in front of his
terminal. The following images represent products from several different companies that produce
MTUs and human machine interface applications. Figure 6.10 is from Iconics. The
view we see is an alarm window that alerts the operator to different conditions in a water
processing plant.
Initially, the darker boxes in Figure 6.10 were red, signifying critical alarms. This view
allows the operator to see a near-real-time view of all alerts that have occurred in the past
several hours. Because this is a water processing plant, the alarm conditions relate to different
parts of the process which cleans the water. For example, we can see chemical readings for
alkaline levels. We can also see that some of the water levels in the tanks are too low. We also
see something called a "Warp core brench," which doesn’t sound like a good thing. So, how
does an operator respond to these alarms? Well, the ones that are not automatically adjusted
using PLCs will need human interaction. This is where a human machine interface (HMI) is
needed, allowing an operator to select a portion of the process, such as a valve, and issue
commands to adjust it.
An HMI allows an operator to control particular parts of the process network from his
console. Rather than walk, drive, or fly to the system, the operator can simply push a button
to correct alarm conditions. Figure 6.11 is also from Iconics; it shows the company’s HMI
interface to the water processing plant that is generating these alarms.
In Figure 6.11, we can see a menu of commands to the right of what appears to be a
processing tank. This menu would appear after an operator has requested an action to be
performed. (Hopefully this is in response to the Warp core breach.)
Figure 6.12 shows what an operator sees when monitoring data received from flow
computers. In this example, the flow computers are monitoring the flow of different chemicals
through a processing plant. If you remember the discussions on statistical monitoring,
this should ring a bell because the technology is very similar. Instead of monitoring for
spikes and drops of logical data, such as traffic from an IP address, this application is monitoring
for spikes and drops in the amount of chemical that is flowing through the processing
lines.
In Figure 6.12, we see different shaded lines (originally in color) representing the
volume of chemical flowing through a monitored line. There is one line for each chemical
being monitored. We can derive from the figure that at the third block from the right, there
is an extreme drop in the volume of a particular chemical flowing through the system. This
type of view can be very useful for trending as well as for looking for anomalous patterns in
flow which could indicate problems.
SCADA is definitely a requirement for the operator who is monitoring thousands of circuits
or valves in a processing plant. It’s a very similar concept to the ESM systems of the
logical security world—taking in large amounts of data and presenting it in a way that allows
for human interpretation.
There is obviously no way an operator could actually look at all these alarms by going
to a console for each, just as a security analyst cannot use multiple consoles to look at intrusion
detection or firewall data. Although correlation isn’t involved here, it does leave the
door open for integration with a correlation engine. If you could correlate failed logons to
an RTU and then a successful logon, you have probably been the victim of a brute force
attack. The power of correlation in the SCADA world is a new frontier, and we will
examine some bleeding-edge examples in the use-case section. The following quote by
Howard Schmidt in New Scientist magazine sums up SCADA and process control: "It used to
be the case that we’d open floodgates by turning a wheel; today it’s done through a keyboard,
often through a remote system."
A SCADA example that must be included in any conversation worth having is a
SCADA implementation in a brewery. There is an interesting article in which a brewery in
the United Kingdom has implemented five SCADA systems to optimize processing and
allow for real-time decision making. It’s an example of SCADA technology leading to operational
efficiencies in other areas besides oil and gas. The article, located at www. industrialnetworking.
co.uk/mag/v7-2/f_bottling_1.html, is worth a read.
WHY CONVERGENCE?
Unfortunately, in the world we live in today, certain organizations and individuals would
love to terrorize a nation by disrupting the processing of some part of the nation’s critical
infrastructure. Because of this, it is imperative that as a community, we investigate and
respond to the threats and challenges that exist. In this section, we will look at some of the
myths surrounding SCADA security as well as the stakeholders involved with trying to protect
SCADA and process control networks. In order to protect the critical infrastructure,
there needs to be collaboration across different organizations, from the Department of
Homeland Security (DHS) to the industrial manufacturing industry, as well as technology
vendors. SCADA technology developers and the people who use the equipment haven’t in
the past seen much need for security because their main focus is on reliability, and the capability
for systems to be up on a 24/7 basis. In a presentation by Dr. Paul Dorey, vice president
and chief information security officer (CISO) at British Petroleum, given in 2006 at
the Process Control Systems Forum Conference, many of the common myths surrounding
SCADA security were identified. We list the myths here, and provide our own explanations
to clarify them.
Myth 1:"Our process control systems are safe because they are all isolated"
According to surveys, 89 percent are connected. So what does this mean?
It means that almost all SCADA networks are in one way or another connected
back to a corporate network. It’s the old problem of security versus convenience, or
ease of getting a job done. If there was a file that an admin had to get into the
SCADA network or vice versa and every time they had to cross an air gap with a
CD or other type of media, their job would be very painful. Furthermore, if the
networks were air-gaped (in which there was absolutely no connection between
the two networks), operators would have to use their terminals only for monitoring
the SCADA processes. They would have no Internet access, no e-mail, and none of
that SCADA networks were isolated, but with modern connection requirements,
this is no longer the case.
Myth 2: "My networks aren’t connected; my server uses a separate network
to connect to the process control network and the corporate network"
This has to be one of the biggest violations imaginable. This means the
user has two interfaces in her computer: one on the corporate network, which is
where an infection, virus, or worm could easily come from; and the other on the
process control network, where the virus or worm will likely travel to once it
infects the host system. This also means that if the user’s computer is ever compromised
by a malicious insider or even an outsider, the attacker will have full-blown
access to the process control network. This should be a direct security violation.
Myth 3: "Antivirus can’t be applied" Many people believe that vendors will
not support installation of antivirus applications on a SCADA system. According to
the presentation, this is supported in more cases than expected. This is something
that we will address in the "Threats and Challenges" section of this chapter where
vendors stop supporting the software or platform if security measures are put into
place. Again, this is one of the reasons stakeholders from many different organizations
need to get together and get these vendors to take the appropriate action and
support security.
Myth 4: "Our system isn’t vulnerable, as it uses proprietary protocols"
Proprietary protocols may be the case with SCADA-specific applications, but as we
discussed earlier, many systems are running on common operating systems such as
Windows and Linux, and services such as FTP are installed. Just because the protocol
used for the SCADA application is probably fairly unknown doesn’t mean
that the operating system and all vulnerabilities associated with it are not. Myth #3
mentions that antivirus software can’t be installed, which probably means that not
much is done as far as disabling unnecessary services or doing any hardening procedures
to these systems.
Myth 5: "I have a firewall, so I’m safe" See Myth #2; this completely
bypasses the firewall. Furthermore, firewalls can’t stop users from plugging laptops
directly into process control networks, and firewall rules tend to be modified for
convenience. If an admin needs to connect to a system and he is also in charge of
the firewall, good money says there will be a firewall rule allowing him access. This
is an example of the failure of separation of duties. The admin for the firewall or
the person responsible for securing the environment should never be the same
group or person that has to work with the systems.
It’s amazing to look at some of the thought processes that are going on in the industry.
The different schools of thinking are very apparent. If you’re coming from the world of the
process control engineer, it’s not likely that you have ever even touched a firewall or that
you understand much about IP and logical security. If you are coming from the logical security
side of things, SCADA, process control systems, process control networks, PLC, and the
other technologies mentioned probably seem foreign to you, and the catastrophes that could
incur if these systems were to fail may seem far-fetched. Someone who has worked with
process control systems their whole career understands the implications of system failure,
which leads to the school of thinking that states that if a vendor says it doesn’t support the
system and doesn’t know the results of applying a security update, hesitation to install an
update is understandable. So how can these issues be addressed to provide security while not
breaking applications that critical processes depend on? How can there be a common
ground between vendors and the oil and gas industry? The only way is with participation
from many different organizations.
To bridge the gap between vendors and industry, there must be a collaboration that
involves players from different backgrounds, with different skill sets and different schools of
thought. The owners of the different industry sectors need to work together to demand
from process control vendors that security be taken seriously. This includes the chemical
industry, the oil and gas industry, nuclear power, water, and electric. Next, the major SCADA
and process control vendors need to get together and work with the industry sectors to
deliver secure products. These vendors would include Honeywell, Siemens, Rockwell,
Invensys, and Emerson, which are some of the major players in the process control system
field. There also needs to be support from academia. Several key organizations would include
the International Federation of Automatic Control (IFAC), the Institute for Information
Infrastructure Protection (I3P), and the American Automatic Control Council (A2C2). The
academics represent the scientists and engineers who are developing the leading-edge technologies
for the future of process control. If they are involved and are aware of the concerns,
they will take these concerns into consideration when they are inventing and designing new
products, and security will not be an afterthought.
Figure 6.13 shows a mockup of a diagram used in a presentation given at a Process
Control Systems Forum Conference in 2006 by Michael Torppey, who is a technical manager
at PCSF and a senior principal at Mitretek Systems. The figure displays the different
cross-sector groups involved, as well as the private security industry.
In Figure 6.13, the link graph starts in the center with the stakeholders in an effort to
protect critical infrastructure. Surrounding the center are the different sectors, such as
Academia and the Department of Homeland Security. Within each sector, the individual
groups or organizations that play a critical role are identified. National labs such as Sandia
and Lawrence Livermore are also involved in the overall cross-sector collaboration. The labs’
influence provides expert advice and direction to the overall strategy of the effort. Sandia
Labs, in fact, was deeply involved in a project sponsored by DHS where industry and security
vendors were brought together to try to detect a series of potential attacks. DHS is, of
expertise to the oil and gas industry, to pushing the issues up the political ladder toward
presidential sponsorship. The standards committees are very important as well, because the
research done will become best practices which then are implemented as standards where
they are accessible for others to follow as guidelines.
The only sector that was left out in the original diagram and that should be included is
the one comprising logical security vendors, the same vendors that have become leaders and
experts in protecting logical IP-based infrastructures. These companies and individuals are
key in bridging the gap. It would be a reinvention of the wheel if process control system
owners didn’t follow some of the same best practices that are used to secure financial records
at a bank or classified information within government. The advances made over the past several
years in the areas of perimeter protection have been working fairly well. It’s been several
years since the last virus or worm has caused any real damage, and you hear far less about
Web sites being hacked into. Technology has gotten better and awareness has been raised.
With that in mind, it’s time to look at some of the specific threats and challenges facing
organizations as we move to a converged secure SCADA world.
THREATS AND CHALLENGES
This section begins with a quote from a popular process control system vendor. The quote
sets the stage, as well as provides insight as to how bad the problems are that need to be
addressed in today’s process control environments:
Security has become an increasingly critical factor and will continue to
be essential to public utility agencies. Wonderware offers robust datalevel
security, in addition to the standard security features provided by
the Microsoft Windows operating system, to protect your control system
from cyber or physical risks. "—http://content.wonderware.com/NR/rdonlyres/
83A979AD-A805-41A3-BC4F-D021C692F6D1/591/scada_water_4pg_rev5_final.pdf
This is just the tip of the iceberg. The standard security features offered by Windows?
You’ve got to be kidding, right? It’s not a joke. This is the level of security awareness that
you will find when talking to some of the process control system vendors. In this section, we
will look at some of the specific threats and challenges facing process control and, specifi-
cally, the oil and gas industry. We will also hear from two industry experts who have or still
do work in a SCADA process control environment.
Interconnectivity
One of the first challenges or issues is the interconnectivity of process control networks and
corporate networks. As we learned earlier in the chapter, some 89 percent of process control
networks are actually connected to corporate networks. In the past, SCADA networks were
not physically connected to corporate networks. Refineries used to have completely
autonomous or self-sufficient networks. Previously, there wasn’t a need to connect into
refineries from remote locations; they didn’t need Internet access or e-mail. Nowadays,
refineries, factories, and manufacturing plants are interconnected. This means there is an
Internet connection, and a connection to headquarters for remote access. Now think of all
the connections back at corporate: business partners, virtual private networks (VPNs), and
wireless. All of these create a window through which an attacker can penetrate. Once on the
corporate network, the attacker is only one attack away from full access to the process control
network.
Another significant challenge that creates many weaknesses is that the industry is standardizing
on common operating systems and protocols. The systems and hardware being
developed and manufactured today are designed to run on Windows and Linux. These systems are also using TCP/IP as a means of communication. This is a double-edged sword.
There is a definite upside because at least the vulnerabilities and weaknesses are somewhat
known and can be fixed, but known vulnerabilities also represent the downside. If the vulnerabilities
are known to security professionals, you can be sure they are also known to
the bad guys. With common operating systems come common problems, such as shipping
with default services enabled. In most cases, vulnerabilities are associated with these services
and they need to be disabled, but without a hardening policy in place, it’s likely that
they are not.
It’s been said that some vendors don’t support the patching or upgrading of systems. It’s
not uncommon to go into a process control environment and find that most systems are still
running Windows NT SP 4. Not only do the vendors not want the systems patched, but
also management doesn’t want downtime, particularly when dealing with these critical systems,
because downtime costs money. In most IT organizations, there is redundancy or
blackout periods where patches can be applied and systems can be rebooted, but in processing
and refinement, every minute a system is down means less product, which means less
revenue. If the systems are not able to be patched or updated, the process control systems are
going to remain vulnerable to attacks that already exist and are easily obtainable by even the
most unsophisticated attackers. The use of host-based firewalls and host-based intrusion
detection/prevention software is a good idea, although if patches and updates are not supported,
firewalls and HIDS are for sure not allowed.
Can you determine system security if you can’t test the system? This is another
problem in the SCADA world. Vulnerability scanning tools such as Tenable Network
Security’s Nessus will cause SCADA applications and older operating systems to freeze or
crash completely. Scanning tools commonly work by sending combinations of IP packets
to a network port in hopes of soliciting a response. Sometimes these packets are out of
band (OOB), meaning that they don’t adhere to the specifications for IP. The reason for
this is to illicit a specific response and map that to a known response from either the operating
system in general (known as OS fingerprinting) or from a service that’s listening on
the network. This type of scanning has been known to break SCADA systems. If the
developer built the network daemon to speak IP, based on the standard and didn’t take
into account error conditions or bad packets, the application would receive an unintended
response when trying to process these malformed packets. Commonly this leads to the
application simply crashing or needing to be restarted. If you can’t perform regular vulnerability
scans in an environment such as this, how can a security posture be evaluated? The
application and hardware vendors need to understand that vulnerability scanning is a critical
component in securing these systems.
Tenable Network Security has begun to address this problem by working with standards
companies, such as OPC and Modicon, to develop a series of checks or plug-ins that will
check for SCADA security issues specifically, without disabling or breaking the application.
This shows the type of collaboration that is needed in the industry to move forward and
protect the critical infrastructure. The checks include default usernames and passwords, insecure
versions of protocols, problems related to Modbus, and checks to determine the applications
that are running on the systems. These checks have been specifically designed to not
break the SCADA applications, but they don’t take into account the inability to scan for
normal operating system vulnerabilities. Most scanners have settings for performing safe
checks, but it seems that even safe checks may harm SCADA applications and take down a
system. Figure 6.14 shows the SCADA plug-in selection in Nessus.
From this screen, the user can select the plug-ins that are part of a particular scan or
scanning policy, and then save his selection for future scans. The idea would be to have different
policies based on the systems being scanned. This is a great start and is a good indication
of the direction security vendors are taking. It really shows that people are concerned
and are willing to help by adapting their products to look for problems specific to SCADA
environments.
SCADA vulnerabilities are no longer a secret; there is growing interest among the hacker
community. At Black Hat Federal, a hacking/security conference sponsored by the Black
Hat organization, there were talks about how to break into SCADA systems. Also, at
Toorcon, there was a presentation which included instructions for how to attack some of the
commonly used protocols in SCADA or process control network environments such as
Modbus. So, now we have not only the fact that the systems are vulnerable, but also how to
exploit the protocols that used to make the industry feel secure because the protocols were
fairly unknown. Security through obscurity is no longer an option for the process control
industry.
As systems increasingly become interconnected and more security devices are put into
place, we see the common problem of digital overload—increasing the amount of data to
process and floods of event logs. If no one is looking at the logs, they are basically useless.
Data growth is becoming overwhelming for teams that have little to no security data analysis
experience. Again, this is where the logical security community needs to step up and help.
Products such as ESM can process large amounts of data and will help in the analysis process,
but the industry needs to invest in its security infrastructure and hire some security
experts to help with the process. It will not be a cheap retrofit, but it is necessary and the
costs of not doing it far outweigh the costs of making the commitment.
Another important challenge is the migration to the "wireless plant. " Lately, everyone
seems to be moving to wireless. Now, this is a great efficiency enhancer, just as it is in a corporate
environment, but it brings with it the same problems. It is extremely easy to crack the
security that’s been applied to wireless networks. The Wired Equivlancy Privacy (WEP) is
not unbreakable. Downloadable tools are available that will collect a series of wireless communications
and crack the key, allowing an attacker full access to the network. Nonetheless,
many vendors these days are pushing refineries and manufacturing plants toward wireless.
Figure 6.15 is an example of an advertisement for just that.
This is scary; just think about war drivers, people who drive around trying to connect to
open or weakly secured wireless networks. This would probably represent the mother load
for them. They could pull up with a high-gain antenna and see all kinds of radio frequencies
floating around one of these environments. Unfortunately, they don’t just want to look; they
want access. Therefore, serious attention needs to be paid to securing these wireless networks.
They need to be encrypted. Hopefully they employ Media Access Control (MAC)
address filtering and don’t use the Dynamic Host Configuration Protocol (DHCP). If this
isn’t the case, though, the ability to access these wireless systems would be trivial. Again, if
you can access the network, it’s fairly easy to start sniffing traffic and see the type of commands
or traffic floating around. This would allow an attacker to spoof sources and send falsified
commands and data back to the MTU, wreaking havoc on the organization. The other
consideration with wireless hackers is that they are not always trying to destroy systems, but
if they do get onto a SCADA network and don’t know what they are doing and start scanning,
even if not with malicious intentions, they could cause severe damage unsuspectingly.
We have looked at some of the challenges and threats out there. In the next section, we
will hear from an industry expert who will explain to us through an interview his experiences
when dealing with the protection of SCADA systems.
Interview: SCADA Penetration Testing
The following interview was conducted January 2007 with Gabriel Martinez, CISSP.
Martinez is a security expert with more than 12 years in the industry providing security
consulting services to nearly every vertical, including government, the Department of
Defense (DoD), intelligence, healthcare, and financial. In addition, he has experience with
the power and utilities industry. He has also spoken at several conferences on the topic,
including the American Gas Association. He brings numerous real-world examples that tie in
with what we have been discussing thus far.
Colby: Can you tell me a little about your background as it relates to SCADA?
Gabriel I have performed many security risk assessments for companies in the power and
energy space. We would break up the assessment in several phases. First, we begin the
assessment with a penetration test focused on externally exposed systems, simulating an
Internet-based attack. The second phase would focus on testing from within the organization.
We would test the access gained from two perspectives: someone just plugging in
a laptop, and from a legitimate user. This really gave us the insider threat perspective. And
finally, we would review any security policies and technical security controls in place.
In general terms, the external systems tended to be relatively secure, but on the inside it
was very much a different story. We used to describe it as a hard-shell candy; hard and
crunchy on the outside but soft and chewy on the inside.
Colby:Tell me more about the external vulnerabilities. Do you think it’s possible to get in
from outside?
GabrielThere tends to be a much smaller footprint and limited exposure. About four or five
years ago, though, you would find exposed Web servers and other DMZ systems that
were accessible from the Internet, almost guaranteeing an entryway through a trusted
access control list (ACL). These days, through awareness and better practices, these have
been locked down. However, as new vulnerabilities are discovered and exploits to match,
there is always a risk.
Colby: What are some common misconceptions surrounding process control network security?
Gabriel Everyone tends to believe the systems are not connected to corporate networks.
This used to be one of the main findings that upper management was always surprised
to hear about. The systems really are interconnected. There might be a firewall, but they
usually had open ACLs and allowed connectivity from numerous different subnets.
To give you an example, once I discovered that a group of workstations from the corporate
network were dual-homed (they had more than one network interface). One side
had an IP connection to the corporate network and a controller card for an energy
management system, which used a proprietary technology. I scanned the workstations
and found that they were all running Carbon Copy, a remote control product, so I
downloaded a copy and found they were using the default password. Now that I was on
these workstations, I had full access to the energy management system. These workstations
were configured that way so that operators could access them remotely via their
corporate VPN so that they could manage/monitor their SCADA environment from
their desktops.
Colby: Didn’t all the scanning and pen testing activity get picked up by the security team?
Gabriel I’m glad you asked that. The test was designed so that only upper management
knew that a penetration test was being conducted. I’d say 80 percent of the time, our
pen tests were never discovered by the security team or IT staff. .
Colby:Weren’t they using intrusion detection systems?
Gabriel Some were, some weren’t, but even the ones that were weren’t even monitoring the
logs. Really, they needed an ESM platform. What good is a firewall/intrusion detection
system if you don’t monitor it?
Colby: I agree. They generate tremendous numbers of logs, and humans cannot interpret the
mass amounts of data that are generated. That’s a great example of what we have been
talking about so far. Let’s move on to another common myth.
Gabriel Another myth is security through obscurity. The misconception is that people don’t
have the knowledge to control a SCADA network. A person once told me there was no
way someone would know how to take over his system. I countered by simply asking
him, if he were in a financial predicament and were offered a large sum of cash or were
being blackmailed by a foreign government, could he provide the necessary details or
could he shut down the SCADA network? Much knowledge is now accessible via the
Internet, including vendor documentation. Vendors even announce who their biggest
customers are, so a targeted attack could be made easier.
Colby:The latest trends show SCADA components moving to common operating systems
such as Windows and Linux. Can you tell me about the path and update processes you
have seen?
Gabriel As SCADA systems move to common operating systems such as UNIX and
Windows, they are subject to the same patch management and security issues that any
other organization faces when securing a critical system or network. Systems need to be
patched and locked down and protected via better access controls. Most systems I looked
at were typically not up-to-date and had numerous vulnerabilities as well as unnecessary
services running.
Colby: Did you find problems between vendor support and patch-level and security hardening?
Gabriel One of the industry issues is that if you did secure or harden the system by shutting
down insecure services and/or applying patches, the vendors wouldn’t support it. Today
awareness is changing and industry collaboration is beginning to get the message across
to vendors that security is a necessity.
Colby: During your penetration tests and scanning, did many systems crash?
GabrielWe didn’t do much actual scanning of SCADA systems. We started with the corporate
networks, and once SCADA networks were ascertained, we evaluated the systems
manually—looking at firewall policies, both inbound and outbound, of the process control
networks. We would log on to servers and look at the services running and compare
the versions to a known vulnerability database we used. If anything, we conducted some
simple tests, but because SCADA applications were sensitive to particular communications
such as bad packets and out-of-band traffic that’s used by scanning products to find
vulnerabilities and identify operating systems, we tended to not use these tools.
Colby: What scanning tools did you use?
GabrielWe used a combination of homegrown tools, and publicly available tools such as
Nessus, using custom plug-ins that we built.
Colby: Have you heard that Nessus now has SCADA plug-ins?
GabrielYes, that’s pretty forward-thinking. It shows movement in the right direction because
it’s being recognized as a problem. I wish I had them available when I was doing these
tests.
Colby: Did you ever run into any geographical or physical concerns?
Gabriel Sometimes there were terminals in remote locations, like computers connected in at
a remote substation. The only thing protecting it was a fence and a door, but they were
usually in such remote locations that nobody would notice someone climbing over or
cutting through the fence. One time we went to one of these locations and went inside
with a key and found the system was still logged in.
Colby:That’s pretty scary. I’ll bet they weren’t pleased about that. Do you have any other
security concerns surrounding remote sites?
Gabriel Communications to remote systems. RTUs generally use satellite, x.25, or Frame
Relay protocols, but as backups they had modems for fail-safe operation with poorly
configured passwords or default passwords. You would be presented a prompt to issue
commands or make configuration changes. Also, these remote systems as of late are connected
via IP networks, making them vulnerable to standard IP-based attacks like man in
the middle attacks, spoofing, or denial of service attacks.
Colby: What were some of the top vulnerabilities you would find?
Gabriel I’d say they are common across any industry: weak authentication (i.e.,
common/default passwords); weak access controls; insecure trusts like rshell and the r
command suite, including rlogin, where trust was based on the initiator’s IP address; and
nonpatched vulnerable systems.
We found that if you gained access to just one vulnerable system, you would gain access
to just about every other system in minutes.
Colby: What about vendor access and business-to-business relationships?
GabrielVendors would have dial-up access into the systems for support and would dial
directly into SCADA systems with weak or default passwords. Plus, you’re trusting the
vendor’s employee, who now has full access to the systems. Theoretically, you could go in
as the vendor or break into a vendor and then use their systems to gain access. Plus,
there are all the normal concerns surrounding trusting other entities to access your network
in business-to-business relationships.
We never conducted a test using that angle, but given the proper motivation and
funding, anything is possible.
Colby: What about wireless?
Gabriel It’s getting better, but it’s still a concern, just as wireless is a concern in corporate
environments. There are tools that crack WEP keys, allowing users onto the network.
And once they are on, everything is accessible. The wireless networks should be locked
down, maybe use VPN technology, or at least MAC filtering, but MACs can be spoofed,
so this is only a thin layer of protection.
Colby:There seem to be numerous problems and insecurities surrounding SCADA and process
control networks. In your opinion, what can be done?
Gabriel Considering the consequences of a successful attack, this is a very important aspect.
Security best practices need to be implemented. The industry needs to heed the warnings
and look at how security has been achieved in other verticals, such as financial
organizations and the DoD. Secure architectures need to be implemented, policies and
procedures need to be in place; it’s not much different from securing any other critical
environment. There need to be strong access controls, and regular vulnerability assessments
need to be performed because the environments are always changing. Intrusion
detection systems and other point security products such as host-based intrusion detection
and firewalls must be not only deployed, but also the logs need to be collected and
monitored. Just having the logs doesn’t help. Correlation and analytic processing are
required to find relationships between disparate events. I suggest using an SIEM product;
unwatched alerts are meaningless.
Colby:You were directly involved with a cross-sector collaboration project involving the oil
and gas industry. Can you tell me a little about it?
Gabriel I was brought in from an ESM perspective to see how ESM can improve the general
security of SCADA environments. The project coordinators came up with a test
environment that included a simulated process control network connected to a reproduced
corporate network. We were in a lab environment that simulated an oil refinement
network. The simulation included a corporate network, a process control network,
and a distribution control network. I developed the logic to monitor and detect whatever
attack scenarios they came up with.
Colby: Can you tell me a little about the attack scenarios?
GabrielWithout going into too much detail, I can say that there were four or five scenarios.
Each consisted of an attacker who gained access to the process control network by
attacking systems that were either directly connected or trusted to connect to the network.
Some of the attacks started in the corporate network. One started at a remote
substation and the other was over wireless. We looked at data feeds from several different
SCADA products, including Telvent and Omniflow, as well as events from several other
security products.
Colby:Thanks for your time. Everything you have said is really enlightening and right in
line with the current industry trends. Do you have any plans to do more work in the
SCADA arena?
GabrielThe pleasure is all mine, anything to do to help. I’m sure this won’t be the last
chance I get to use what I know about SCADA, especially with the problems that are
being identified in the industry. Thank you.
It’s amazing how the interview lined up with the items discussed in this chapter.
Martinez’s input really shows that the myths we discussed earlier, and some of the threats and
challenges, are not just scare tactics; they really do exist, and the industry really does need to
come together and improve the security posture of process control environments.
The project Martinez was involved with is Project LOGIIC, which stands for Linking
the Oil and Gas Industry to Improve Cyber Security. The project was funded by DHS and
led by Sandia National Laboratories. Other participants in the project included Chevron,
Citgo, BP, Ergon, and SRI International, to name a few. The goal of the project was to identify
new types of security sensors for process control networks. The integrated solution leveraged
ArcSight’s ESM technology and represented the first test of this type to deliver an
expertly developed, fully tested solution that enabled centralization of security information,
monitoring vulnerable points of entry within oil and gas IT and process control networks.
The LOGIIC consortium is a model example of a partnership between government and
industry that is committed to combining resources to define and advance the security of the
oil and gas industry. More information is available from Sandia
(www.sandia.gov/news/resources/releases/2006/logiic-project.html) and DHS
(www.cyber.st.dhs.gov/logic.html).
Interview: Process Control System Security
The following interview was conducted with Dr. Ben Cook from Sandia National
Laboratories. Cook is a member of the research team at Sandia and has a doctorate in science
and IT. His background is in modeling and simulation in complex systems: physical and
engineering systems. Cook does a lot of computational work and for the past five years or so
at Sandia, he’s been involved in helping to start and manage several infrastructure protection,
research, and development projects. Project LOGIIC is one of those. He also is looking very
holistically at infrastructures such as the power grid and trying to understand their vulnerabilities
as well as how Sandia can work with industry to secure those infrastructures in an
economical fashion.
Colby:Thank you for spending time to talk with me. I appreciate it. You have told me a
little about your background, dealing with complex systems. What do you mean by complex
system? Do you mean process control systems?
Ben: I’m looking at systems like the power grid. Control systems would be a piece of that—basically, how would you model the power grid on a regional scale. A piece of the power
grid is the information infrastructure, the control systems. Another piece of the power
grid is the actual physical infrastructure in the way of transmission lines and transformers
and generators, and yet another piece of the power grid is the markets through which
the power is sold. My technical background is modeling physical systems—large physical
systems where you have lots of different things going on in the way of physics, in terms
of fluid dynamics and solid mechanics. The goal was to think about how you would
model these coupled systems: how you would idealize them, abstract them, develop the
mathematical models, and solve those mathematical models using computers and then
visualize the results.
Colby: Sounds like fun.
Ben: Sandia has had large programs for the past decade or so looking at infrastructure and
their dependencies, trying to understand linkages between infrastructures: how does the
power grid rely on the oil and gas infrastructure, and if you look at an actual gas
pipeline, how does it depend on the power grid?
If the power grid goes down, how does that outage in the power grid ripple out and
bring about consequences in other infrastructures? You have clearly intertwined with all
physical infrastructures, like the power grid and the oil and gas pipelines and refineries.
You have telecommunications infrastructure; companies are increasingly dependent upon
the telecommunications infrastructure. We’ve done a lot of work at Sandia in taking a
look at a very large scale. It’s the backbone of our infrastructure systems that support the
economy and support our way of life. How could that backbone be compromised, and
equally important, how would you protect that backbone? How might you reengineer
the infrastructure? That can be done through policy. You know, maybe there are ways of
introducing, just like we did on LOGIIC, new sensors that would allow utilities to have
a broader, more global view of their system health and to be able to anticipate failures,
and then take measures to try to stabilize those systems.
We spend a lot of time here at Sandia working on those kind of issues, and I’ve been
involved in that work here for over five years now. Project LOGIIC is just one example
on the cyber security side.
Colby: Is there a red team/blue team model like the Marines or the DoD uses, in which
one team attacks systems and the other tries to detect the attacks?
Ben:Yes, it’s similar. If you look at the threat through your own rose-colored glasses, it’s
likely to be a very biased view. You really have to try to get into the mind of an adversary,
so part of this is trying to think about who your adversaries might be. How do they
look at the world? What are they trying to accomplish? What resources do they have
available? How sophisticated are they? What are their technical capabilities?
Colby: Besides the obvious—terrorist organizations—who else do you consider to be adversaries
that would actually try to compromise a process control system or part of the
infrastructure? I hope that just the hacker that’s going to break into Web sites for fun
wouldn’t consider breaking into an oil refinery just because potential loss of human life
is far different from taking down a Web site.
Ben: Certainly that’s a concern because there is some exposure and some risk of collateral
damage if the hacker just happens to come about a company that has an open door to
its control systems, and this hacker finds himself somehow having successfully exploited
one of those control system components.
Colby: Let’s talk a little about some of the differences in these systems compared to systems
in a typical IT infrastructure.
Ben: Control systems are a little different from IT systems in that in an IT system, your concern
is trying to protect the data. Some of the tenets of security from an IT perspective
are availability, confidentiality, and integrity; usually availability is something that you can
sacrifice. If my workstation goes down for the next hour, I will lose some productivity,
but there are other things that I could do.
If I’m losing data on my workstation, or if somebody steals that data, that’s a serious
issue. In the case of control systems, availability is paramount, so if the control systems
are there to control and to manage the operations, it’s the continuity of the operations
that the industry really cares about. They care about what you care about as a consumer
of electric power or gasoline. You want to make sure gas is available at the pump; that
when you flick on a light switch in your house the lights go on. Control systems really
are a different beast in that availability has to be preserved, almost at all costs.
It is really that coupling, then, between the control systems, the information systems, the
hardware and the software that make up the control systems, and the physical process—
understanding that coupling and understanding how a control system compromise might
in turn impact the operation, the refinery, and the pumping of oil. Fortunately, the
industry has been pretty good in terms of thinking through these things, because maybe
in the past they haven’t had to worry so much about someone attacking them through a
cyber means, but they have had to worry about other types of problems that may impact
operations, such as losing power, in which case they would like the refinery to continue
to operate.
They’ve thought through some of these infrastructure dependencies and interdependencies
that we were talking about earlier, and from a business continuity standpoint they’ve
tried to mitigate the potential impacts of an errant control system component or a signal
instruction, or the loss of power.
Colby: Is this done with a lot of redundancy built in?
Ben:There’s redundancy, but there’s also fail-safe safety systems; sometimes they are mechanical
and sometimes they’re electromechanical, but there’s an extra layer of protection.
Colby:That’s comforting to know.
Ben: In the past, these guys could always revert, if necessary, to mechanical, hands-on operations
of the processes.
Colby: Closing valves manually?
Ben:Yes, but that becomes harder with the trend toward full automation. Then there is the
question of whether you have the manpower to do that, as you make your transition
from no automation to partial automation to full automation. I spent a lot of time with
folks in the oil and gas industry and other related infrastructure sectors the past couple
of years, and I’ve been very impressed with the amount of effort they put into trying to
make sure their operations run smoothly and reliably. It’s really their bottom line.
Colby: If a system goes down, production goes down, and profits go down.
Ben:That’s one of the ways a lot of folks are thinking about how you make the business case
for investment in security. At the end of the day, from the CIO or CSO perspective, if
they can make that connection between availability, or continuity of operations, and
security, it’s a very powerful connection and it’s good justification for making an investment
in security.
Colby: Security is one of those hard things to prove because showing that nothing has happened
is when you’re really showing that you have a good security practice in place.
Ben: Naturally, so having a proven understanding of the state of health of your operation is
something that can be valuable for not only trying to understand whether you’re potentially
being exploited, but also whether something’s going wrong and isn’t functioning
correctly. Maybe it’s not functioning correctly because of human error, or maybe it’s not
functioning correctly because it’s just not fully optimized and you have some opportunities to squeeze more out of your business through additional optimization, or maybe
something’s not functioning correctly because a component is starting to fail.
Colby: Maybe an old piece of equipment is just starting to fail on its own…
Ben: ...yes, something like leaks in the pipeline. The leak protection is big business. A
broader view, a deeper view into your business, and a more intelligent view—this is the
power of ArcSight. It provides not only the broad view, but also the deep view. On the
process side, you can do things like monitor your process control networks. Sometimes
they’re using specialized protocols, which run on top of TCP/IP.
Colby:That’s really neat. Completely new event sources are always interesting. It’s a great
time to derive new use cases because you can look at the new data that you’re receiving
and how it can correlate with other events that are coming into the systems. What were
other things you looked at in LOGIIC?
Ben: Exposures, the vector through which an adversary can attack; they’re going to work
their way to the control systems through the business network. Plus, the technologies
that are now increasingly being used and deployed on our control systems are the same
technologies that they’re familiar with and that they probably have exploits for, and they
have the same set of vulnerabilities.
Colby: Good point. I’ve also heard that some of the data vendors actually don’t want people
upgrading the operating systems, so they’re running on older versions like Windows NT.
Ben:Yes, and this gets back to the issue that we talked about earlier, about the control systems
being a different beast to manage from a securities standpoint. We talked about
availability being paramount. If availability is your utmost concern, understandably there’s
going to be a reluctance to patch.
Colby:This seems like a place where collaboration across the industry really needs to come
from both the product vendor side, and the customers in terms of trusting that the vendors
are supporting these things.
Ben: I think there’s been good dialogue there. On the vendor side, there’s increased understanding
on the importance of security, so they are trying to work more closely with
their customers to more quickly upgrade the systems that are out there, the legacy systems,
but they’re also trying to incorporate security features into their new product lines.
Colby:Awareness is key.
Ben:Yes, it is. Now you have asset owners who are saying they want to know what you’re
doing about security. They want to understand what your typical response time is to
patches and how closely you work with Microsoft, how quickly you can patch, how you
are addressing this type of vulnerability and what implementation of Modbus you are
using, and whether you have looked at these types of issues with that protocol. In
response, vendors are starting to take action to address the legacy issues, as well as embed
security into their new products. That’s encouraging.
Colby: Now scanning, that’s another big problem, right?
Ben:Yes, scanning is a problem. Again, inadvertently, because they don’t have this understanding
of the importance of availability on the control system side and there hasn’t
been a dialogue, the IT guy tries to scan something on the control system network side.
Colby: I think also that if you look at the security that’s happened as far as a lot of the IT
technology companies, and online businesses, you don’t hear so much about them getting
broken into anymore, so I think security practices have improved across the board.
And I think that if they take the best practices from these companies that already have
online entities like banks and other organizations, they will be ahead of the game.
Ben: Absolutely—looking at the best practices and just applying them into the process control
environment. In the past, people have said they can’t use a particular antivirus
product on PCs because the workload associated with the process of operation is suffi-
ciently high that if their box gets any further bogged down running it, it’s just not going
to perform well, and it might actually hiccup and bring the process down. People are
starting to look at that and say that maybe in some cases they actually can run that software.
Certainly, bringing together the IT guys on the business side with the control
system guys and the physical security guys—bringing those folks together and getting
them to talk and to understand that they do have shared responsibilities and that they
can work together—they’re going to be much more effective.
Colby: It’s really the converged approach; convergence is just necessary.
Ben:Yes, but thinking of it in broader terms, convergence of the security infrastructure—it’s
the technologies in the organizations, and the resources.
Colby: At the end of the day, if the power goes down, eBay is not going to be running its
Web site, so it should be willing to share its knowledge and experiences. It depends on
the critical infrastructure, just like anyone else.
Ben: And you certainly see that in LOGIIC. You saw that with the commitment of the asset
owners to open up and to share their understanding of the issues and to provide guidance
to the team toward the development of a solution that would be useful not only to
them, but (in their minds at least) also to the broader industry. Companies like Chevron,
BP, Ergon, and Citgo that were part of LOGIIC, those organizations felt this shared
responsibility, not only within their organizations, but within sectors. There’s a merging,
an understanding of supply chain integrity. These companies are very intimate; they’ve
linked with one another. One company might be the provider of crude to another company’s
refinery, and that company might be pumping its crude through maybe the company
that gave it the refined products, so the crude is coming from one company and
going through a refinery; and the refined products that come out from that other company’s
refinery go back into the company that provided the crude pipeline.
Colby: It’s all interconnected?
Ben:Yes, and ultimately, it’s distributed by some other company and trucked out and sold
through another company’s retail gas stations. At the end of the day, they all need to be
working together and their facilities have to work as one, as in a supply-chain sense.
Colby: I’m just glad to see projects like LOGIIC and the work you guys are doing because
the more I research this stuff, it’s kind of scary, actually.
Ben: It’s been a great opportunity. There are powerful forces that are not going to stop anytime
soon. The trend is toward increased connectivity, toward globalization. These companies
that we’re talking about here that we’re trying to help on LOGIIC, those are
multinational companies. It’s not a problem that’s unique to the United States. This is a
problem that exists throughout the world, because everyone is becoming more and more
connected.
Colby: I think that awareness has increased a lot over the past couple of years, which is a
good sign. It shows that people know there is a problem and that by working together,
they can address the issues. It’s already starting. Technologies and policies are out there
that can address the concerns, and it’s just a matter of getting them in place.
Well, Ben, it’s been a pleasure. I’d really like to thank you for your time today, for sharing
your experiences and knowledge with me. I look forward to working together in the
future.
Ben: My pleasure. I think this book sounds great and will probably be a real eye-opener for
people.
Cook has extremely valuable insights into many of the issues surrounding complex
system and process control system environments these days. He works on a daily basis to
help protect the nation’s critical infrastructure through awareness and better understanding
of the interdependencies between critical components of the industry sectors. His involvement
in LOGIIC, among other projects, not only gives him a unique perspective on new
ways to protect infrastructure, but also allows him to be in a thought leadership position and
apply his past experiences to solutions moving forward. It’s exciting and reassuring to hear
about some of the advances being made as well as the awareness levels among not only the
industry, but also the product vendors.
Because we have been discussing these threats and challenges that exist within process
control networks, it only makes sense to look at real-world examples of incidents where
some of these challenges or weaknesses were exploited. In the next section, we will examine
some incidents involving process control environments that made the news.
Real-Life Examples
We pulled the following examples from various presentations and articles that discuss reallife
attacks and potential threats to SCADA and process control systems. The first several
examples are from a working document published by ISA, which sets standards and provides
education and research in the process control arena. The document,"dISA-99.00.02
Manufacturing and Control Systems Security," is recommended reading for anyone who
wants to learn more about SCADA security.
In January 2003, the SQL Slammer Worm rapidly spread from one computer
to another across the Internet and within private networks. It penetrated
a computer network at Ohio’s Davis-Besse nuclear power plant
and disabled a safety monitoring system for nearly five hours, despite a
belief by plant personnel that the network was protected by a firewall.
It occurred due to an unprotected interconnection between plant and
corporate networks. The SQL Slammer Worm downed one utility’s critical
SCADA network after moving from a corporate network to the control
center LAN. Another utility lost its Frame Relay Network used for
communications, and some petrochemical plants lost Human Machine
Interfaces (HMIs) and data historians. A 9-1-1 call center was taken
offline, airline flights were delayed and canceled, and bank ATMs were
disabled.
This is an example of where patching and antivirus software would have been extremely
useful. Also note what was said about the firewall: "despite a belief by plant personnel that
the network was protected by a firewall. "This just goes to show that a firewall is not
enough. There are backdoors into networks that people don’t even realize exist because
someone may have added them for convenience, and they may not even be in use anymore
but the connection remains hot. The example shows the worm moving from the corporate
network into the plant network. The next example is just as bad, but it involves destruction
of the environment via a sewage processing plant in Australia:
Over several months in 2001, a series of cyber attacks were conducted
on a computerized wastewater treatment system by a disgruntled contractor
in Queensland, Australia. One of these attacks caused the diversion of millions of gallons of raw sewage into a local river and park.
There were 46 intrusions before the perpetrator was arrested.
It’s good to know that the perpetrator was arrested. You can read the complete appeal at
www.austlii.edu.au/au/cases/qld/QCA/2002/164.html. It’s interesting to note that the man
was charged with 26 counts of computer hacking. It’s also interesting that the attacker
spoofed a pumping station in order to gain access to the network:
On examination it was found that the software to enable the laptop to
communicate with the PDS system through the PDS computer had been
re-installed in the laptop on 29 February 2000 and that the PDS Compact
computer had been programmed to identify itself as pump station 4—the identification used by the intruder in accessing the Council sewerage
system earlier that night. The software program installed in the laptop
was one developed by Hunter Watertech for its use in changing configurations
in the PDS computers. There was evidence that this program was
required to enable a computer to access the Council’s sewerage system
and that it had no other practical use.
Here is another example of a disgruntled individual—not an employee but someone an
employee had a relationship with—deciding to launch a DoS attack against a female chat
room user:
In September 2001, a teenager allegedly hacked into a computer server
at the Port of Houston in order to target a female chat room user following
an argument. It was claimed that the teenager intended to take
the woman’s computer offline by bombarding it with a huge amount of
useless data and he needed to use a number of other servers to be able
to do so. The attack bombarded scheduling computer systems at the
world’s eighth largest port with thousands of electronic messages. The
port’s web service, which contained crucial data for shipping pilots,
mooring companies, and support firms responsible for helping ships navigate
in and out of the harbor, was left inaccessible.
It’s noteworthy that although that attack wasn’t targeted at the process control system
directly, it was actually affected and operations were shut down. What if the attack were
directed toward the process control environment?
The next example is not of an attack that happened, but of a threat that is far greater
than that of a mad chat room user. In an article in the Washington Post, written by Barton
Gellman and published June 27, 2002 (www.washingtonpost.com/ac2/wp-dyn/A50765-2002Jun26), it was mentioned that much reconnaissance activity targeting Pacific Gas and
Electric as well as many other utilities across the United States was originating from the
Middle East—namely Pakistan and Saudi Arabia:
Working with experts at the Lawrence Livermore National Laboratory,
the FBI traced trails of a broader reconnaissance. A forensic summary of
the investigation, prepared in the Defense Department, said the bureau
found "multiple casings of sites" nationwide. Routed through telecommunications
switches in Saudi Arabia, Indonesia, and Pakistan, the visitors
studied emergency telephone systems, electrical generation and
transmission, water storage and distribution, nuclear power plants, and
gas facilities.
There have also been references to Al Qaeda gathering SCADA and process control
documentation, and that computers have been seized that contain documents as well as user
guides to operate process control systems. Al Qaeda is not a mad chat room user. They will
not try to DoS a chat client; they will launch a direct attack against the critical infrastructure
if they are given a glimpse of an opportunity. This is all the more reason to take the security
of process control networks as a serious responsibility that needs to be addressed sooner
rather than later, and that needs open involvement from many different technology sectors.
Information needs to be shared across organizations, just like the information sharing practices
that have been set up with government programs such as the US-CERT, FIRST, and
G-FIRST. These are all information sharing and response programs to computer-related
attacks. The work that SANS is doing should be noted, where they held a Process Control
and SCADA Security Summit in September 2006. This is exactly the kind of activity that is
needed to shed light and bring awareness to the issues that we face in regard to protecting
critical infrastructure.
In the next section, we will look at examples of an attack targeting an oil refinery. This
is an example of a real threat that could occur without an increase in investment and
awareness.
Plant Meltdown
In this use-case example, we will follow the behavior of a disgruntled employee, John
McClane. The targeted organization is a national oil refinery called Petrol123, located in
Texas. John had been working for Petrol123 for nearly 20 years as a SCADA engineer, but
he was recently fired. John was overlooked for a promotion, and instead of working harder,
he became angry with other employees. After multiple reprimands and second chances, he
was finally let go. John resented the fact that he was fired after 20 years of dedicated service,
and he felt he should have been promoted over his coworkers. This is where the story
begins.
The Plot
John’s job involved monitoring and responding to alarms generated by the different RTUs
and flow computers around the organization. John is familiar with the inner workings of the
refinery and knows how to cause considerable damage. Because John worked at Petrol123
for 20 years, he is familiar with the processing network, where RTUs and flow computers
are located, as well as the geographic locations of remote substations. He devises a plan to
cause general chaos within the refinery, and at this point doesn’t have much regard for the
company or his ex-coworkers. His plan to disrupt the oil refinement processes comprises
spoofing commands to the MTU and some of the PLCs which will allow him to control
the flow of crude oil into the plant.
Let’s start by looking at the process control network at Petrol123 and some of the different
security devices that are deployed (see Figure 6.16). John is not aware of the security
devices that have been put in place, as he was never involved in IT operations.
Starting at the top, we have a standard corporate network consisting of a wireless connection
for the employees with laptops, and we have the standard corporate servers such as
e-mail, databases, file servers, and financial systems. The corporate network, as we have found
in most cases, in connected to the process control network via a firewall. The firewall does
have some access control rules in place to try to prevent the spread of worms and viruses,
but they are minimal and there are many exceptions for remote access.
The SCADA process control network (bottom right) has several devices that we should
note. The MTU takes in data from all of the RTUs, as well as the flow computers monitoring
the flow of oil throughout the refinery, and the oil pipeline that feeds the crude oil
into the processing plant. There is also a remote substation located miles away from the
actual refinery, where flow can be controlled as well as rerouted. At the substation, there are
RTUs and flow monitoring systems connected directly to the pipeline.
John is familiar with most of this equipment; he knows where it’s located and has a general
understanding of how it works and its function in the environment. What he doesn’t
know is that within the past year, Petrol123 has deployed intrusion detection systems as well
as an ESM platform. John doesn’t know much about logical security, so as far as he is concerned,
he is in the clear.
John’s plan is pretty simple: He is going to gain access to the network via the remote
substation. Knowing it’s in an obscure location in the hill country of Texas, he is not worried
about anyone seeing him break in. The substations are also so obscure and remote that they
typically don’t have guards. John packs up his laptop, filled with some of the latest hacking
tools that he found on the Internet, as well as some how-to guides. He drives to the outskirts
of town and does a little surveillance of the substation and surrounding area.
Everything looks clear, so he parks his car and heads over to the station, needing only to
climb the fence to get onto the property.
He breaks into the housing and locates the RTU as well as the flow sensor that’s monitoring
the pipeline. He uses a large screwdriver and pops open the cover of the RTU. Just as
he figured, an Ethernet connection leads to the RTU. John decides that accessing the network
directly over Ethernet rather than hacking the satellite link will be much easier.
Because this was part of his plan, he purchased a small Ethernet hub that allows for connection
sharing. Next, he unplugs the connection to the RTU and the flow computer and connects
it to his hub, then uses another cable to reconnect the RTU. Next, he plugs his laptop
into the hub as well. John knows that in the time he disconnected the RTU, there will be a
system-down alert, but because he immediately reconnected it, a system-up alert will be sent
as well. This will cause the operator monitoring the SCADA console to most likely ignore
the message, thinking it’s just a simple error.
Once on the network, John opens Ethereal, a packet capturing tool, to sniff the network
traffic coming from the RTU and the flow computer. Because the network is not using
DHCP, he doesn’t get an IP address automatically, so he needs to see the IP addresses that
are being used by the systems that are talking on the network. He discovers they are using
standard private addresses. The RTU seems to be using 10.0.1.102 and the flow computer is
using 10.0.1.103. John then assigns himself a random address of 10.0.1.191, hoping that it is
an unused address. He is now on the network, and if he wants, he can start communicating
with other connected systems. His plan involves identifying the MTU so that he can spoof it
as the source of his fabricated commands, so he can’t do anything before he identifies the
address of the MTU. Remember, the only check that is done when a command is sent is
based on an IP address check to see whether the MTU or a PLC is sending the command.
John’s plan is to launch an attack by sending spoofed commands to the different RTUs
and flow computers in the process control network, instructing the systems to open their
valves full throttle. This could cause pipes to break and could destroy the plant, or at least
create a considerable amount of damage. At this point, John is still capturing traffic and
looking at which systems the RTU and the flow computer are communicating with. He
sees that most of the traffic is headed back to one address, so he assumes that is the MTU.
The RTU and the flow sensor are probably updating the MTU with their latest information
and they really don’t have reason to communicate with other systems on the network. John
can also see the system as the source of several broadcast messages.
The next thing John needs to do is map out the network for logic controllers as well as
other RTUs to which he can send commands. He knows that scanning SCADA systems
with a common port scanner will probably set off alarms, so he uses Telnet to map the network.
Most of the systems have Telnet enabled for remote administration.
What John doesn’t know is that Petrol123 is now using ESM and has deployed intrusion
detection systems. He figures Telnet will go under the radar because it’s used all the time, but
as his requests cross the network, the routers he passes through generate logs to the effect
that there was accepted traffic from his address. This sets off alarms in ESM. The process
control network does not use DHCP and this is a very static environment, so analysts want
to be alerted whenever there are any communications from hosts that have never been seen
on the network. The way this is accomplished is to map out all of the systems on the network
using assets. Once all known systems are imported to ESM as assets, a simple rule can
be written to flag any traffic that is not going from one asset to another.
At this point, John has been detected, although the alert that is generated is not highpriority,
so the analysts are not going to respond right away. Figure 6.17 shows the analyst’s
view of the events in question.
In Figure 6.17, the analyst can see several correlation events, indicated by the lightning
bolt in the far-left column. Starting from the bottom, we can see the Modbus timeout events
from when John unplugged the flow computer to hook up his hub. These events caused the
Failed Communication to a Flow Computer rule to fire. Because the system did come back
online, these events were passed off as a fluke and the analysts didn’t follow up further. The
next event we see is the Rogue System Detected event. This is triggered because of a router
event with a source that hasn’t been modeled as an asset within ESM. The analysts figure
that someone just installed a new system and didn’t alert them to the fact, so they begin
investigating all recent change requests.
Meanwhile, John still thinks he is home free. He does get frustrated with the process of
telneting to each possible address on the network and decides against his better judgment to
launch a scan regardless of the results. He decides to just issue a single port scan of each
system to determine what systems are alive on the network. This, of course, sets off many
alerts within ESM.
Figure 6.18 is an analyst’s view of the security posture of the process control network as
seen through ESM. Counters at each section of the network are represented by bar charts
and pie charts. In the figure, you can see three counters. The one at the upper left shows all
ESM fired rules or correlation events. The pie chart shows all attacks or suspicious activity
targeting systems in the process control network. The third counter is a bar chart showing all
attacks or suspicious activity originating from the remote substation. It is crucial to be able
to map out the network and determine from which segment attacks are occurring so that
the scope of the attack can be narrowed down.
The analysts can quickly see from where the alerts are originating and what parts of the
network are being targeted. The analysts can drill down on any part of the display to get to
the underlying events, and they discover that 10. 0. 1. 191, the address that John assigned himself,
has not only been detected as a rogue host, but also has been the source of several port
scans.
John starts getting nervous because he has launched the scan, but he has discovered several
systems that appear to be RTUs and PLCs. He decides to start sending Modbus commands
to these systems. John is not an experienced programmer by any means, and he
knows just enough about networks to get himself in trouble. He uses a packet crafting tool
to send what he thinks look like valid commands to the different systems telling them to
open their associated valves 100 percent. The systems keep replying with errors, so John gets
frustrated and decides that if he can’t dazzle them with his brilliant attack plan, he will just
launch a DoS attack on the MTU. He uses a User Datagram Protocol (UDP) port flood
tool and launches his attack. This is obviously picked up by the intrusion detection system,
and using ArcSight’s Network Response Module, the router access control lists between the
remote substation and the process control network are changed, blocking all traffic from
John’s address. The change is done by the analyst using the authorization queue process
within the Network Response Module (see Figure 6.19).
Figure 6.19 shows the Authorization queue in ArcSight NRM. Because this is a critical
environment, the organization does not want to take an automated response, so it lets ESM
recommend the actions that should be taken—in this case, block the source of the DoS and
let an analyst decide whether to commit the action based on the analyst’s access rights.
At this point, both a security team and the authorities have been dispatched to the
remote substation, where they find John still fumbling with his laptop, wondering why his
attempts to DoS the MTU are not successful. He will have a long time to wonder while
serving hard time at Leavenworth Federal Prison.
CONCLUSION
SCADA and process control systems are extremely important in today’s automated world.
We depend on these systems to be operational for our daily activities and well-being. Process
control systems have been designed for efficiency and stability, but a cyber attack could bring
them to their knees. The consequences of a compromised process control system span far
greater than someone stealing your identity, or breaking into a critical server within your
organization. The consequences here could mean the difference between life and death. Not
only could a successful attack result in damage to the environment, as in the Australian
sewage treatment case, or the inability to operate a port, as in the case of the mad chat room
client, but a targeted, well-funded attack could cause the loss of human life. Luckily, this has
not happened yet.
There are challenges. It’s common knowledge that issues surround the security of process
control systems, but there are also teams of dedicated individuals who are working hard
an integrated part of any organization, including the industries responsible for critical infrastructure.
It’s not that they don’t care; it’s just that in the past, security hasn’t been a top priority.
These days, with all the threats of terrorism and havoc from extremist groups, as well as
malicious insiders, everyone is aware of the need for a converged plan of action to address
these global concerns. This is evident through projects conducted involving government,
industry, and security vendors, spanning across sectors of business to collaborate on, research,
and address the issues. It’s evident that there is community awareness of the concerns and the
problems, as SANS is holding dedicated conferences to improve the security of process control
networks. This shows that not only has the government gotten involved through DHS,
but also that the security industry has gotten involved and is willing to help secure the critical
infrastructure.
- A Modbus function command is an instruction to a device to perform a task such as check a register
|