System Security Testing and Python

Overview

A significant part of systems security can be testing, and this presents a real challenge for most systems security engineers. Whether it is pen testing, forensic analysis, fuzz testing, or network testing, there can be infinite variations of System Under Test (SUT) when combined with the necessary testing variations.

The challenge is to develop an approach to this testing that provides the necessary flexibility without imposing an undue burden on the systems security engineer. Traditionally the options included packaged security tools; which provides an easy to use interface for pre-configured tests, and writing tests in some high level language; which provide a high degree of flexibility with a relatively high level of effort / learning curve. The downside with the packaged tools is lack of flexibility and cost.

An approach which has been increasingly more popular is to take either one (or both) of these approaches and improve through the use of Python.

For Example

A few examples of books that take this approach include:

  • Grayhat Python – ISBN 978-1593271923
  • Blackhat Python – ISBN 978-1593275907
  • Violent Python – ISBN 978-1597499576
  • Python Penetration Testing Essentials – ISBN 978-1784398583
  • Python Web Penetration Testing Cookbook – ISBN 978-1784392932
  • Hacking Secret Ciphers with Python – ISBN 978-1482614374
  • Python Forensics – ISBN 978-0124186767

In general, these take the approach of custom code based on generic application templates, or scripted interfaces to security applications.

Conclusions

After skimming a few of these books and some of the code samples, it is become obvious that Python has an interesting set of characteristics that make it a better language for systems work  (including systems security software) than any other language I am aware of.

Over the last few decades, I have learned and programmed in a number languages including Basic, Fortran 77, Forth C, Assembly, Pascal (and Delphi), and Java. Through all of these languages I have come to accept that each one of these languages had a set of strengths and issues. For example, Basic was basic. it provided a very rudimentary set of language features, and limited libraries which meant there often a very limited number of ways to do anything (and sometimes none). It was interpreted, so that meant it was slow (way back when).  It was not scaleable, which encouraged small programs, and it was fairly easy to read. The net result is that Basic was used as a teaching language, suitable for small demonstration programs – and it fit that role reasonably well.

On the other hand Java (and other strongly typed language) are by nature, painful to write in due that strongly typed nature, but also make syntax errors less likely (after tracking down all of the missing semi-colons, matching braces, and type matching). Unfortunately, syntactical errors are usually the much simpler class of problems in a program.

Another attribute of Java (and other OO languages) is the object oriented capabilities – which really do provide advantages for upwards scaleability and parallel development, but result in very difficult imperative development (procedural). Yes – everything can be an object, but that does not mean that it is the most effective way to do it.

Given that background, I spent a week (about 20 hours of it) reading books and writing code in Python. In that time I went from “hello world.py” to a program with multiple classes that collected metrics for each file in a file system, placed that data in a dictionary of objects and wrote out / retrieved from a file in about a 100 lines of code. My overall assessment:

  • The class / OO implementation is powerful, and sufficiently ‘weakly typed’ that it is easily useable.
  • The dictionary functionality is very easy to use, performs well, massively flexible, and becomes the go-to place to put data.
  • The included standard libraries are large and comprehensive, and these are dwarfed by the massive, high quality community developed libraries.
  • Overall – In one week, Python has become my default language of choice for Systems Security Engineering.

Postscript

Also of note, I looked at numerous books on Python and have discovered that:

  • There are a massive number of books purportedly for learning Python.
  • They are also nearly universally low value, with a few exceptions.

My criteria for this low value assessment is based on the number of “me-too” chapters. For example, every book I looked at for learning Python has at least one chapter on:

  • finding python and installing it
  • interactive mode of the Python interpreter
  • basic string functions
  • advanced string functions
  • type conversions
  • control flow
  • exceptions
  • modules, functions
  • lists, tuples, sets and dicts

In addition each of these sections provide a basic level of coverage, and are virtually indistinguishable from a corresponding chapter in dozens of other books. Secondary to that there was usually minimal or basic coverage of dicts, OO capabilities, and module capabilities.  I wasted a lot of time looking for something that provided a more terse coverage of the basic concepts and a more complete coverage of more advanced features of Python. My recommendation to authors of computer programming books: if your unique content is much less than half of your total content, don’t publish.

From this effort I can recommend the following books:

  • The Quick Python Book (ISBN 978-1935182207): If you skip the very basic parts, there is a decent level of useful Python content for the experienced programmer.
  • Introducing Python (ISBN 978-1449359362): Very similar to the Quick Python book, with some unique content.
  • Python Pocket Reference (ISBN 978-1449357016): Simply a must have for any language. If O’Reilly has one, you should have it.
  • Learning Python (ISBN 978-1449355739): A 1500 page book that surprised me. It does have the basic “me-too” chapters, but has a number of massive sections not found in any other Python book. Specifically, Functions and Generators (200 pages), Modules in depth (120 pages), Classes and OOP in depth (300 pages), Exceptions in depth (80 pages), and 250 pages of other Advanced topics. Overall it provides the content of at least three other books on Python, in a coherent package.

Note – Although I could have provided links on Amazon for each of these books (every one of them is available at Amazon), my purpose is to provide some information on these books as resources (not promote Amazon). I buy many books directly from O’Reilly (they often have half off sales), Amazon, and Packt.

IoT and Stuff – Cautionary Tales

Overview

IoT (Internet of Things) is an interesting phenomena where “things” become connected and provide either some control and / or sensor capability through this connection. Examples include connected thermostats, weather stations, garage door openers, smart door locks, etc.

It is an area of explosive growth, and like any other system it will have its security failures.

Tale 1 – Hacking Internet Connected Light Bulbs

LIFX lightbulbs are smart LED lights with two wireless interfaces; a Wi-Fi interface to connect to the local network and provide a control path for computers / smartphones, and an IEEE 802.15.4 mesh network to communicate between multiple LIFX smartlights. This dual wireless interface meant that any number of LIFX smartlights could be controlled and managed through a single Wi-Fi connection. Since any of the LIFX smartlights could operate as the “master” device that connected to both networks, it was necessary for each smartlight to have the Wi-Fi network access credentials.

Vulnerability

The vulnerability involves a couple of aspects in the design. These include:

  1. When an additional LIFX smartlight was added to the network, it exchanges data over the IEEE 802.15.4 network in the clear (unencrypted); except for the Wi-Fi credentials and some configuration details. All of this data was sent as encrypted blob.
  2. The encryption key for this blob was a pre-shared key hard coded into the firmware for every LIFX smartlight (of that firmware revision). This key was accessible via JTAG (which was pinned out on the PCB) or through the firmware image (which was not available at the time of the compromise).
  3. The system allowed a client on the IEEE 802.15.4 network to request (and receive) this encrypted configuration / credentials blob at any time in the background.

Compromise

The compromise allows an attacker physically close to the system to:

  1. Acquire the LIFX pre-shared encryption key from the firmware or JTAG interface.
  2. On the IEEE 802.15.4 network, request the encrypted configuration / credentials blob (masquerading as a LIFX smartlight).
  3. Crack open the blob using the encryption key from step 1.
  4. Connect to the Wi-Fi network using the credentials from the blob.
  5. Access the network and / or control the LIFX light bulbs.

Assessment

From this there are at least a few poor design choices that enabled this compromise. The first of these is to use a static pre-shared key to encrypt sensitive wireless data. The ability to establish a secure channel based on PKI has been standard practice for decades, allowing the use of dynamically generated keys at a session level. The use of a static pre-shared key is just lazy design.

The second of these is the ability to request the encrypted credential blob silently. For an initial configuration of an additional smartlight to the network, it is reasonable to require user confirmation to share the data with the additional smartlight. An attacker requesting this data in the background should not be allowed to get this data without user confirmation, or simply rejected when not part of a new bulb configuration.

Although having the JTAG port pinned out may seem to be a poor design choice, it is not really add significant risk. JTAG availability on the device pins would have been more than sufficient for a physical hacker, and that is assuming that the same data would not have been available in a firmware download. A JTAG port does not present a significant risk if keys are managed securely, and the security architecture takes this exposure into account.

Tale 2 – Smart Home Denial of Service

Vulnerability

The vulnerability in this story is that the smart home in this story had connected all of the smart devices in the house through a common Ethernet infrastructure – effectively rendering every device as a node on a flat network. This flat network meant that any one device can saturate the network with packets, effectively breaking the network. It also means that any one device can also monitor every packet on the network, or selectively disrupt packets. Essentially the security of this flat network can be compromised in multiple ways by any device on the network, and the overall security of the network is only as good as the weakest device.

Compromise

This particular compromise was based on a smartlight beaconing on the network as a denial of service attack. This event was not malicious, but if we consider the triad of confidentially, integrity and availability it is still a security failure. A self induced denial of service is still a denial of service.

Assessment

As a systems engineer the smart home described in this article makes me uncomfortable. The designer indicated that he had not installed his smart door locks since he did not want to be locked out / in by the locks. The designer also indicated that the light bulb denial of service rendered all the smart devices in his house broken / unavailable.

As a systems engineer this bothers me for a couple of reasons. The first of these is that it is possible to segment the network so that a failure does propagate through the entire network – effectively setting up security domains on functional boundaries. Even a trivial level of peering management would provide some level of isolation without giving up the necessary control protocols.

The second part that I find bothersome is that it appears that the entire system was designed with a single centralized control mechanism / scheme. Given the relatively poor reliability of network systems as compared with traditional home lighting / appliance controls, it makes sense to to install a parallel control scheme that is based on a local (more reliable) control path that operates much closer to the device being controlled.

In summary – the architecture of this particular smart home implementation is brittle in that a single device failure can precipitate an entire system failure. In addition it is fragile in that the control scheme is dependent on a number of disparate sequential operations that provide a multitude of single point failures for every device. Lastly, the system is not robust in that there is not an alternate control scheme. In my opinion this smart home may be an interesting experiment, but is a weak systems design with lots of architectural / system flaws.

Tale 3 – ThingBots

This is not a cautionary tale about a specific device or attack, but a cautionary tale about embedded devices in general, and by inclusion – IoT devices. Back in last week of 2013/first week of 2014, Proofpoint gathered some data from a number of botnets sending out spam. Specifically, they identified the unique IP addresses in the botnets, and characterized them forensically, and found that roughly a quarter of the zombie machines were not traditional PCs, but things like DVRs, security cameras, home routers, and at least one refrigerator. From this they coined the term ‘ThingBot’, which is a botnet zombie based on some ‘thing’.

The message is that when it comes to compromise and attack, there are no devices that will not be attacked, there is no point where your devices is not a target for a botnet. Harden all embedded devices and design defensively.

Bottom Line

The messages in these three tales are diverse, but can be summarized by:

  1. Every connected device is a target. Simply being a connected device is sufficient.
  2. Key management may be mundane but is even more critical on devices since often the only interface is networked.
  3. Most importantly – System design matters. Most security issues occur at the integration  interfaces between components of one type or another – and good system design reduce that exposure.

References

IOT and Stuff – The Evolution

Overview

This is the first of several posts I expect to do on IoT, including systems design, authentication, standards, and security domains. This particular post is an IoT backgrounder from my subjective viewpoint.

Introduction

The Internet of Things (IoT) is a phenomena that is difficult to define, and difficult to scope. The reason it is difficult to define is that it is rapidly evolving, and is currently based on the foundational capabilities IoT implementations provide.

Leaving the marketing hyperbole behind, IoT is the integration of ‘things’ into what we commonly refer to as the Internet. Things are anything that can support sensors and/or controls, an RF network interface, and most importantly – a CPU. This enables ubiquitous control / visibility into something physical on the network (that wasn’t on the network before).

IoT is currently undergoing a massive level of expansion. It is a chaotic expansion without any real top down or structured planning. This expansion is (for the most part) not driven by need, but by opportunity and the convergence of many different technologies.

Software Development Background

In this section, I am going to attempt to draw a parallel to IoT from the recent history of software development. Back at the start of the PC era (the 80s), software development carried with it high cost for compilers, linkers, test tools, packagers, etc. This marketing approach was inherited from the mainframe / centralized computer system era, where these tools were purchased and licensed by “the company”.  The cost of an IBM Fortran compiler and linker for the PC in the mid 80s was over $700, and libraries were $200 each (if memory serves me). In addition, the coding options were very static and very limited. Fortran, Cobol, C, Pascal, Basic and Assembly represented the vast majority of programming options. In addition (and this really surprised me at the time), if you sold a commercial software package that was compiled with the IBM compiler, it required that you purchase a distribution license from IBM that was priced based on number of units sold.  Collectively, these were significant barriers to any individual who wanted to even learn how to code.

This can be contrasted with the current software development environment where there is a massive proliferation of languages and most of them available as open source. The only real limitations or barriers to coding are personal ability, and time. There have been many events that have led to this current state, but (IMO) there were two key events that played a significant part in this. The first of these was the development of Borland Turbo Pascal in 1983, which retailed for $49.99, with unlimited distribution rights for an additional $99.99 for any software produced by the compiler. Yes I bought a copy (v2), and later I bought Turbo Assembler, Delphi 1.0, and 3.0. This was the first real opportunity for an individual to learn a new computer language (or to program at all) at an approachable cost without pirating it.

To re-iterate, incumbent software development products were all based on a mainframe market, and mainframe enterprise prices and licensing, with clumsy workflows and interfaces, copy protection or security dongles. Borland’s Turbo Pascal integrated editor, compiler and linker into an IDE – which was an innovative concept at the time. It also had no copy protection and a very liberal license agreement referred to as the Book License. It was the first software development product targeted at end users in a PC type market rather than the enterprise that employed the end user.

The second major event that brought about the end of expensive software development tools was GNU Compiler Collection (GCC) in 1987, with stable release by 1991. Since then, GCC has become the default compiler engine nearly all code development, enabling an explosion of languages, developers and open source software. It is the build engine that drives open source development.

In summary, by eliminating the barriers to software development (over the last 3 decades),  software development has exploded and proliferated to a degree not even imagined when the PC was introduced.

IoT Convergence

In a manner very analogous to software development over the last 3 decades, IoT is being driven by a similar revolution in hardware development, hardware production, and  software tools. One of the most significant elements of this explosion is the proliferation of Systems On a Chip (SoC) microprocessors. As recently as a decade ago (maybe a bit longer), the simplest practical microprocessor required a significant number of external support functions, which have now been integrated to a single piece of silicon. Today, there are microprocessors with various combinations of integrated UARTs, USB OTG ports, SDIO, I2C, persistent flash RAM, RAM, power management, GPIO, ADC and DAC converters, LCD drivers, self-clocking oscillator, and a real time clock  – all for a dollar or two.

A secondary aspect of the hardware development costs are a result of the open source hardware movement (OSH), that has produced very low cost development kits. In the not so distant past, the going cost for microprocessor development kit was about $500, and that market has been decimated by Arduino, Raspberry PI, and dozens of other similar products.

Another convergent element of the IoT convergence comes from open source software / hardware movement. All of the new low cost hardware development kits are based on some form of open source software packages. PCB CAD design tools like KiCAD enable low cost PCB development. Projects like OSHPark enable low cost PCB prototypes and builds without lot charges or minimum panel charges.

A third facet of the hardware costs is based on the availability and lower costs of data link radios for use with microprocessors. Cellular, Wi-Fi, 802.15.4, Zigbee, Bluetooth and Bluetooth LE all provide various tradeoffs of cost, performance, and ease of use – but all of them have devices and development kits that are an order of magnitude of lower cost than a decade ago.

The bottom line, is that IoT is not being driven by end use cases, or one group, special interest or industry consortium. It is being driven by the convergent capabilities of lower cost hardware, lower cost development tools, more capable hardware / software, and the opportunity to apply to whatever “thing” anybody is so inclined. This makes it really impossible to determine what it will look like as it evolves, and it also makes efforts by various companies get in front of or “own” IoT seem unlikely to succeed. The best these efforts are likely to achieve is that they will dominate or drive some segment of IoT by the virtue of what value they contribute to IoT. Overall these broad driving forces and the organic nature of the IoT growth means it is also very unlikely that it can be dominated or controlled, so my advice is to try and keep up and don’t get overwhelmed.

Personally, I am pretty excited about it.

PS – Interesting Note: Richard Stallman may be better known for his open source advocacy and failed Mach OS, but he was the driving developer behind GCC and EMACs, and GCC is probably as important as the Linux kernel in the foundation and success of the Linux OS and the open source software movement.

References