Category Archives: Systems Engineering

The Ying & Yang of Systems Security Engineering

Overview

Systems Security Engineering is Systems Engineering. Like any other engineered system, a security system will follow a certain workflow as it progresses from concept through to deployment. These include architectural development, design,  implementation, design verification and validation. This is the classic Systems Engineering top down development process followed by a bottom up verification process – like any other systems engineering effort.

However, in other ways Systems Security Engineering very unlike Systems Engineering in that many security requirements are negative requirements, and typical systems engineering is about positive requirements and functional goals. For example – a negative requirement may state “the security controls must prevent <some bad thing>”, where a positive requirement may state “the system must do <some functional thing>”. In addition Systems Engineering is about functional reduction, where some higher level function is reduced to some set of lower level functions – defining what the system does. Security Engineering is about how system functions are implemented, and things the system should not do, ideally with no impact on the overall function of the system. These two factors increase the complexity of top down security implementation, and make the bottom up verification much more difficult (since it is impossible to prove a negative).

In this post below we are going to be focusing on how security systems are verified, and provide a few insights on how to more effectively verify system security.

Level 0 Verification: Testing Controls

As security engineers, we work to express every security requirement as a positive requirement, but that approach is fundamentally flawed since a logical corollary almost never exists for the negative requirements. The best we can hope for is to reduce the scope of the negative requirements. In addition, security architectures and designs are comprised of controls which have specific functions. The result often is that the security verification is a collection of tests that functionally verify security controls, and this is mis-interpreted as verification of the overall system security. This is not to say these are unimportant (they are), but they represent the most basic level of testing because testing of this nature only tests the functional features of specific security controls. It does not test any of the negative requirements that drive the controls. For example, if we started out with a negative security requirement that states “implement user authentication requirements that prevent unauthorized access”. This could be implemented as a set of controls that enforce password length, complexity and update requirements for users. These controls for length, complexity and update requirements could then be tested to verify that they have been implemented correctly. However, if an attacker were able to get the authentication hashed datafile, and extract the passwords with some ridiculous GPU based personal supercomputer or a password cracker running on EC2, this attacker would have access since they simply can use the cracked password. The result is that the controls have been functional tested (and presumed passed), but the negative requirement has not been satisfied. The takeaways are:

  • Testing the controls functionally is important, but don’t confuse that with testing the security of the system.
  • System security is limited by the security controls, and attackers are only limited by their creativity and capability. Your ability as a systems security engineer is directly correlated to your ability to identify threats and attack paths in the system.

Level 1 Verification: Red Teams / Blue Teams

The concept of Red Team versus Blue Team has evolved from military war gaming simulations, where the blue team represents the defenders and the red team represent the attackers. Within the context of military war gaming, this is a very powerful model since it encompasses both the static and dynamic capabilities of the conflict between the teams.

This model was adapted to system security assessment where the blue team represents the  system architects and / or system admins /ITSecOps team / system owners (collectively – the stakeholders), and red team is some team of capable “attackers” that operates independently from the system design team. As a system testing model this brings forward some significant advantages.

First and foremost, system designers / system owners have a strong tendency to only see the security of their system through the lens of the controls that exist in the system. This is an example of Schneier’s Law, an axiom that states “any person (or persons) can invent a security system so clever that she or he can’t think of how to break it.” A blue team is that group that generally cannot think of a way to break their system. A red team is external to the system architects / system owners is not bound by those preconceptions and is more likely to see the system in terms of potential vulnerabilities (and is much more likely to find vulnerabilities).

Secondary to that, since a red team is organizationally independent from the system architects / system owners, they are much less likely to be concerned about the impact of their findings on the project schedule, performance or bruised egos of the system stakeholders. In the case of penetration test teams, it is often a point of pride to cause as much havoc as possible within the constraints of their contract.

Penetration Testing teams are a form of red team testing, and work particularly well for some classes of systems where much of the system security is based on people. This is discussed in detail in the next sections.

Level 2 Verification: Black Box / White Box

In the simplest terms, black box testing is testing of a system where little or no information of the system is known by the testers. White box testing is where a maximum level of information on the system is shared with the testers.

From a practical viewpoint, whitebox testing can produce results much more quickly and efficiently since the test team can skip past the reconnaissance  / discovery of the system architecture / design.

However, there are cases where whitebox testing will not give you complete / correct results and blackbox testing will likely be more effective. There are two major factors that can drive black box testing as the better methodology over white box testing.

The first factor is whether or not the implemented system actually matches the architecture / design. If the implementation has additions/deletions or modifications that do not match the documented architecture / design, whitebox testing may not identify those issues, since reconnaissance  / discovery is not been performed as part of whitebox testing. As a result, vulnerabilities associated with these modifications are not explored.

The second factor in determining if blackbox testing is the right choice is where the security controls are. Security controls can exist in the following domains:

  1. Management – These are people policy, organizational and authority controls put in place to support the system security. Requiring all employees to follow all the systems security rules, or be fired and / or prosecuted – is  management control. A common failure of this rule is where corporate VPs share their usernames / passwords with their administrative assistants – and generally do not risk being fired. In most cases management controls are the teeth behind the rules.
  2. Operational – These controls are the workflow and process controls. These are the controls that are intended to associate authority with accountability. An example is that all purchase orders are approved by accounting, and  above a certain value they must be approved by a company officer. Another one is to not share your username / password. These controls are people-centric controls (not enforced by technology), and in most cases they present the greatest vulnerabilities.
  3. Technical – These are the nuts and bolts of system security implementation. These are the firewalls, network Intrusion Detection Systems (IDS), network / host anti-virus tools, enforced authentication rules, etc. This is where 90% of the effort and attention of security controls is focused, and where a much smaller percentage of the failures actually occur.

When your system is well architected, with integral functional controls for technical controls, but with a significant portion of the system security focused in operational (people) controls, black box testing is merited. Much like the first factor where the actual system may not reflect the architecture / design and it is necessary to use black box testing to discovery these issues, people controls are often soft and variable and it is necessary to use black box testing to test this variability.

Penetration Test Teams

Penetration Test Teams (also known as Red Teams) are teams comprised of systems security engineers with very specialized knowledge and skills in compromising different elements of target computer systems. An effective Red Team has all of the collective expertise needed to compromise most systems. When functioning as a blackbox team, they function and operate in a manner that is consistent with cyber attackers, but with management endorsement and the obligatory get out of jail free documentation.

At first glance, Red Teams operating in this way may seem like a very effective approach to validating the security of an system. As discussed above, that would be a flawed assumption. More specifically, Red Team team testing can be effective for a specific type of system security architecture, where the actual system could deviate from the documented system or if much of your system security controls are people-centric. Secondly, by understanding where the security in a system is (and where it is not), we can determine if Black Box testing is the more correct approach to system security testing.

Security Control Decomposition – Where “Security” Lives

In any security solution, system or architecture it should be clear what makes the system secure. If it is not obvious what controls in a system provide the security, it is not really possible to assess and validate how effective the security is. In order to better explore this question, we are going to take a look at another (closely related) area of cyber-security that is somewhat more mature that security engineering for parallels – cryptography.

Background: Historical Cryptography

In the dark ages of cryptography, the algorithm was the secrecy. The Caesar Cipher is a simple alphabet substitution cipher where plaintext is converted to ciphertext by shifting some number of positions in the alphabet. Conversion back to plaintext is accomplished by reversing the process. This cipher is the basis of the infamous ROT13, which allows the plaintext to be recovered from ciphertext by applying the 13 step substitution a second time, due to the 26 letters in the basic Latin alphabet.

In modern terms, the algorithm of the Caesar Cipher is to shift substitute by some offset to encrypt (with wrap around at the end of the alphabet), and shift substitute with the same offset negatively to decrypt. The offset used would be considered the key for this method. The security of any cipher is based on what parts of the cipher make it secure. In the Caesar Cipher knowledge of the method allows some attacker to try offsets until they are successful (with a keyspace of 25 values). If the attacker knows the key, but not the method, it appears to be more challenging that testing for 1 of 25 values. Given this very trivial example, it would appear that the security of the Caesar Cipher is more heavily based on the algorithm than the key.  From a more practical sense, Caesar gained most of his security based on the degree of illiteracy of his time.

In practice, Caesar used a fixed offset of three in all cases, with the result that his key and algorithm with fixed for all applications, which meant there is not distinction between key and algorithm.

Fast forward a few thousand years (give or take), and modern cryptography has a very clear distinction between key and algorithm. In any modern cipher, the algorithm is well documented and public, and all of the security is based on the keys uses by the cipher. This is a really important development in cryptography.

Background: Modern Cryptography

Advanced Encryption Standard (AES) was standardized by the US National Institute of Standards and Technology (NIST) around 2001. The process to develop and select an algorithm was essentially a bake off starting in 1997 of 15 different ciphers along with some very intensive and competitive analysis by the cryptography community. The result is that the process was transparent, the evaluation criteria was transparent, and many weaknesses were identified in a number of ciphers. The resulting cipher (Rijndael) survived this process, and by being designated the cipher of choice by NIST it has a lot of credibility.

Most importantly for this discussion is the fact that any attacker has access to complete and absolute knowledge of the algorithm, and even test suites to ensure interoperability, and this results in no loss of security to any system using it. Like all modern ciphers, all of the security of a system that uses AES is based on the key used and how it is managed.

Since the use of AES is completely free and open (unlicensed), over the last decade it has been implemented in numerous hardware devices and software systems. This enables interoperability between competitive products and systems, and massive proliferation of AES enabled systems. This underscores why it is so important to have a very robust and secure algorithm.

If some cipher were developed as a close source algorithm with a high degree of secrecy, was broadly deployed and then later a weakness / vulnerability was discovered, this would compromise the security of any system that used cipher. That is exactly what happened with a steam cipher known as RC4. For details refer to the Wikipedia reference below for RC4. The net impact is that the RC4 incident / story is one of the driving reasons for the openness of the AES standards process.

And now back to our regularly scheduled program…

The overall message from this discussion on cryptography is that a security solution can be viewed as a monolithic object, but by doing so it cannot effectively be assessed and improved. The threats need to be identified and anti-patterns need to be developed based on these threats, system vulnerabilities, and attack vectors mapped. Based on this baseline specific security controls can be defined and assessed for how well these risks are mitigated.

The takeaways are:

  • System security is based on threats, vulnerabilities, and attack vectors. These are mitigated by explicitly by security controls.
  • System security is built from a coordinated set of security controls, where each control provides a clear and verifiable role / function in the overall security of the system.
  • The process of identifying threats, vulnerabilities, attack vectors and mitigating controls is Systems Security Engineering. It also tells you “where your security is”.

Bottom Line

In this post we highlighted a number of key points in System Security Engineering.

  • Systems Security engineering is like Systems engineering in that (done properly) it is based on top down design and bottom up verification / validation.
  • Systems Security engineering is not like Systems engineering in that it is usually not functional and expressed as negative requirements that defy normal verification / validation.
  • Security assessments can be based on red team / blue team assessments and it can be done using a white box model / black box model, and the most effective approach will be based on the nature of the system.

As always, I have provided links to interesting and topical references (below).

References

 

2016 Personal Security Recommendations

Overview

There are millions of criminals on the Internet and billions of potential victims. You have probably not been attacked or compromised and if so, it is due to the numbers – probably not your personal security habits.

I have a passion for cyber security. Effective cyber security is a system problem with no easy or obvious solutions, and the current state of the art leaves plenty of room for improvement. I also think that every person who uses the Internet should have a practical understanding of the risks and what reasonable steps they should take to protect themselves.

For these reasons, any conversation I am in tends toward cyber security, and I occasionally am asked what my recommendations are for personal cyber security. When not asked, I usually end up sharing my opinions anyway.  My answer generally is qualified by the complexity of defending against the threats that are more ‘real’, but for most people we can make some generalizations.

The list below is what I think makes the most sense at this time. Like all guidance of this nature, the shelf life of this may be short. Before we can look at actionable recommendations, we need to really look at the threats we face. The foundation for any effective security recommendation must be to look at your threat space.

  1. Threats – These are realistic and plausible threats to your online accounts and data, in which you have realistic and plausible mitigation.
    1. Cyber Criminals – Criminals who are trying to monetize whatever they can from people on the Internet. There are so many ways this can be accomplished, but in most cases it involves getting access to your online accounts or installing malware to your computer. This threat represents 99.5% of the entire threat space most users have (note – this is a made up number, but is probably not too far off).
    2. Theft or Loss – Criminals who steal your computers or phone for  the device itself. If they happen to gain access to personal information on the device that enables extortion or other criminal access to your online accounts, that is a secondary goal. This threat represents 90% of the remaining threat space (so 90% of 0.5%) for laptops and smartphones (note – this number is also made up, with the same caveats).
    3. Computer Service Criminals – Anytime you take a phone / computer in for service, there is a risk that somebody copies off more interesting information for personal gain. It really does happen – search “geek squad crime” for details.
  2. Non-Threats – These are threats that are less likely, less plausible or simply unrealistic to defend against.
      1. NSA / FBI / CIA / KGB / GRU / PLA61398– Not withstanding the current issue between FBI vs Apple (which is not really about technical capability but about legal precedent), big govt Agencies (BGAs) have massive resources and money that they can bring to bear if you draw their attention. So my recommendation is that if you draw the attention of one or more BGAs, get a lawyer and spend some time questioning the personal choices that got you where you are.

    In order to effectively apply security controls to these threats, it is critical to understand what threat each of these controls protects against with some quantifiable understanding of relatively risk. In other words – it is more effective to protect against the threat that is most likely.

    Of the threats identified above, we identified online threats, device theft threats and computer service threats. For most people, the total number of times a computer / smart phone has been serviced or stolen can be counted on one hand. Comparatively, your online accounts are online and available 365 x 24 (that’s 8766 hours/year that you are exposed), and accessible by any criminal in the world with Internet access. Simple math should show you that protecting yourself online is at least 100x more critical than any other threat identified above.

    Threat Vectors

    In order to determine the most effective security controls for the given threats, it is important to understand what the threat vectors for each threat are. Threat vectors define the “how systems are attacked” for a given threat. Fortunately for the threats identified above, the vectors are fairly simple.

    In reverse order:

        1. Computer Service Threat: As part of the service process, you (the system owner) provides the device username and password so that the service people can access the operating system. This also happens to give these same service people fairly unlimited access to the personal files and data on the system, which they have been know to harvest for their personal gain. Keeping files of this nature in a secure container can reduce this threat.
        2. Theft or Loss: In recent years criminals have discovered that the information on a computer / phone may be worth much more than the physical device itself. In most cases, stolen computers and phones are harvested for whatever personal information can be monetized and then are sold to a hardware broker. If your system is not encrypted, all of the information on the system is accessible even if you have a complex password. Encryption of the system is really the only protection from this threat.
        3. Cyber Criminals: This is the most complex of the threats, since there are always at least two paths to the information they are looking for. Remember that the goal of this threat is to compromise your online accounts, which means that they can target the accounts directly on the Internet. However, most online Internet companies are fairly good at detecting and blocking direct attacks of this nature. So the next most direct path is to compromise a device with malware and harvest the information from this less protected device. The nature of this vector means this is also the most complex to protect. The use of Firewalls, Anti-Virus/Anti-Malware, Ad-Blockers, more secure browsers, secure password containers, and two factor authentication all contribute to blocking this attack vector. This layering of security tools (controls) is also called “defense in depth”.

    Actionable Recommendations [ranked]

    1. (Most Critical) Use Two Factor Authentication (2FA) for critical online accounts.
      1. Google: Everybody (maybe not you) has a Google account, and in many cases it is your primary email account. As a primary email account it is the target account for resetting your password for most other accounts. It is the one account to rule them all for your online world, and it needs to be secured appropriately. Use Google Authenticator on your smart phone for 2FA.
      2. Amazon: In the global first world, this is the most likely online shopping account everybody (once again – maybe not you) has. It also supports Google Authenticator for 2FA.
      3. PayPal: PayPal uses the SMS code as a 2nd authentication factor. It is not as convenient as Google Authenticator, but is better that 1FA.
      4. Device Integration: Apple, Google and Microsoft are increasingly integrating devices in their product ecosystems into their online systems. This increases the capabilities of these devices, and it also increases the online exposure of your accounts.
        1. Microsoft Online: Enable 2FA. Microsoft unfortunately does not  integrate with Google Authenticator, but does provide their own authentication app for your smart phone.
        2. Apple ITunes: Require Authentication for any purchases and Enable 2FA.
        3. Google Play: Require Authentication for any purchases.
      5. Banks, Credit Unions and Credit Accounts – These groups are doing their own thing for 2FA. If your banks, credit unions or credit accounts do not have some form of 2FA, contact them and request it. Or move your account.
    2. Password Manager: Use one, and offline is better than online. Remember putting it in the cloud is just somebody else’s computer (and may represent more risk than local storage). I personally recommend KeePass since it is open source, supports many platforms, is actively supported and free.
    3. Never store credit card info online: There are many online service providers that insist each month that they really want to store my credit card information in their systems (I am talking to you Comcast and Verizon), and I have to uncheck the save info box every time. At some point in the past, I asked a few of these service providers (via customer service) if agreeing to store my information on their servers meant that they assumed full liability for any and all damages if they were compromised. The lack of any response indicated to me that the answer is “probably not”. So if they are not willing to take responsibility for that potential outcome, I don’t consider it reasonable to leave credit card information in their system.
    4. Encrypt your SmartPhone: Smart phones are becoming the ultimate repository of personal information that can be used to steal your identity / money, and nearly all smart phones have provisions for encryption and password / PIN access. Use them. They really do work and are effective. It is interesting to note that most PIN codes are 4 to 6 digits, and most patterns (when reduced to bits) are comparable to 4 digit (or less) codes.
    5. Encrypt your Laptop: Your second most portable device is also the second most likely to be stolen or lost. If you have a Windows laptop, use BitLocker for system encryption. It is well integrated and provides some decent level of data security. In addition I would also recommend installing VeraCrypt. VeraCrypt is the more open source, next generation of TrueCrypt. For that extra level of assurance, you can create a secure container on your device or removable drive to store data requiring greater security / privacy.
    6. Password protect Chrome profile: I personally save usernames and passwords in my Chrome profile purely for the convenience. This allows me to go to any of my systems, and login easily to some of my regular sites. It also means that my profile represents a tremendous security exposure. So I sync everything and secure / encrypt it with a passphrase. Chrome offers the option to secure / encrypt with Google Account credentials, but I chose to use a separate passphrase to create a small barrier between my Google account and my Chrome sync data.
    7. Ad Blocker Plus/ AntiVirus/Firewall/Chrome: Malware is the most likely path to having your computer compromised. This can happen through phishing emails, or through a website or popup ads. Browsers are more effective at stopping malware than they used to be, and Chrome updates silently and continuously, decreasing your exposure risk. Chrome isthe browser I recommend. In addition, I use the Ad Blocker Plus plugin in Chrome. Lastly, I am using Windows 10, so I keep Windows  Defender fully enabled and updated. Pick your favorite anti-virus / anti-malware product, Defender just happens to be included and and does not result in a self inflicted Denial of Service (McAfee anyone?).
    8. Use PayPal (or equivalent) when possible: PayPal (and some other credit providers) manage purchases more securely online by doing one time transactions for purchases rather than simply passing on your credit credentials. This limits the seller to the actual purchase, and greatly reduces the risk that your card can be compromised.
    9. (Least Critical) VPN: If you have a portable device and use forms of public Wi-Fi, there is a risk that your information could be harvested as part of that first hop to the Internet. VPNs will not make you anonymous, VPNs are not TOR, but an always on VPN can provide you some security for this first hop. I use an always on VPN that I was able to get for $25 / 5 years. It may not provide the most advanced /  best security / privacy features available, but it is probably good enough for realistic threats.

    Additional Notes

    For those who are curious, there are some security tools that purport to provide security against the big government Agencies. However, it is important to note that even if these tools are compromised by these Agencies, it is very unlikely that they would admit it since it is more useful to have people believe they are being protected by these tools.

    1. VeraCrypt: Provides standalone encryption capability for files and storage devices that is nearly unbreakable. Like any encryption, the real weakness is the key and how you manage it.
    2. KeePass: Uses standalone encryption for passwords and other credential information. Once again, it is only as good as the password credentials you use.
    3. Signal / Private Call by Open Whisper: Secure messaging and voice call apps for your smart phone. The usefulness of these is directly related to who you are chatting with / talking with since both parties involved have to buy into to the additional effort to communicate securely.

    Bottom Line

    Security should do many things, but the most important elements for practical security are:

    1. It should protect against real threats in an effective manner. The corollary: It should not protect against imaginary / non-existent threats.
    2. It should be as transparent / invisible / easy to use as possible.
    3. It should be good enough that you are an obviously harder target than the rest of the herd (e.g There is no need to be faster than the bear chasing you, just faster than the guy next to you).

    Remember – The most effective security is the security that is used.

    Note – I apologize for my lack of tools for Apple platforms, but since I do not own one it is much more difficult to research / use.

    References

IOT and Stuff – The Evolution

Overview

This is the first of several posts I expect to do on IoT, including systems design, authentication, standards, and security domains. This particular post is an IoT backgrounder from my subjective viewpoint.

Introduction

The Internet of Things (IoT) is a phenomena that is difficult to define, and difficult to scope. The reason it is difficult to define is that it is rapidly evolving, and is currently based on the foundational capabilities IoT implementations provide.

Leaving the marketing hyperbole behind, IoT is the integration of ‘things’ into what we commonly refer to as the Internet. Things are anything that can support sensors and/or controls, an RF network interface, and most importantly – a CPU. This enables ubiquitous control / visibility into something physical on the network (that wasn’t on the network before).

IoT is currently undergoing a massive level of expansion. It is a chaotic expansion without any real top down or structured planning. This expansion is (for the most part) not driven by need, but by opportunity and the convergence of many different technologies.

Software Development Background

In this section, I am going to attempt to draw a parallel to IoT from the recent history of software development. Back at the start of the PC era (the 80s), software development carried with it high cost for compilers, linkers, test tools, packagers, etc. This marketing approach was inherited from the mainframe / centralized computer system era, where these tools were purchased and licensed by “the company”.  The cost of an IBM Fortran compiler and linker for the PC in the mid 80s was over $700, and libraries were $200 each (if memory serves me). In addition, the coding options were very static and very limited. Fortran, Cobol, C, Pascal, Basic and Assembly represented the vast majority of programming options. In addition (and this really surprised me at the time), if you sold a commercial software package that was compiled with the IBM compiler, it required that you purchase a distribution license from IBM that was priced based on number of units sold.  Collectively, these were significant barriers to any individual who wanted to even learn how to code.

This can be contrasted with the current software development environment where there is a massive proliferation of languages and most of them available as open source. The only real limitations or barriers to coding are personal ability, and time. There have been many events that have led to this current state, but (IMO) there were two key events that played a significant part in this. The first of these was the development of Borland Turbo Pascal in 1983, which retailed for $49.99, with unlimited distribution rights for an additional $99.99 for any software produced by the compiler. Yes I bought a copy (v2), and later I bought Turbo Assembler, Delphi 1.0, and 3.0. This was the first real opportunity for an individual to learn a new computer language (or to program at all) at an approachable cost without pirating it.

To re-iterate, incumbent software development products were all based on a mainframe market, and mainframe enterprise prices and licensing, with clumsy workflows and interfaces, copy protection or security dongles. Borland’s Turbo Pascal integrated editor, compiler and linker into an IDE – which was an innovative concept at the time. It also had no copy protection and a very liberal license agreement referred to as the Book License. It was the first software development product targeted at end users in a PC type market rather than the enterprise that employed the end user.

The second major event that brought about the end of expensive software development tools was GNU Compiler Collection (GCC) in 1987, with stable release by 1991. Since then, GCC has become the default compiler engine nearly all code development, enabling an explosion of languages, developers and open source software. It is the build engine that drives open source development.

In summary, by eliminating the barriers to software development (over the last 3 decades),  software development has exploded and proliferated to a degree not even imagined when the PC was introduced.

IoT Convergence

In a manner very analogous to software development over the last 3 decades, IoT is being driven by a similar revolution in hardware development, hardware production, and  software tools. One of the most significant elements of this explosion is the proliferation of Systems On a Chip (SoC) microprocessors. As recently as a decade ago (maybe a bit longer), the simplest practical microprocessor required a significant number of external support functions, which have now been integrated to a single piece of silicon. Today, there are microprocessors with various combinations of integrated UARTs, USB OTG ports, SDIO, I2C, persistent flash RAM, RAM, power management, GPIO, ADC and DAC converters, LCD drivers, self-clocking oscillator, and a real time clock  – all for a dollar or two.

A secondary aspect of the hardware development costs are a result of the open source hardware movement (OSH), that has produced very low cost development kits. In the not so distant past, the going cost for microprocessor development kit was about $500, and that market has been decimated by Arduino, Raspberry PI, and dozens of other similar products.

Another convergent element of the IoT convergence comes from open source software / hardware movement. All of the new low cost hardware development kits are based on some form of open source software packages. PCB CAD design tools like KiCAD enable low cost PCB development. Projects like OSHPark enable low cost PCB prototypes and builds without lot charges or minimum panel charges.

A third facet of the hardware costs is based on the availability and lower costs of data link radios for use with microprocessors. Cellular, Wi-Fi, 802.15.4, Zigbee, Bluetooth and Bluetooth LE all provide various tradeoffs of cost, performance, and ease of use – but all of them have devices and development kits that are an order of magnitude of lower cost than a decade ago.

The bottom line, is that IoT is not being driven by end use cases, or one group, special interest or industry consortium. It is being driven by the convergent capabilities of lower cost hardware, lower cost development tools, more capable hardware / software, and the opportunity to apply to whatever “thing” anybody is so inclined. This makes it really impossible to determine what it will look like as it evolves, and it also makes efforts by various companies get in front of or “own” IoT seem unlikely to succeed. The best these efforts are likely to achieve is that they will dominate or drive some segment of IoT by the virtue of what value they contribute to IoT. Overall these broad driving forces and the organic nature of the IoT growth means it is also very unlikely that it can be dominated or controlled, so my advice is to try and keep up and don’t get overwhelmed.

Personally, I am pretty excited about it.

PS – Interesting Note: Richard Stallman may be better known for his open source advocacy and failed Mach OS, but he was the driving developer behind GCC and EMACs, and GCC is probably as important as the Linux kernel in the foundation and success of the Linux OS and the open source software movement.

References

A Brief Introduction to Security Engineering

Background

One of the great myths is that security is complicated, hard to understand, and must be opaque to be effective. This is mostly fiction perpetrated by people who would rather you did not question the security theater they are creating in lieu of real security, by security practitioners who don’t really understand what they are doing, or lastly those who are trying to accomplish something in their interests under the false flag of security. This last one is why so much of the government “security” activities are not really about security, but about control – which is not the same. Designing and doing security can be complex, but understanding security is much easier than it is generally portrayed.

Disclaimer – This is not a comprehensive or exhaustive list / analysis. It is a brief introduction that touches on a few of the most practical elements of security engineering.

Security Axioms

Anytime I look at systems security, there are a few axioms I use to set the context, limit the scope and measure the effectiveness. These are:

  1. Perfect security is unachievable, and any practical security is the result of some cost driven tradeoff.
  2. Defining and understanding your threat model is step zero of any security solution. If you don’t know who are are defending against, the solution will not fit.
  3. Defining and understanding success. This means understanding what you trying to protect and what exactly protecting those elements means.
  4. Defending a system is more costly / difficult than attacking that same system. Attacker only need to be successful once, but defenders need to be successful everytime.
  5. Security based on secrecy is weaker than security based on strength. Closed security solutions are more likely to contain flaws that weaken the security versus open security solutions. Yes – this has been validated.

The first of these is a recognition that a security is about a conflict between a system / information defender and an attacker on that system. Somebody is trying to take something of yours and you want to stop them. Each of these two parties can use different approaches and tools to do this, with increasing costs – where costs are monetary, time, resources, or risks of being caught / punished. This first axiom simply states that if an attacker has infinite time, money, resources, and zero risk, your system will be compromised because you are outgunned. For less enabled attackers,  the most cost effective security is that which is just enough to discourage them so they move on to an easier target. This of course leads understanding your attacker, and the next axiom – know your threat.

The second axiom states that any security solution is designed to protect from a certain certain type of threat. Defining and understanding the threats you are defending against is foundational to security design since it will drive every aspect of the system. A security system to keep your siblings, parents, children out of your personal data is completely different than one designed to keep out cyber extortionists out of your Internet accounts.

The third axiom is based on the premise that most of what your system / systems are doing requires minimal protected (depending on the threat model), but some parts of it require significant protection. For example – my Internet browsing history is not that important as compared with my password and account access file. I have strong controls on my passwords and account access (eg KeePass), and my browsing history is behind a system password. Another way to look at this to imagine what the impact could be if a given element were compromised – that should guide the level of protection for that item.

The fourth axiom is based on the premise that the defender must successfully defend every vulnerability in order to be successful, but the attacker only has to be successful on one vulnerability – one time to be successful. This is also why complex systems are more prone to compromise – greater complexity leads to more vulnerabilities (since there are more places for gremlins to hide).

The fifth one is the perhaps the least obvious axiom of this list. Simply put the strength of some security control should not be based on the design being secret. Encryption protocols are probably the best example of how this works. Most encryption protocols over the last few decades are developed, and publicized within the peer community. Invariably, weaknesses are found and corrected, improving the quality of the protocol, and reducing the risk of an inherent vulnerability. These algorithms and protocols are published and well known, enabling interoperability and third party validation reducing the risk of vulnerabilities due to implementation flaws. In application, the security of the encryption is based solely on the key – the keys used by the users. The favorite counter example is from the world of traditional pin tumbler locks , in which locksmith guilds attempted to keep their design / architecture secret for centuries, passed laws making it a crime to possess lock picks or knowing how to pick a lock unless you were a locksmith. Unfortunately, these laws did little to impede criminals and it became an arms race between lock makers, locksmiths and criminals, with the users of locks being kept fairly clueless. Clearly of the lock choices available to a user, some locks were better, some were worse, and some were nearly useless – and this secrecy model of security meant that users did not have the information to make that judgement call (and in general they still don’t). The takeaway – if security requires that the design / architecture of the system be kept secret, it is probably not very good security.

Threat Models

In the world of Internet security and information privacy, there are only a few types of threat models that matter. This is not because there are only a few threats, but because the methods of attack and the methods to defend are common. Generally it is safe to ignore threat distinctions that don’t effect how the system is secured. This list includes:

  1. Immediate family / Friends / Acquaintances – Essentially people who know you well and have some degree of physical access to you or the system your are protecting.
  2. Proximal Threats : Threats you do not know, but are who are physically / geographically close to you and the system you are protecting.
  3. Cyber Extortionists : A broad category of cyber attackers whose intent is to profit by attacking and compromising your information. This group generally targets individuals, but not a specific individual – they look for easy targets.
  4. Service Compromise : Threats who attack large holders of user information – ideally credit card information. This group is looking for bulk information is not targeting individuals directly.
  5. Advanced Persistent Threats (APTs) : Well equipped, well resourced, highly capable and persistent. These attackers are generally supported by governments or large businesses and their targets are usually equally large. This group plans and coordinates their attacks with a specific purpose.
  6. Government (NSA / CIA / FBI / DOJ / DHS / etc): Currently the biggest, baddest threat. They have the most advanced technical resources, the most money, and they use National Security Letters when those are not enough. The collect data in bulk, and they target individuals.

From a personal security perspective we are looking at threats most likely to concern any random user of internet services – you. In that context, we can dismiss a couple of these quickly. Lets do this in reverse order:

Government (NSA et al) – If they are targeting you specifically, and you use Internet services – you are need of more help than I can provide in this article. If your data is part of some massive bulk data collection – there is very little you can do about that either. So in either case,  in the context of personal data security for Joe Internet User, don’t worry about it.

Advanced Persistent Threats (APTs) – Once again, much like the NSA, it is unlikely you would be targeted specifically, and if you are your needs are beyond the scope of this article. So – although you may be concerned about this threat, there is very little you can do to stop this threat.

Service Compromise – I personally pay all of my bills online, and every one of these services wants to store my credit card in their database. Now the question you have to ask is if (for example), the Verizon customer database is compromised and somebody steals all of that credit card information (with 10s of millions of card numbers) and uses them to spend 100s of millions of charges – is Verizon (or any company in that position) going to take full responsibility? Highly unlikely – and that is why I do not store my credit information on their systems. If they are not likely to accept responsibility for any outcome, should you trust them with your credit?

Cyber Extortionists – The most interesting and creative of all these threat classes. I continue to be amazed at every new exploit I hear about. Examples include mobile apps that covertly call money transfer numbers (eg 1-900 numbers in US), or apps that buy other apps covertly. Much like the Salami Slicing attacks (made famous in the movie Office Space), individual attacks represent some very small financial gain, but the hope is that collectively they can represent significant money.

Proximal Threats – If somebody can physically take your laptop, tablet, phone, they have a really good shot at all of the information on that device. Many years ago, I had an iPhone stolen from me on the Washington DC metro, I had not enabled the screen lock, and I had the social security numbers / birthdays of my entire family in my contacts. And yes, there were false attempts to get credit based on this information within hours – unsuccessfully. I now use / recommend everybody use some device access lock, and encrypt very sensitive information in some form of locker. Passwords / accounts and social security numbers in KeePass and sensitive file storage in TruCrypt. These apps are free and provide significant protection for Just In Case. Remember physical control / access to a device is its own special type of attack.

Friends / Family / Acquaintances – In most cases, the level of security to protect from this class of threat is small. More importantly, it is crucial to understand what it is you are trying to protect, why are you protecting it, and what are your recovery options. To repeat – what are your recover options? It is very easy to secure your information, and then forget the password /  passphrase  or corrupt your keyfile. Compromise of private data in this context is orders of magnitude less likely than you locking yourself out of your data – permanently. Yes, I have done this and family photos on a locked TrueCrypt partition cannot be recovered in your lifetime. So when you look at security controls to protect from this threat model, look for built in recovery capabilities and only protect what is necessary to protect.

Conclusions

Fundamentally security engineering is about understanding what you are trying to protect, who / what your threat is, and determining what controls to use to impede the threat while not impeding proper function. Understanding your threat is the first and most important part of that process.

Lastly – I would encourage everybody who finds this the least bit interesting to either read Bruce Schneier’s blog and his books. He provides a very approachable and coherent perspective on IT security / Security Engineering.

Links

Software: Thoughts on Reliability and Randomness

Overview

Software Reliability and Randomness are slippery concepts that may be conceptually easy to understand, but hard to pin down. As programmers, we can write the equivalent of ‘hello world’ in dozens of languages on hundreds of platforms and once the program is functioning – it is reliable. It will produce the same results every time it is executed. Yet systems built from thousands of modules and millions of lines of code function less consistently than our hello world programs – and are functionally less reliable.

As programmers we often look for a source of randomness in our programs, and it is hard to find. Fundamentally we see computers as deterministic systems without any inherent entropy (for our purposes – randomness). For lack of true random numbers we generate Pseudo Random Numbers (PRNs), which are not really random. They are used in generating simulations, and in generating session keys for secure connections, and this lack of true randomness in computer generated PRNs has been the source of numerous security vulnerabilities.

In this post I am going to discuss how software can be “unreliable”, deterministic behavior, parallel systems / programming, how modern computer programs / systems can be non-deterministic (random), and how that is connected to software reliability.

Disclaimer

The topics of software reliability, deterministic behavior, and randomness in computers is a field that is massively deep and complex. The discussions in this blog are high level, lightweight, and I make some broad generalizations and assertions that are mostly correct (if you don’t look to closely) – but hopefully still serve to illustrate the discussion.

I also apologize in advance for this incredibly dry and abstract post.

Software Reliability

Hardware reliability, more precisely “failure” is most often occurs when some device in a system breaks (the smoke comes out), and the system no longer functions as expected. Software failures do not involve broken hardware or devices. Software failures are based on the concept that there are a semi-infinite number of paths (or states) through a complex software package, and the vast majority will result in the software acting and functioning as expected. However there are some paths through the code that will result in the software not functioning as expected. When this happnes, the software and system are doing exactly what the code is telling it to do – so from that perspective, there is no failure. However from the concept of a software failure, the software is not doing what is expected – which we interpret as a software failure, which provides a path to understand the concept of software reliability.

Deterministic Operation

Deterministic operation in software means that a given program with a given set if inputs will function in exactly the same manner every time it is executed – without any unexpected behaviors. For the most part this characteristic is what allows us to effectively write software. If we carry this further, and look at software on simple (8 / 16 bit) microprocessors / microcontrollers, where the software we write runs exclusively on the device, operation is very deterministic.

In contrast – on a modern system, our software exists in a relatively high level on top of APIs (application programming interfaces), libraries, services, and a core operating system – and in most cases this is a multitasking/multi-threaded/multi-cored environment. In the world of old school 8 / 16 bit microprocessors / microcontrollers, none of these layers exist. When we program for that environment, our program is compiled down to machine code that runs exclusively on that device.

In this context, our program not only operates deterministically in how the software functions, but the timing and interactions external to the microprocessor is deterministic. In the context of modern complex computing systems, this is generally not the case. In any case, the very deterministic operation of software on dedicated microprocessor makes it ideal for real world interactions and embedded controllers. This is why this model is used for toasters, coffee pots, microwave ovens and other appliances. The system is closed – meaning its inputs are limited to known and well defined sources, and its functions are fixed and static, and generally these systems are incredibly reliable. After all how often it is necessary to update the firmware on an appliance?

If this war our model the world of software and software reliability, we would be ignoring much of what has happened in the world of computing over the last decade or two. More importantly – we need to understand that this model is an endpoint, not the whole story, and to understand where we are today we need to look further.

Parallel Execution

One of the most pervasive trends in computing over the last decade (or so) is the transition from increasingly faster single threaded systems to increasingly parallel systems. This parallelism is accomplished through multiple computing cores on a single device and through multiple processing threads on a single core, which are both mechanisms to increase the ability of the processor to produce more work by being able to support concurrently running programs. A typical laptop today can have two to four cores and support two hardware threads per core, resulting in 8 relatively independent processes running at the same time. Servers with 16 to 64 cores would have qualified as supercomputers (small ones) a decade ago are now available off the shelf.

Parallel Programming: the Masochistic Way

Now – back in the early 80s as an intern at Cray, my supervisor spent one afternoon trying to teach me about how Cray computers (at that time) were parallel coded. As one of the first parallel processing systems, and as systems where every cycle was expensive – much of the software was parallel programmed in assembly code. The process is exactly how would imagine. There was a hardware scheduler that would transfer data to/from each processor to main memory every so many cycles. In between these transfers the processors would execute code. So if the system had four processors, you would write assembly code for each processor to execute some set of functions that were time synchronized ever so many machine cycles, with NOPs (no operation) occasionally used to pad the time. NOPs were considered bad practice since cycles were precious and not to be wasted on a NOP.  At the time, it was more than I wanted to take on, and I was shuffled back to hardware troubleshooting.

Over time I internalized this event, and learned something about scalability. It was easy to imagine somebody getting very good at doing two (maybe even 3 or 4) dissimilar time synchronous parallel programs. Additionally, since many programs also rely on very similar parallel functions, it was also easy to imagine somebody getting good at writing programs that did the same thing across a large number of parallel processors. However, it is much harder to imagine somebody getting very good at writing dissimilar time synchronous parallel programs effectively over a large number of parallel processors. This is in addition to the lack of scalability inherent in assembly language.

Parallel Programming – High Level Languages

Of course in the 80s or even the 90s, most computer programmers did not need to be concerned with parallel programming, and every Operating System was single threaded, and the argument of the day was Cooperative multitasking versus Preemptive multitasking. Much like the RISC vs CISC argument from the prior decade, these issues were rendered irrelevant by the pace of processor hardware improvements. Now many of us walk around with the equivalent that Cray supercomputer in our pockets.

In any case the issue of parallel programming was resolved in two parts. The first being the idea of a multi-tasking operating systems with a scheduler – the core function that controls what programs are running (and how long they run) in parallel at any one time. The second being the development of multi-threaded programming in higher level languages (without the time synchronization of early Crays).

Breaking Random

Finally getting back to my original point… The result today is that all modern operating systems have some privileged block of code – the kernel running continuously, but have a number of other services that run the OS, including the memory manager and the task scheduler.

The key to this whole story is that these privileged processes manage access to shared resources on the computer. Of these two, the task manager is the most interesting – mostly due the arcane system attributes it uses to determine which processes have access to which core / thread on the processor. This is one of the most complex aspects of a multitasking / multi-core / multithreaded (hardware) system. The attributes the scheduler looks at include affinity flags that processes use to indicate core preference, priority flags, resource conflicts and hardware interrupts.

The net result is that if we take any set of processes on a highly parallel system there are some characteristics of this set that are sufficiently complex and impacted by unknown external elements that they are random – truly random. For example if we create three separate processes that generate a pseudo random number set based on some seed (using unique values in each), and point all of them to some shared memory resource- where the value is read as input and the output is written back. Since the operation of the task scheduler means that the order of execution of these three threads is completely arbitrary, it is not possible to determine what the sequence is deterministically – the result would be something more random than a PRNG. A not so subtle (and critical) assumption is that the system has other tasks and processes it is managing, which directly impact the scheduler, introducing entropy to the system.

Before we go on, lets take a closer look at this. Note that if some piece of software functions the same (internally and externally) every time it executes, it is deterministic. If this same piece of software functions differently based on external factors that are unrelated to this software, that is non-deterministic. Since kernel level resource managers (memory, scheduler, etc) function in response to system factors and factors from each and every running process – that means that from the perspective of any one software package, certain environmental factors are non-deterministic (i.e. random). In addition to the scheduling and sequencing aspects identified above, memory allocations will also be granted or moved in a similar way.

Of course this system level random behavior is only half the story. As software packages are built to take advantage of gigabytes of RAM, and lots of parallel execution power, they are becoming a functional aggregation of dozens (to hundreds) of independently functioning threads or processes, which introduce a new level of sequencing and interdependancies which are dependent on the task manager.

Bottom Line – Any sufficiently complex asynchronous and parallel system will have certain non-deterministic characteristics based on the number of independent sources that will influence access / use of system shared resources. Layer the complexity of parallel high level programming, and certain aspects of program operation are very non-deterministic

Back to Software Reliability

 Yes we have shown that both multitasked parallel hardware and parallel programmed software contribute to some non-deterministic behavior in operation, but we also know that for the most part software is relatively reliable. Some software is better and some is worse, but there clearly is some other set of factors in play. 

The simple and not very useful answer is “better coding” or “code quality”. A slightly more insightful answer would tell you that code that depends on or uses some non-deterministic feature of the system is probably going to be less reliable. An obvious example is timing loops. Back in the days of single threaded programs and single threaded platforms, programmers would introduce relatively stable timing delays with empty timing loops. This practice was easy, popular and produced fairly consistent timing – showing deterministic behavior. As systems hardware and software have evolved, the assumptions these coding practices rely on become less and less valid. Try writing a timing loop program on a modern platform and the results can be workable much of the time, but it  can also vary by orders of magnitude – in a very non-deterministic manner. There are dozens of programming practices like this that use to work just fine, but no longer do – but they don’t completely break, just operate a little bit randomly. In many cases, the behavior is close enough to “correct” that the program appears to function, but not very reliably.

Another coding practice that used to work on single threaded systems was to call some function and expect the result would be available on the next line of code. It worked on single threaded systems because execution was handed off to that function, and did not return until it was complete. Fast forward to today, and if this is written as a parallel program – the expected data may not be there when your code thinks is should be. There is a lesson here – high level parallel programming languages make writing parallel code fairly easy, but that does not mean that writing robust parallel programs is easy. Parallel inter-dependencies issues can be just as ugly as parallel assembly code on a Cray system.

Summary

A single piece of code running exclusively on a dedicated processor is very deterministically, but parallel programmed software on a multitasking parallel hardware system can be very non-deterministic, and difficult to test. Much of software reliability is based on how little a given software package depends on these non-deterministic features. Managing software reliability and failure mechanisms requires that programmers understand the system beyond the confines of the program.

References