Category Archives: Security

General Systems Security and Security Architecture

The Ying & Yang of Systems Security Engineering

Overview

Systems Security Engineering is Systems Engineering. Like any other engineered system, a security system will follow a certain workflow as it progresses from concept through to deployment. These include architectural development, design,  implementation, design verification and validation. This is the classic Systems Engineering top down development process followed by a bottom up verification process – like any other systems engineering effort.

However, in other ways Systems Security Engineering very unlike Systems Engineering in that many security requirements are negative requirements, and typical systems engineering is about positive requirements and functional goals. For example – a negative requirement may state “the security controls must prevent <some bad thing>”, where a positive requirement may state “the system must do <some functional thing>”. In addition Systems Engineering is about functional reduction, where some higher level function is reduced to some set of lower level functions – defining what the system does. Security Engineering is about how system functions are implemented, and things the system should not do, ideally with no impact on the overall function of the system. These two factors increase the complexity of top down security implementation, and make the bottom up verification much more difficult (since it is impossible to prove a negative).

In this post below we are going to be focusing on how security systems are verified, and provide a few insights on how to more effectively verify system security.

Level 0 Verification: Testing Controls

As security engineers, we work to express every security requirement as a positive requirement, but that approach is fundamentally flawed since a logical corollary almost never exists for the negative requirements. The best we can hope for is to reduce the scope of the negative requirements. In addition, security architectures and designs are comprised of controls which have specific functions. The result often is that the security verification is a collection of tests that functionally verify security controls, and this is mis-interpreted as verification of the overall system security. This is not to say these are unimportant (they are), but they represent the most basic level of testing because testing of this nature only tests the functional features of specific security controls. It does not test any of the negative requirements that drive the controls. For example, if we started out with a negative security requirement that states “implement user authentication requirements that prevent unauthorized access”. This could be implemented as a set of controls that enforce password length, complexity and update requirements for users. These controls for length, complexity and update requirements could then be tested to verify that they have been implemented correctly. However, if an attacker were able to get the authentication hashed datafile, and extract the passwords with some ridiculous GPU based personal supercomputer or a password cracker running on EC2, this attacker would have access since they simply can use the cracked password. The result is that the controls have been functional tested (and presumed passed), but the negative requirement has not been satisfied. The takeaways are:

  • Testing the controls functionally is important, but don’t confuse that with testing the security of the system.
  • System security is limited by the security controls, and attackers are only limited by their creativity and capability. Your ability as a systems security engineer is directly correlated to your ability to identify threats and attack paths in the system.

Level 1 Verification: Red Teams / Blue Teams

The concept of Red Team versus Blue Team has evolved from military war gaming simulations, where the blue team represents the defenders and the red team represent the attackers. Within the context of military war gaming, this is a very powerful model since it encompasses both the static and dynamic capabilities of the conflict between the teams.

This model was adapted to system security assessment where the blue team represents the  system architects and / or system admins /ITSecOps team / system owners (collectively – the stakeholders), and red team is some team of capable “attackers” that operates independently from the system design team. As a system testing model this brings forward some significant advantages.

First and foremost, system designers / system owners have a strong tendency to only see the security of their system through the lens of the controls that exist in the system. This is an example of Schneier’s Law, an axiom that states “any person (or persons) can invent a security system so clever that she or he can’t think of how to break it.” A blue team is that group that generally cannot think of a way to break their system. A red team is external to the system architects / system owners is not bound by those preconceptions and is more likely to see the system in terms of potential vulnerabilities (and is much more likely to find vulnerabilities).

Secondary to that, since a red team is organizationally independent from the system architects / system owners, they are much less likely to be concerned about the impact of their findings on the project schedule, performance or bruised egos of the system stakeholders. In the case of penetration test teams, it is often a point of pride to cause as much havoc as possible within the constraints of their contract.

Penetration Testing teams are a form of red team testing, and work particularly well for some classes of systems where much of the system security is based on people. This is discussed in detail in the next sections.

Level 2 Verification: Black Box / White Box

In the simplest terms, black box testing is testing of a system where little or no information of the system is known by the testers. White box testing is where a maximum level of information on the system is shared with the testers.

From a practical viewpoint, whitebox testing can produce results much more quickly and efficiently since the test team can skip past the reconnaissance  / discovery of the system architecture / design.

However, there are cases where whitebox testing will not give you complete / correct results and blackbox testing will likely be more effective. There are two major factors that can drive black box testing as the better methodology over white box testing.

The first factor is whether or not the implemented system actually matches the architecture / design. If the implementation has additions/deletions or modifications that do not match the documented architecture / design, whitebox testing may not identify those issues, since reconnaissance  / discovery is not been performed as part of whitebox testing. As a result, vulnerabilities associated with these modifications are not explored.

The second factor in determining if blackbox testing is the right choice is where the security controls are. Security controls can exist in the following domains:

  1. Management – These are people policy, organizational and authority controls put in place to support the system security. Requiring all employees to follow all the systems security rules, or be fired and / or prosecuted – is  management control. A common failure of this rule is where corporate VPs share their usernames / passwords with their administrative assistants – and generally do not risk being fired. In most cases management controls are the teeth behind the rules.
  2. Operational – These controls are the workflow and process controls. These are the controls that are intended to associate authority with accountability. An example is that all purchase orders are approved by accounting, and  above a certain value they must be approved by a company officer. Another one is to not share your username / password. These controls are people-centric controls (not enforced by technology), and in most cases they present the greatest vulnerabilities.
  3. Technical – These are the nuts and bolts of system security implementation. These are the firewalls, network Intrusion Detection Systems (IDS), network / host anti-virus tools, enforced authentication rules, etc. This is where 90% of the effort and attention of security controls is focused, and where a much smaller percentage of the failures actually occur.

When your system is well architected, with integral functional controls for technical controls, but with a significant portion of the system security focused in operational (people) controls, black box testing is merited. Much like the first factor where the actual system may not reflect the architecture / design and it is necessary to use black box testing to discovery these issues, people controls are often soft and variable and it is necessary to use black box testing to test this variability.

Penetration Test Teams

Penetration Test Teams (also known as Red Teams) are teams comprised of systems security engineers with very specialized knowledge and skills in compromising different elements of target computer systems. An effective Red Team has all of the collective expertise needed to compromise most systems. When functioning as a blackbox team, they function and operate in a manner that is consistent with cyber attackers, but with management endorsement and the obligatory get out of jail free documentation.

At first glance, Red Teams operating in this way may seem like a very effective approach to validating the security of an system. As discussed above, that would be a flawed assumption. More specifically, Red Team team testing can be effective for a specific type of system security architecture, where the actual system could deviate from the documented system or if much of your system security controls are people-centric. Secondly, by understanding where the security in a system is (and where it is not), we can determine if Black Box testing is the more correct approach to system security testing.

Security Control Decomposition – Where “Security” Lives

In any security solution, system or architecture it should be clear what makes the system secure. If it is not obvious what controls in a system provide the security, it is not really possible to assess and validate how effective the security is. In order to better explore this question, we are going to take a look at another (closely related) area of cyber-security that is somewhat more mature that security engineering for parallels – cryptography.

Background: Historical Cryptography

In the dark ages of cryptography, the algorithm was the secrecy. The Caesar Cipher is a simple alphabet substitution cipher where plaintext is converted to ciphertext by shifting some number of positions in the alphabet. Conversion back to plaintext is accomplished by reversing the process. This cipher is the basis of the infamous ROT13, which allows the plaintext to be recovered from ciphertext by applying the 13 step substitution a second time, due to the 26 letters in the basic Latin alphabet.

In modern terms, the algorithm of the Caesar Cipher is to shift substitute by some offset to encrypt (with wrap around at the end of the alphabet), and shift substitute with the same offset negatively to decrypt. The offset used would be considered the key for this method. The security of any cipher is based on what parts of the cipher make it secure. In the Caesar Cipher knowledge of the method allows some attacker to try offsets until they are successful (with a keyspace of 25 values). If the attacker knows the key, but not the method, it appears to be more challenging that testing for 1 of 25 values. Given this very trivial example, it would appear that the security of the Caesar Cipher is more heavily based on the algorithm than the key.  From a more practical sense, Caesar gained most of his security based on the degree of illiteracy of his time.

In practice, Caesar used a fixed offset of three in all cases, with the result that his key and algorithm with fixed for all applications, which meant there is not distinction between key and algorithm.

Fast forward a few thousand years (give or take), and modern cryptography has a very clear distinction between key and algorithm. In any modern cipher, the algorithm is well documented and public, and all of the security is based on the keys uses by the cipher. This is a really important development in cryptography.

Background: Modern Cryptography

Advanced Encryption Standard (AES) was standardized by the US National Institute of Standards and Technology (NIST) around 2001. The process to develop and select an algorithm was essentially a bake off starting in 1997 of 15 different ciphers along with some very intensive and competitive analysis by the cryptography community. The result is that the process was transparent, the evaluation criteria was transparent, and many weaknesses were identified in a number of ciphers. The resulting cipher (Rijndael) survived this process, and by being designated the cipher of choice by NIST it has a lot of credibility.

Most importantly for this discussion is the fact that any attacker has access to complete and absolute knowledge of the algorithm, and even test suites to ensure interoperability, and this results in no loss of security to any system using it. Like all modern ciphers, all of the security of a system that uses AES is based on the key used and how it is managed.

Since the use of AES is completely free and open (unlicensed), over the last decade it has been implemented in numerous hardware devices and software systems. This enables interoperability between competitive products and systems, and massive proliferation of AES enabled systems. This underscores why it is so important to have a very robust and secure algorithm.

If some cipher were developed as a close source algorithm with a high degree of secrecy, was broadly deployed and then later a weakness / vulnerability was discovered, this would compromise the security of any system that used cipher. That is exactly what happened with a steam cipher known as RC4. For details refer to the Wikipedia reference below for RC4. The net impact is that the RC4 incident / story is one of the driving reasons for the openness of the AES standards process.

And now back to our regularly scheduled program…

The overall message from this discussion on cryptography is that a security solution can be viewed as a monolithic object, but by doing so it cannot effectively be assessed and improved. The threats need to be identified and anti-patterns need to be developed based on these threats, system vulnerabilities, and attack vectors mapped. Based on this baseline specific security controls can be defined and assessed for how well these risks are mitigated.

The takeaways are:

  • System security is based on threats, vulnerabilities, and attack vectors. These are mitigated by explicitly by security controls.
  • System security is built from a coordinated set of security controls, where each control provides a clear and verifiable role / function in the overall security of the system.
  • The process of identifying threats, vulnerabilities, attack vectors and mitigating controls is Systems Security Engineering. It also tells you “where your security is”.

Bottom Line

In this post we highlighted a number of key points in System Security Engineering.

  • Systems Security engineering is like Systems engineering in that (done properly) it is based on top down design and bottom up verification / validation.
  • Systems Security engineering is not like Systems engineering in that it is usually not functional and expressed as negative requirements that defy normal verification / validation.
  • Security assessments can be based on red team / blue team assessments and it can be done using a white box model / black box model, and the most effective approach will be based on the nature of the system.

As always, I have provided links to interesting and topical references (below).

References

 

2016 Personal Security Recommendations

Overview

There are millions of criminals on the Internet and billions of potential victims. You have probably not been attacked or compromised and if so, it is due to the numbers – probably not your personal security habits.

I have a passion for cyber security. Effective cyber security is a system problem with no easy or obvious solutions, and the current state of the art leaves plenty of room for improvement. I also think that every person who uses the Internet should have a practical understanding of the risks and what reasonable steps they should take to protect themselves.

For these reasons, any conversation I am in tends toward cyber security, and I occasionally am asked what my recommendations are for personal cyber security. When not asked, I usually end up sharing my opinions anyway.  My answer generally is qualified by the complexity of defending against the threats that are more ‘real’, but for most people we can make some generalizations.

The list below is what I think makes the most sense at this time. Like all guidance of this nature, the shelf life of this may be short. Before we can look at actionable recommendations, we need to really look at the threats we face. The foundation for any effective security recommendation must be to look at your threat space.

  1. Threats – These are realistic and plausible threats to your online accounts and data, in which you have realistic and plausible mitigation.
    1. Cyber Criminals – Criminals who are trying to monetize whatever they can from people on the Internet. There are so many ways this can be accomplished, but in most cases it involves getting access to your online accounts or installing malware to your computer. This threat represents 99.5% of the entire threat space most users have (note – this is a made up number, but is probably not too far off).
    2. Theft or Loss – Criminals who steal your computers or phone for  the device itself. If they happen to gain access to personal information on the device that enables extortion or other criminal access to your online accounts, that is a secondary goal. This threat represents 90% of the remaining threat space (so 90% of 0.5%) for laptops and smartphones (note – this number is also made up, with the same caveats).
    3. Computer Service Criminals – Anytime you take a phone / computer in for service, there is a risk that somebody copies off more interesting information for personal gain. It really does happen – search “geek squad crime” for details.
  2. Non-Threats – These are threats that are less likely, less plausible or simply unrealistic to defend against.
      1. NSA / FBI / CIA / KGB / GRU / PLA61398– Not withstanding the current issue between FBI vs Apple (which is not really about technical capability but about legal precedent), big govt Agencies (BGAs) have massive resources and money that they can bring to bear if you draw their attention. So my recommendation is that if you draw the attention of one or more BGAs, get a lawyer and spend some time questioning the personal choices that got you where you are.

    In order to effectively apply security controls to these threats, it is critical to understand what threat each of these controls protects against with some quantifiable understanding of relatively risk. In other words – it is more effective to protect against the threat that is most likely.

    Of the threats identified above, we identified online threats, device theft threats and computer service threats. For most people, the total number of times a computer / smart phone has been serviced or stolen can be counted on one hand. Comparatively, your online accounts are online and available 365 x 24 (that’s 8766 hours/year that you are exposed), and accessible by any criminal in the world with Internet access. Simple math should show you that protecting yourself online is at least 100x more critical than any other threat identified above.

    Threat Vectors

    In order to determine the most effective security controls for the given threats, it is important to understand what the threat vectors for each threat are. Threat vectors define the “how systems are attacked” for a given threat. Fortunately for the threats identified above, the vectors are fairly simple.

    In reverse order:

        1. Computer Service Threat: As part of the service process, you (the system owner) provides the device username and password so that the service people can access the operating system. This also happens to give these same service people fairly unlimited access to the personal files and data on the system, which they have been know to harvest for their personal gain. Keeping files of this nature in a secure container can reduce this threat.
        2. Theft or Loss: In recent years criminals have discovered that the information on a computer / phone may be worth much more than the physical device itself. In most cases, stolen computers and phones are harvested for whatever personal information can be monetized and then are sold to a hardware broker. If your system is not encrypted, all of the information on the system is accessible even if you have a complex password. Encryption of the system is really the only protection from this threat.
        3. Cyber Criminals: This is the most complex of the threats, since there are always at least two paths to the information they are looking for. Remember that the goal of this threat is to compromise your online accounts, which means that they can target the accounts directly on the Internet. However, most online Internet companies are fairly good at detecting and blocking direct attacks of this nature. So the next most direct path is to compromise a device with malware and harvest the information from this less protected device. The nature of this vector means this is also the most complex to protect. The use of Firewalls, Anti-Virus/Anti-Malware, Ad-Blockers, more secure browsers, secure password containers, and two factor authentication all contribute to blocking this attack vector. This layering of security tools (controls) is also called “defense in depth”.

    Actionable Recommendations [ranked]

    1. (Most Critical) Use Two Factor Authentication (2FA) for critical online accounts.
      1. Google: Everybody (maybe not you) has a Google account, and in many cases it is your primary email account. As a primary email account it is the target account for resetting your password for most other accounts. It is the one account to rule them all for your online world, and it needs to be secured appropriately. Use Google Authenticator on your smart phone for 2FA.
      2. Amazon: In the global first world, this is the most likely online shopping account everybody (once again – maybe not you) has. It also supports Google Authenticator for 2FA.
      3. PayPal: PayPal uses the SMS code as a 2nd authentication factor. It is not as convenient as Google Authenticator, but is better that 1FA.
      4. Device Integration: Apple, Google and Microsoft are increasingly integrating devices in their product ecosystems into their online systems. This increases the capabilities of these devices, and it also increases the online exposure of your accounts.
        1. Microsoft Online: Enable 2FA. Microsoft unfortunately does not  integrate with Google Authenticator, but does provide their own authentication app for your smart phone.
        2. Apple ITunes: Require Authentication for any purchases and Enable 2FA.
        3. Google Play: Require Authentication for any purchases.
      5. Banks, Credit Unions and Credit Accounts – These groups are doing their own thing for 2FA. If your banks, credit unions or credit accounts do not have some form of 2FA, contact them and request it. Or move your account.
    2. Password Manager: Use one, and offline is better than online. Remember putting it in the cloud is just somebody else’s computer (and may represent more risk than local storage). I personally recommend KeePass since it is open source, supports many platforms, is actively supported and free.
    3. Never store credit card info online: There are many online service providers that insist each month that they really want to store my credit card information in their systems (I am talking to you Comcast and Verizon), and I have to uncheck the save info box every time. At some point in the past, I asked a few of these service providers (via customer service) if agreeing to store my information on their servers meant that they assumed full liability for any and all damages if they were compromised. The lack of any response indicated to me that the answer is “probably not”. So if they are not willing to take responsibility for that potential outcome, I don’t consider it reasonable to leave credit card information in their system.
    4. Encrypt your SmartPhone: Smart phones are becoming the ultimate repository of personal information that can be used to steal your identity / money, and nearly all smart phones have provisions for encryption and password / PIN access. Use them. They really do work and are effective. It is interesting to note that most PIN codes are 4 to 6 digits, and most patterns (when reduced to bits) are comparable to 4 digit (or less) codes.
    5. Encrypt your Laptop: Your second most portable device is also the second most likely to be stolen or lost. If you have a Windows laptop, use BitLocker for system encryption. It is well integrated and provides some decent level of data security. In addition I would also recommend installing VeraCrypt. VeraCrypt is the more open source, next generation of TrueCrypt. For that extra level of assurance, you can create a secure container on your device or removable drive to store data requiring greater security / privacy.
    6. Password protect Chrome profile: I personally save usernames and passwords in my Chrome profile purely for the convenience. This allows me to go to any of my systems, and login easily to some of my regular sites. It also means that my profile represents a tremendous security exposure. So I sync everything and secure / encrypt it with a passphrase. Chrome offers the option to secure / encrypt with Google Account credentials, but I chose to use a separate passphrase to create a small barrier between my Google account and my Chrome sync data.
    7. Ad Blocker Plus/ AntiVirus/Firewall/Chrome: Malware is the most likely path to having your computer compromised. This can happen through phishing emails, or through a website or popup ads. Browsers are more effective at stopping malware than they used to be, and Chrome updates silently and continuously, decreasing your exposure risk. Chrome isthe browser I recommend. In addition, I use the Ad Blocker Plus plugin in Chrome. Lastly, I am using Windows 10, so I keep Windows  Defender fully enabled and updated. Pick your favorite anti-virus / anti-malware product, Defender just happens to be included and and does not result in a self inflicted Denial of Service (McAfee anyone?).
    8. Use PayPal (or equivalent) when possible: PayPal (and some other credit providers) manage purchases more securely online by doing one time transactions for purchases rather than simply passing on your credit credentials. This limits the seller to the actual purchase, and greatly reduces the risk that your card can be compromised.
    9. (Least Critical) VPN: If you have a portable device and use forms of public Wi-Fi, there is a risk that your information could be harvested as part of that first hop to the Internet. VPNs will not make you anonymous, VPNs are not TOR, but an always on VPN can provide you some security for this first hop. I use an always on VPN that I was able to get for $25 / 5 years. It may not provide the most advanced /  best security / privacy features available, but it is probably good enough for realistic threats.

    Additional Notes

    For those who are curious, there are some security tools that purport to provide security against the big government Agencies. However, it is important to note that even if these tools are compromised by these Agencies, it is very unlikely that they would admit it since it is more useful to have people believe they are being protected by these tools.

    1. VeraCrypt: Provides standalone encryption capability for files and storage devices that is nearly unbreakable. Like any encryption, the real weakness is the key and how you manage it.
    2. KeePass: Uses standalone encryption for passwords and other credential information. Once again, it is only as good as the password credentials you use.
    3. Signal / Private Call by Open Whisper: Secure messaging and voice call apps for your smart phone. The usefulness of these is directly related to who you are chatting with / talking with since both parties involved have to buy into to the additional effort to communicate securely.

    Bottom Line

    Security should do many things, but the most important elements for practical security are:

    1. It should protect against real threats in an effective manner. The corollary: It should not protect against imaginary / non-existent threats.
    2. It should be as transparent / invisible / easy to use as possible.
    3. It should be good enough that you are an obviously harder target than the rest of the herd (e.g There is no need to be faster than the bear chasing you, just faster than the guy next to you).

    Remember – The most effective security is the security that is used.

    Note – I apologize for my lack of tools for Apple platforms, but since I do not own one it is much more difficult to research / use.

    References

IOT and Stuff – The Evolution

Overview

This is the first of several posts I expect to do on IoT, including systems design, authentication, standards, and security domains. This particular post is an IoT backgrounder from my subjective viewpoint.

Introduction

The Internet of Things (IoT) is a phenomena that is difficult to define, and difficult to scope. The reason it is difficult to define is that it is rapidly evolving, and is currently based on the foundational capabilities IoT implementations provide.

Leaving the marketing hyperbole behind, IoT is the integration of ‘things’ into what we commonly refer to as the Internet. Things are anything that can support sensors and/or controls, an RF network interface, and most importantly – a CPU. This enables ubiquitous control / visibility into something physical on the network (that wasn’t on the network before).

IoT is currently undergoing a massive level of expansion. It is a chaotic expansion without any real top down or structured planning. This expansion is (for the most part) not driven by need, but by opportunity and the convergence of many different technologies.

Software Development Background

In this section, I am going to attempt to draw a parallel to IoT from the recent history of software development. Back at the start of the PC era (the 80s), software development carried with it high cost for compilers, linkers, test tools, packagers, etc. This marketing approach was inherited from the mainframe / centralized computer system era, where these tools were purchased and licensed by “the company”.  The cost of an IBM Fortran compiler and linker for the PC in the mid 80s was over $700, and libraries were $200 each (if memory serves me). In addition, the coding options were very static and very limited. Fortran, Cobol, C, Pascal, Basic and Assembly represented the vast majority of programming options. In addition (and this really surprised me at the time), if you sold a commercial software package that was compiled with the IBM compiler, it required that you purchase a distribution license from IBM that was priced based on number of units sold.  Collectively, these were significant barriers to any individual who wanted to even learn how to code.

This can be contrasted with the current software development environment where there is a massive proliferation of languages and most of them available as open source. The only real limitations or barriers to coding are personal ability, and time. There have been many events that have led to this current state, but (IMO) there were two key events that played a significant part in this. The first of these was the development of Borland Turbo Pascal in 1983, which retailed for $49.99, with unlimited distribution rights for an additional $99.99 for any software produced by the compiler. Yes I bought a copy (v2), and later I bought Turbo Assembler, Delphi 1.0, and 3.0. This was the first real opportunity for an individual to learn a new computer language (or to program at all) at an approachable cost without pirating it.

To re-iterate, incumbent software development products were all based on a mainframe market, and mainframe enterprise prices and licensing, with clumsy workflows and interfaces, copy protection or security dongles. Borland’s Turbo Pascal integrated editor, compiler and linker into an IDE – which was an innovative concept at the time. It also had no copy protection and a very liberal license agreement referred to as the Book License. It was the first software development product targeted at end users in a PC type market rather than the enterprise that employed the end user.

The second major event that brought about the end of expensive software development tools was GNU Compiler Collection (GCC) in 1987, with stable release by 1991. Since then, GCC has become the default compiler engine nearly all code development, enabling an explosion of languages, developers and open source software. It is the build engine that drives open source development.

In summary, by eliminating the barriers to software development (over the last 3 decades),  software development has exploded and proliferated to a degree not even imagined when the PC was introduced.

IoT Convergence

In a manner very analogous to software development over the last 3 decades, IoT is being driven by a similar revolution in hardware development, hardware production, and  software tools. One of the most significant elements of this explosion is the proliferation of Systems On a Chip (SoC) microprocessors. As recently as a decade ago (maybe a bit longer), the simplest practical microprocessor required a significant number of external support functions, which have now been integrated to a single piece of silicon. Today, there are microprocessors with various combinations of integrated UARTs, USB OTG ports, SDIO, I2C, persistent flash RAM, RAM, power management, GPIO, ADC and DAC converters, LCD drivers, self-clocking oscillator, and a real time clock  – all for a dollar or two.

A secondary aspect of the hardware development costs are a result of the open source hardware movement (OSH), that has produced very low cost development kits. In the not so distant past, the going cost for microprocessor development kit was about $500, and that market has been decimated by Arduino, Raspberry PI, and dozens of other similar products.

Another convergent element of the IoT convergence comes from open source software / hardware movement. All of the new low cost hardware development kits are based on some form of open source software packages. PCB CAD design tools like KiCAD enable low cost PCB development. Projects like OSHPark enable low cost PCB prototypes and builds without lot charges or minimum panel charges.

A third facet of the hardware costs is based on the availability and lower costs of data link radios for use with microprocessors. Cellular, Wi-Fi, 802.15.4, Zigbee, Bluetooth and Bluetooth LE all provide various tradeoffs of cost, performance, and ease of use – but all of them have devices and development kits that are an order of magnitude of lower cost than a decade ago.

The bottom line, is that IoT is not being driven by end use cases, or one group, special interest or industry consortium. It is being driven by the convergent capabilities of lower cost hardware, lower cost development tools, more capable hardware / software, and the opportunity to apply to whatever “thing” anybody is so inclined. This makes it really impossible to determine what it will look like as it evolves, and it also makes efforts by various companies get in front of or “own” IoT seem unlikely to succeed. The best these efforts are likely to achieve is that they will dominate or drive some segment of IoT by the virtue of what value they contribute to IoT. Overall these broad driving forces and the organic nature of the IoT growth means it is also very unlikely that it can be dominated or controlled, so my advice is to try and keep up and don’t get overwhelmed.

Personally, I am pretty excited about it.

PS – Interesting Note: Richard Stallman may be better known for his open source advocacy and failed Mach OS, but he was the driving developer behind GCC and EMACs, and GCC is probably as important as the Linux kernel in the foundation and success of the Linux OS and the open source software movement.

References

A Brief Introduction to Security Engineering

Background

One of the great myths is that security is complicated, hard to understand, and must be opaque to be effective. This is mostly fiction perpetrated by people who would rather you did not question the security theater they are creating in lieu of real security, by security practitioners who don’t really understand what they are doing, or lastly those who are trying to accomplish something in their interests under the false flag of security. This last one is why so much of the government “security” activities are not really about security, but about control – which is not the same. Designing and doing security can be complex, but understanding security is much easier than it is generally portrayed.

Disclaimer – This is not a comprehensive or exhaustive list / analysis. It is a brief introduction that touches on a few of the most practical elements of security engineering.

Security Axioms

Anytime I look at systems security, there are a few axioms I use to set the context, limit the scope and measure the effectiveness. These are:

  1. Perfect security is unachievable, and any practical security is the result of some cost driven tradeoff.
  2. Defining and understanding your threat model is step zero of any security solution. If you don’t know who are are defending against, the solution will not fit.
  3. Defining and understanding success. This means understanding what you trying to protect and what exactly protecting those elements means.
  4. Defending a system is more costly / difficult than attacking that same system. Attacker only need to be successful once, but defenders need to be successful everytime.
  5. Security based on secrecy is weaker than security based on strength. Closed security solutions are more likely to contain flaws that weaken the security versus open security solutions. Yes – this has been validated.

The first of these is a recognition that a security is about a conflict between a system / information defender and an attacker on that system. Somebody is trying to take something of yours and you want to stop them. Each of these two parties can use different approaches and tools to do this, with increasing costs – where costs are monetary, time, resources, or risks of being caught / punished. This first axiom simply states that if an attacker has infinite time, money, resources, and zero risk, your system will be compromised because you are outgunned. For less enabled attackers,  the most cost effective security is that which is just enough to discourage them so they move on to an easier target. This of course leads understanding your attacker, and the next axiom – know your threat.

The second axiom states that any security solution is designed to protect from a certain certain type of threat. Defining and understanding the threats you are defending against is foundational to security design since it will drive every aspect of the system. A security system to keep your siblings, parents, children out of your personal data is completely different than one designed to keep out cyber extortionists out of your Internet accounts.

The third axiom is based on the premise that most of what your system / systems are doing requires minimal protected (depending on the threat model), but some parts of it require significant protection. For example – my Internet browsing history is not that important as compared with my password and account access file. I have strong controls on my passwords and account access (eg KeePass), and my browsing history is behind a system password. Another way to look at this to imagine what the impact could be if a given element were compromised – that should guide the level of protection for that item.

The fourth axiom is based on the premise that the defender must successfully defend every vulnerability in order to be successful, but the attacker only has to be successful on one vulnerability – one time to be successful. This is also why complex systems are more prone to compromise – greater complexity leads to more vulnerabilities (since there are more places for gremlins to hide).

The fifth one is the perhaps the least obvious axiom of this list. Simply put the strength of some security control should not be based on the design being secret. Encryption protocols are probably the best example of how this works. Most encryption protocols over the last few decades are developed, and publicized within the peer community. Invariably, weaknesses are found and corrected, improving the quality of the protocol, and reducing the risk of an inherent vulnerability. These algorithms and protocols are published and well known, enabling interoperability and third party validation reducing the risk of vulnerabilities due to implementation flaws. In application, the security of the encryption is based solely on the key – the keys used by the users. The favorite counter example is from the world of traditional pin tumbler locks , in which locksmith guilds attempted to keep their design / architecture secret for centuries, passed laws making it a crime to possess lock picks or knowing how to pick a lock unless you were a locksmith. Unfortunately, these laws did little to impede criminals and it became an arms race between lock makers, locksmiths and criminals, with the users of locks being kept fairly clueless. Clearly of the lock choices available to a user, some locks were better, some were worse, and some were nearly useless – and this secrecy model of security meant that users did not have the information to make that judgement call (and in general they still don’t). The takeaway – if security requires that the design / architecture of the system be kept secret, it is probably not very good security.

Threat Models

In the world of Internet security and information privacy, there are only a few types of threat models that matter. This is not because there are only a few threats, but because the methods of attack and the methods to defend are common. Generally it is safe to ignore threat distinctions that don’t effect how the system is secured. This list includes:

  1. Immediate family / Friends / Acquaintances – Essentially people who know you well and have some degree of physical access to you or the system your are protecting.
  2. Proximal Threats : Threats you do not know, but are who are physically / geographically close to you and the system you are protecting.
  3. Cyber Extortionists : A broad category of cyber attackers whose intent is to profit by attacking and compromising your information. This group generally targets individuals, but not a specific individual – they look for easy targets.
  4. Service Compromise : Threats who attack large holders of user information – ideally credit card information. This group is looking for bulk information is not targeting individuals directly.
  5. Advanced Persistent Threats (APTs) : Well equipped, well resourced, highly capable and persistent. These attackers are generally supported by governments or large businesses and their targets are usually equally large. This group plans and coordinates their attacks with a specific purpose.
  6. Government (NSA / CIA / FBI / DOJ / DHS / etc): Currently the biggest, baddest threat. They have the most advanced technical resources, the most money, and they use National Security Letters when those are not enough. The collect data in bulk, and they target individuals.

From a personal security perspective we are looking at threats most likely to concern any random user of internet services – you. In that context, we can dismiss a couple of these quickly. Lets do this in reverse order:

Government (NSA et al) – If they are targeting you specifically, and you use Internet services – you are need of more help than I can provide in this article. If your data is part of some massive bulk data collection – there is very little you can do about that either. So in either case,  in the context of personal data security for Joe Internet User, don’t worry about it.

Advanced Persistent Threats (APTs) – Once again, much like the NSA, it is unlikely you would be targeted specifically, and if you are your needs are beyond the scope of this article. So – although you may be concerned about this threat, there is very little you can do to stop this threat.

Service Compromise – I personally pay all of my bills online, and every one of these services wants to store my credit card in their database. Now the question you have to ask is if (for example), the Verizon customer database is compromised and somebody steals all of that credit card information (with 10s of millions of card numbers) and uses them to spend 100s of millions of charges – is Verizon (or any company in that position) going to take full responsibility? Highly unlikely – and that is why I do not store my credit information on their systems. If they are not likely to accept responsibility for any outcome, should you trust them with your credit?

Cyber Extortionists – The most interesting and creative of all these threat classes. I continue to be amazed at every new exploit I hear about. Examples include mobile apps that covertly call money transfer numbers (eg 1-900 numbers in US), or apps that buy other apps covertly. Much like the Salami Slicing attacks (made famous in the movie Office Space), individual attacks represent some very small financial gain, but the hope is that collectively they can represent significant money.

Proximal Threats – If somebody can physically take your laptop, tablet, phone, they have a really good shot at all of the information on that device. Many years ago, I had an iPhone stolen from me on the Washington DC metro, I had not enabled the screen lock, and I had the social security numbers / birthdays of my entire family in my contacts. And yes, there were false attempts to get credit based on this information within hours – unsuccessfully. I now use / recommend everybody use some device access lock, and encrypt very sensitive information in some form of locker. Passwords / accounts and social security numbers in KeePass and sensitive file storage in TruCrypt. These apps are free and provide significant protection for Just In Case. Remember physical control / access to a device is its own special type of attack.

Friends / Family / Acquaintances – In most cases, the level of security to protect from this class of threat is small. More importantly, it is crucial to understand what it is you are trying to protect, why are you protecting it, and what are your recovery options. To repeat – what are your recover options? It is very easy to secure your information, and then forget the password /  passphrase  or corrupt your keyfile. Compromise of private data in this context is orders of magnitude less likely than you locking yourself out of your data – permanently. Yes, I have done this and family photos on a locked TrueCrypt partition cannot be recovered in your lifetime. So when you look at security controls to protect from this threat model, look for built in recovery capabilities and only protect what is necessary to protect.

Conclusions

Fundamentally security engineering is about understanding what you are trying to protect, who / what your threat is, and determining what controls to use to impede the threat while not impeding proper function. Understanding your threat is the first and most important part of that process.

Lastly – I would encourage everybody who finds this the least bit interesting to either read Bruce Schneier’s blog and his books. He provides a very approachable and coherent perspective on IT security / Security Engineering.

Links

Software: Thoughts on Reliability and Randomness

Overview

Software Reliability and Randomness are slippery concepts that may be conceptually easy to understand, but hard to pin down. As programmers, we can write the equivalent of ‘hello world’ in dozens of languages on hundreds of platforms and once the program is functioning – it is reliable. It will produce the same results every time it is executed. Yet systems built from thousands of modules and millions of lines of code function less consistently than our hello world programs – and are functionally less reliable.

As programmers we often look for a source of randomness in our programs, and it is hard to find. Fundamentally we see computers as deterministic systems without any inherent entropy (for our purposes – randomness). For lack of true random numbers we generate Pseudo Random Numbers (PRNs), which are not really random. They are used in generating simulations, and in generating session keys for secure connections, and this lack of true randomness in computer generated PRNs has been the source of numerous security vulnerabilities.

In this post I am going to discuss how software can be “unreliable”, deterministic behavior, parallel systems / programming, how modern computer programs / systems can be non-deterministic (random), and how that is connected to software reliability.

Disclaimer

The topics of software reliability, deterministic behavior, and randomness in computers is a field that is massively deep and complex. The discussions in this blog are high level, lightweight, and I make some broad generalizations and assertions that are mostly correct (if you don’t look to closely) – but hopefully still serve to illustrate the discussion.

I also apologize in advance for this incredibly dry and abstract post.

Software Reliability

Hardware reliability, more precisely “failure” is most often occurs when some device in a system breaks (the smoke comes out), and the system no longer functions as expected. Software failures do not involve broken hardware or devices. Software failures are based on the concept that there are a semi-infinite number of paths (or states) through a complex software package, and the vast majority will result in the software acting and functioning as expected. However there are some paths through the code that will result in the software not functioning as expected. When this happnes, the software and system are doing exactly what the code is telling it to do – so from that perspective, there is no failure. However from the concept of a software failure, the software is not doing what is expected – which we interpret as a software failure, which provides a path to understand the concept of software reliability.

Deterministic Operation

Deterministic operation in software means that a given program with a given set if inputs will function in exactly the same manner every time it is executed – without any unexpected behaviors. For the most part this characteristic is what allows us to effectively write software. If we carry this further, and look at software on simple (8 / 16 bit) microprocessors / microcontrollers, where the software we write runs exclusively on the device, operation is very deterministic.

In contrast – on a modern system, our software exists in a relatively high level on top of APIs (application programming interfaces), libraries, services, and a core operating system – and in most cases this is a multitasking/multi-threaded/multi-cored environment. In the world of old school 8 / 16 bit microprocessors / microcontrollers, none of these layers exist. When we program for that environment, our program is compiled down to machine code that runs exclusively on that device.

In this context, our program not only operates deterministically in how the software functions, but the timing and interactions external to the microprocessor is deterministic. In the context of modern complex computing systems, this is generally not the case. In any case, the very deterministic operation of software on dedicated microprocessor makes it ideal for real world interactions and embedded controllers. This is why this model is used for toasters, coffee pots, microwave ovens and other appliances. The system is closed – meaning its inputs are limited to known and well defined sources, and its functions are fixed and static, and generally these systems are incredibly reliable. After all how often it is necessary to update the firmware on an appliance?

If this war our model the world of software and software reliability, we would be ignoring much of what has happened in the world of computing over the last decade or two. More importantly – we need to understand that this model is an endpoint, not the whole story, and to understand where we are today we need to look further.

Parallel Execution

One of the most pervasive trends in computing over the last decade (or so) is the transition from increasingly faster single threaded systems to increasingly parallel systems. This parallelism is accomplished through multiple computing cores on a single device and through multiple processing threads on a single core, which are both mechanisms to increase the ability of the processor to produce more work by being able to support concurrently running programs. A typical laptop today can have two to four cores and support two hardware threads per core, resulting in 8 relatively independent processes running at the same time. Servers with 16 to 64 cores would have qualified as supercomputers (small ones) a decade ago are now available off the shelf.

Parallel Programming: the Masochistic Way

Now – back in the early 80s as an intern at Cray, my supervisor spent one afternoon trying to teach me about how Cray computers (at that time) were parallel coded. As one of the first parallel processing systems, and as systems where every cycle was expensive – much of the software was parallel programmed in assembly code. The process is exactly how would imagine. There was a hardware scheduler that would transfer data to/from each processor to main memory every so many cycles. In between these transfers the processors would execute code. So if the system had four processors, you would write assembly code for each processor to execute some set of functions that were time synchronized ever so many machine cycles, with NOPs (no operation) occasionally used to pad the time. NOPs were considered bad practice since cycles were precious and not to be wasted on a NOP.  At the time, it was more than I wanted to take on, and I was shuffled back to hardware troubleshooting.

Over time I internalized this event, and learned something about scalability. It was easy to imagine somebody getting very good at doing two (maybe even 3 or 4) dissimilar time synchronous parallel programs. Additionally, since many programs also rely on very similar parallel functions, it was also easy to imagine somebody getting good at writing programs that did the same thing across a large number of parallel processors. However, it is much harder to imagine somebody getting very good at writing dissimilar time synchronous parallel programs effectively over a large number of parallel processors. This is in addition to the lack of scalability inherent in assembly language.

Parallel Programming – High Level Languages

Of course in the 80s or even the 90s, most computer programmers did not need to be concerned with parallel programming, and every Operating System was single threaded, and the argument of the day was Cooperative multitasking versus Preemptive multitasking. Much like the RISC vs CISC argument from the prior decade, these issues were rendered irrelevant by the pace of processor hardware improvements. Now many of us walk around with the equivalent that Cray supercomputer in our pockets.

In any case the issue of parallel programming was resolved in two parts. The first being the idea of a multi-tasking operating systems with a scheduler – the core function that controls what programs are running (and how long they run) in parallel at any one time. The second being the development of multi-threaded programming in higher level languages (without the time synchronization of early Crays).

Breaking Random

Finally getting back to my original point… The result today is that all modern operating systems have some privileged block of code – the kernel running continuously, but have a number of other services that run the OS, including the memory manager and the task scheduler.

The key to this whole story is that these privileged processes manage access to shared resources on the computer. Of these two, the task manager is the most interesting – mostly due the arcane system attributes it uses to determine which processes have access to which core / thread on the processor. This is one of the most complex aspects of a multitasking / multi-core / multithreaded (hardware) system. The attributes the scheduler looks at include affinity flags that processes use to indicate core preference, priority flags, resource conflicts and hardware interrupts.

The net result is that if we take any set of processes on a highly parallel system there are some characteristics of this set that are sufficiently complex and impacted by unknown external elements that they are random – truly random. For example if we create three separate processes that generate a pseudo random number set based on some seed (using unique values in each), and point all of them to some shared memory resource- where the value is read as input and the output is written back. Since the operation of the task scheduler means that the order of execution of these three threads is completely arbitrary, it is not possible to determine what the sequence is deterministically – the result would be something more random than a PRNG. A not so subtle (and critical) assumption is that the system has other tasks and processes it is managing, which directly impact the scheduler, introducing entropy to the system.

Before we go on, lets take a closer look at this. Note that if some piece of software functions the same (internally and externally) every time it executes, it is deterministic. If this same piece of software functions differently based on external factors that are unrelated to this software, that is non-deterministic. Since kernel level resource managers (memory, scheduler, etc) function in response to system factors and factors from each and every running process – that means that from the perspective of any one software package, certain environmental factors are non-deterministic (i.e. random). In addition to the scheduling and sequencing aspects identified above, memory allocations will also be granted or moved in a similar way.

Of course this system level random behavior is only half the story. As software packages are built to take advantage of gigabytes of RAM, and lots of parallel execution power, they are becoming a functional aggregation of dozens (to hundreds) of independently functioning threads or processes, which introduce a new level of sequencing and interdependancies which are dependent on the task manager.

Bottom Line – Any sufficiently complex asynchronous and parallel system will have certain non-deterministic characteristics based on the number of independent sources that will influence access / use of system shared resources. Layer the complexity of parallel high level programming, and certain aspects of program operation are very non-deterministic

Back to Software Reliability

 Yes we have shown that both multitasked parallel hardware and parallel programmed software contribute to some non-deterministic behavior in operation, but we also know that for the most part software is relatively reliable. Some software is better and some is worse, but there clearly is some other set of factors in play. 

The simple and not very useful answer is “better coding” or “code quality”. A slightly more insightful answer would tell you that code that depends on or uses some non-deterministic feature of the system is probably going to be less reliable. An obvious example is timing loops. Back in the days of single threaded programs and single threaded platforms, programmers would introduce relatively stable timing delays with empty timing loops. This practice was easy, popular and produced fairly consistent timing – showing deterministic behavior. As systems hardware and software have evolved, the assumptions these coding practices rely on become less and less valid. Try writing a timing loop program on a modern platform and the results can be workable much of the time, but it  can also vary by orders of magnitude – in a very non-deterministic manner. There are dozens of programming practices like this that use to work just fine, but no longer do – but they don’t completely break, just operate a little bit randomly. In many cases, the behavior is close enough to “correct” that the program appears to function, but not very reliably.

Another coding practice that used to work on single threaded systems was to call some function and expect the result would be available on the next line of code. It worked on single threaded systems because execution was handed off to that function, and did not return until it was complete. Fast forward to today, and if this is written as a parallel program – the expected data may not be there when your code thinks is should be. There is a lesson here – high level parallel programming languages make writing parallel code fairly easy, but that does not mean that writing robust parallel programs is easy. Parallel inter-dependencies issues can be just as ugly as parallel assembly code on a Cray system.

Summary

A single piece of code running exclusively on a dedicated processor is very deterministically, but parallel programmed software on a multitasking parallel hardware system can be very non-deterministic, and difficult to test. Much of software reliability is based on how little a given software package depends on these non-deterministic features. Managing software reliability and failure mechanisms requires that programmers understand the system beyond the confines of the program.

References

Howto: Browse (more) Securely / Privately / Anonymously

Background

For a number of reasons, many people are increasingly concerned with their privacy and security on the Internet. Since the primary reason most people use the Internet is for browsing, this would be a opportunistic use model to look for improvement. Of course the tradeoff is that as we make browsing more secure, we also may make the browsing experience more difficult. So in the list below, it progresses from low return / low impact to high impact / high return, and you can pick you pain threshold.

Note that in the context of a browser (and browsing), I define security as the ability to browse without being infected or compromised by malware. I define privacy as the ability to browse without sites (or other parties) tracking, harvesting information from my browser. Anonymity is when there is a sufficiently high degree of privacy that the browsing activity is anonymous – and true anonymity is not easy to achieve.

Off the Shelf / Good Browser Hygiene

Browser: There are lots of browser options and I cannot offer an opinion on most of them. On a regular basis browsers are reviewed for security – and Chrome, and Firefox are usually in the top three. Privacy is distinct from security, and generally Firefox rates higher than Chrome in that respect. However everything is a tradeoff, and I personally think that Chrome has better performance (which I may be imagining), and my Android devices and Chromebook are Chrome by design – so that is my browser choice by default. Secondary to that, I appreciate the rolling updates and aggressive stance Google takes on security, and I think that outweighs the weaker stance they take on privacy – since I believe I can manage my privacy / personal data easier than I manage security threats. Consider browser selection as the first thing to do in cleaning up your browser security / privacy concerns.

Browser Settings: The obvious things to check in your browser include:

  • Turn on “Do Not Track” / Open settings and search for this flag – if it is not set, set it. This provides some minimal and non necessarily mandatory level of tracking reduction.
  • Content Setttings (Cookies): I up the default level to “Keep local data only until I quit my browser” and “Block third-party cookies and site data”.
  • [Chrome Specific]Under Signin and Sync Settings, I encrypt my sync data with a passphrase. This is all about key management and reducing personal data on Google Servers.

Browser Plugins: The following list includes a few plugins that provide improved privacy.

  • HTTPS Everywhere: This is a plugin that will force a HTTPS connection as the default, with HTTP (non-secure) as the fallback.
  • DuckDuckGo Search: Duck Duck Go is a search service that provides much stronger statements about not tracking your browsing / searching activity (as compared with Google). They feel fairly strongly that this is a big deal. Take a look at their positions on results bubbling
  • DoNotTrackMe: A plugin that gives you explicit tracking information as you browse. This actually provides some visibility into what sites are tracking you in realtime.

Sites: What to do to reduce your browsing footprint.

  • Google Search History: By default Google saves your search history and used it to target ads and search results. My recommendation – turn it off.
  • Google Dashboard: A nice portal that provides a one view view into your data footprint on Google Servers. Review and clean it up. While you are there, setup an Alert on your name. It will give you any visibility into possible misuse of your name.
  • Twitter Privacy: Twitter by definition is fairly public so there is not much to tweak. However it makes sense to verify that “Do Not Track” is enabled and consider turning off / deleting location data.
  • Facebook: Expect this to change over time. Privacy settings seem to be a fast moving target at Facebook. So much of the business value proposition of Facebook is about eliminating privacy, so this will always be about providing some minimal level of privacy control that that is just enough to keep most users from leaving.

Overall these tweaks to your browsing experience will provide some improved level of security and privacy, but fundamentally much of the browsing process from your client system will still be relatively visible – the contents may protected with SSL/TLS, but where you are going, what you are downloading and how long you are there is not. Specifically, where you are going (page by page by page), how long you are there and how my kilobytes you have downloaded is all visible.  If your ISP / employer / campus / hotel / building has a proxy server between you and the Internet, they have access to this level of information.

Overall I consider these steps to just be good browser hygiene.

Some Better

If this level of exposure bothers you (it may), and you feel a need to mitigate this issue, read on – a VPN / proxy service may be the solution you are craving.

Technically a VPN and a proxy server are two very distinct functions. A VPN (Virtual Private Network) is a secure (i.e encrypted channel) and authenticated (i.e. username/password and server certificate) channel from your client system to some server on the Internet. In the enterprise / business world, VPNs are used to enable authorized users on the Internet access to corporate servers on the private networks. In the world of proxy servers, VPNs are used to provide a secure channel to some proxy server on the Internet.

A Proxy server is simply a relay for your Internet / Browsing traffic. You send some Internet request to the proxy server, and it redirects it to the Internet, with the source mapped back to the proxy server. When the response is received by the proxy server, it is then relayed back to your client system. Proxy servers are not explicitly secure, so they are generally coupled with some form of VPN to provide a secure channel.

There are large number of VPN/Proxy service providers around the world. For the most part, the free ones (reportedly) have a fairly high rate of malware infection and the for pay ones are from $40 to $100 a year. This is not an endorsement – but PureVPN and HideMyAss are both typical for-pay VPN/Proxy Services, with very typical pricing and functionality providing a wide range of target servers around the world.

When using a VPN/Proxy service, the net effect is that any geolocation will place you at (or near) the location of the proxy server. This means that if you are accessing some Internet service with geolocation service qualifiers (e.g. bbc.com, nfl.com) , you can appear to be somewhere that you are not. It also means that if your employer, hotel, campus, school has blocked sites/services, you can circumvent these restrictions with a VPN/proxy. In both of these cases you are not likely violating any laws, but you are likely violating some Terms of Service – implied or otherwise.

More legitimately, if you often use public or untrusted WiFi networks, a VPN / Proxy ensures that your traffic will not be sniffed on the local network. If you use WiFi in a high density environment, and are concerned about your network being compromised, or you don’t trust the other users on a shared network – a VPN/Proxy can ensure your traffic is secure / private even if your network may not be.

Ultimately, a VPN / Proxy service can provide a step up in privacy / security for a specific set of threats. However, by using a VPN / Proxy service you are literally handing this same information over the VPN/Proxy service provider – so if your concern is browsing/security in general, you have just shifted the risk.

More Better

From this point, there is one very obvious and better way to achieve better security/privacy – the TOR Browser. The TOR (The Onion Router) Browser is a custom version of Firefox packaged/integrated with a few tools related to The Onion Router, including an Onion Router proxy for your client system. The download package installs easily, and the TOR proxy starts automatically just be launching the TOR browser. If you are serious about using it for the privacy it can provide, read the Warnings FAQ.

The general principle behind TOR is that an outgoing datapacket is encrypted with some relay address on the TOR network, with multiple successive similar layers applied, and ultimately the packet is sent out to the network in which each one of the relays peels off the successive layers – and it is finally sent to the Internet destination. The goal / purpose of this effort is that through this obfuscated path, the user is much more anonymous and their privacy is protected.

In an ideal world, where TOR relays were spread around the world from different organizations it is possible to achieve some level of anonymity. In the real world, some of these relays are operated by agencies with the intent to compromise the TOR network, reducing the effectiveness. In addition some academic research has shown a few other weaknesses related to coordination between TOR relays. The net result is that the TOR network and the TOR browser provide a much high degree of anonymity than any other readily available solution – but it can be broken. For a recent example, refer to the story behind Silk Road shutdown. Details are lacking, but this does show it is susceptible if the incentive is high enough.

Bottom Line

There are a wide range of things you (as a user) can do to reduce your browsing footprint, reduce your ability to be tracked, increase your security and privacy (and anonymity). However, the first step to any of this is to assess what your threats are, and take reasonable steps to mitigate those threats. If you threats are non-specific and general, than it is likely that the non-specific and general browser hygiene solutions are sufficient. If you have specific threats that fit the more elaborate solutions, use appropriately.

Howto: Share Files Securely/Privately

Background

The joint concepts of Secure and Private are relative and subjective. Relative in that there are very few absolutes, but there are an infinite number of variations that may be better or worse. Qualifying “better or worse” is where the subjective comes into play. It is subjective in terms of who / what you are trying protect your files from. Is it your family, co-workers, your neighbors, the Internet, some large corporation trying to characterize you (in order to better sell to you), or the government? Depending on how good of a solution and who you are trying to protect your privacy from, we can look at a few easy (and practical) solutions.

Off the Shelf

There are off the shelf solutions that provide file sharing options. Dropbox, Box and Google Drive are three popular examples or cloud storage solutions – meaning your files are on their servers. Each one of these provides some degree of privacy / security. Each of these services use a username / password to restrict access, and additionally Google and Dropbox support two factor authentication using Google Authenticator. Each of these services uses SSL/TLS to provide a secure channel from the client to their servers. What they do not provide is any explicit privacy or security from the respective services or anybody with a NSL.

Fundamentally these services are not particularly private, secure, but they do provide some degree of security / privacy. If you use them and Two Factor Authentication is an option – use it.

A Better Option(s)

If the convenience of these services is appealing, but you have some real need for something more secure, we have a better solution. TrueCrypt is an disk encryption tool that can create secure containers for files. Specifically, Truecrypt can be used to create a secure file container in your GDrive/Dropbox/Box sync directory on your client system. This container can be opened by Truecrypt, files placed inside, and then be closed – at which point the service will sync the file up to their servers. They services will have access to the file, but its contents will be completely hidden from all except the keyholder. Note – a large container will hold lots of files, but the entire file will need to be synced even if there is a minor change – so consider wisely how large / small this container should be.

Another tool is Keepass, a secure password locker that is similar – but only for password / account information. Both of these tools are also cross platform and open source.

An Even Better Option

One of the core flaws with each of these cloud storage solutions identified above (as examples) is that ultimately all of your data resides on their servers within the providers data centers. BitTorrent Sync is a solution that breaks that paradigm by distributing files using the bittorrent protocols in a peer to peer (P2P) fashion. The result is that files can be distributed and shared between multiple users / platforms, but they do not exist on any cloud server – greatly reducing the risk of compromise-ever. BitTorrent Sync is easy to setup and use. Specifically, the app is installed and then you can create a share – and then generate a key – initiating a share. If you are connecting to an existing share, you create a share and provide the key for that share, and it will automagically be synced from the other clients on that share.

The most significant upside (other than P2P architecture) is that there are no storage or transfer limits – the only limitation being your local capacity.

The only significant downside to BitTorrent Sync is that synchronizations must be synchronous – since there is no cloud storage server, it requires that at least two members be online to synchronize.

For the truly insecure, TrueCrypt can be used on top of BitTorrent Sync.

Bottom Line

These are a few examples of how to secure / privatize file sharing on the Internet using relatively non-private services coupled with a few open source applications. However, it is very important to understand key management – since this security / privacy is only as secure as the keys you use to contain it. The applications themselves are fairly mature, well reviewed and generally accepted as secure.

Google Two Factor Authentication

Background

Anytime we get real data on Internet user passwords, we once again discover people are bad with passwords. Additionally as the tools to compromise and crack passwords get better, even high quality passwords are becoming less secure. Two factor authentication is something that should be used – when available and if you have an authenticated website / webapp, there is a cheap and easy method to implement.

In an earlier post, I showed that some online services were more critical than others from a security perspective – specifically the email account used for account recovery for other services. In many cases, this is Google Gmail and in this post I will be using it as an example.

One Time Passwords and Google Authenticator

Google Authenticator is a relatively simple app written by Google that generates time windowed One Time Passwords (OTP) every 30 seconds. This app is available for Blackberry, iOS and Android devices, and can be used for Google account access as a Two Factor Authenticator (2FA). More importantly, it can be used by any non-Google website or application developer. Let me back up a minute, and explain why this is a good thing.

An Authenticator is something you use to authenticate – or prove who you are to a system. A password is an authenticator, but not a very good one by itself (anymore). Authenticators can be based on:

  •  Something you know : Password, PIN code
  • Something you possess : Smart Card/Fob, SecurID, device with Google Authenticator
  • Something you are (biometrics) : Fingerprints, Retina scan, etc.

The idea behind Two Factor Authentication is that even if one the factors is weak, the combination of two factors is much stronger than either one of the authenticators individually. Most importantly – it is very easy to share passwords, but very hard to share both parts of a Two Factor Authentication.  In the very recent past, 2FA was not very accessible since passwords are cheap to use / implement, and none of the other authenticator options were.

Here is where Google Authenticator comes in. Google Authenticator provides a well known (RFC6238) method to generate six digit authenticator tokens based on the current time and a shared secret key. The app can also support multiple concurrent authentication generators. The app does not depend on Google services – and up until a certain point, it was open sourced. Open source equivalents to Authenticator are available. Details on the alternatives and how Authenticator functions is in the associated Wikipedia article.

Enabling on Google Account – How it Works

To setup 2FA on your Google account, do the following:

  1. Install Google Authenticator on your Smart Device (phone / tablet / etc)
  2. Login to your Google Account
  3. Go to Account Settings / Security / 2-Step Verification and select ‘edit’
  4. Enter the information including the phone number and printing out the 10 emergency codes. Safety nets are what prevent Self Inflicted Denial of Service Attacks (SIDoSA).
  5. Follow the instructions to load the shared secret into the app AND verify it.
  6. That’s it – you are setup.

After that, you will be asked to enter username / password followed by a request for the six digit authenticator from your smart device. Since I don’t store cookies, I need to do this each time I login – but after a few days it becomes an easy habit. I also have the knowledge that my account is fairly secure – even if my password looks like “Fluffy-Bunnies” instead of something like ‘H@Af5%Zwqhkh*6iJ8’.

Potential Risks with using Google Authenticator

There are no risk free solutions to real problems, and Google Authenticator also has its risks. We can look at a couple of scenarios to see what some of those may be:

  • When used on Google Account:
    • Q: If my Google Authenticator device is lost or stolen and it happens to be the phone listed as my recovery, could somebody use that to access my Google account?
    • A: Only if: your phone is not locked (it should be), and they also have your password – since they need both factors to get in. Low Risk (and yes you should put a lock on your phone).
    • Steps: If this actually did happen the first actions you should take is to use one of your 10 recovery codes to login to your google account, disable 2FA, disable that device password (if you use device passwords) and change your primary password – taking your lost / stolen authenticator out of the loop and disabling access of any form from that device.
  • When used on some Non-Google website /application:
    • Q: Since the secret key for this non-Google website / application is entered into Google Authenticator, does Google now have access to my account on this non-Google website / application.
    • A: Not very likely. It is possible that they are backdooring all of these secret keys, but since:
      • There is no direct association between a secret key and a given website / application, there is no direct way for Google to know where this key should be used; and
      • It is only one half of a two factor authentication, since they are missing the password authenticator (and the username).

Bottom Line

Passwords alone are about a decade past being effective and rapidly approaching useless. Google Authenticator provides an effective authenticator generator for Google accounts that can also be used on just about anything (there is a PAM plugin available). and when paired with a password provides a much better degree of security.

Recommendations

Use it for Google accounts and any other website that offers it as an option. Use it for your enterprise login.

For the Maker community – Use it for your PIN pad on your house/garage door. Use it for access to your home automation webserver.  A rolling Google Authenticator can be duplicated on multiple devices easily to allow family wide access, but cannot be shared with others (something to be said for that). 

Use it everywhere you can imagine – and if you can use it with a password, you have 2FA and all of the goodness that comes with that.

Security as a System: iMessage

Background

Say it with me folks, “Security is a system”. In case it is not obvious what that means, I will articulate. Security is made up of a collection of parts, and the system security of this collection is not based on the average security of those parts, or the sum security of these parts – it is based on the weakest security of those parts and how they are integrated. And – sometimes that weakness is not even a part, but a gap in the integration of those parts.

I know, in the abstract we have all heard it before. However we have a new example that highlights this principle.

Before we get into the example I want to share a few of Tom’s Rules of Security:

  1. Know your Threat – Security can only be understood / judged in the context of a given threat type. Good security against one class of threat may be worthless against another class of threat.
  2. Follow the Keys – Key management is about how and where the encryption / access control keys are kept. Whoever holds the keys controls the access. If it isn’t you, you don’t control access.

Note – These are not really my rules, just my personal versions of well known principles.

In the Spotlight: Apple iMessage

Recently there was a presentation in Kuala Lumpar (by pod2g) addressing the security of Apple iMessage. More specifically the presentation highlighted a few weaknesses that illustrate the two rules identified above, and went into great detail as to how it could be compromised. For our purposes, we need to first look at how iMessage works and then we will look at how it can fail – or at least be insecure.

From any given iMessage client a secure message can be sent to any other iMessage client. This message is based on public-private key encryption keys. With public-private encryption keys, every user has two keys – a private key (which is the secret key) and the public key (which can be shared with anybody). These are special keys in that any message encrypted with the public key can only be decrypted with the cooresponding private key, and conversely any message encrypted with a private key can only be decrypted with the corresponding public key. In the first case, it allows somebody to send a secure message to a given person without needing to exchange secret keys. In the second case – also known as digital signing, it identifies the source of the message since only the holder of that given private key could create that message. When these methods both used, a message can be sent from Ted to Alice (using their respective public keys) and:

  • Ted knows Alice and only Alice can decrypt the message if it was encrypted with her public key.
  • Alice knows that Ted and only Ted could have sent the message if it was signed with his private key.

To a certain degree, that is what Apple iMessage does. However, messages are not sent directly between Alice and Ted, they are sent through Apple services and retained under both of the AppleID accounts of the message participants. This alone is not a security exposure by itself, but as a user I would like the option to purge my historical data. It may be securely encrypted today, but who is to say how secure that may be in the future?

In any case, the next part of the story is that all the public keys used by iMessage are stored on Apple ESS servers and are delivered to iMessage clients automatically. Which puts Apple in a perfect position to compromise any encrypted iMessage with a Man in the Middle attack (MiTM). Specifically pages 75&77 (of the presentation) show that Apple has full control of the public key directory, public keys are retrieved by clients “as needed” (with a 30 minute cache window), and users have no visibility into the public keys being used. At any point Apple has the technical capability to insert themselves as the endpoint to a message and then recreate / encrypt the message and send to the intended recipient. Since the keys are exchanged in the background – the users will not be aware than it was not an end to end encryption.

Most of the other 88 pages in this presentation illustrates how iMessage works under the covers, and the challenges of a third party compromise. I will give you a clue – it would be very difficult for anybody who is not Apple to compromise iMessage, but technically very easy for Apple to do.

Bottom line

Apple controls the Keys :There is nothing to imply that Apple is spying on iMessages. However there are no technical limitations that would prevent them from doing so if they were so inclined or directed to, since indirectly they control the keys.

Know your threat: If the threat you concerned about is Joe Internet Hacker, iMessage is very secure, with a very low risk of interception/decryption. However if the threat you are concerned about has a National Security Letter in their pocket, iMessage probably does not provide much security.

Update [2013 Nov 5] : A very well written analysis of Lavabit at thoughtcrime.org shows that Lavabit had a similar approach to key management – and same weakness of co-mingling the keys with the “secured” accounts on the server.

Internet Security As a System

Background

Most of us do not see our activities on the Internet as a system, and if it is a system we are not sure what that has to do with securing ourselves on the Internet. First lets look at a typical Joe Internet User in terms of the definition of system – “a set of connected things or parts forming a complex whole”. The parts are the individual services we use – GMail, Facebook, Amazon, iTunes, PayPal, Verizon and/or AT&T, etc. For each one of these we have a username and password – which may or may not be very unique. The connectivity part is the user, Joe Internet user – who is the real target of a attacker.

How you defend this type of a system is not entirely obvious, however if we flip the perspective around it may give us some insight. Specifically, how would an attacker plan to go after your accounts to their benefit?

If we assume the threat model is a high volume, Internet cyber extortionist looking for a quick return, we can characterize an attack pattern.

Phases of an Attack

A simple attack has three phases:

Compromise – This phase is where an attacker has already identified you as a target, and is probing for a weakness / vulnerability to “get inside” – compromising the system.

Mapping / Discovery – This phase is where the attacker has compromised some part of your system of services and is mapping out your other accounts / services. Since this process is essentially information gathering / compromise – it is fairly hard to detect. This information is used to plan and execute the next phase as quickly as possible.

Exploitation – This phase is where the attacker implements a plan to use the information collected to their benefit – and usually to your detriment.

An Example of a Common Attack

In this example, Joe Internet User is a typical first world Internet power user with all of the accounts listed above –  GMail, FaceBook, Amazon, iTunes, PayPal, Verizon and/or AT&T, etc..

In our first example, the attacker has been perusing Facebook and found a public profile for promising target. The status updates indicate either an iPhone/iPad/Android Tablet / Smartphone etc – indicating either a iTunes or Google Play account, or both. Other references may indicate online shopping habits – enabling the attacker to identify target accounts. Most importantly, the attacker discovers the target’s primary email address – either GMail, HotMail or Yahoo (for example). Connections to other social networks (eg Twitter, Google+, Instagram, etc) provide additional sources of personal information. At this point the attacker knows where you live, your age, family / marital status, friends, pets / kids names / ages, where you work, what you do for a living, where you went to school, and what you do for fun. All from public sources.

The next part of discovery is compromising an account. The most promising is usually the primary email account. This is due to this magical feature of every Internet service – the password recovery email address. People forget passwords and people forget usernames, but every service has an email address for password recovery. This is usually setup when the account is initially created, and forgotten shortly afterwards.

To get back to our process, the attacker makes a number of educated guesses for the password for the users primary email account – and sadly most people are still using simple passwords. Is your email password based on a birthday, names (parents, spouse, kids, pets), sports team / player, personal interests? With a one or two number appended? In any case, lets just guess that an attacker will compromise a quarter of all accounts in less than 25 guesses – and our Joe Internet User GMail account has been compromised. Where does that lead us?

The attacker is patient, and access to a primary email account is a much better way to collect more useful / personal information. One of the first things an attacker is going to do is download the user contacts and email – in case the user suspects compromise and changes the password. Most webmail services provide this feature, and it ensures that the attacker has a backup of your information. At this point we have to ask a few questions about Joe Users webmail account. Does he have a folder with his online account email? Bills, credit cards, online shopping accounts? Do the contacts have birthdays, anniversaries, even Social Security numbers? We know they have addresses, email and phone numbers. Each of these helps build data for credit card fraud. At this point this is still a discovery process, and the attacker is very careful to not touch, change or leave any clues of activity.

Exploitation is the next step and the attacker will develop a plan of attack and usually the first step is based on the accounts and stored credit cards / store credit cards. For example – is there an Amazon, Tiffanys, Macys, Sears, etc online account with an credit card saved in the online store? Is the email account tied in with a Google Play Store and a credit card? The attacker can buy phones, tablets and computers using that account. Is it tied to a Verizon, AT&T, or T-Mobile with a credit card stored in the account? Once again, the attacker can buy phones and tablets from these accounts. The first think to consider for online shopping is embedded credit card numbers. Some of these are credit cards that can be removed – but most store credit cards are automatically available on the account and cannot be removed without cancelling the credit card.

The next step of exploitation is to look for signs of illegal or incriminating information that can be used to extort something from the user. Most people know this as blackmail, and although it does not occur often – it does occur. Think about the depth and breadth of highly personal information that is in your email accounts.

Going one step beyond blackmail, attackers will sometimes “hijack” all of the accounts by changing the passwords and redirecting the recovery email address to some email account held by the attacker. Then a message is sent to the user, asking for ransom to get their accounts back. Once again – this is rare, but it does occur.

Generally the last part of exploitation is where all of this personal information gathered on Joe User, his friends, family, acquaintances etc, is used to build a persona database used to apply for credit and loans – credit fraud and what is commonly known as identity theft.

A Few Simple Steps

This example shows how attackers see the collective accounts and services of Joe Internet User as a system – with Joe User as the key connective element, and how attacking a few weaknesses provides significant opportunity to the attacker.

  1. Learn how to create Good Passwords (and use them when possible) – I get frustrated when an account service requires an 8-12 character password, with upper case, lower case, numbers and symbol. This does create a high entropy password – but is also very difficult to remember. Take a look at this xkcd panel and think about it when you create passwords.
  2. Primary Email Account – Since your primary email account is your account recovery account, this account is more critical than any other account. Choose / use a quality password and if possible use two factor authentication.
  3. Two Factor Authentication (2FA) – If the service offers two factor authentication, referred to as “2-step verification” by Google – use it. Two factor authentication does not make an account impossible to compromise, but it makes it sufficiently hard that this type of attacker will move on as soon as they discover you are using it. Google (GMail, Google Play) and WordPress both offer free 2FA for user accounts. In both cases it is based on a mobile device app – Google Authenticator
  4. Stored Credit Card Numbers / Bank Account Numbers – Carefully tradeoff the convenience of storing a credit card online in an account versus the cost if it is compromised. I recommend removing any general credit card numbers.
  5. Store Credit Accounts – Store credit accounts are usually tied right to that stores online store and cannot be removed without closing that line of credit. Attackers know this and use this to their advantage. Consider closing those lines of credit.
  6. Sanitize Contacts / Email – Audit your contacts and all of your email to see what could be deleted and clean it up. How necessary is a 5 year archive of all sent mail? If you are worried about holding onto everything – back it up before cleaning. The less information available in a compromise, the lower the risk.
  7. Sanitize Social networks / Make your profile Private – Most of the social networks now enable you to make your profile private – so only your circles / friends can see what is on your pages. In addition, content should be cleaned up to reduce your online presence. Once again, is it really necessary to have a 5 year archive of Facebook posts?
  8. Unique Passwords – DO NOT use the same password for all your accounts. DO not use a couple of passwords for all your accounts. Use unique passwords for each account. If one of you accounts is compromised, make them work for each account – don’t just give it too them.

These steps will not make your accounts bulletproof, but most attackers are opportunists and these steps will harden your accounts enough for them to move on to somebody else.