Tag Archives: security

A Brief Introduction to Security Engineering

Background

One of the great myths is that security is complicated, hard to understand, and must be opaque to be effective. This is mostly fiction perpetrated by people who would rather you did not question the security theater they are creating in lieu of real security, by security practitioners who don’t really understand what they are doing, or lastly those who are trying to accomplish something in their interests under the false flag of security. This last one is why so much of the government “security” activities are not really about security, but about control – which is not the same. Designing and doing security can be complex, but understanding security is much easier than it is generally portrayed.

Disclaimer – This is not a comprehensive or exhaustive list / analysis. It is a brief introduction that touches on a few of the most practical elements of security engineering.

Security Axioms

Anytime I look at systems security, there are a few axioms I use to set the context, limit the scope and measure the effectiveness. These are:

  1. Perfect security is unachievable, and any practical security is the result of some cost driven tradeoff.
  2. Defining and understanding your threat model is step zero of any security solution. If you don’t know who are are defending against, the solution will not fit.
  3. Defining and understanding success. This means understanding what you trying to protect and what exactly protecting those elements means.
  4. Defending a system is more costly / difficult than attacking that same system. Attacker only need to be successful once, but defenders need to be successful everytime.
  5. Security based on secrecy is weaker than security based on strength. Closed security solutions are more likely to contain flaws that weaken the security versus open security solutions. Yes – this has been validated.

The first of these is a recognition that a security is about a conflict between a system / information defender and an attacker on that system. Somebody is trying to take something of yours and you want to stop them. Each of these two parties can use different approaches and tools to do this, with increasing costs – where costs are monetary, time, resources, or risks of being caught / punished. This first axiom simply states that if an attacker has infinite time, money, resources, and zero risk, your system will be compromised because you are outgunned. For less enabled attackers,  the most cost effective security is that which is just enough to discourage them so they move on to an easier target. This of course leads understanding your attacker, and the next axiom – know your threat.

The second axiom states that any security solution is designed to protect from a certain certain type of threat. Defining and understanding the threats you are defending against is foundational to security design since it will drive every aspect of the system. A security system to keep your siblings, parents, children out of your personal data is completely different than one designed to keep out cyber extortionists out of your Internet accounts.

The third axiom is based on the premise that most of what your system / systems are doing requires minimal protected (depending on the threat model), but some parts of it require significant protection. For example – my Internet browsing history is not that important as compared with my password and account access file. I have strong controls on my passwords and account access (eg KeePass), and my browsing history is behind a system password. Another way to look at this to imagine what the impact could be if a given element were compromised – that should guide the level of protection for that item.

The fourth axiom is based on the premise that the defender must successfully defend every vulnerability in order to be successful, but the attacker only has to be successful on one vulnerability – one time to be successful. This is also why complex systems are more prone to compromise – greater complexity leads to more vulnerabilities (since there are more places for gremlins to hide).

The fifth one is the perhaps the least obvious axiom of this list. Simply put the strength of some security control should not be based on the design being secret. Encryption protocols are probably the best example of how this works. Most encryption protocols over the last few decades are developed, and publicized within the peer community. Invariably, weaknesses are found and corrected, improving the quality of the protocol, and reducing the risk of an inherent vulnerability. These algorithms and protocols are published and well known, enabling interoperability and third party validation reducing the risk of vulnerabilities due to implementation flaws. In application, the security of the encryption is based solely on the key – the keys used by the users. The favorite counter example is from the world of traditional pin tumbler locks , in which locksmith guilds attempted to keep their design / architecture secret for centuries, passed laws making it a crime to possess lock picks or knowing how to pick a lock unless you were a locksmith. Unfortunately, these laws did little to impede criminals and it became an arms race between lock makers, locksmiths and criminals, with the users of locks being kept fairly clueless. Clearly of the lock choices available to a user, some locks were better, some were worse, and some were nearly useless – and this secrecy model of security meant that users did not have the information to make that judgement call (and in general they still don’t). The takeaway – if security requires that the design / architecture of the system be kept secret, it is probably not very good security.

Threat Models

In the world of Internet security and information privacy, there are only a few types of threat models that matter. This is not because there are only a few threats, but because the methods of attack and the methods to defend are common. Generally it is safe to ignore threat distinctions that don’t effect how the system is secured. This list includes:

  1. Immediate family / Friends / Acquaintances – Essentially people who know you well and have some degree of physical access to you or the system your are protecting.
  2. Proximal Threats : Threats you do not know, but are who are physically / geographically close to you and the system you are protecting.
  3. Cyber Extortionists : A broad category of cyber attackers whose intent is to profit by attacking and compromising your information. This group generally targets individuals, but not a specific individual – they look for easy targets.
  4. Service Compromise : Threats who attack large holders of user information – ideally credit card information. This group is looking for bulk information is not targeting individuals directly.
  5. Advanced Persistent Threats (APTs) : Well equipped, well resourced, highly capable and persistent. These attackers are generally supported by governments or large businesses and their targets are usually equally large. This group plans and coordinates their attacks with a specific purpose.
  6. Government (NSA / CIA / FBI / DOJ / DHS / etc): Currently the biggest, baddest threat. They have the most advanced technical resources, the most money, and they use National Security Letters when those are not enough. The collect data in bulk, and they target individuals.

From a personal security perspective we are looking at threats most likely to concern any random user of internet services – you. In that context, we can dismiss a couple of these quickly. Lets do this in reverse order:

Government (NSA et al) – If they are targeting you specifically, and you use Internet services – you are need of more help than I can provide in this article. If your data is part of some massive bulk data collection – there is very little you can do about that either. So in either case,  in the context of personal data security for Joe Internet User, don’t worry about it.

Advanced Persistent Threats (APTs) – Once again, much like the NSA, it is unlikely you would be targeted specifically, and if you are your needs are beyond the scope of this article. So – although you may be concerned about this threat, there is very little you can do to stop this threat.

Service Compromise – I personally pay all of my bills online, and every one of these services wants to store my credit card in their database. Now the question you have to ask is if (for example), the Verizon customer database is compromised and somebody steals all of that credit card information (with 10s of millions of card numbers) and uses them to spend 100s of millions of charges – is Verizon (or any company in that position) going to take full responsibility? Highly unlikely – and that is why I do not store my credit information on their systems. If they are not likely to accept responsibility for any outcome, should you trust them with your credit?

Cyber Extortionists – The most interesting and creative of all these threat classes. I continue to be amazed at every new exploit I hear about. Examples include mobile apps that covertly call money transfer numbers (eg 1-900 numbers in US), or apps that buy other apps covertly. Much like the Salami Slicing attacks (made famous in the movie Office Space), individual attacks represent some very small financial gain, but the hope is that collectively they can represent significant money.

Proximal Threats – If somebody can physically take your laptop, tablet, phone, they have a really good shot at all of the information on that device. Many years ago, I had an iPhone stolen from me on the Washington DC metro, I had not enabled the screen lock, and I had the social security numbers / birthdays of my entire family in my contacts. And yes, there were false attempts to get credit based on this information within hours – unsuccessfully. I now use / recommend everybody use some device access lock, and encrypt very sensitive information in some form of locker. Passwords / accounts and social security numbers in KeePass and sensitive file storage in TruCrypt. These apps are free and provide significant protection for Just In Case. Remember physical control / access to a device is its own special type of attack.

Friends / Family / Acquaintances – In most cases, the level of security to protect from this class of threat is small. More importantly, it is crucial to understand what it is you are trying to protect, why are you protecting it, and what are your recovery options. To repeat – what are your recover options? It is very easy to secure your information, and then forget the password /  passphrase  or corrupt your keyfile. Compromise of private data in this context is orders of magnitude less likely than you locking yourself out of your data – permanently. Yes, I have done this and family photos on a locked TrueCrypt partition cannot be recovered in your lifetime. So when you look at security controls to protect from this threat model, look for built in recovery capabilities and only protect what is necessary to protect.

Conclusions

Fundamentally security engineering is about understanding what you are trying to protect, who / what your threat is, and determining what controls to use to impede the threat while not impeding proper function. Understanding your threat is the first and most important part of that process.

Lastly – I would encourage everybody who finds this the least bit interesting to either read Bruce Schneier’s blog and his books. He provides a very approachable and coherent perspective on IT security / Security Engineering.

Links

Software: Thoughts on Reliability and Randomness

Overview

Software Reliability and Randomness are slippery concepts that may be conceptually easy to understand, but hard to pin down. As programmers, we can write the equivalent of ‘hello world’ in dozens of languages on hundreds of platforms and once the program is functioning – it is reliable. It will produce the same results every time it is executed. Yet systems built from thousands of modules and millions of lines of code function less consistently than our hello world programs – and are functionally less reliable.

As programmers we often look for a source of randomness in our programs, and it is hard to find. Fundamentally we see computers as deterministic systems without any inherent entropy (for our purposes – randomness). For lack of true random numbers we generate Pseudo Random Numbers (PRNs), which are not really random. They are used in generating simulations, and in generating session keys for secure connections, and this lack of true randomness in computer generated PRNs has been the source of numerous security vulnerabilities.

In this post I am going to discuss how software can be “unreliable”, deterministic behavior, parallel systems / programming, how modern computer programs / systems can be non-deterministic (random), and how that is connected to software reliability.

Disclaimer

The topics of software reliability, deterministic behavior, and randomness in computers is a field that is massively deep and complex. The discussions in this blog are high level, lightweight, and I make some broad generalizations and assertions that are mostly correct (if you don’t look to closely) – but hopefully still serve to illustrate the discussion.

I also apologize in advance for this incredibly dry and abstract post.

Software Reliability

Hardware reliability, more precisely “failure” is most often occurs when some device in a system breaks (the smoke comes out), and the system no longer functions as expected. Software failures do not involve broken hardware or devices. Software failures are based on the concept that there are a semi-infinite number of paths (or states) through a complex software package, and the vast majority will result in the software acting and functioning as expected. However there are some paths through the code that will result in the software not functioning as expected. When this happnes, the software and system are doing exactly what the code is telling it to do – so from that perspective, there is no failure. However from the concept of a software failure, the software is not doing what is expected – which we interpret as a software failure, which provides a path to understand the concept of software reliability.

Deterministic Operation

Deterministic operation in software means that a given program with a given set if inputs will function in exactly the same manner every time it is executed – without any unexpected behaviors. For the most part this characteristic is what allows us to effectively write software. If we carry this further, and look at software on simple (8 / 16 bit) microprocessors / microcontrollers, where the software we write runs exclusively on the device, operation is very deterministic.

In contrast – on a modern system, our software exists in a relatively high level on top of APIs (application programming interfaces), libraries, services, and a core operating system – and in most cases this is a multitasking/multi-threaded/multi-cored environment. In the world of old school 8 / 16 bit microprocessors / microcontrollers, none of these layers exist. When we program for that environment, our program is compiled down to machine code that runs exclusively on that device.

In this context, our program not only operates deterministically in how the software functions, but the timing and interactions external to the microprocessor is deterministic. In the context of modern complex computing systems, this is generally not the case. In any case, the very deterministic operation of software on dedicated microprocessor makes it ideal for real world interactions and embedded controllers. This is why this model is used for toasters, coffee pots, microwave ovens and other appliances. The system is closed – meaning its inputs are limited to known and well defined sources, and its functions are fixed and static, and generally these systems are incredibly reliable. After all how often it is necessary to update the firmware on an appliance?

If this war our model the world of software and software reliability, we would be ignoring much of what has happened in the world of computing over the last decade or two. More importantly – we need to understand that this model is an endpoint, not the whole story, and to understand where we are today we need to look further.

Parallel Execution

One of the most pervasive trends in computing over the last decade (or so) is the transition from increasingly faster single threaded systems to increasingly parallel systems. This parallelism is accomplished through multiple computing cores on a single device and through multiple processing threads on a single core, which are both mechanisms to increase the ability of the processor to produce more work by being able to support concurrently running programs. A typical laptop today can have two to four cores and support two hardware threads per core, resulting in 8 relatively independent processes running at the same time. Servers with 16 to 64 cores would have qualified as supercomputers (small ones) a decade ago are now available off the shelf.

Parallel Programming: the Masochistic Way

Now – back in the early 80s as an intern at Cray, my supervisor spent one afternoon trying to teach me about how Cray computers (at that time) were parallel coded. As one of the first parallel processing systems, and as systems where every cycle was expensive – much of the software was parallel programmed in assembly code. The process is exactly how would imagine. There was a hardware scheduler that would transfer data to/from each processor to main memory every so many cycles. In between these transfers the processors would execute code. So if the system had four processors, you would write assembly code for each processor to execute some set of functions that were time synchronized ever so many machine cycles, with NOPs (no operation) occasionally used to pad the time. NOPs were considered bad practice since cycles were precious and not to be wasted on a NOP.  At the time, it was more than I wanted to take on, and I was shuffled back to hardware troubleshooting.

Over time I internalized this event, and learned something about scalability. It was easy to imagine somebody getting very good at doing two (maybe even 3 or 4) dissimilar time synchronous parallel programs. Additionally, since many programs also rely on very similar parallel functions, it was also easy to imagine somebody getting good at writing programs that did the same thing across a large number of parallel processors. However, it is much harder to imagine somebody getting very good at writing dissimilar time synchronous parallel programs effectively over a large number of parallel processors. This is in addition to the lack of scalability inherent in assembly language.

Parallel Programming – High Level Languages

Of course in the 80s or even the 90s, most computer programmers did not need to be concerned with parallel programming, and every Operating System was single threaded, and the argument of the day was Cooperative multitasking versus Preemptive multitasking. Much like the RISC vs CISC argument from the prior decade, these issues were rendered irrelevant by the pace of processor hardware improvements. Now many of us walk around with the equivalent that Cray supercomputer in our pockets.

In any case the issue of parallel programming was resolved in two parts. The first being the idea of a multi-tasking operating systems with a scheduler – the core function that controls what programs are running (and how long they run) in parallel at any one time. The second being the development of multi-threaded programming in higher level languages (without the time synchronization of early Crays).

Breaking Random

Finally getting back to my original point… The result today is that all modern operating systems have some privileged block of code – the kernel running continuously, but have a number of other services that run the OS, including the memory manager and the task scheduler.

The key to this whole story is that these privileged processes manage access to shared resources on the computer. Of these two, the task manager is the most interesting – mostly due the arcane system attributes it uses to determine which processes have access to which core / thread on the processor. This is one of the most complex aspects of a multitasking / multi-core / multithreaded (hardware) system. The attributes the scheduler looks at include affinity flags that processes use to indicate core preference, priority flags, resource conflicts and hardware interrupts.

The net result is that if we take any set of processes on a highly parallel system there are some characteristics of this set that are sufficiently complex and impacted by unknown external elements that they are random – truly random. For example if we create three separate processes that generate a pseudo random number set based on some seed (using unique values in each), and point all of them to some shared memory resource- where the value is read as input and the output is written back. Since the operation of the task scheduler means that the order of execution of these three threads is completely arbitrary, it is not possible to determine what the sequence is deterministically – the result would be something more random than a PRNG. A not so subtle (and critical) assumption is that the system has other tasks and processes it is managing, which directly impact the scheduler, introducing entropy to the system.

Before we go on, lets take a closer look at this. Note that if some piece of software functions the same (internally and externally) every time it executes, it is deterministic. If this same piece of software functions differently based on external factors that are unrelated to this software, that is non-deterministic. Since kernel level resource managers (memory, scheduler, etc) function in response to system factors and factors from each and every running process – that means that from the perspective of any one software package, certain environmental factors are non-deterministic (i.e. random). In addition to the scheduling and sequencing aspects identified above, memory allocations will also be granted or moved in a similar way.

Of course this system level random behavior is only half the story. As software packages are built to take advantage of gigabytes of RAM, and lots of parallel execution power, they are becoming a functional aggregation of dozens (to hundreds) of independently functioning threads or processes, which introduce a new level of sequencing and interdependancies which are dependent on the task manager.

Bottom Line – Any sufficiently complex asynchronous and parallel system will have certain non-deterministic characteristics based on the number of independent sources that will influence access / use of system shared resources. Layer the complexity of parallel high level programming, and certain aspects of program operation are very non-deterministic

Back to Software Reliability

 Yes we have shown that both multitasked parallel hardware and parallel programmed software contribute to some non-deterministic behavior in operation, but we also know that for the most part software is relatively reliable. Some software is better and some is worse, but there clearly is some other set of factors in play. 

The simple and not very useful answer is “better coding” or “code quality”. A slightly more insightful answer would tell you that code that depends on or uses some non-deterministic feature of the system is probably going to be less reliable. An obvious example is timing loops. Back in the days of single threaded programs and single threaded platforms, programmers would introduce relatively stable timing delays with empty timing loops. This practice was easy, popular and produced fairly consistent timing – showing deterministic behavior. As systems hardware and software have evolved, the assumptions these coding practices rely on become less and less valid. Try writing a timing loop program on a modern platform and the results can be workable much of the time, but it  can also vary by orders of magnitude – in a very non-deterministic manner. There are dozens of programming practices like this that use to work just fine, but no longer do – but they don’t completely break, just operate a little bit randomly. In many cases, the behavior is close enough to “correct” that the program appears to function, but not very reliably.

Another coding practice that used to work on single threaded systems was to call some function and expect the result would be available on the next line of code. It worked on single threaded systems because execution was handed off to that function, and did not return until it was complete. Fast forward to today, and if this is written as a parallel program – the expected data may not be there when your code thinks is should be. There is a lesson here – high level parallel programming languages make writing parallel code fairly easy, but that does not mean that writing robust parallel programs is easy. Parallel inter-dependencies issues can be just as ugly as parallel assembly code on a Cray system.

Summary

A single piece of code running exclusively on a dedicated processor is very deterministically, but parallel programmed software on a multitasking parallel hardware system can be very non-deterministic, and difficult to test. Much of software reliability is based on how little a given software package depends on these non-deterministic features. Managing software reliability and failure mechanisms requires that programmers understand the system beyond the confines of the program.

References

Howto: Browse (more) Securely / Privately / Anonymously

Background

For a number of reasons, many people are increasingly concerned with their privacy and security on the Internet. Since the primary reason most people use the Internet is for browsing, this would be a opportunistic use model to look for improvement. Of course the tradeoff is that as we make browsing more secure, we also may make the browsing experience more difficult. So in the list below, it progresses from low return / low impact to high impact / high return, and you can pick you pain threshold.

Note that in the context of a browser (and browsing), I define security as the ability to browse without being infected or compromised by malware. I define privacy as the ability to browse without sites (or other parties) tracking, harvesting information from my browser. Anonymity is when there is a sufficiently high degree of privacy that the browsing activity is anonymous – and true anonymity is not easy to achieve.

Off the Shelf / Good Browser Hygiene

Browser: There are lots of browser options and I cannot offer an opinion on most of them. On a regular basis browsers are reviewed for security – and Chrome, and Firefox are usually in the top three. Privacy is distinct from security, and generally Firefox rates higher than Chrome in that respect. However everything is a tradeoff, and I personally think that Chrome has better performance (which I may be imagining), and my Android devices and Chromebook are Chrome by design – so that is my browser choice by default. Secondary to that, I appreciate the rolling updates and aggressive stance Google takes on security, and I think that outweighs the weaker stance they take on privacy – since I believe I can manage my privacy / personal data easier than I manage security threats. Consider browser selection as the first thing to do in cleaning up your browser security / privacy concerns.

Browser Settings: The obvious things to check in your browser include:

  • Turn on “Do Not Track” / Open settings and search for this flag – if it is not set, set it. This provides some minimal and non necessarily mandatory level of tracking reduction.
  • Content Setttings (Cookies): I up the default level to “Keep local data only until I quit my browser” and “Block third-party cookies and site data”.
  • [Chrome Specific]Under Signin and Sync Settings, I encrypt my sync data with a passphrase. This is all about key management and reducing personal data on Google Servers.

Browser Plugins: The following list includes a few plugins that provide improved privacy.

  • HTTPS Everywhere: This is a plugin that will force a HTTPS connection as the default, with HTTP (non-secure) as the fallback.
  • DuckDuckGo Search: Duck Duck Go is a search service that provides much stronger statements about not tracking your browsing / searching activity (as compared with Google). They feel fairly strongly that this is a big deal. Take a look at their positions on results bubbling
  • DoNotTrackMe: A plugin that gives you explicit tracking information as you browse. This actually provides some visibility into what sites are tracking you in realtime.

Sites: What to do to reduce your browsing footprint.

  • Google Search History: By default Google saves your search history and used it to target ads and search results. My recommendation – turn it off.
  • Google Dashboard: A nice portal that provides a one view view into your data footprint on Google Servers. Review and clean it up. While you are there, setup an Alert on your name. It will give you any visibility into possible misuse of your name.
  • Twitter Privacy: Twitter by definition is fairly public so there is not much to tweak. However it makes sense to verify that “Do Not Track” is enabled and consider turning off / deleting location data.
  • Facebook: Expect this to change over time. Privacy settings seem to be a fast moving target at Facebook. So much of the business value proposition of Facebook is about eliminating privacy, so this will always be about providing some minimal level of privacy control that that is just enough to keep most users from leaving.

Overall these tweaks to your browsing experience will provide some improved level of security and privacy, but fundamentally much of the browsing process from your client system will still be relatively visible – the contents may protected with SSL/TLS, but where you are going, what you are downloading and how long you are there is not. Specifically, where you are going (page by page by page), how long you are there and how my kilobytes you have downloaded is all visible.  If your ISP / employer / campus / hotel / building has a proxy server between you and the Internet, they have access to this level of information.

Overall I consider these steps to just be good browser hygiene.

Some Better

If this level of exposure bothers you (it may), and you feel a need to mitigate this issue, read on – a VPN / proxy service may be the solution you are craving.

Technically a VPN and a proxy server are two very distinct functions. A VPN (Virtual Private Network) is a secure (i.e encrypted channel) and authenticated (i.e. username/password and server certificate) channel from your client system to some server on the Internet. In the enterprise / business world, VPNs are used to enable authorized users on the Internet access to corporate servers on the private networks. In the world of proxy servers, VPNs are used to provide a secure channel to some proxy server on the Internet.

A Proxy server is simply a relay for your Internet / Browsing traffic. You send some Internet request to the proxy server, and it redirects it to the Internet, with the source mapped back to the proxy server. When the response is received by the proxy server, it is then relayed back to your client system. Proxy servers are not explicitly secure, so they are generally coupled with some form of VPN to provide a secure channel.

There are large number of VPN/Proxy service providers around the world. For the most part, the free ones (reportedly) have a fairly high rate of malware infection and the for pay ones are from $40 to $100 a year. This is not an endorsement – but PureVPN and HideMyAss are both typical for-pay VPN/Proxy Services, with very typical pricing and functionality providing a wide range of target servers around the world.

When using a VPN/Proxy service, the net effect is that any geolocation will place you at (or near) the location of the proxy server. This means that if you are accessing some Internet service with geolocation service qualifiers (e.g. bbc.com, nfl.com) , you can appear to be somewhere that you are not. It also means that if your employer, hotel, campus, school has blocked sites/services, you can circumvent these restrictions with a VPN/proxy. In both of these cases you are not likely violating any laws, but you are likely violating some Terms of Service – implied or otherwise.

More legitimately, if you often use public or untrusted WiFi networks, a VPN / Proxy ensures that your traffic will not be sniffed on the local network. If you use WiFi in a high density environment, and are concerned about your network being compromised, or you don’t trust the other users on a shared network – a VPN/Proxy can ensure your traffic is secure / private even if your network may not be.

Ultimately, a VPN / Proxy service can provide a step up in privacy / security for a specific set of threats. However, by using a VPN / Proxy service you are literally handing this same information over the VPN/Proxy service provider – so if your concern is browsing/security in general, you have just shifted the risk.

More Better

From this point, there is one very obvious and better way to achieve better security/privacy – the TOR Browser. The TOR (The Onion Router) Browser is a custom version of Firefox packaged/integrated with a few tools related to The Onion Router, including an Onion Router proxy for your client system. The download package installs easily, and the TOR proxy starts automatically just be launching the TOR browser. If you are serious about using it for the privacy it can provide, read the Warnings FAQ.

The general principle behind TOR is that an outgoing datapacket is encrypted with some relay address on the TOR network, with multiple successive similar layers applied, and ultimately the packet is sent out to the network in which each one of the relays peels off the successive layers – and it is finally sent to the Internet destination. The goal / purpose of this effort is that through this obfuscated path, the user is much more anonymous and their privacy is protected.

In an ideal world, where TOR relays were spread around the world from different organizations it is possible to achieve some level of anonymity. In the real world, some of these relays are operated by agencies with the intent to compromise the TOR network, reducing the effectiveness. In addition some academic research has shown a few other weaknesses related to coordination between TOR relays. The net result is that the TOR network and the TOR browser provide a much high degree of anonymity than any other readily available solution – but it can be broken. For a recent example, refer to the story behind Silk Road shutdown. Details are lacking, but this does show it is susceptible if the incentive is high enough.

Bottom Line

There are a wide range of things you (as a user) can do to reduce your browsing footprint, reduce your ability to be tracked, increase your security and privacy (and anonymity). However, the first step to any of this is to assess what your threats are, and take reasonable steps to mitigate those threats. If you threats are non-specific and general, than it is likely that the non-specific and general browser hygiene solutions are sufficient. If you have specific threats that fit the more elaborate solutions, use appropriately.

Security as a System: iMessage

Background

Say it with me folks, “Security is a system”. In case it is not obvious what that means, I will articulate. Security is made up of a collection of parts, and the system security of this collection is not based on the average security of those parts, or the sum security of these parts – it is based on the weakest security of those parts and how they are integrated. And – sometimes that weakness is not even a part, but a gap in the integration of those parts.

I know, in the abstract we have all heard it before. However we have a new example that highlights this principle.

Before we get into the example I want to share a few of Tom’s Rules of Security:

  1. Know your Threat – Security can only be understood / judged in the context of a given threat type. Good security against one class of threat may be worthless against another class of threat.
  2. Follow the Keys – Key management is about how and where the encryption / access control keys are kept. Whoever holds the keys controls the access. If it isn’t you, you don’t control access.

Note – These are not really my rules, just my personal versions of well known principles.

In the Spotlight: Apple iMessage

Recently there was a presentation in Kuala Lumpar (by pod2g) addressing the security of Apple iMessage. More specifically the presentation highlighted a few weaknesses that illustrate the two rules identified above, and went into great detail as to how it could be compromised. For our purposes, we need to first look at how iMessage works and then we will look at how it can fail – or at least be insecure.

From any given iMessage client a secure message can be sent to any other iMessage client. This message is based on public-private key encryption keys. With public-private encryption keys, every user has two keys – a private key (which is the secret key) and the public key (which can be shared with anybody). These are special keys in that any message encrypted with the public key can only be decrypted with the cooresponding private key, and conversely any message encrypted with a private key can only be decrypted with the corresponding public key. In the first case, it allows somebody to send a secure message to a given person without needing to exchange secret keys. In the second case – also known as digital signing, it identifies the source of the message since only the holder of that given private key could create that message. When these methods both used, a message can be sent from Ted to Alice (using their respective public keys) and:

  • Ted knows Alice and only Alice can decrypt the message if it was encrypted with her public key.
  • Alice knows that Ted and only Ted could have sent the message if it was signed with his private key.

To a certain degree, that is what Apple iMessage does. However, messages are not sent directly between Alice and Ted, they are sent through Apple services and retained under both of the AppleID accounts of the message participants. This alone is not a security exposure by itself, but as a user I would like the option to purge my historical data. It may be securely encrypted today, but who is to say how secure that may be in the future?

In any case, the next part of the story is that all the public keys used by iMessage are stored on Apple ESS servers and are delivered to iMessage clients automatically. Which puts Apple in a perfect position to compromise any encrypted iMessage with a Man in the Middle attack (MiTM). Specifically pages 75&77 (of the presentation) show that Apple has full control of the public key directory, public keys are retrieved by clients “as needed” (with a 30 minute cache window), and users have no visibility into the public keys being used. At any point Apple has the technical capability to insert themselves as the endpoint to a message and then recreate / encrypt the message and send to the intended recipient. Since the keys are exchanged in the background – the users will not be aware than it was not an end to end encryption.

Most of the other 88 pages in this presentation illustrates how iMessage works under the covers, and the challenges of a third party compromise. I will give you a clue – it would be very difficult for anybody who is not Apple to compromise iMessage, but technically very easy for Apple to do.

Bottom line

Apple controls the Keys :There is nothing to imply that Apple is spying on iMessages. However there are no technical limitations that would prevent them from doing so if they were so inclined or directed to, since indirectly they control the keys.

Know your threat: If the threat you concerned about is Joe Internet Hacker, iMessage is very secure, with a very low risk of interception/decryption. However if the threat you are concerned about has a National Security Letter in their pocket, iMessage probably does not provide much security.

Update [2013 Nov 5] : A very well written analysis of Lavabit at thoughtcrime.org shows that Lavabit had a similar approach to key management – and same weakness of co-mingling the keys with the “secured” accounts on the server.

Internet Security As a System

Background

Most of us do not see our activities on the Internet as a system, and if it is a system we are not sure what that has to do with securing ourselves on the Internet. First lets look at a typical Joe Internet User in terms of the definition of system – “a set of connected things or parts forming a complex whole”. The parts are the individual services we use – GMail, Facebook, Amazon, iTunes, PayPal, Verizon and/or AT&T, etc. For each one of these we have a username and password – which may or may not be very unique. The connectivity part is the user, Joe Internet user – who is the real target of a attacker.

How you defend this type of a system is not entirely obvious, however if we flip the perspective around it may give us some insight. Specifically, how would an attacker plan to go after your accounts to their benefit?

If we assume the threat model is a high volume, Internet cyber extortionist looking for a quick return, we can characterize an attack pattern.

Phases of an Attack

A simple attack has three phases:

Compromise – This phase is where an attacker has already identified you as a target, and is probing for a weakness / vulnerability to “get inside” – compromising the system.

Mapping / Discovery – This phase is where the attacker has compromised some part of your system of services and is mapping out your other accounts / services. Since this process is essentially information gathering / compromise – it is fairly hard to detect. This information is used to plan and execute the next phase as quickly as possible.

Exploitation – This phase is where the attacker implements a plan to use the information collected to their benefit – and usually to your detriment.

An Example of a Common Attack

In this example, Joe Internet User is a typical first world Internet power user with all of the accounts listed above –  GMail, FaceBook, Amazon, iTunes, PayPal, Verizon and/or AT&T, etc..

In our first example, the attacker has been perusing Facebook and found a public profile for promising target. The status updates indicate either an iPhone/iPad/Android Tablet / Smartphone etc – indicating either a iTunes or Google Play account, or both. Other references may indicate online shopping habits – enabling the attacker to identify target accounts. Most importantly, the attacker discovers the target’s primary email address – either GMail, HotMail or Yahoo (for example). Connections to other social networks (eg Twitter, Google+, Instagram, etc) provide additional sources of personal information. At this point the attacker knows where you live, your age, family / marital status, friends, pets / kids names / ages, where you work, what you do for a living, where you went to school, and what you do for fun. All from public sources.

The next part of discovery is compromising an account. The most promising is usually the primary email account. This is due to this magical feature of every Internet service – the password recovery email address. People forget passwords and people forget usernames, but every service has an email address for password recovery. This is usually setup when the account is initially created, and forgotten shortly afterwards.

To get back to our process, the attacker makes a number of educated guesses for the password for the users primary email account – and sadly most people are still using simple passwords. Is your email password based on a birthday, names (parents, spouse, kids, pets), sports team / player, personal interests? With a one or two number appended? In any case, lets just guess that an attacker will compromise a quarter of all accounts in less than 25 guesses – and our Joe Internet User GMail account has been compromised. Where does that lead us?

The attacker is patient, and access to a primary email account is a much better way to collect more useful / personal information. One of the first things an attacker is going to do is download the user contacts and email – in case the user suspects compromise and changes the password. Most webmail services provide this feature, and it ensures that the attacker has a backup of your information. At this point we have to ask a few questions about Joe Users webmail account. Does he have a folder with his online account email? Bills, credit cards, online shopping accounts? Do the contacts have birthdays, anniversaries, even Social Security numbers? We know they have addresses, email and phone numbers. Each of these helps build data for credit card fraud. At this point this is still a discovery process, and the attacker is very careful to not touch, change or leave any clues of activity.

Exploitation is the next step and the attacker will develop a plan of attack and usually the first step is based on the accounts and stored credit cards / store credit cards. For example – is there an Amazon, Tiffanys, Macys, Sears, etc online account with an credit card saved in the online store? Is the email account tied in with a Google Play Store and a credit card? The attacker can buy phones, tablets and computers using that account. Is it tied to a Verizon, AT&T, or T-Mobile with a credit card stored in the account? Once again, the attacker can buy phones and tablets from these accounts. The first think to consider for online shopping is embedded credit card numbers. Some of these are credit cards that can be removed – but most store credit cards are automatically available on the account and cannot be removed without cancelling the credit card.

The next step of exploitation is to look for signs of illegal or incriminating information that can be used to extort something from the user. Most people know this as blackmail, and although it does not occur often – it does occur. Think about the depth and breadth of highly personal information that is in your email accounts.

Going one step beyond blackmail, attackers will sometimes “hijack” all of the accounts by changing the passwords and redirecting the recovery email address to some email account held by the attacker. Then a message is sent to the user, asking for ransom to get their accounts back. Once again – this is rare, but it does occur.

Generally the last part of exploitation is where all of this personal information gathered on Joe User, his friends, family, acquaintances etc, is used to build a persona database used to apply for credit and loans – credit fraud and what is commonly known as identity theft.

A Few Simple Steps

This example shows how attackers see the collective accounts and services of Joe Internet User as a system – with Joe User as the key connective element, and how attacking a few weaknesses provides significant opportunity to the attacker.

  1. Learn how to create Good Passwords (and use them when possible) – I get frustrated when an account service requires an 8-12 character password, with upper case, lower case, numbers and symbol. This does create a high entropy password – but is also very difficult to remember. Take a look at this xkcd panel and think about it when you create passwords.
  2. Primary Email Account – Since your primary email account is your account recovery account, this account is more critical than any other account. Choose / use a quality password and if possible use two factor authentication.
  3. Two Factor Authentication (2FA) – If the service offers two factor authentication, referred to as “2-step verification” by Google – use it. Two factor authentication does not make an account impossible to compromise, but it makes it sufficiently hard that this type of attacker will move on as soon as they discover you are using it. Google (GMail, Google Play) and WordPress both offer free 2FA for user accounts. In both cases it is based on a mobile device app – Google Authenticator
  4. Stored Credit Card Numbers / Bank Account Numbers – Carefully tradeoff the convenience of storing a credit card online in an account versus the cost if it is compromised. I recommend removing any general credit card numbers.
  5. Store Credit Accounts – Store credit accounts are usually tied right to that stores online store and cannot be removed without closing that line of credit. Attackers know this and use this to their advantage. Consider closing those lines of credit.
  6. Sanitize Contacts / Email – Audit your contacts and all of your email to see what could be deleted and clean it up. How necessary is a 5 year archive of all sent mail? If you are worried about holding onto everything – back it up before cleaning. The less information available in a compromise, the lower the risk.
  7. Sanitize Social networks / Make your profile Private – Most of the social networks now enable you to make your profile private – so only your circles / friends can see what is on your pages. In addition, content should be cleaned up to reduce your online presence. Once again, is it really necessary to have a 5 year archive of Facebook posts?
  8. Unique Passwords – DO NOT use the same password for all your accounts. DO not use a couple of passwords for all your accounts. Use unique passwords for each account. If one of you accounts is compromised, make them work for each account – don’t just give it too them.

These steps will not make your accounts bulletproof, but most attackers are opportunists and these steps will harden your accounts enough for them to move on to somebody else.

How to Secure Dropbox (and others) – Part 1

Personal security and privacy on the Internet are often seen as lost dreams – something we sacrificed in back in the 90s without a clue. In this blog, I cannot give this back to you, but my hope is to help you take back at least some parts of your personal online security / privacy piece by piece.

Background

One of the most interesting transformations in how people use the Internet is personal data convergence. In this model, a user may have a phone, a laptop, a tablet and a desktop system. Or another particular type of user would “roost” at different computers that were convenient. Personal data convergence is where that user has some mechanism or function to access and update a personal datastore from each one of these devices – fairly transparently. This is a big deal because (when done correctly) this process renders the platform or device transparent – enabling people to more effectively do what they do.

For example – at one time everybody had a home telephone, and each one had the same basic capabilities, and the primary value of having a telephone had very little to do with the actual telephone, and everything about the function and service – how it enabled the user. This personal data convergence means that each user can have their cloud of resources follow anywhere they go, and this has resulted in a proliferation of services that offer something like this. Examples include:

  • Dropbox – a basic client / server / cloud service that provides some gigabytes of data that can be synchronized between Windows, iOS, Android, Linux, OSX and others. Premium service offers more space. Free format allows any file type. Storage is at 2GB to 5GB, depending on their promotions.
  • Box – Similar to Dropbox with fewer client types supported, but more space (with the free service), which is at 10GB at this time.
  • GDrive – The Google spin on a user-centric filestore. This was originally an extension of their online office suite, and only supports specific file types.
  • Chrome – This is not a general filestore, but a specialized synchronization where all of the personal features of Chrome are stored in the Google Cloud. This includes favorites, cached usernames / passwords, cookies, history and configuration.
  • iCloud – The Apple spin of online backup / synchronization. It synchronizes and backs up the entire Apple universe of devices, but like most things Apple, there is more left unsaid than should be. We can guess that it is better than average, but no better than it has to be. But it will work well with Apple devices and it will look good the whole time.

The Issue

Each one of these has their value add / differentiator to appeal to some specific use case, but each one of these also has a significant structural security issue. In each of these services, data is essentially unsecured within the service provider. Seriously – although several (if not all) of these service providers will make strong statements about the level of encryption they use on their SSL/TLS connections and how data is encrypted on servers with some form of disk encryption, however if the keys are held by the same service provider – it means nearly nothing. In any case, this class of service is not going away – and will only increase in size and capability – but from a basic privacy and security perspective, it is one (big) step up from public storage on the Internet.

For example, right now, today – passwords for nearly every WiFi router (that is paired with a Android device – worldwide) is stored in Google servers. As part of the account backup process, Google has been backing up WiFi settings for the last several Android versions – which means hundreds of millions of WiFi passwords worldwide. Recall that Google Streetview got into some trouble over harvesting WiFi passwords, but now they build it into the Android ecosystem – and they get the passwords with no muss or fuss.

From a personal viewpoint – I see this consolidation of my online footprint, particularly private elements like usernames, passwords, and network access as something to be very uncomfortable with.

Security Theater in the Cloud

The following articles provide an entertaining juxtaposition between real security and security theater. Both are technically correct, but have very different messages.

With that perspective, now take a look at this post from Google regarding G-Drive encryption.

Yeah. As a security guy, i have to ask the question – Google and Apple have smart guys working there, lots of them. So if this is supposed to be real security, clue me in who the threat is? Based on the fact that in both cases they control the encryption and they control the keys, so it is not protected from these vendors, insider threats, anybody who could compromise their keystores, or National Security Letters. My cynical nature whispers to me and says it is security theater.

Is there a “fix” ?

For these service providers, there is no “fix” since the unsecured nature of their services is a key part of their business model. With this level of access to your personal data and files, they can build an incredibly detailed demographic profile of you as a consumer, you as a citizen, you as a security threat, and you as a future employee for any firm willing to get / buy the data. I don’t think it comes as any surprise that even if you pay for a service, a very subtle and implicit part of the cost is giving up any claim to privacy and security for the data stored in the service. This is very much a case of broken by design (and intent).  So any talk of a privacy / security “fix” is purely subjective and and not likely to be supported by the service providers. Depending on the service provider, they may consider it a violation of their terms of service. Caveat Emptor.

We have Options

If retaining your privacy and securing your personal data matter, we basically have three options.

1) Air Gap : Don’t put private / personal data in these services. It may sound excessive – but air-gapping your personal data from the Internet is the most robust privacy control you can use.

2) File Level Encryption: Use encrypted containers on the cloud sync service. Examples include Trucrypt and Keepass.

3) Private Cloud synchronization: Drop these services for something where you – the user, controls the encryption keys. Examples include BitTorrent Sync and SharePlan.

In any case, this is only part 1, and in part 2 I will expand on the options to better secure these synchronization services.