Unsupervised learning for fraud and abuse:
Abuse traffic is the bane of many online services. Bots generate huge volumes of traffic that guess passwords, open accounts, defeat CAPTCHAs, commit click-fraud and otherwise abuse services intended for humans. Supervised approaches are hard to apply since labels are often either impossible to obtain, or come with a lot of latency. The phenomena are also high-flux: approaches that rely on attackers re-using a small set of useragents or limited pool of IP addresses are very brittle. Using one-sided learning, we make minimal assumptions about attack traffic and instead use everything available to triangulate the clean distributions. Applicable to password-guessing, signup fraud, CAPTCHA solving, and automated generation of firewall rules. Paper on application to password-guessing.
Much in computer security involves recommending defensive measures: telling people how they should choose and maintain passwords, manage their computers, and so on. We show that claims that any measure is necessary for security are empirically unfalsifiable. That is, no possible observation contradicts a claim of the form “if you don’t do X you are not secure.” A consequence is that self-correction operates only in one direction. If we are wrong about a measure being sufficient, a successful attack will demonstrate that fact, but if we are wrong about necessity, no possible observation reveals the error. The fact that claims of necessity are easy to make, but impossible to refute, makes waste inevitable and cumulative. Paper on falsifiability. How do we justify security measures? Review paper on Science and Security.
Users are constantly encouraged to choose stronger passwords, but when does it actually make a difference, and how much is enough? We find that offline guessing is less of a factor than commonly thought, even when the password file is stolen. Further, once a password is strong enough to withstand online guessing, increases in strength that fall short of what's needed to withstand offline guessing are wasted. Passwords in this online-offline chasm waste effort since they do too much for online guessing and not enough for offline. Overview paper in CACM.
We've all received variants of the Nigerian email scam. A stranger asks for help to rescue a hidden fortune from corrupt officials and wicked in-laws. Who falls for those crazy stories? Who hasn't seen it hundreds of times already? The fabulous tales may be a feature, not a bug. Because of the long odds of finding people who'll go all the way and not back out at some point the scammer needs to identify those who are credulous and who haven't seen it before. "By sending an email that repels all but the most gullible the scammer gets the most promising marks to self-select, and tilts the true to false positive ratio in his favor." Paper explains in depth.
People have been predicting the death of passwords for a long time, and yet we rely on them more than ever. We examine two decades worth of proposed replacements against usability, deployability and security benefits that an ideal scheme might provide. "Not only does no known scheme come close to providing all desired benefits: none even retains the full set of benefits that legacy passwords already provide." Overview paper with the dream team of Joe Bonneau, Paul van Oorschot and Frank Stajano.
We show that survey-based estimates of cyber-crime losses have systematic biases; errors are cumulative and directional; a single outlier answer can make a 10x difference in the estimate. This doesn't mean that cyber-crime isn't a big deal, but those eye-popping numbers about billions being made by cyber-criminals are junk. WEIS paper and invited NY Times OpEd.
This paper caused quite a stir when it came out: e.g. , , , , , , , ,