[cabfpub] Revocation ballot

Mon Jul 17 07:13:20 MST 2017

On Sat, Jul 15, 2017 at 11:57 PM, Jeremy Rowley
<jeremy.rowley at digicert.com> wrote:
> While perhaps not true in all cases, I think key compromise situations should be handled similar to all other software security vulnerabilities. Because they are vulnerabilities, we should react like other reported issues.  We should learn from the work and lessons of other responsible disclosure practices (Project Zero is a great example). I doubt you're in the camp that claims all vulnerabilities should be promptly disclosed without a correction period, so I think we can find some common ground on how certificate vulnerability and key compromise reporting should happen. The remediation window is a vital part of responsible disclosure; it's not merely a nice gesture, but represents the opportunity to work with information security and software development colleagues to remediate vulnerabilities in a way that minimally impacts users (i.e. attempts to reduce the potential harm that could be caused by the vulnerability to the greatest extent possible, while balancing the time constraints caused by an awareness that if one responsible party discovered the issue, it's only a matter of time before another, potentially less responsible, party does likewise).

I don't think we get to make that false equivalency, certainly not
with how the certificates work.

If the purpose of a certificate is to bind a key to the domain, and
the domain is no longer bound to the key (by virtue of the key having
been compromised and thus possessed by an unknown number of parties -
from zero to everyone), then the certificate has failed its purpose
and is actively misrepresenting the status.

For example, DigiCert's Relying Party Agreement -
https://www.digicert.com/wp-content/uploads/2017/05/DigiCertRelyingPartyAgreement_5-9-17.pdf
- limits its liability to those of online financial transactions. For
example, if a key is compromised, and DigiCert fails to promptly
revoke, and the RP checked revocation information, then:

a) If an online transaction was performed, and it was disputed and not
reversed, then DigiCert is liable
b) If something else was performed - for example, the act of signing
in - DigiCert is disclaiming liability

> Of course, waiting on disclosure or revocation doesn't remove the vulnerability. However, using responsible disclosure does give developers and IT teams time to properly assess the scope of the issue, update their systems, and deploy fixes before the knowledge is wide-spread.

I think we're in agreement that, given the nature of the ecosystem,
the target state to be moving towards is to enable and allow the
prompt and timely rotation of keys, for both emergency and
non-emergency situations, correct?

> I agree with prompt revocation, but what constitutes "timely and promptly" is subjective and depends heavily on context and severity.  For non-emergency situations (a vulnerability or compromise is discovered, but no evidence is found of active exploitation and the discovery occurred in such a way that repeat discovery is unlikely to be imminent), 24 hours seems too short to advise on a restructuring, deploy new certificates, and update systems.

How do you propose to discover active exploitation? The math is
indistinguishable to all participants but the Subscriber, and even
then, it's questionable.

> For emergency situations (a vulnerability is discovered to be undergoing active exploitation or is publicly available to an extend that re-discovery and exploitation is inevitable), 24 hours is probably too long.  My proposal is we bifurcate the timelines based on the reason for revocation and impact.  A company that changes address and fails to update their certificate within 24 hours is in a different risk category than someone who publishes the key pair used on their shopping site to Github.  I think we're in violent agreement that automation is essential within the industry, and that is a key recommendation we always make. Unfortunately, automation isn't available on every device, nor does every entity deploying digital certificates automate for all situations. Generally, if the device operator is disclosing their private key on a public repository like Github, they likely did not plan on addressing a revocation massively affecting their devices.

Then it sounds like they will be disproportionately affected if they
publish their key on GitHub. I don't see a reasonable argument to
increase the risk to all Relying Parties (by virtue of relaxing
requirements for all certificates) in order to satisfy the incentives
of those who do not automate.

> Every company has "perverse" incentives to help their customers.  Google has its own perverse incentives to keep users on its search engine and browser. Somehow, despite these perverse incentives, we all seem to work towards Internet security in our own fashion. Ours is helping entities (customers and non-customers alike) configure PKI related systems, deploy certificates, and remediate messes, e.g. those caused by reusing private keys on hardware devices sent to all of their users.

This is also a false-equivalency. As proposed, there is zero incentive
for timely revocation - all responsibility is disclaimed, the only
fiduciary relationship is with the Subscriber, and it seeks to
explicitly legitimize the behaviour to reduce the risk of any
supervisory sanction.

> For example, let's say we have a popular program that has 10 million downloads. Unfortunately, they deployed the same private key in every download (note that this was not our recommendation; at this point, they haven't contacted us for recommendations on best practices for key management and protection, etc.). We receive notice at 1 am on a Saturday. Although we maintain a constantly monitored certificate problem reporting process, the supplier does not. Although we immediately spam every email we have, including their emergency numbers, there's no way they can get someone technical on the phone prior to 1 am Sunday.  The net effect is that on Sunday at 1 am, each user is potentially blocked from their app.

Yup.

> That doesn't help relying parties at all.

But it does. They have reasonable assurance that their communications
remain secure. The software vendor did not provide a way to ensure
that, and as such, found it not working until it did.

> Instead, if we received responsible disclosure of the issue, but could wait for a week before revoking, the company could push out an emergency update deploying unique certificates to each device (or re-work how their app communicates internally so that local private keys aren't needed at all; yes, that's what we'd recommend in plenty of cases because we actually do care about what’s best for users) eliminating the shut down. Luckily for us, so far, the revocations have not had quite that widespread of detrimental effects.
>
> Seems like there should be a balance we can strike between the need for prompt revocation and the desire not to impact relying parties while still encouraging certificate use.  I proposed two weeks based on the reason for revocation, but I'm certainly open to other suggestions. Maybe it's simply not possible to treat key compromise similarly to other vulnerability disclosures, but I'd like to at least explore the possibility before giving up on it.

So I can understand and appreciate that perspective, but it's also not
what you proposed. As proposed, however, it allows indefinitely
delayed revocations, fully disclaimed liability (which itself is
already an unreasonable burden, and so the solution is not to impose
more of this fiction as a justification for reducing security), and no
reasonable bounds for either Relying Party _or_ Subscribers. It solely
benefits CAs.