In response to incident 1724458 - Sectigo: Mojibake in certificate Subject fields (mozilla.org)<https://bugzilla.mozilla.org/show_bug.cgi?id=1724458>, I created a zlint and tested it against all pre-certificates in the crt.sh that contain at least an organization name (to exclude DV certificates), there is still some room for improvement, but I think that the results are useful.

I attached the initial report so that we can discuss the finding with the validation subcommittee.

Findings include certificates:

  *   who mix scripts, for example the Greek script with the Latin script, often the incorporation suffix ΑΕ
  *   with a U+FEFF or U+200B (Zero Width No-Break Space / Zero Width Space) character
  *   containing a symbol U+FFFD (Replacement Character)
  *   containing a symbol U+00A4 (Currency Sign)
  *   ...

To my understanding scripts are normally not mixed to prevent script spoofing (this would for example permit a homograph attack in IDN). In CJK (Chinese, Japanese, and Korean as in the Han, Katakana, Hiragana scripts) it does seem ​to be common to mix scripts. I also do see a few cases where scripts are combined with Latin characters, for example in addresses, this might be fine but would be good to discuss.

The source of the current lint (WIP) can be found at:
Comparing zmap:master...pkic:SubjectDNScript · zmap/zlint (github.com)<https://github.com/zmap/zlint/compare/master...pkic:SubjectDNScript?expand=1>

