Ductus Exemplo

That one different gummy bearSometimes when you buy a pack of sweets, some of the sweets are damaged or look weird. One time, I even had them mixed with a different type of sweets. This may happen to you once a year or less, but the people who are working in the factory surely see it happening more often.

This works for everybody: Doctors see more injuries, Bartenders hear more weird stories and we see more inaccurate WHOIS entries and compliance-violations than most other people.

At some point, it has become so much that I started to become very interested in how this works in detail. ICANN provides binding (hah!) agreements between Registrars and Registries. Let’s take postal codes as an example.

ICANN says (through the RAA): Postal codes need to be in the format specified in RFC 5733 and in accordance with the UPU address format for the country in question (in this example: China).
IETF says (through RFC 5733): Postal codes are represented through character strings with a defined minimum and maximum length.
UPU says (through international addressing sheet China): 6 digits.

You could agree with me here that we established the fact that the postal code for a registrant form China needs to be six digits. Not two, not eight, not letters, not symbols and so on, right?

Well, while writing this post, we have 11.100.156 new gTLD domains registered to registrants from China.

905.693 of them are in violation of the UPU address format, the RFC 5733 and thus, ICANNs RAA. That’s 8,1%. And that’s postal codes in China only. Worldwide we also have

  • 2.211 domains with invalid country ISO-3166-1 codes
  • 205.952 telephone numbers with a non-existing country codes
  • 263.906 domains don’t even have a telephone number entered at all
  • about 50 email addresses that should fail even the simplest email syntax check (containing spaces, missing @, missing host name, etc.)
  • 941 missing email addresses
  • 137.138 disposable email addresses

And we’re far from being done with our checks yet.

Knowing this makes ICANNs specifications look like a joke to me. To put it into perspective, here is my favourite piece from ICANN regarding this:

  1. Validate the presence of data for all fields required under Subsection 3.3.1 of the Agreement in a proper format for the applicable country or territory.
  2. Validate that all email addresses are in the proper format according to RFC 5322 (or its successors).
  3. Validate that telephone numbers are in the proper format according to the ITU-T E.164 notation for international telephone numbers (or its equivalents or successors).
  4. Validate that postal addresses are in a proper format for the applicable country or territory as defined in UPU Postal addressing format templates, the S42 address templates (as they may be updated) or other standard formats.
  5. Validate that all postal address fields are consistent across fields (for example: street exists in city, city exists in state/province, city matches postal code) where such information is technically and commercially feasible for the applicable country or territory.
  6. Verify:
    1. the email address of the Registered Name Holder (and, if different, the Account Holder) by sending an email requiring an affirmative response through a tool-based authentication method such as providing a unique code that must be returned in a manner designated by the Registrar, or
    2. the telephone number of the Registered Name Holder (and, if different, the Account Holder) by either (A) calling or sending an SMS to the Registered Name Holder’s telephone number providing a unique code that must be returned in a manner designated by the Registrar, or (B) calling the Registered Name Holder’s telephone number and requiring the Registered Name Holder to provide a unique code that was sent to the Registered Name Holder via web, email or postal mail.

And those inaccurate informations are for valid and active domains. A quick check revealed that only 1% of those domains are in clientHold or any similar status, which would allow such false entries to exist. But even then you could question the acceptance of such methods, since even a domain on clientHold is gone and can’t be registered by someone else for the time being.

Now this post isn’t supposed to be just a rant. We want to lead by example and act within our capabilities to make the whole new gTLD domain space a better place (I know it sounds cheesy, but whatever). So we approached ICANN with the goal of finding a way to provide them with the most up-to-date inaccuracies. Basically, we want to enable ICANN to fast-forward the process of identifying “faulty” domains and Registrars/Registries who just don’t care and thus, “poisoning” (yep, strong word) the whole thing.

Believe me when I tell you that while working with every type of client from the domain industry, I realised that if everyone would do their job just with a little more effort, this whole thing would not only be way more easy for you and me, but also way more fun. Instead of reacting to a bizillion of errors and fixing problems that are being thrown in our faces because some Registry is changing creationDate entries (suddenly dated back, missing, etc.) – and thus, making our statistics worthless – we could provide you with so much more interesting and amazing statistics you’d neglect your own wife because you “just wanna browse nTLDStats a bit” instead of coming to bed.

That being said – good night!

 

WHOIS validation, anyone?

We did it. Mark your calendars, since today is the day in which we are releasing our WHOIS validation tool. The super-serious description would be Registration Data Directory Service Specifications Validator (add “9000” for extra tension while saying it) – or you can just call it Brian. Why Brian? Because I feel like Brian suits the secret purpose of why we are releasing this tool. But I’ll talk about that later in this post.

Let me explain its official purpose first:

The first part is the validator itself. You can now check whether the WHOIS output of a specific registry complies with ICANNs Registration Data Directory Service Specifications set in various Registry Agreements. 

The second part is an interactive rules-guide for the abovementioned specifications. When (formatting) errors are found, our tool points them out and lets you read up on it in detail.

 

That’s it. That is its purpose. Nothing more. That is also the boring part, because now I am gonna explain to you the secret purpose of our new tool. And please don’t tell anyone, because – you know – its secret.

Imagine you handle about 100.000 (sometimes five times more, other times a fifth of that) database records per day. You process them automatically, because if not you’d need about as many employees as Walmart has. We obviously have the money, but we just don’t feel like hiring so many people. Which leaves us with the only thing that makes sense: Process insane amounts of data automatically.

Which of course is what we are doing. We programmed a nice piece of software that validates everything. And by everything I mean everything. It goes through the nTLDStats database and checks for bugs, burglars and insurance agents. A few minutes after we ran it for the first time, it died. The reason was an unexpected output format provided by a registry. So we added an exception for that and ran it a second time. Again, it died shortly after we started it. Same reason. So we added another exception. It went on and on like this for a while – basically forever. At this point, I am not even sure whether you can even remotely comprehend the frustration we felt in the office. Lets just say some tables, keyboard, coffee cups as well as one USB fan had to be replaced. But we are okay now.

Error Distribution

Only 16 TLDs manage to deliver a WHOIS output following ICANNs specifications.

We now have Brian.

Brian will display which Registry does not comply with their Registration Data Directory Service Specifications set in a Registry Agreement for their own gTLD. Brian will also show you a graph, detailing which registries WHOIS output has errors and how many there are. We will even send E-Mails, manually(!), to those registries with the most error count in their WHOIS output to make them feel bad!

 

Fear Brian!

Runs away laughing maniacally

P.S.: On a more serious note: The ICANN is planning to do what we’re doing starting 2016, but with real consequences for the particular registry. So love us or hate us, we’re doing you a favour.

 

 

=> See the new nTLDStats.com WHOIS Validation Tool and be sure to check out the interactive rules-guide as well.