Internet Privacy

We built Dombox to address privacy and spam issues. An average internet user don't quite understand the importance of Privacy. Allow us to explain why current internet lacks privacy with an Example.

Hash

Can you identify a bunch of text you typed OR photo you have taken OR a video you captured OR any other digital file for that matter without looking at its contents?

With the help of "Hash" you can.

Hash is a unique string that identifies the given file or string.

Hash is a One Way Ticket. Meaning... you can create Hash as long as you have the original message but you cannot create the original message from the hash.

Hash is not a secret value. Hash can be created by anyone as long as they have the original message. The hash value for "SomeRandomString" gonna be the same for you and the person who live in other side of the world.

Hashes are fixed length string. No matter how much data you feed, you always going to get fixed length hash.

Hash Demo

Hashes are about two rules.

Rule 1: Each Hash must be unique

Rule 2: Ditto Rule 1

When a hashing algorithm produces the same hash for two different strings, then it's called collision. Such algorithms are considered broken and will be discontinued for security reasons. e.g. MD5 and SHA1 both are vulnerable to collision attacks.

Use Case 1: Passwords => Your passwords are hashed first before storing it in servers

Use Case 2: Storage => For Identifying duplicate content and saving storage space (e.g. File and video hosting websites)

Use Case 3: Anti-Virus => For scanning malicious files. If the file hash in your computer, match a malicious file hash found in your Anti-virus software hash list, then that's a virus file

Use Case 4: Integrity => File integrity check after downloading a file from the internet.

Use Case 5: Digital Signatures => For verifying the authenticity of digital messages or documents.

Gravatar

Have you ever heard of a site called Gravatar? It is one of the most popular avatar services on the internet. Gravatar stands for "Globally Recognized Avatar".

Before the inception of Gravatar, you need to upload your avatar manually in every website you sign up. But after Gravatar, it's all "one" avatar.

According to their stats, they are serving the avatars over 8.6 billion times in a day.

WordPress is a popular open source software. More than 60 million websites you see on the internet powered by that software. This software comes with Gravatar by default. So more than 60 million websites today supports Gravatar.

Even many of the major professional websites like StackOverflow, Github etc depends on the Gravatar service for avatars.

This is how Gravatar works. You go to gravatar.com, signup with your email address and upload an avatar. This avatar is now linked to your email address.

Gravatar uses the email hash to build the avatar URL. [Hash is a unique string that identifies the data]

This is how your avatar image URL looks like. https://secure.gravatar.com/avatar/{MD5 email hash goes here}

Now if you signup to any third party websites or post a comment with your email address, then the Gravatar will be displayed if the site support it.

Although Gravatar solved a major issue, it created two more major issues.

(1) Email Brute-forcing (2) Privacy

Note: An average internet user may not notice these things. So we will try to explain clearly as much as we can.

Entropy

In a nutshell, Entropy is the "Degree of Unpredictability"

You know what is the most common password on the internet?

It's 123456

Now... A hacker's first try would be trying that password. So entropy of that password is "literally zero". Because the hacker cracked the password in the first attempt.

To increase the Entropy, you need to pick a very strong password.

If we give you a "Hash" of an email address and ask you to find the real email address, you would be completely lost. Right?

e.g. 503A8F0B2D11DA49A27150C868A5EEB5 => ?????????@????????

Because there are Gazillion possibilities. The Entropy is very high. The value of this entropy depends on the possible email address combinations. So you have no idea where to start.

But if we give you the "Name" too, then it's going to make your job much easier. A man whose name "Donald Trump" definitely not gonna have an email address that looks like "barackobama@gmail.com"

Underline the word "definitely". Although you still have no idea about the real email address, you are "sure" of something now. So you weakened the entropy.

Let us give you the "Name" and "Email Hash".

NameEmail Hash
Jeff Bezos503A8F0B2D11DA49A27150C868A5EEB5

Let's try the following combinations.

Email AddressHash
jeff@amazon.com27D637B6F491BCBEE2C87F13136B675E
bezos@amazon.com12B79F144DBF4AA7FEADFD71679A2F91
jbezos@amazon.com503A8F0B2D11DA49A27150C868A5EEB5

There.. we got the correct email hash in the last attempt.

So one thing is clear in the last experiment.

You can find "Valid Email Addresses" if we give you "Name" and "Email Hash"

But If we give you the "Date" too, then you can find the "Active Email Addresses" easily right?

For example, If a user post a comment within the past 6 months or 1 year, then most likely the user is an active email user.

Email Hash + NameValid Email Addresses
Email Hash + Name + DateActive Email Addresses

Brute-forcing

In brute-force method, the spammers have to generate multiple email addresses and try sending an email to each generated email address. If the email got accepted then it's a valid email address.

The success rate of this method will be very low. Let's just say the success rate is 5%, that means 95 out of 100 emails are failing. In such cases popular mail services like Gmail, Outlook etc., usually block and blacklist the spammer's IP address.

In Gravatar case, email brute-force / dictionary / combinations attacks are not going be an issue. All you have to do now is generate email hash based on the name you see right next to avatar and compare with the avatar email hash. If it matches then you found a valid email address.

A spammer can find a massive amount of Gravatar URLs by crawling the web.

Efficiency

Gravatar method is actually efficient too. Let's measure the efficiency.

Total number of email users in the world: 3.8 Billion

Although some users may have multiple accounts, let's go with one mail address for each user.

So we have 3.8 billion email addresses.

An average consumer computer can generate hashes in Millions per second.

A high-end gaming computer that has a graphics card can generate hashes in Billions per second.

Application-Specific Integrated Circuit (ASIC) is a chip designed for specific applications. For example, an ASIC designed for Bitcoin usually has a huge hash rate.

How much are we talking about?

Let us grab the screenshot for AntMiner S9

AntMiner s9 Hash Rate

Can you tell us what "TH" stands for in that screenshot?

Exactly...

Trillion Hashes / Tera Hashes.

In the screenshot they claim, the chip can generate up to 14 Trillion Hashes per second.

If you try 1000 name combinations for each email address, you would use only 3.8 Trillion hashes for 3.8 Billion email addresses.

So you have used roughly quarter of a 1 second to try all the email addresses available in the world.

That's more efficient than sending emails to services like Gmail to validate email addresses. Wouldn't you agree?

Privacy

Gravatar means globally recognized avatar right? If you signup to any website that supports gravatar, then your avatar URL going to be the same. This is the real problem.

Let us explain clearly. Let's say you have a website example.com and you would like to support Gravatar.

There is no API for Gravatar. All you have to do is just take your user's email address and generate email hash.

Now just load the following URL for the image. That's it.

https://secure.gravatar.com/avatar/{your user's MD5 email hash goes here}

If you can do that, then everyone in the world can do that too right? That is the problem here.

In Internet sex sells. There are plenty of people out there who use the same email address for everything from professional use to signing up for porn websites.

For the sake of our argument, imagine you are a girl who goes kinky in such websites and one of your colleagues is stalking you. Now if your colleague does a deep we scan, that would reveal all your activity if the site supports gravatar.

As far as we know Gravatar TOS doesn't exclude any such websites.

Even if a site doesn't support Gravatar today, there is no guarantee the site won't support it in the future.

To be quite honest, we are less concerned about the porn websites.

There are things that require more privacy. e.g. A person from a suppressed country who protest under a pen name now can be traced back.

We can even give you more examples. People who hide their sexuality in the real world but open about it on the Internet, People who seek discreet medical help on public forums etc.

Let us demonstrate the issue by using one of their team member avatars.

Pay attention. We are going to use only the avatar URL to find the user related activity on the internet

Toni's avatar on Gravatar Blog
Toni's avatar on Gravatar Blog
Copy avatar URL
Copy avatar URL
Google that avatar URL
Google that avatar URL
Valid Result 1: Article written by Toni on Gravatar blog
Valid Result 1: Article written by Toni on Gravatar blog
Valid Result 2: Personal Blog Of Toni
Valid Result 2: Personal Blog Of Toni
Valid Result 3: A comment posted by Toni on Numenity blog
Valid Result 3: A comment posted by Toni on Numenity blog
Valid Result 4: Article written by Toni on WordPress blog
Valid Result 4: Article written by Toni on WordPress blog

Google indexed only certain pages. But if you build a web crawler only for that particular job, then you can have more results.

Even if "Toni" change his name to "John" while signing up to a website or commenting on an article, the avatar URL going to stay the same since it's linked to his email address. So he can be traced back.

Again... We found those results, using only his avatar URL. not his name.

Government agencies can able to create full-fledged scanning tool only for this purpose.

Now we know what you are gonna say.

"I have never heard of Gravatar before. So why should I bother?"

Well... we got news for you. The disturbing thing here is that It doesn't matter whether you have signed up for Gravatar or not.

Keep in mind, the subject of our discussion here is "Gravatar URL". Not "Gravatar Users"

If you have ever used your email address on a third party website for commenting or signing up, chances are your privacy is at risk.

This is because third-party websites have no idea whether you had signed up for gravatar or not. So they need to build the Gravatar URL for everyone using email hash.

If there is an avatar linked to your email address, then that avatar will be displayed. Else a default avatar will be displayed.

The blog on the next image contains 500+ users comments with avatars.

Blog Comments
Blog Comments

The comments that have an avatar are the real "Gravatar" users. The comments that have dummy avatar are "Non-Gravatar" users.

Non-Gravatar vs Gravatar Images
Non-Gravatar vs Gravatar Images

Pay attention to the "Non-Gravatar" user avatar URLs. Email addresses are still hashed there.

Non-Gravatar user URL
Non-Gravatar user URL

There may be few million gravatar users. But you can most likely find billions of gravatar URLs on the Internet.

For what its worth, We are not blaming Gravatar for this. Because the problem they solved is completely different. We are just pointing out the flaws in their system.

Note: Gravatar privacy issue applicable only to the public pages that can be crawled.