Dombox - Conversational Email Address (CEA)

Spam Filters

The world we have today, primarily rely on spam filters to reduce spam. Spam filters are only silencing the spam problem. Not solving it. Bandwidth, CPU and Storage still getting wasted. Spam filters are not perfect. But it works for most people. This creates a "placebo effect".

The problem with spam filters is that you need to train them what is good and what is bad. To train, you need lots of data. It is much easier to find billions of spam emails (i.e. Unwanted mails) on the internet to train your spam filter. But, it’s very hard to find ham emails (i.e. Wanted mails) due to its sensitive nature. Nobody wants to give away their "genuine" inbox mails just to train your spam filter.

Gmail's biggest strength is not its proprietary spam filter algorithm, but the billions of users data they have. So even if Google open source their proprietary spam filter algorithm, your spam filter never be as good as Gmail unless you can convince billions of people to trust you with their data.

Probability vs Certainty

Spam filters relies on Bayes' theorem. It is all about calculating probability. Since spam filters deal with probability rather than certainty, spam mails need to be stored in the spam folder for further end user inspection.

If we ever gonna solve the spam problem, then we need a "Certainty" system, not "Probability" system. When we have a "Certainty" system, we can reject mails instead of accepting it. When we reject mails, we also save Bandwidth, CPU and Storage.

Universal Mechanism

The reason why we rely on spam filters is because we don't have any universal mechanism which can prove that the sender is who they say they are. i.e. If the email is claiming it's from jeff@amazon.com, that's not always gonna be true.

Sender Policy Framework (SPF) is one of the mechanisms we have for email to detect email spoofing. It was introduced in 2006. In SPF, the receiving server usually compare the "Incoming mail IP address i.e. Client IP" with the whitelisted IP addresses found in the "Envelope Domain" SPF record.

For example this is the SPF record of facebook.com.

dig +short txt facebook.com
"v=spf1 redirect=_spf.facebook.com"
dig +short txt _spf.facebook.com
"v=spf1 ip4:66.220.144.128/25 ip4:66.220.155.0/24 ip4:66.220.157.0/25 ip4:69.63.178.128/25 ip4:69.63.181.0/24 ip4:69.63.184.0/25" " ip4:69.171.232.0/24 ip4:69.171.244.0/23 -all"

Facebook.com says, accept mails only from those whitelisted IP addresses when the MAIL FROM email address ending with @facebook.com

But there is one bigger problem with SPF. It's an OPTIONAL mechanism. i.e. There is no internet standard that says, a domain MUST configure SPF.

The popularity of SPF record fades away once we get past the Alexa top 1 million domains. So if we rely only on SPF record, then the solution may work for the 100th domain, but not gonna work for the 100 millionth domain.

Hot Gates

Have you ever watched the Gerard Butler starred movie 300? If yes, let us ask you a question?

In that movie, King Leonidas and his soldiers battle against 300,000 persian soldiers, near a narrow pass called "Thermopylae aka. Hot Gates".

Our question is, Why Hot Gates? Why not battle in an open ground?

That's because these spartans strength not only lies on their superior fighting skills, but also lies on their tactical advantage. Without "Hot Gates", the whole battle would have been an instant massacre.

To solve the spam problem, we need such tactical advantage. To have such tactical advantage, we need to make our system deal with "conversational mails" and "non-conversational mails" separately.

Non-Conversational Mails

Let's deal with "non-conversational mails" first.

We need to offload all non-conversational mails to Domboxes. Dombox stands for Domain-based Isolated Mailbox. You create a new Dombox for each and every domain you signup. Note: You have full control over Domboxes. So you can delete them anytime you want.

There's more to Domboxes. Visit our features page to fully understand how Dombox addresses work.

Conversational Mails

Email is ubiquitous. You know what else is ubiquitous?

MX Records. They were introduced in 1986.

Today all email communications utilize MX records for transmitting mails from one domain to another.

e.g. When john@example.com sends an email to jane@gmail.com, Gmail.com MX record is queried and then mail will be transferred to one of the Gmail MX servers. When Jane reply to that mail, example.com MX record is queried and then mail will be transferred to one of the example.com MX servers. So Conversational Mails requires MX record on both sides.

The term "Conversational Mails" can be termed as Mailbox-to-Mailbox / MX-to-MX / Human-to-Human emails.

So If we use "MX Records" as "Hot Gates", then we can solve the email spam problem.

An email address that only deals with "Conversational Mails" is called "Conversational Address" in our system.

An incoming mail for a conversational address can be either from a self-hosted sender or a third-party hosted sender.

Self-Hosted Sender

When a mail coming from richard@piedpiper.com, we are gonna compare the "Incoming mail IP i.e. Client IP" address with the IP addresses extracted from the following records.

dig MX piedpiper.com (MX Records)

dig TXT piedpiper.com (SPF Record)

dig A piedpiper.com (A Record)

If the IP address not found in the extracted IP addresses, then we are gonna check whether "richard@piedpiper.com" is a recognized contact by searching in recipient's address book. If that fails too, then we reject the mail with the following error.

550 Unverified and Unrecognized Sender. Please send this mail from one of your MX server IP address OR whitelist the IP address [XXX.XXX.XXX.XXX] in piedpiper.com SPF record.

Third-Party Hosted Sender

When MX server domain not ends with the same domain, then that domain will be considered as a third-party hosted domain.

In this case, piedpiper.com hosting their mails in Google servers.

So we are gonna compare the “Incoming mail IP i.e. Client IP” address with the IP addresses extracted from the following records.

dig MX piedpiper.com (MX Records Points to google.com)

dig TXT piedpiper.com (PiedPiper SPF Record)

dig A piedpiper.com (A Record)

dig TXT google.com (Google SPF Record — The base domain of MX host)

If the IP address not found in the extracted IP addresses, then we are gonna check whether "richard@piedpiper.com" is a recognized contact by searching in recipient's address book. If that fails too, then we reject the mail with the following error.

550 Unverified and Unrecognized Sender. Please send this mail from one of your MX server IP address OR whitelist the IP address [XXX.XXX.XXX.XXX] in piedpiper.com SPF record or google.com SPF record.

IP-based Reputation System

IP addresses are unstable.

You go to starbucks for a coffee. Connect to their wifi. Now you have an IP address. Let's just say you keep sending spam from that IP address. A receiving server may rate limit mails from that starbucks IP address to safeguard their systems from abuse. However, the receiving server can't ban starbucks IP address forever.

Your internet service provider (ISP) assigns you an IP address. Once you close your account, that IP address will be released and assigned to some other person.

If you search for "free ip proxy", you can find millions of results. So IP addresses are literally free.

To summarise, IP addresses are cheap, unstable and the punishment for spamming is temporary.

Domain-based Reputation System

Our system is a Domain-based reputation system. i.e. The sending IP must be whitelisted in the sending domain either implicitly (MX, A / AAAA) or explicitly (SPF) in order to deliver conversational mails.

A Domain-based reputation system have the following advantages over IP-based reputation system.

Not Free - Starbucks is not going to give you their domain "starbucks.com" for sending out mails. That means, a spammer need to spend some money in buying a domain.
Increased Cost - If IP addresses are equivalent to "Silver", then domains can be termed as "Gold". So Domains are costly while compared to IP addresses. This increases the cost for spamming.
Stable - Once a domain is flagged as spamming domain, we can permanently ban it. This means, a spammer need to keep on buying more domains.
Registration Date - We can use the domain registration date to rate limit mails from fresh domains.
Botnets - Spammers hijack lots of computers to get fresh IP addresses for spamming. For example, Bredolab botnet was estimated to consist of millions of computers. At its peak, it was estimated that 3.6 billion emails with Bredolab virus payloads were sent daily. A domain-based reputation system can make such hijacking useless.

Challenge / Response

Our "conversational address" accepts only the verified mails. i.e. The sender is who they say they are.

Sometimes spammer may configure everything properly and be okay with getting their domain banned. In order to survive from such spammers, we need another barrier.

If you are someone who don't want to annoy your sender, then you can go with the traditional "spam filter" route. i.e. Those verified mails will be passed into a spam filter for further inspection. Since we use the "spam filter" as a secondary spam prevention mechanism, we are dealing with only a tiny percentage of emails here.

On the other hand, If you are okay with annoying your senders little bit, you can make our system to send a challenge mail saying "Please click this link and prove that you are human by filling the CAPTCHA to deliver the mail. Mail will be automatically discarded if you don't prove that you are a human within the next 7 days".

Since spammers don't want to keep filling CAPTCHA to deliver mails, this kind of system cut down a lot of spam.

Note: "Authorized Personnel" skips the "CAPTCHA" process. The following addresses are called "Authorized Personnel" in our system.

The email addresses found in your "Address Book / Contacts".
The email addresses whom you have mailed.
The email addresses whom you have replied.