E-mail addresses check technologies
There are 2 stages in e-mail message delivery to the addressee:
- 1. The sender’s mail server determines the addressee’s mail server using DNS service;
- 2. The sender’s mail server connects to the addressee’s mail server via the SMTP protocol and transmits the message.
To check an e-mail address availability, it’s necessary to emulate these stages. The problem is that some mail services do not check the addressees’ e-mail addresses (mail boxes) actual existence in their domains when accepting incoming mail. All messages are accepted and then, if an address does not exist in fact, the mail service just sends the original message’s sender a response containing a delivery failure message. The number of e-mail addresses which belong to such mail services is about 30% of all e-mails. Their availability cannot be checked using software methods. Thus, only about 70% of unavailable e-mail addresses can be determined with the help of domain or email validator tools.
In its turn, about 30% of unavailable addresses which can be determined with software tools, are discovered on the first checking stage (DNS request) and to discover the other 70% the 2nd stage is necessary (SMTP connection emulation). The 2nd checking stage usually takes 10 times more time and 5 times more network traffic then the 1st one. In fact, the complete two-stage check of an e-mail address existence takes the same time and traffic as sending a short message to this address.
Let’s look at the check stages in more details.
Stage 1 The email verifying software parses the e-mail address syntactically, singles out the mail domain and sends a request to the DNS server to get the mail server of this domain. During the exchange with the DSN serves the UDP protocol is used which is faster then TCP because doesn’t involve connection establishment between the servers. Usually it takes 1-2 seconds to request a DNS server. This includes sending a request package (about 60 bytes including the package header) and accepting a response package (usually 200-300 bytes but not more than 512). This stage filters out all syntactically incorrect e-mails as well as e-mails in non-existent domains.
Note: The syntactical check performed by Email Verifier is a very simple one: e-mail address must include one @ sign and must end with one of the basic top-level domains (TLD). TLDs list is stored in the file BulkVerifier.tld in the application’s main folder. More precise syntactical check seems to be not quite reasonable since it will slow down the processing.
Stage 2 The email verifying software establishes connection to the mail server via the SMTP protocol (based on TCP). The TCP protocol is connection-oriented, so the servers dispatch service packages to establish the connection. After the connection is established, the servers exchange hello messages (the first lines in the log below). Then the sender’s address is transmitted and the receiving server submits the message from this address to be accepted. Then the addressee’s address is transmitted.
Here is a log example:
< 220 Thu, 22 Aug 2002 20:44:07 +0500
> HELO cisco.my.net
< 250-ns.watson.ibm.com Hello cisco.my.net [12.44.72.94],
< 250 pleased to meet you
> MAIL FROM:<verify@testmail.com>
< 250 <verify@testmail.com>... Sender is valid.
> RCPT TO:<noshuchaddress@ibm.com>
< 550 <noshuchaddress@ibm.com>... User unknown
> RSET
< 250 Resetting the state.
> QUIT
As you can see, the receiving server responded that the user with the address noshuchaddress@ibm.com is unknown and refused to receive a message for this user. Then the servers exchanged commands to close the connection.
During the address check the servers exchanged 10 messages with the total size of about 500 bytes. But in fact it took more than 20 packages to deliver these messages which led to the total expended traffic of about 2 KBytes. At that most of the time was spent waiting the response from the other server.
BulkVerifier can perform for you both complete (but slow) two-stage check of e-mail addresses availability and a high-speed check which involves only the 1st stage (DNS server request). This fast email verifier checks a validity of e-mail addresses or domain in any bulk email lists, database, spreadsheets/excel.
