BulkVerifier – Bulk Domain & Email Verifier
Home | Download
| Purchase | Support | Help
User’s Guide
Introduction to e-mailing technologies
E-mail addresses check technologies
General Bulk Verifier features
Fast Mode - High Speed Bulk Verifier
Deep (slow) mode of Bulk Verifier
Bulk Verifier interface and options
Welcome to Bulk Verifier, an efficient
multi-threaded high speed verifier application for checking e-mail addresses and domain availability.
This advanced email verifier checks every email address from a given mailing
list, allowing you to determine if they are still exist.
Bulk Verifier offers you two processing modes – fast
and deep to clean and validate
email list and domains.
In its fast mode this Fast Email Verifier works extremely fast being able to process
mailing lists containing dozens of millions of e-mail addresses at a speed of
several thousands addresses per second. This mode does not ensure the highest
accuracy of checking but is optimal by expended time and traffic and provides
quite sufficient results. We recommend the fast
processing mode of Bulk Verifier as
a high-speed tool for sifting obvious rubbish out of large mailing lists
containing millions of e-mail addresses. For the details please see the section
“Fast mode of this high speed Bulk Verfier”.
In its deep (default) mode Bulk Verifier works significantly slower but provides much more
precise results. Optimal data amount for this mode is 70...100 thousand
e-mail addresses. We recommend the deep processing
mode of Bulk Verifier as a slow but
high-quality tool for checking of not very large mailing lists. For the details
please see the section “Deep (slow) mode of Bulk Verifier”.
There are 2 stages in
e-mail message delivery to the addressee:
1.
The sender’s mail
server determines the addressee’s mail server using DNS service;
2.
The sender’s mail
server connects to the addressee’s mail server via the SMTP protocol and
transmits the message.
A mail domain (e. g. mail.com
for the address nicky@mail.com)
is usually different from the name of the mail server which receives e-mail
messages for the address. For example, by the moment of this Guide creation the
servers mail-com.mr.outblaze.com and mail-com-bk.mr.outblaze.com accept messages for the address nicky@mail.com
while the computers with the addresses mail.com and www.mail.com
do not accept messages for any
e-mail addresses at all. That’s why you should not associate directly an e-mail
domain with the name of the mail server, since messages are often accepted by
another computer with absolutely different name.
To determine the addressee’s server address the request is sent to the
DNS service which stores (together with other data) the information about the
correspondence between mail services and mail domains which receive messages
for them. DNS is a distributed database. For example, the DNS server ns1.outblaze.com
stores all information about the domain mail.com but doesn’t have any
information about other domains (e. g. hotmail.com). At the same time
the server ns1.hotmal.com has information about the domain hotmil.com
but no data about other domains. There is a server responsible for domains in
the .com zone which stores the information about domains of this zone.
DNS server of your provider does not contain any records about mail.com or hotmail.com. When it receives a
request sent by, for example, mail.com,
it will ask the domain responsible for the zone .com for the address of the server containing the information about
the domain mail.com (it is ns1.outblaze.com),
then connect to this server and send back a response for you. Such request
execution is called recursive.
DNS technologies are
described in details in many public sources and are not the subject of this
Guide. What is important to know is that the request to a DNS service can pass
through several DNS servers in different areas before you get the response. And
the responsible for information storage about a certain domain is the owner of
the domain.
There is also a
technology of DNS requests caching. Usually a DNS server stores the results of
latest requests for several days to decrease the load on DNS servers and speed
up requests execution. This means that in case of some unforeseen changes in a
DNS server records it may take several days before the caches of other DNS
servers will be refreshed to provide their users with the updated information.
As it was already said
above, there are 2 stages in e-mail message delivery to the addressee:
1.
The sender’s mail
server determines the addressee’s mail server using DNS service;
2.
The sender’s mail
server connects to the addressee’s mail server via the SMTP protocol and
transmits the message.
To check an e-mail
address availability, it’s necessary to emulate these stages. The problem is
that some mail services do not check the addressees’ e-mail addresses (mail
boxes) actual existence in their domains when accepting incoming mail. All
messages are accepted and then, if an address does not exist in fact, the mail
service just sends the original message’s sender a response containing a
delivery failure message. The number of e-mail addresses which belong to such
mail services is about 30% of all e-mails. Their availability cannot be checked
using software methods. Thus, only about 70% of unavailable e-mail addresses
can be determined with the help of software tools.
In its turn, about 30%
of unavailable addresses which can be determined with domain or email validation software tools, are
discovered on the first checking stage (DNS request) and to discover the other
70% the 2nd stage is necessary (SMTP connection emulation). The 2nd
checking stage usually takes 10 times more time and 5 times more network
traffic then the 1st one. In fact, the complete two-stage check of
an e-mail address existence takes the same time and traffic as sending a short
message to this address.
Let’s look at the check
stages in more details.
Stage 1. The verify maillist software parses the e-mail address syntactically,
singles out the mail domain and sends a request to the DNS server to get the
mail server of this domain. During the exchange with the DSN serves the UDP protocol
is used which is faster then TCP because doesn’t involve connection
establishment between the servers. Usually it takes 1-2 seconds to request a
DNS server. This includes sending a request package (about 60 bytes including
the package header) and accepting a response package (usually 200-300 bytes but
not more than 512). This stage filters out all syntactically incorrect e-mails
as well as e-mails in non-existent domains.
Note. The syntactical check performed
by Email Verifier is a very simple one:
e-mail address must include one “@” sign and must end with one of the basic
top-level domains (TLD). TLDs list is stored in the file “Bulk Verifier.tld” in
the application’s main folder. More precise syntactical check seems to be not
quite reasonable since it will slow down the processing.
Stage 2. The checking software establishes connection to the mail server via the
SMTP protocol (based on TCP). The TCP protocol is connection-oriented, so the
servers dispatch service packages to establish the connection.
After the connection
is established, the servers exchange “hello messages” (the first lines in the
log below). Then the sender’s address is transmitted and the receiving server
submits the message from this address to be accepted. Then the addressee’s
address is transmitted.
Here is a log example:
< 220-ns.watson.ibm.com ESMTP Sendmail AIX4.3/8.9.3/8.9.0< 220 Thu, 22 Aug 2002 20:44:07 +0500> HELO cisco.my.net< 250-ns.watson.ibm.com Hello cisco.my.net [12.44.72.94],< 250 pleased to meet you> MAIL FROM:<verify@testmail.com>< 250 <verify@testmail.com>... Sender is valid.> RCPT TO:<noshuchaddress@ibm.com>< 550 <noshuchaddress@ibm.com>... User unknown> RSET< 250 Resetting the state.> QUIT
As you can see, the receiving server responded
that the user with the address noshuchaddress@ibm.com is unknown and
refused to receive a message for this user. Then the servers exchanged commands
to close the connection.
During the address check the servers exchanged
10 messages with the total size of about 500 bytes. But in fact it took more
than 20 packages to deliver these messages which led to the total expended
traffic of about 2 KBytes. At that most of the time was spent waiting the
response from the other server.
Email
Verifier can perform for you both
complete (but slow) two-stage check of e-mail addresses availability and a
high-speed check which involves only the 1st stage (DNS server
request). For the details please see the sections Fast mode and
and Deep (slow) mode of Bulk Domain / Email Validator. This Email Verifier is a kind of
software to verify email addresses and clean the mailing list from dead
addresses.
Email Verifier is a powerful e-mail checking
tool to verify your customers e-mail addresses from your mailbox or contact
files. It can process both plain list of e-mail addresses / domains where each
line contains one item and files of more complex structure where lines
represents multi-field records of the same structure (i. e. containing the same
fields separated with the same delimiter). For example, you can export a
worksheet of an MS Excel file to check availability of e-mail addresses/domains
listed there. It’s supposed that one line of an incoming file contains one
e-mail address and/or one domain. This Email Verifier can perform several
checks against an email address including syntax, dns MX lookup, top level
domain name validation, etc.
Bulk
Verifier stores domain check
results in the internal cache. If another e-mail address from the same domain
will be found in the same mailing list, Bulk
Verifier will not request the DNS server once again but will use the result
from the cache. Cache size is limited only by the memory size of your computer.
It takes 40 bytes of memory to store the result of one domain check. Thus, it
will take 40 MBytes of memory to store the results of check of one million
different domains. The time spent to find a previous check result in the cache
practically does not depend on the cache size.
The quality of DNS servers list used by Bulk Verifier (Options\DNS) also
influences deeply the application performance. If Bulk Verifier does not receive a response from a DNS server in a
specified period of time (Options\Timeout, in seconds), it makes new attempts
using another DNS service from the list each time. If all these attempts
failed, the e-mail address is listed as not checked due to the connection
timeout. The bigger the list of DNS servers which can be used by E-mail Verifier, the less is the probability that a couple of DNS servers
which have operating problems will affect the application’s performance.
Bulk
Verifier is a multi-thread
application. You can define up to 600 threads which will be used simultaneously
(one thread is used to check one e-mail address/domain).
Please note that using the maximal number of
threads is not always the best choice. For example, if you use 600 threads, the
application checks 600 domains at the same time sending up to 15 000
requests for DNS servers per minute. At that the traffic may amount to 700
kbps. A DNS server’s software may regard this as a hackers attack and block you
up.
It is also possible that DNS server can process
only a certain limited amount of requests per second from the same address
ignoring the rest of requests to ensure other users have enough resources to
work with the server. In this case the application productivity will decrease
significantly since some addresses will be checked repeatedly because previous
attempts to check them were unsuccessful due to timeouts.
Thus, if your network connection is capable to
provide the work of more than 50 threads, you should adjust your Bulk Verifier parameters as about one
DNS server (Options\DNS) per each ten threads. In this case you can be sure
that servers will not fail because of the overload.
Multithread applications work in different ways
with different operation systems of Windows family. Windows XP perfectly copes
with 600 processing threads; at that the processor load increases
insignificantly. Older operation systems (e. g. Windows’98, Windows NT4) are
more sensitive to big threads number and even a hundred of threads may lead to
a considerable processor load. We recommend you to use Bulk Verifier on computers controlled by Windows XP to reach the
application’s maximal performance.
In this mode this High Speed Verifier is able to process mailing lists containing dozens of
millions of e-mail addresses at a speed of several thousands addresses per
second. To switch to this mode please UNcheck the option “Advanced e-mail check
using SMTP” in the Bulk Verifier Options
dialog (see also the section “Bulk Verifier interface and options”).
Working in the fast mode, Bulk Verifier determines about 25-30% of unavailable e-mail
addresses in a mailing list. These figures may seem weak since theoretically up
to 70% of unavailable e-mails can be determined in a list using software
methods, but in fact these 30% can amount to 10% of the whole mailing list,
which is quite significant.
More precise check which allows to define
another 40% of unavailable e-mails is still available in the List check deep mode. But you should realize that the deep check may sometimes
take 10 times more time and 5 times more network traffic, which often makes its
use not quite reasonable for large e-mail lists.
In the fast mode, Bulk Verifier uses the stage of DNS requests to check e-mail
addresses availability. During an e-mail address availability check the
following actions are executed:
If the
initial e-mail address is syntactically incorrect or its top-level domain was
not found in the file Bulk
Verifier.tld, the address is regarded as invalid. The further processing
is not performed for this address.
In the deep
(slow) mode Bulk Verifier performs a
complete two-stage check of e-mail addresses availability. To switch to this
mode please check the option “Advanced e-mail check using SMTP” in the Bulk Verifier Options dialog (see also
the section “Bulk Verifier interface and options”).
The first stage of the check is absolutely the
same as the one used by the fast mode
of Bulk Verifier: the application
extracts the mail server address of an e-mail address out of DNS. Please see
the section “Fast mode of Bulk Verifier” for more details.
If the mail server address is extracted
successfully, the second processing stage starts. Bulk Verifier attempts to connect to this mail server and emulate
a message dispatch. No message is
actually sent during the e-mail availability check. Bulk Verifier establishes the connection with the mail server,
sends a “hello message”, transmits the sender’s address (Options\Sender) to
pretend there is a message and then transmits the addressee’s mail box address
(an e-mail address being checked). As soon as the receiving server confirms or
denies the requested mail box availability, Bulk Verifier disconnects.
Bulk
Verifier interface is simple and
intuitive. There are two windows in the application: main window and Options
dialog.
In Bulk
Verifier main window you can indicate the following parameters:
For this mode you can indicate the following
additional parameters:
The section Statistics will show you current processing results.
The section Log is used to reflect processing progress; the log-file can be
also created in the specified path. These features slow down the processing, so
they are disabled by default. To enable them check the option Enable above the log window.
To set DNS, e-mail, proxy and other parameters
open the Options dialog by pressing the button “Options” in the main window
toolbar.
The following fields are available here:
·
DNS – the address(es) of DNS server(s) you want to use during the
check. Here you can indicate several addresses (each on a separate line). If
the first DNS server in the list does not respond, the second one will be
requested and so on. This will slow down the processing but increase its
accuracy.
·
E-mail is the section where you can indicate the sender’s attributes which can
be used during the emulation of test messages sending (SMTP ID, sender
address). Please note that these settings are used only in the deep (slow) mode of high-quality check. See
the section “Deep (slow) mode of Bulk Verifier”.
·
Threads is a number of simultaneously available processing threads which define
a number of e-mail addresses being checked at the same moment.
·
“No relay” is not error – do not consider the DNS response “No relay”
as a sign of domain invalidity.
·
Advanced e-mail check using SMTP – to run Bulk
Verifier in the deep (slow) mode of high-quality check. See
the section “Deep (slow) mode of Bulk Verifier”.
·
Proxy is the section where you can set your proxy parameters (Server address, Authentification method, Port,
Username/Password, etc.) if you use one.
Bulk
Verifier represents processing
results in several files placed into the Output folder specified on the
application Options:
For example, if the input file name was
“Master.txt”, after the processing you may get:
Master.ses
Master.invalid.domains.txt
Master.valid.domains-and-emails.txt
Etc.
Home
| Download | Purchase | Support
Useful Resources:
Webmaster
- Google
-
ShareMe
- Business
- Download Software
- Electronics
- Shopping
-
File Heap - WebDir
- Dmoz
- Science
- Download Junction
Super Shareware -
WWWThreads -
World Travel