MarketCom's MktAgent v1.28

The InterNet Publicity Engine


MarketCom's MktAgent, webwalker and wanderer, is often but erroneously referred to as a "robot". Written in 100% pure Java, it is one of an entirely new breed of user's agents. Executing on your desktop system, it allows inspection of thousands upon thousands of homepages, taking action based upon the content that was observed.

MktAgent can generate tremendous website traffic almost overnight. When thousands of netizens become aware of a particular site, the word-of-mouth continues to bring visitors for a very long time. It is this initial "critical mass" that MktAgent was originally designed to attract.

But, MktAgent is also much more than that. By carefully planning your MktAgent "Ad Campaign", you can directly generate TARGETED marketing lists containing literally hundreds of thousands of potential customers. All on your Windows-95/NT system.

Given a starting URL (an HTTP address), MktAgent will search that site for any hyperlinks, and follow them, from there to anywhere. Along the way, it'll find Email addresses embedded within HTML pages. In effect, MktAgent crawls from site-to-site, locating and pushing people's "mail-ME" buttons.

MktAgent does NOT skim newsgroups, databases, and/or private subscriber lists. It locates addresses on published, accessible WWWeb-pages, and sends just a single message to each recipient. MktAgent will contact those people who have invited this comment by displaying their Email coordinates in a "please-mail-to-me" context (i.e. a MAILTO button on a public page).


MktAgent's Menus and Threads




Theory of Operation

As described above, MktAgent is designed to roam the World Wide Web looking for, contacting, and remembering Email Addresses. It contains various sub-programs that perform specific tasks. A few of the most important ones are described briefly here (and are covered in detail in subsequent sections).

Basically, you envision a MktAgent "Ad Campaign", and based on this vision, prepare a message that you believe will draw people to your business. Then, you search the web for people who are interested in what you have to say. Having found them, you send each a single copy of your Electronic Advertisement. Continuing in this manner, you can easily locate and contact hundreds of thousands of potential clients.

MktCrawler is the HTTP-crawling thread, that scours the WWW for published Email Addresses. It will check every address found against MktAgent's ever-growing lists and report all previously unencountered addresses to MktAgent. Any new addresses will be placed on MktAgent's Pending List.

MktMailer is the SMTP-mailing thread, which sends a message to each address found by MktCrawler. It obtains its addresses from MktAgent's Pending list, and will save all "sent" addresses in MktAgent's Sent list and in MktAgent's Email Address DataBase.

MktMailDB is similar to MktMailer in that it is another SMTP-mailing thread. But, MktMailDB is capable of remailing all addresses that are stored in MktAgent's DataBase.



MktAgent - Main Menu

The MktAgent menu has the following three (3) selections:

About - displays info about MktAgent, including its license agreement
Setup - invokes the MktSetup dialog to modify MktAgent's configuration
Exit - causes MktAgent and all of its threads to quit



MktSetup - Configuration Dialog

The MktSetup dialog box allows you to configure MktAgent. You can select this option from MktAgent's main menu at any time. MktSetup allows you to specify which sites/addresses are excluded from being searched by MktCrawler, and it lets you define the message headers and body that will be sent by MktMailer and/or MktMailDB.

Exclusion Lists Setup
The bottom half of the MktSetup dialog box allows you to control what types of sites and addresses will be ignored by MktCrawler. There are three (3) lists of exclusions:

Excluded Sites - domains that'll be skipped
Excluded Names - stock names that'll be avoided
Excluded Addrs - individual Email addresses that'll be ignored
These three lists are supplied containing those entities which MarketCom has already encountered. Any new MktAgent user should continue to update these, based on her/his experiences.

Excluded Sites is a list of domain names that MktCrawler will NOT add to its search list. It is impossible to crawl over anything in this list. Some large sites (such as search engines) do not have many MAILTO buttons and are better ignored. Other sites have 1000's of such buttons, and will get aggravated the first time MktCrawler visits them. No problem, just drop their domain name into the Excluded Sites list. After all, everybody in that site has already received the message.

Excluded Names is a list of stock names, things like WEBMASTER or SYSOP. Usually, whoever made the pages at a given site has some catch-all name on many pages, but their personal address will appear inside somewhere. Skipping these ADMIN-type names helps to avoid any duplication which can occur from sending to SUSAN@host.com and to WEBMISTRESS@host.com. This exclusion list is also a good way to avoid mail lists. Often list server names are identifiable (like MajorDomo, ListServ, or containing -list or somesuch). If a mailing list gets hit, you'll hear about, so just put its name on the Excluded Names list.

Excluded Addresses is the most specific of all. It lists individuals whose addresses should be ignored. This includes complainers, discussion lists, basically anyone whose feedback indicates that they don't want to hear about it. Sometimes lists slip in, when they've been given a non-list looking name (e.g. BAPA - Bay Area Pinball Association, real big complainers).

NOTE:
All three exclusion lists treat their entries as FRAGMENTS of names, links, etc. That is, putting .MIL in any list will cause MktCrawler to ignore anything which contains .MIL in its URL and/or Email address.

Email Message Setup

The bottom half of the MktSetup dialog box allows you configure the message that will be sent by MktMailer and/or MktMailDB. The following settings are available:
Return Address
the address of the message's sender, as it will appear on the "envelope".
Precedence
checkbox which allows selection of the Precedence header, where "normal" causes no such header to be emitted.
Priority
checkbox which allows selection of the Priority header, where "normal" causes no such header to be emitted.
Anonymize
checking this box will make the Return Address the same as the Recipient's address.

Use of this feature is encouraged only in very special circumstances.

When messages are anonymized in this way, they will pass through almost every Email server and/or client filter. Also, when bouncing, they will end-up in the SMTP server's ROOT mailbox. Some people find this very distressing (to have received a message from "themselves"), and as a consequence, start complaining loudly. When employing this mode, users would be well-advised to connect to an SMTP server that fails to reveal the sender's true IP address. There are many such services, easily locatable with your favourite search engine.
Subject Line
the subject of the message, as it will appear on the "envelope".
Message Body
the message's actual contents. This text area can contain HTML, MIME, extra headers, and so on; so, be creative.
NOTE:
By providing the precedence header, operators can choose "junkmail," "normal," "first class," or "urgent," and thereby make it possible for irritated SysOps to filter out MktAgent's messages, and at the same time, for the curious, open-minded ones to gather impact statistics. Almost all PostMasters appreciate the use of this header.



MktLists - RunTime List Manager

MktLists allows you to manipulate the contents of MktAgent's Pending and/or Sent list. Access to this execution thread is via the MktLists menu. It has the following 3 selections:

Stop List Operation
Abandon a long in-progress List opertion.
Load File into Pending
Read the contents of a flat text file into MktAgent's Pending list.
This file must contain only a single Email address per line, or a name and address, where the address is enclosed in angle brackets (e.g. somebody <username@host.com>).
Dump Both Lists
Empty out both the Pending list and the Sent list.



MktAgentDB - Address DataBase Manager

MktAgentDB allows you to examine and manipulate the contents of MktAgent's DataBase. This database is optimized to contain Email Addresses, which can have one of the following three states:

Sent
This address has been SENT a message.
Pending
This address has not been sent a message (i.e. it is PENDING).
Removed
Address will NEVER be sent any messages (i.e. it is REMOVED from all future mailings). Note that once an address reaches this state, there is no turning back. It won't ever be collected or emailed ever again.


MktAgent's database can be manipulated via the MktAgentDB menu, which has the following 13 selections:

Launch MktFilter
Startup MktFilter, the pattern matching database filtering utility, which is described below.
Compute DB Statistics
Scan the entire DataBase counting up the number of addresses that are Sent, Pending, and Removed.
Stop DB Operation
Abandon a long in-progress DataBase opertion.
Import Sent File to DB
Load the contents of a flat textfile into the Sent DataBase.
This file must contain only a single Email address per line, or a name and address, where the address is enclosed in angle brackets (e.g. somebody <username@host.com>).
Export Sent DB to File
Write all addresses in the Sent DataBase out to the textfile of your choice (one address per line).
Mark all Sent as Pending
Run through the entire Sent DataBase, changing all addresses' states to Pending.
Mark all Sent as Removed
Run through the entire Sent DataBase, changing all addresses' states to Removed. Exercise caution in the use of this feature, as it can easily remove your entire database.
Import Pending File to DB
Load the contents of a flat textfile (one address per line) into the Pending DataBase.
This file must contain only a single Email address per line, or a name and address, where the address is enclosed in angle brackets (e.g. somebody <username@host.com>).
Export Pending DB to File
Write all addresses in the Pending DataBase out to the textfile of your choice (one address per line).
Mark all Pending as Sent
Run through the entire Pending DataBase, changing all addresses' states to Sent.
Mark all Pending as Removed
Run through the entire Pending DataBase, changing all addresses' states to Removed. Since this feature can set the remove state on large numbers of addresses, some bit of care should be taken in its use.
Import Removed File to DB
Load the contents of a flat textfile (one address per line) into the Removed DataBase.
This file must contain only a single Email address per line, or a name/address, where the address is enclosed in angle brackets (e.g. some name <s.name@host.com$gt;).
Export Removed DB to File
Write all addresses in the Removed DataBase out to the textfile of your choice (one address per line).



MktFilter - Pattern Matching DB Filter

Launching MktFilter

Once MktFilter has been invoked (via its selection from the MktAgentDB menu), there are two values which must be entered, to define what pattern will influence the selection of Email addresses:

Operation
The pull-down list has six possible values (CONTAINS, DOESN'T CONTAIN, ENDS WITH, DOESN'T END WITH, STARTS WITH, or DOESN'T START WITH), indicating where in the address the given pattern should appear (or not appear).
Pattern
Whatever character string is entered into the text-field will be looked for in every address (at a position dependent upon the Operation field, see above). If found, that address will be effected by MktFilter. Leaving this field empty will cause MktFilter to select EVERY address.

Once the filter pattern has been chosen, MktFilter's database filtering operations can be initiated by using one of the five (5) buttons:

Count
Addresses which match the pattern are counted, totals are displayed in status line upon completion. It's a good idea to run this first, to verify how many addresses will be effected by the other buttons.
Sent
All addresses which match the pattern are marked as having been SENT a message. Note that this does not effect whether or not this address is REMOVED.
Pending
Matched addresses are marked as not having been sent a message (i.e. they are PENDING). Again, this will not effect the addresses' REMOVED state.
Removed
Selected addresses will NEVER be sent any messages (i.e. they are REMOVED from all future mailings). Be careful when using this option. It can remove your entire database at the press of a button. On the other hand, it makes removing whole classes of addresses quite easy.
Export
Addresses that are matched will be written out to the flat file of your choice (one address per line).



MktCrawler - MultiThreaded HTTP Crawler

Launching MktCrawlers

Probably the main work component of MktAgent is its webwalking spider called MktCrawler. This is a traditional "web-crawler", in that it is given a webpage to start from, and it examines this page for links to other pages/sites and follows them. To launch a MktCrawler, just enter an HTTP-type URL (Universal Resource Locator, something like http://www.xyzzy.web) into the Starting URL field at the top of the MktAgent window; then, press the launch MktCrawler button just below it. This will start one MktCrawler directed towards whatever website that URL points to.

You'll see a MktCrawler box appear, and links to other sites and other pages within the same site will begin appearing in its two list boxes. The upper box lists site names, the lower box lists specific pages within those sites. MktCrawler will proceed to search pages within a site in a First-seen, first-searched basis, and will continue examining pages until it has picked up Search Depth pages. Obviously, you can control how deep into a site MktCrawler will go, by changing this value before launching MktCrawler.

Search Depth is an important parameter. By setting it quite low (less than 10), you're basically conducting a "breadth-wise" search. That is, you're looking at the front few pages of each site. This is great for scanning over large lists of links or lots of so-called vanity domains.

On the other hand, you can set this quite high (or put it on ZERO to cause MktCrawler to examine every page in every site). This type of crawling is referred to as "depth-wise" crawling. Deep inspection of sites can be very useful when a MktCrawler is started from a list of all user pages at a large domain. Most Internet Service Providers (ISPs) publish exactly such a page, with links to everyone of their clients' sites. Full deep searching of such a list will often yield the Email address of every published account on their system.

Lastly, there's the Pause and Abort buttons on each MktCrawler, which do exactly what you'd expect. Pause asks MktCrawler to suspend searching as soon as it finishes the current page. Press it again and searching will resume. Abort will cause MktCrawler to exit, but only after it makes sure that's what you really want.

NOTE:
You can start as many MktCrawler threads as your system can handle. The number varies greatly, depending on your processor speed, your memory size, and the speed of your Internet connection. Even the smallest, slowest systems can usually handle 3 or 4. Pentium based (and faster) machines, with large memory spaces and ISDN connections, can easily execute 10 or more MktCrawlers.

Controlling individual MktCrawlers

Overall, MktCrawler is fairly simple to use; and, while it can run unattended, it is much more efficient, if its human operator helps out. While crawling (and possibly mailing), the user can hit the Slide and Float buttons to help MktCrawler to skip over files or entire sites that are not of interest. Slide skips only the currently inspected file. Float is more drastic, and abandons the entire site.

It can be quite interesting and instructive to "drive" the MktCrawler in this way. The driver will quickly notice where to find "link farms", guest books, or contact lists, trends in what links to what, sites to exclude for various reasons, stock names to avoid, difficult spaces, "black holes", all manner of hitherto unknown aspects of the WWW virtual terrain.

Controlling ALL MktCrawlers

You can use the selections on the MktCrawler menu, to Pause and/or Abort all currently executing MktCrawler threads. The Pause selection will toggle the "pause" state; that is, the first click will pause everybody, the next will restart them. Note also that neither Pause nor Abort take effect immediately. Both selections pass a request to all threads to perform the operation as soon as it becomes possible to do so (i.e. as soon as the currently waiting connection is complete).

What happens when MktCrawler finds an Email Address

Whenever MktCrawler locates an address, lots of things happen. First of all, that address is placed in a queue to be inspected by MktAgent. When it gets a chance, MktAgent will check that address against the Excluded Names and the Excluded Addresses lists. If it's not on those lists, then that address is searched for in the DataBase. If it's not there either, then it's considered a new address and placed on the Pending list, and entered into the MktAgentDB as a Pending mailbox.

If you don't plan to use MktMailer to send to these addresses right now, that's OK. They will be in the Pending section of the MktAgentDB for mailing later by using MktMailDB (see below).

Using MktCrawler with your favourite Search Engine

To improve the effectiveness of your MktAgent Ad Campaign, you can use any public search engine to help locate sites that are within your target audience. Just run whatever web browser you like and visit whatever search tool you prefer (no recommendations from us on either of these thorny issues). Refine and revise your search until you're viewing a search engine link page that has lots of links that look good. Copy the long, squiggly, CGI-type URL for that page from your browser's URL field. Then paste that URL in to MktAgent's Starting URL field and start a MktCrawler. It'll search all those sites for you, and the ones they link to, and on and on.

Remember to set the Search Depth field to a value large enough to accomodate how ever many links your search engine will return. HotBot and Excite, for example, can return 100 links per page, and will return a maximum of 1000 links for a given search term. You can start 10 MktCrawlers towards each of those 10 lists of 100 links and build a LARGE, TARGETTED list quickly.

The default .INI file, that MktAgent is shipped with, is setup to exclude crawling over all the "side-sites" that hang off of both HotBot and Lycos. These two search engines allow one to fetch large pages of links (50 - 100 pages long), and will return ten pages for each search. By excluding the "side-sides", MktAgent does not get bogged down trying to get pages from the bells & whistles (e.g. stocks.hotbot.com or personal.lycos.com); but rather, proceeeds immediately to the real work of searching the list you've located.



MktMailer - MultiThreaded SMTP Mailer

Launching MktMailers

Once MktCrawler has located new addresses and placed them in MktAgent's Pending list, it's time to startup a MktMailer or two. Assuming that you've used MktSetup to configure the message and its envelope, just enter the name of your SMTP server into the SMTP Port field in MktAgent's window and press the launch MktMailer button right next to it.

You'll see MktMailer box appear, and addresses will begin to disappear from MktAgent's Pending list and reappear on its Sent list. Those addresses have received your Electronic Advertisement.

The Batch Size field controls how many messages MktMailer will send in each connection to the SMTP daemon. Setting this to a high number (like 100) will result in somewhat better performance, since MktMailer won't have to wait as often to establish a connection. For unstable or slow mail servers, or when using the Anonymize feature (see the MktSetup section for more info on anonymizing your message), you should set this value very low, even as low as one (1).

The BCC Count field controls how many (if any) recipients will be included in a Blind Carbon Copy list. If this field's value is ZERO (the default), then no BCC list will be created. That is, for each recipient address, an individual message will be constructed and sent.

On the other hand, setting BCC Count to value greater than ZERO instructs MktMailer to combine that many recipients into a Blind CC-List and send a bulk message. This message will be addressed "To" and "From" whatever return address you have configured, and all of the actual recipients will appear in the BCC-List (this insures that no copied recipient will be aware of any of the others).

There are advantages and disadvantages to each approach. Individual Emails have a tendency to be more deliverable, for the simple fact that many mail servers have filters that will delete copy-listed mailings. Individual mailings are significantly slower however, since a new message must be constructed for each recipient.

By contrast, BCC mailings will run much faster. If you use a BCC Count of 50 (maximum is 100), you can expect to send about 30-40 times more messages than an individual mailing would have. Additionally, you would receive one message out of every 50 to your return address, so you can know for sure that your Electronic Advertisement is being sent. This can be quite important, since so many of today's SMTPs and POPs apply fatal filtering (i.e. they delete, bounce, or otherwise fail to deliver messages that meet certain criteria; that is, they are censoring your Email).

Agressive logging, monitoring, and filtering of Email can be such a problem for Electronic Advertisers, that MktAgent includes the MktScanner for finding and testing SMTP server daemons.

Lastly, you can monitor MktMailer's progress by checking the Sent Count and the Fail Count, which reflect exactly the totals that their names describe.

Controlling individual MktMailers

MktMailer has two buttons that function exactly like MktCrawler's. They are Pause and Abort, and again do exactly what you'd expect. Pause asks MktMailer to suspend sending as soon as it finishes the current message. Press it again and mailing will resume. Abort will cause MktMailer to exit, but only after it makes sure that's what you really want.

Controlling ALL MktMailers

You can use the selections on the MktMailer menu, to Pause and/or Abort all currently executing MktMailer threads.

The Pause selection will toggle the "pause" state; that is, the first click will pause everybody, the next will restart them. Note also that neither Pause nor Abort take effect immediately. Both selections pass a request to all threads to perform the operation as soon as it becomes possible to do so (i.e. as soon as the currently waiting connection is complete).

Sending a Test Message

You can send a test message, by using the selection on the MktMailer menu. Choosing this menu item will cause the message you have configured in the MktSetup window to be sent out through the SMTP server named in the MktAgent window. This allows you to verify that the mailer daemon you have chosen will safely relay your message, and that the envelope will appear exactly as you want it. For more information on locating usable Email servers, see the discussion of MktScanner below.

NOTE: It's a good idea to send a test Email prior to every mailing session.



MktMailDB - Address DataBase ReMailer

Launching MktMailDBs

The startup process is the same for MktMailDB as for MktMailer. The only difference between these two close-relatives is the source of the addresses that will be sent to. In this case, each MktMailDB thread will compete to find Pending addresses in the DataBase. As they are sent to, they are marked as Sent. Once the entire DataBase has been sent to, all MktMailDBs will exit.

You can start as many MktMailDB threads as your system can handle. The number varies greatly, depending on your processor speed, your memory size, the speed of your Internet connection, and the speed of the SMTP server that you pointed the MktMailDB to.

NOTE:
You can use selections on the MktAgentDB menu to scan the entire DataBase and toggle the state of Pending or Sent addresses. This allows you to reset the whole DataBase for remailing with a different message. Consult the MktAgentDB section above for more info.

Controlling individual MktMailDBs

Just like it's cousin MktMailer, MktMailDB has Pause and Abort buttons, and they work exactly the same.

Controlling ALL MktMailDBs

You can use the "Pause MktMailDB(s)" and "Abort MktMailDB(s)" selections on the MktMailer menu, to Pause and/or Abort all currently executing MktMailDB threads.



MktScanner - useful SMTP locator/tester

MktScanner is a feature unique to MktAgent, and is invoked via its selection from the MktMailer menu. To run, it requires that you supply a Return Address in the MktSetup window, which it will use as the destination address for its test messages.

Once started, it will run through the MktAgentDB, extracting the hostnames from the Email addresses that're saved therein. MktScanner will attempt to connect to every unique mail server named in the MktAgentDB. If successfull, it will perform a series of tests to determine how this mailer daemon constructs Email headers. The various servers might fall into one of six (6) possible categories:

Domains Not Found
These mailers could not be located. That is, they did not have a Domain Name Server (DNS) entry, and are either being moved to a new location or have been taken offline.
Connection Refused
While these mail servers could be located, they could not be connected to (because they are either down, offline, behind a firewall, or for any other reason unreachable). Obviously, these mailers are of no use.
IP Address Logged
These daemons perform a reverse-DNS lookups, to determine the ISP-name and IP-address of the actual originator of any/all messages that are relayed through. Servers in this catagory may relay Email for you, following your electronic request, however they do keep track of (i.e. log) who sent what.
Relaying Prohibited
Servers on this list refused to relay messages to locations outside their own domain. These mailers also are of no use.
Miscellaneous Failures
This list records the names of those SMTP servers that failed to complete the message transaction. Might be caused by an unresponsive server timing-out, non- standard error messages, or other bizarre configurations. Again, not too useful.
Useful SMTP Servers
This is the good list, because these SMTP daemons accepted the Email without regard to where its from or where its going. But, this only means that they are POSSIBLY safe; therefore, a test message will be sent through these servers to your return address.

But, that message might not arrive at all, because some systems run ChuckMail (the server that delivers all messages to the bit-bucket, cute huh?). Or, the test might arrive, only to discover that on close inspection of the RECEIVED headers, the originator's IP address was logged.

MktScanner can be instrumental in locating a useful SMTP server for sending your message. Do remember though, that you *must* examine the test Emails very carefully, to determine the exact contents of the actual RECEIVED headers.

Controlling MktScanner

MktScanner has three (3) buttons that're pretty self-explanatory. They are Pause, Abort, and Write, and they do exactly what you'd expect:

Pause
Asks MktScanner to suspend searching as soon as it finishes the current server. Click Pause again, and scanning will continue.
Abort
Causes MktScanner to exit, but only after it makes sure that's what you really want (i.e. it'll ask you to click Abort again, to actually exit).
Write
generates a flat text file (in the location of your choice), which contains the listings of all six (6) SMTP categories listed above.



Installation Notes

MktAgent support libraries

First time installations will require the Run Time Libraries to be loaded. If you've upgrading to a newer version of MktAgent, you can skip to the next section; otherwise, you should have downloaded the MktAgentRTL.exe self-extracting archive. When executed, that archive will install the following ten (10) runtime support files into your \Windows\System directory:

MktAgent program and documentation

The MktAgent128.exe self-extracting archive, when executed, will install the following four (4) files into your \MktAgent directory:

When you opened the self-extracting distribution archive, all of the above listed files should have been placed in a folder named MktAgent (in the root directory of your main hard disk). This is the location that the MktAgent executable requires, and at this point all you need to do is to ask Windows to run it.

MktAgent is relatively self-contained.

The first time MktAgent is executed on a particular system, it will take a few minutes to build and initialize its database, the MktAgentDB. Take a look at the green status line at the bottom of the main window. It'll show the progress of this initialization. Once it says "MarketCom's MktAgent is ready!", you can begin webcrawling, listbuilding, and emailing.

Good Luck! God Bless! Do Business!

For questions, comments, technical support, order enquiries, or whatever, please contact:

mailto:HipCrime@MarketCom.COM

This is the information file for MarketCom's MktAgent, which was designed, implemented, and is solely owned MarketCom, Houston, Texas

World rights reserved (c) 1997-1999. Use by permission only.