* * * * *

               More Ins and Outs of Calculating Weblog Traffic

Obligatory Sidebar Links

The Ins and Outs of Calculating Weblog Traffic [1]
The Ins and Outs of Calculating Browser Usage [2]
Googletracking [3]



As I do occasionally, I run the stats for the Boston Diaries [4]. I use some
programs I wrote to pretty much manually go through the log files as I feel
it gives me a better feel for the actual traffic I get than if I were to use
a program like Analog [5]. Besides, doing it this way I often times find
interesting things going on with autonomous agents silently indexing websites
for their own nefarious reasons (muahahahahaha!).

I suspect that most people who run their stats don't take the time to really
look into the results, because it wouldn't surprise me if the reported stats
for most bloggers is inflated quite a bit.

I ran the stats as I have in the past and noticed that I had a higher rate of
traffic than normal; I usually get about 100 human hits per day but last
month it looked more like 116 per day. Okay, not that big a spike but enough
to make me curious as to what's going on. I look at some of the requests that
are being counted as human hits and I see [output truncated somewhat]:

> 213.60.99.73 GET /2002/11/29 HTTP/1.0 200 Mozilla [email protected]
> 213.60.99.73 GET /2002/11/29.1 HTTP/1.0 200 Mozilla [email protected]
> 213.60.99.73 GET /2002/11/23.1 HTTP/1.0 200 Mozilla [email protected]
>

Interesting … seems to be some unspecified robot. A quick query (Réseaux IP
(Internet Protocol) Européens Whois Query) [6] shows it to be from Spain, but
other than that, no real information unless I want to track this down
further. I'm not that curious, so add that to the list of agents to ignore
and rerun the stats.

Still high—about 114 visits per day. Check the requests and find:

> 12.148.209.196 GET / HTTP/1.1 200 Mozilla/4.7
> 12.148.209.196 GET /2002/6 HTTP/1.1 200 Mozilla/4.7
> 12.148.209.196 GET /2001/10 HTTP/1.1 200 Mozilla/4.7
> 12.148.209.196 GET /2000/6 HTTP/1.1 200 Mozilla/4.7
> 12.148.209.196 GET /2002/5 HTTP/1.1 200 Mozilla/4.7
>

Now that is odd. Netscape 4.7 is usually a bit more verbose about what it is
than just Mozilla/4.7. Looking up the address (American Registry for Internet
Numbers Whois Query) [7] I see that it belongs to NameProtect® [8]:

>  [9] NameProtect, Inc.® is committed to setting the industry standard when
> it comes to trademark research and registration services. As one of the
> world's leading trademark research firms, we have helped thousands of
> entrepreneurs, businesses, attorneys, and other intellectual property
> professionals with trademark needs.
>

“NameProtect®—About us [10]”

Oh how nice …

I probably wouldn't be so upset over these guys if they weren't tring to hide
behind a browser, or if they respected the Robots Exclusion Protocol [11],
but they don't do either (and I wonder what they'll think of my using their
logo here? It won't be the first time I got a cease-and-desist letter for
trademark violations—my first, and so far, only one was in September/October
of 1998).

> This section of your report includes information on generic top-level
> domain names (.com, .net, .org) and other country-specific domain name
> registrations that are similar to your name. Use this section to identify
> potential competitors and assess the potential for your web traffic to be
> diverted.
>

“NameGuard Free Name Monitoring [12]”

Okay, so removing the “anonymous” NameProtect® robot and rerunning again, I
see I'm now down to a more normal 106 human visits per day, but just on the
safe side …

> 4.64.202.64 GET /2000/08/30 HTTP/1.0 200 Mozilla/3.0 (compatible)
> 4.64.202.64 GET /2000/08/28.2 HTTP/1.0 200 Mozilla/3.0 (compatible)
> 4.64.202.64 GET /2000/08/31.3 HTTP/1.0 200 Mozilla/3.0 (compatible)
> 4.64.202.64 GET /2000/08/19.1 HTTP/1.0 200 Mozilla/3.0 (compatible)
> 4.64.202.64 GET /2000/08/14.7 HTTP/1.0 200 Mozilla/3.0 (compatible)
> 4.64.202.64 GET /2000/08/15 HTTP/1.0 200 Mozilla/3.0 (compatible)
>

Large number of requests from this address. 143 to be exact, the majority on
December 8^th and requesting entries mostly from August of 2000 [13]. Hard to
tell if this is an actual user or a robot someone is working on. If I filter
these requests out, I get 101 human visits per day.

Which is about what I expect.

[1] gopher://gopher.conman.org/0Phlog:2002/04/09.3
[2] gopher://gopher.conman.org/0Phlog:2002/07/16.1
[3] gopher://gopher.conman.org/0Phlog:2002/07/27.1
[4] https://boston.conman.org/
[5] http://www.analog.cx/
[6] http://www.ripe.net/perl/whois?form_type=simple&full_query_string=&searchtext=213.60.99.73&do_search=Search
[7] http://ws.arin.net/cgi-bin/whois.pl?queryinput=12.148.209.192
[8] http://www.nameprotect.com/
[9] gopher://gopher.conman.org/IPhlog:2003/01/05/logo.png
[10] http://www.nameprotect.com/about.html
[11] http://www.robotstxt.org/wc/norobots.html
[12] http://nameguard.nameprotect.com/ng/Conflict.po;jsessionid=c0bXI1WmapHdh_5qkp2c2t1d?a=172920&b=414904
[13] gopher://gopher.conman.org/1Phlog:2000/08

Email author at [email protected]