Seattle Web Design
Seattle Web Design
Feed Burner Subscribe Via Feed
Subscribe Via Email:

Contact Aldebaran Website Design Seattle

Meet the author:
Jill Olkoski

Jill has a MA in Clinical Psychology, a BS in Computer Science, and a BS in Mechanical Engineering.

She currently owns Aldebaran Web Design in Seattle WA and enjoys educating her clients on topics related to small business website design.

In Jill's previous life, she spent 17 years in the engineering and quality organizations of a Fortune 100 tech company.

Please enjoy the articles and leave a comment!


Kavam.net Traffic Mystery: Limelight Networks in Tempe Arizona? It might be from SearchMe’s Spider/Robot

February 3rd, 2008

I watch my website traffic closely, and this morning I noticed a strange thing: many hits to my website, always only one page, and they all came from different subdomains of Kavam.net, all from an ISP named Limelight Networks LLC, and all from Tempe Arizona. I remember a while ago reading about Kavam being connected with a search engine robot, so I looked into this traffic.

Here’s a snapshot of what the traffic looked like (click to enlarge):
kavamvisits.jpg

The Whois for Kavam.net showed:

Registrant:
searchme
Domain Administrator
335 Bryant st
3rd Floor
Palo Alto, CA 94301
US

I looked up Searchme.com and found this on a page having to do with their spider, called Charlotte:

Who are you and why are you accessing my site?

Charlotte is a spider created by Searchme, Inc. in Mountain View, CA, a startup that is indexing the web for sites to include in its search engine index.

We are not attempting to steal any copyrighted information from your site and will not be re-distributing your content. We will only be allowing users to find your website more easily.

How can I control Charlotte’s access to my site?

The easiest way to prevent Charlotte from accessing your site is via the robots.txt file, which should be located at the root of your website, ie: http://www.mysite.com/robots.txt. If you add the following lines, requests to your website will soon stop:

User-Agent: Charlotte
Disallow: /

It may take up to an hour for this change to take effect. A common mistake people make is to deny access via robots.txt and also deny via another method, such as blocking the IP from which the requests are coming. This ends up blocking the spider’s access to robots.txt, and when robots.txt can’t be retreived, the spider continues requesting other pages. It’s best to wait for roughly an hour for the robots.txt changes to take effect before attempting the more drastic measure of IP-based blocking.

You may be asking yourself, why do I care about this traffic?

I care, because I measure “bounce rate” statistics for my website. Bounce rate is the percentage of visitors who come to only one page and leave. That’s exactly what this robot appears to be doing, and therefore, it will artificially raise my bounce rate. I’m going to contact my website traffic company, Web-Stat, and see what they say about this traffic source.

And comments on this topic would be much appreciated!

UPDATE:
Although I updated my robots.txt file over an hour ago, I am still getting visits from Kavam.net - so apparently Charlotte the Kavam.net spider is not paying attention to my robots.txt file.

I asked my website traffic tracking company Web-Stat, their impression, and here’s what they said:

“It indeed looks like a robot (which is strange because robots do not normally load the Web-Stat [javascript] code so they are not counted in the stats, but for some reason this one does).

You can exclude it by entering 208.111.154 in the IP exclusion box in the Web-Stat Control Panel (click on Configure first)”

Aren’t you impressed that my traffic tracking company actually is available to talk to me about things like this? I sure am :-)

Jill
--------------
J. Olkoski
Aldebaran Web Design, Seattle
Jill Olkoski has a BS in Engineering, a BS in Computer Science and a MA in Clinical Psychology. She delights in using her advanced technical, psychological and interpersonal skills to help small business owners develop cost-effective and successful websites.


Liked This Article? Please Bookmark It!
StumbleUpon[StumbleUpon]   del.icio.us [del.icio.us]   Digg It[Digg]   Facebook[Facebook]   Technorati [Technorati]  

35 Responses to “Kavam.net Traffic Mystery: Limelight Networks in Tempe Arizona? It might be from SearchMe’s Spider/Robot”

  1. carl Says:

    thanks for the info. i too was being bombarded by this little spider. now i know how to block it…

  2. CJ Says:

    Hey Jill,

    Thanks for that. Kavam’s been annoying me for months, too.

    Regards,

    CJ

    PS Of course, this post is now #2 on Google for “kavam.net”

  3. J Barber Says:

    We were (are) getting hit hard by this spider - averaging about 1,000 “visits” per day for the last 2 weeks! It really sucks because each request shows up in our analytics as a new visit, 0:00 spent on site, 99.9% bounce rate. We’ve also noticed short periods of high sever load as well during this time, anyone else seeing performance issues associated with the bot?

    I’ve added the IP address you mentioned above to our analytics reporting exclusion list - I assume the last # in the address above is left blank to catch all the IPs in the range? I also plan on contacting Limelight to tell them to fix their bot!

  4. Jill Olkoski Says:

    Carl and CJ,
    You’re very welcome, I hope we are all a little less annoyed soon.

    J Barber,
    Wow, 1000 visits per day - that would certainly ruin your bounce rate stats. I haven’t seen any performance issue…but we’ll see what the others say.

    Yes, the last # is left blank as a wildcard - it should eliminate all traffic from all the numbers, hopefully. And do let us know how contacting Limelight goes!

  5. TourPro Says:

    Start IP End IP
    69.25.60.0 69.25.61.255
    208.111.154.0 208.111.154.255
    209.249.86.0 209.249.86.255

    Blocked!

    Initially, I thought I had a new fan in Tempe, but now I see that it is merely an annoyance to be ignored.

    Thanks.

  6. Lu Says:

    I don’t have a website, just a blog, but I, too, was getting slammed by this thing. I was beginning to think I had a stalker — all my blog read but no comments made, and no referring links… kinda scary.

    Thanks for the info. I wish there was a way to block them altogether. But for now I’m satisfied that I at least know what this traffic is.

  7. Jill Olkoski Says:

    Hi TourPro,
    Thanks for sending the IP ranges you’re seeing!

  8. Jill Olkoski Says:

    Hi Lu,
    Perhaps there is a WordPress plugin that would enable you to block IP addresses, not sure. If you find one, let us know.

  9. Charity Says:

    Hi Jill, here via a Google search. Thanks for the informative post! I sadly have just a dinky little LiveJournal, but I’ve noticed the hits from them too. At least now I know it’s not some weirdo just reading each individual post one by one for what purpose I could not imagine! I’m fascinating, but hardly THAT riveting. Thanks again!

  10. Jill Olkoski Says:

    Hi Charity,
    You’re very welcome…it certainly was a creepy feeling the first time I saw it as well, but it’s nice knowing we’re not alone!

  11. Lu Says:

    Jill,

    I’m with Typepad, so I can’t say about Wordpress plugins (I don’t know if they’d work even if I could pull one in)… but I have opened a ticket with Typepad support on this issue. I’ll let you know what happens with that.

    I’ve also opened a support ticket with Sitemeter.com, one of two of the stats counters I use, to see if I can exclude the IP range from my count. I don’t have but a few visitors, so these constant visits really mess up my count.

    I’ll let you know when I know more.

    As others have said already, thank you again for this post. VERY helpful!

  12. Mad William Flint Says:

    yeah I was all kinds of psyched to have new readers ’til I drilled down.

    I’m pretty much stuck with ‘em as I don’t have deep enough control to nuke the IPs. It’s a hit every 45 minutes or so for me.

  13. Angie Says:

    Jill: I found your site after googling “kavem.net.” I’m also getting regular hits.

    It’s been a couple of days now since you posted this…are you still getting hits from kavam even after disallowing Charlotte? Just wondering whether I should bother with altering my robot.txt file. Unfortunately, I can’t block specific IP’s. Thanks!

  14. Stacy Says:

    Mine was around 100 a day. Always accessing different pages, these things never show for me in StatCounter, but they were this time; it obviously concerned me. I blocked the IP range through my cPanel.

  15. Jill Olkoski Says:

    Hi Angie,
    I initially blocked them via the Robots.txt, but after two hours, was still getting hits, so I switched to simply blocking them from showing in my stats. So I am probably still getting hits, just not seeing them in my stats. So I’d say, give Robots.txt a try and let us know if it works.

  16. Jill Olkoski Says:

    Hi Stacy,
    Thanks for the comment! Glad you got them blocked.

  17. Norbert Spittka Says:

    Hi Jill

    thank you very much for the informations. Since several days I thougt I have a very good fan in Tempe because he visited my german website very often. Now it is obvious to me that the fan is a spider :-(

    Norbert

  18. e Says:

    i don’t know enough to figure out how to do this, but i’m gettimg help so i
    i’m hopeful now. all the hits are just about to drive me to close down my tiny little blog! i even called tempe and, after trying many disconnected numbers, got some poor guy in sales who said it was due to youtube (!), wasn’t the least bit interested in even taking down the offending ip #, but did call back to transfer me to “support”, which then hung up on me before i got through to anybody. if they are a ligitimate concern, i feel sorry for them apparently they don’t care that they’re getying trashed all over google.

  19. Angela Says:

    I have been going crazy trying to figure out what this activity is on my site. Here is a new twist though - I tracked the activity on my site very closely and noticed it went into a gift certificate page in my site along with all the days and days and days of just one page drops and when I saw it in this particular page thought it was very strange - because I get no activity on my gift certificates - so when I went into check the activity on my gift certificates page found that this activity included dropping text links to porno sites and a whole lot of garbage in my gift certificates message text box!!! I am obviously very upset about this activity and have contacted this isp in Temple, AZ and copied this activity I captured and have insisted they let me know what is happening here. ???

  20. Jenny Says:

    My cafepress store has been getting visits from this for weeks now. More and more often. ANNOYINGLY often. So many hits I cannot even find real traffic in my stats. I’ve been reading the posts here and most complain this spider is landing and leaving. Why then does my stat tracker show some visits from Kavam lasting over 20 HOURS? I’ve never seen any search crawler do that before.
    I’d like to see what search engine they are building. Do some simple searches and just see if any of my pages come up. But I cannot even do that.
    Do they have a search engine up and running yet? Is there some place I can visit and do a search? Or are they building some new one?

    I think unless there is some actual search engine on line (in which case, good.) I don’t know if I want to be crawled to death each and every day.
    I don’t want to put any anti crawling code on my store unless I can specify THIS one in particular. Is that possible?

  21. AscenderRisesAbove Says:

    It has been sitting on my site for six or seven days now. I watch it go from post to post hour after hour. I tried to post it in wordpress but since it is not leaving comments I am unable to do so. I have stat counter. Is it possible to block it in there.

    Will it get full eventually and move on?

    I keep checking back here for updates. Thanks for keeping us informed.

    J

  22. Jill Olkoski Says:

    Hi Norbert…sorry your Tempe fan turned out to be the Kavam.net spider :-(

  23. Jill Olkoski Says:

    Hi Ascender,
    “Will it get full eventually and move on?”
    I doubt that spiders get “full”…I think you need to take some action to block it, unless they stop on their own.

  24. Jill Olkoski Says:

    Hi Jenny:

    “Do they have a search engine up and running yet? Is there some place I can visit and do a search? Or are they building some new one?

    I think unless there is some actual search engine on line (in which case, good.) I don’t know if I want to be crawled to death each and every day.
    I don’t want to put any anti crawling code on my store unless I can specify THIS one in particular. Is that possible?”

    I believe they’re building a search engine. If you read my first post, there’s a link to Searchme.com.

    You can try creating a Robots.txt file per their instructions on their page, this is the most conservative approach. If that doesn’t work, you’ll have to block it via .htaccess via IP numbers.

  25. Chris Says:

    thanks so much for the detailed information on these guys. Some associates of mine think that they might be running an intellectual property bot and some say they are taking screen shots.

    “Kavam.com Inc. manufactures and markets software solutions for corporates.”

    now the question is… is Kavam.com and Kavam.net related?
    http://investing.businessweek.com/research/stocks/private/snapshot.asp?privcapId=26492528

    they also send traffic through:
    ASN Name
    6461 ABOVENET
    22822 LLNW

    http://fixedorbit.com/AS/36/AS36737.htm
    love that tool for bot busting

  26. Jill Olkoski Says:

    Hi Chris,
    Sure looks like they’re related…and thanks for the additional info!

  27. Anonymous Coward Says:

    My servers see more than 83,000 hits from kavam.net’s bots for February 2008, and there are 5 days left before the end of the month. The bandwidth consumed is nearly 500 MBs.

    I plan to telephone them and ask why they need to hit so much — they’re eating up more bandwidth than Google’s bots (which are definitely worthwhile).

  28. Simon Jones Says:

    I know this is an old post, but I had my two blogs accessed so much over a period of a couple of weeks that the requests slowed my server to a crawl which required a restart to escape from.

    I believe it is the searchme.com search engine taking copies of my entire site for the purposes of creating a massive visual directory of the internet in time for the launch of their visual search engine.

    As much as I will welcome their users finding my content I have closed the door to their repeated and obnoxious requests on my server. I found out how to do this from the following forum.

    http://www.webmasterworld.com/analytics/3560874.htm

  29. Jill Olkoski Says:

    Hi Simon,
    It may be an old post, but it get the most traffic hits of anything I’ve written - so your comment will be read by probably around 10 people each day. So thank you very much for the information and the tip on how to stop the traffic.

  30. Dr. Nicole Sundene Says:

    I’m confused…I thought we wanted a low bounce rate…so this bot is “lowering it”…?

    Maybe I need to go to bed LOL

  31. Jill Olkoski Says:

    Hi Nicole,
    Yes, you’re totally correct, and I’ve fixed the error above. If the bot comes and leaves, the bounce rate will be artificially too HIGH, not too low.
    Thanks for the correction!
    J

  32. Daryl Clark Says:

    Jill,

    Thanks for this information. I’m a web-stat user and have been for years. I’ve been getting frequent visits from Limelight and I was wondering why? The Kavam visits started today so your post was very useful. The bounce rate doesn’t matter to me but I’m interested in real visitors.

  33. Jill Olkoski Says:

    Hi Daryl,
    You’re welcome - it’s funny how different websites start getting hit by Kavam at different times. I was hoping that by now, and with all the complaints they’ve received (from myself included) they would have fixed this bad behavior, I guess not.

  34. Troels Nybo Nielsen Says:

    For months my statistics from Hitslink had shown that my blog at My Opera got visits from Tempe, Arizona. Nothing big, only a few visits a day, and not every day, just enough to suggest that my blog might have won some rather eager, but very shy, fan in Tempe.

    Those visits stopped for some time, but recently they have returned. And almost at the same time I began to observe a new regular guest from San Jose, California.

    Little by little I had become curious enough to make some investigations. This also included doing searches on the organisations that brought those visits to me. One of those searches brought me here.

    I wish to thank you, Jill. From your blog post it has become obvious for me that Limelight Networks in Tempe and Kavam in San Jose send bots to my blog. I have excluded their IP ranges from tracking in my Hitslink account.

  35. Jill Olkoski Says:

    Hi Troels,
    You’re very welcome and it makes me happy you found this article helpful. For those of us who watch our traffic carefully, it sure it nice to know how to exclude badly behaving bots, like Limelight Networks and Kavam.

Please Leave A Comment or Question:

(I'll respond to all questions by posting a reply as well as emailing you.)




Aldebaran Web Design, Seattle WA
206.523.6560
Jill@AldebaranWebDesign.com

[Home]  [New Website Design]  [Website Redesign]  [Search Engine Optimization]  [Content Management]  [Maintenance]  [eCommerce ]  [Website Portfolio]  [Testimonials] [Google AdWords Advertising ] [Contact ]

 ©2008 Aldebaran Website Design, Seattle WA
 206.523.6560
 All Rights Reserved
Small Business Website Design
by Aldebaran Website Design, Seattle WA
Site Last Modified: November 18, 2006
Search Articles:

Green Web Hosting! This site hosted by DreamHost.     Valid XHTML 1.0 Transitional