1
   

Problems with robots.txt

 
 
Reply Sat 2 Jul, 2005 10:01 am
I have a problem with my robots.txt file, I enter urls to discallow, but both google and msn is still indexing them, I've had the robots.txt file on my phpBB forum since I started the board, so there's no way google or msn crawled the site without a robots.txt file.

This is the robots.txt I use:
Code:User-agent: *
Disallow: /admin/
Disallow: /db/
Disallow: /images/
Disallow: /includes/
Disallow: /language/
Disallow: /templates/
Disallow: /common.php
Disallow: /config.php
Disallow: /faq.php
Disallow: /groupcp.php
Disallow: /login.php
Disallow: /memberlist.php
Disallow: /modcp.php
Disallow: /posting.php
Disallow: /privmsg.php
Disallow: /profile.php
Disallow: /search.php
Disallow: /viewonline.php


And a site:www.domain.com search on google and MSN shows pages it shouldn't show, for example:

Google:
www.domain.com/privmsg.php?mode=post&u=38
www.domain.com/posting.php?mode=quote&p=408

MSN:
http://www.domain.com/profile.php?mode=viewprofile&u=3

(MSN does only show 1 page it shouldn't show. But google shows over 200 pages it shouldn't show according to robots.txt)



I thought that able2know is an optimized forum, so I can check their robots.txt file. (http://www.able2know.com/robots.txt). That robots.txt also disallow many pages, then I do a search for site:www.able2know.com on google and it's showing thousands of pages it shouldn't be showing...

What's wrong, why isn't google obeying the robots.txt files? Is there anyway I can change it on to make google understand it?
  • Topic Stats
  • Top Replies
  • Link to this Topic
Type: Discussion • Score: 1 • Views: 620 • Replies: 1
No top replies

 
Craven de Kere
 
  1  
Reply Wed 6 Jul, 2005 12:26 am
Simple answer: search engines will not spider (i.e. download) the blocked pages.

However if they crawl pages that link to those pages, they will often index just the link, without downloading its content.

In Google, for example, they show up as urls, and do not use the page title or any description.
0 Replies
 
 

Related Topics

Webdevelopment and hosting - Question by harisit2005
Showing an Ico File - Discussion by Brandon9000
how to earn money in internet - Discussion by rizwanaraj
The version 10 bug. Worse then Y2K! - Discussion by Nick Ashley
CSS Border style colors - Question by meesa
There is no Wisdom in Crowds - Discussion by ebrown p
THANK YOU CRAVEN AND NICK!!! - Discussion by dagmaraka
I'm the developer - Discussion by Nick Ashley
 
  1. Forums
  2. » Problems with robots.txt
Copyright © 2025 MadLab, LLC :: Terms of Service :: Privacy Policy :: Page generated in 0.03 seconds on 01/18/2025 at 11:17:35