2
   

robot.txt 101 for dummies?

 
 
tthome
 
  1  
Reply Wed 9 Jun, 2004 03:24 pm
CDK or anyone else that can help....

Is there a tool online that can spider my site and tell me how well it "might" do against different search engines and possibly give suggestions on how to improve my site for SEO? RIght now I'm using bcentral's submit-It product to tell me what's good and what's considered bad, but according to what I've learned here, some of the suggestions that bcentral suggests that I do I probably shouldn't.

Thanks...

Tim
0 Replies
 
Craven de Kere
 
  1  
Reply Wed 9 Jun, 2004 03:26 pm
tthome wrote:
Is there a tool online that can spider my site and tell me how well it "might" do against different search engines and possibly give suggestions on how to improve my site for SEO?


Yes there are tons, but they are as accurate as blind monkey throwing dartboards.

Quote:
RIght now I'm using bcentral's submit-It product to tell me what's good and what's considered bad, but according to what I've learned here, some of the suggestions that bcentral suggests that I do I probably shouldn't.


Lesson number 1 with SEO:

Automated tools are usually not worth anything and usually give horrid, outdated and patently incorrect advice.

Automated SEO tools have more to do with making money off of n00bs than anything else.
0 Replies
 
tthome
 
  1  
Reply Thu 24 Jun, 2004 09:26 am
I seem to have some new vistiors from Google. The IP's I listed earlier in this thread you said were "fresh" bots. Well are these the "deep search bots"

IP address: 64.68.82.27
Host name: crawler10.googlebot.com
IP address: 64.68.82.181
Host name: crawler14.googlebot.com
IP address: 64.68.82.135
Host name: crawler13.googlebot.com
IP address: 64.68.82.136
Host name: crawler13.googlebot.com
IP address: 64.68.82.164
Host name: crawler14.googlebot.com
IP address: 64.68.82.10
Host name: crawler10.googlebot.com
IP address: 64.68.82.79
Host name: crawler12.googlebot.com
IP address: 64.68.82.159
Host name: crawler14.googlebot.com
IP address: 64.68.82.44
Host name: crawler11.googlebot.com

I sure hope they are, I've been waiting patiently to get "crawled" on... Laughing

Tim
0 Replies
 
tthome
 
  1  
Reply Thu 24 Jun, 2004 09:38 am
hmmm.... Sad

I just read that the freshbot has the ips of 64.* and that the deepbot has the 216.* IP's I guess I still have to wait...this is the first time, I've seen this type of activity on the site in such numbers from google.
0 Replies
 
Craven de Kere
 
  1  
Reply Thu 24 Jun, 2004 10:58 am
Well the good news is that Google has been phasing out deep crawl updates and they plan to rely on freshbot style updates.

They used to do a major update around once a month but now they do many more mini updates.

So the turn around is a lot faster these days and the days of waiting for deepbot may be over.
0 Replies
 
tthome
 
  1  
Reply Thu 24 Jun, 2004 11:29 am
CDK,

I don't want to detract or redirect ppl from the great information that you provide here, but can you suggest a site that has some of this new info about google or some site that lists "current google" actions that you refer to. I sure would be interested in reading a little bit about the changes.

Tim

BTW, I've had these freshbots on my forum nearly all day. I've seen mulitple instances for over 6 hours now.
0 Replies
 
Craven de Kere
 
  1  
Reply Thu 24 Jun, 2004 11:38 am
There are hundreds of thousands of such sites. Search engine watch by Danny sullivan is the leading Search engine industry news site.

But if you want the info, the info is only found on the search engines themselves.

Any list of tips and tricks will be outdated fast, if you are looking for my source of info it is the engines themselves.

The algo is guessed at through observation of their behavior.
0 Replies
 
lavinya
 
  1  
Reply Fri 13 Aug, 2004 08:50 am
hi all.

warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/a2k-post*.html$
24 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/a2k-view-poll*.html$
25 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/updates-topic.html*$
26 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/stop-updates-topic.html*$
27 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/ptopic*.html$
28 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/ntopic*.html$

url:
http://www.searchengineworld.com/cgi-bin/robotcheck.cgi

why warning ? pls help me.
0 Replies
 
lavinya
 
  1  
Reply Sat 14 Aug, 2004 12:57 am
hello .

Able2Know.com SEO 2.0.0 added my site. but not listed search engines my forum. Sad ?

mod added date: 11 agu
0 Replies
 
Craven de Kere
 
  1  
Reply Sat 14 Aug, 2004 01:31 am
lavinya wrote:
hi all.

warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.....

why warning ? pls help me.


It means exactly what it says, using wildcards in robots.txt is not a standard even though Google supports it.

Other spiders support other non-standard features as well.
0 Replies
 
Craven de Kere
 
  1  
Reply Sat 14 Aug, 2004 01:32 am
lavinya wrote:
hello .

Able2Know.com SEO 2.0.0 added my site. but not listed search engines my forum. Sad ?

mod added date: 11 agu


I will answer this question on the SEO mod thread, and not in all three of the threads where you asked it. Confused
0 Replies
 
lavinya
 
  1  
Reply Sat 14 Aug, 2004 02:48 am
thanks for your answer.
0 Replies
 
lavinya
 
  1  
Reply Thu 12 May, 2005 12:29 am
hello all.

-----
Robots.txt Validator

http status: 200 OK

Syntax check robots.txt on http://www.lavinya.net/robots.txt (219 bytes)
Line Severity Code
It validates, but has some bad style.

2 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/post-*.html$
3 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/updates-topic.html*$
4 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/stop-updates-topic.html*$
5 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/ptopic*.html$
6 warning Possible Missplaced Wildcard. Although Google supports wildcards in the Disallow field, it is nonstandard.

Disallow: /phpBB2/ntopic*.html$


robots.txt source code for http://www.lavinya.net/robots.txt
Line Code
1 User-agent: *
2 Disallow: /phpBB2/post-*.html$
3 Disallow: /phpBB2/updates-topic.html*$
4 Disallow: /phpBB2/stop-updates-topic.html*$
5 Disallow: /phpBB2/ptopic*.html$
6 Disallow: /phpBB2/ntopic*.html$
7 Disallow: /refer/

-------
why warning ? pls help me.
0 Replies
 
Craven de Kere
 
  1  
Reply Fri 13 May, 2005 01:28 am
lavinya wrote:
why warning ? pls help me.


Read the warning, it tells you why repeatedly.
0 Replies
 
 

Related Topics

Webdevelopment and hosting - Question by harisit2005
Showing an Ico File - Discussion by Brandon9000
how to earn money in internet - Discussion by rizwanaraj
The version 10 bug. Worse then Y2K! - Discussion by Nick Ashley
CSS Border style colors - Question by meesa
There is no Wisdom in Crowds - Discussion by ebrown p
THANK YOU CRAVEN AND NICK!!! - Discussion by dagmaraka
I'm the developer - Discussion by Nick Ashley
 
Copyright © 2024 MadLab, LLC :: Terms of Service :: Privacy Policy :: Page generated in 0.03 seconds on 04/26/2024 at 06:51:05