ok, I'm trying to figure out how or if google is crawling my site so I did some research and found out that I could download these "raw files" to my computers and view who's accessed the site. I inported the file into excel and was able to see some very interesting info.
I see that the MSN bot crawls my site and I see that it's hit a bunch of /forums/about-*.html files which I know the SOE MOD is suppose to help do.
I also see that thee is another bot called GornKer Crawler that also has these same /forums/about-*.html files crawled
YET...
When I see what google has done it doesn't appear that it's doing the same as MSN and GornKer is doing.
Here is my robots.txt file
User-agent: *
Disallow: /forums/admin/
Disallow: /forums/db/
Disallow: /forums/images/
Disallow: /forums/includes/
Disallow: /forums/language/
Disallow: /forums/templates/
Disallow: /forums/common.php
Disallow: /forums/groupcp.php
Disallow: /forums/memberlist.php
Disallow: /forums/modcp.php
Disallow: /forums/posting.php
Disallow: /forums/profile.php
Disallow: /forums/privmsg.php
Disallow: /forums/viewonline.php
Disallow: /forums/faq.php
Disallow: forums/updates-topic.html*$
Does this look right and what can I do to help google crawl my site more?
Oh please great CDK and Monger or whoever knows please help. I feel I'm on the cusp of search engine acceptance, but I don't have something quite right.
Here is an almost complete listing of what googlebot touches:
GET /robots.txt HTTP/1.0
(it hits this alot)
GET / HTTP/1.0
GET /robots.txt HTTP/1.0
GET / HTTP/1.0
GET /robots.txt HTTP/1.0
GET /cal/cal.php HTTP/1.0
GET /robots.txt HTTP/1.0
GET /main.htm HTTP/1.0
GET /contactinfo.htm HTTP/1.0
GET /contactinfo.htm HTTP/1.0
GET /robots.txt HTTP/1.0
GET / HTTP/1.0
GET /HC_web_tou.htm HTTP/1.0
GET /why.htm HTTP/1.0
GET /why.htm HTTP/1.0
GET /computing.htm HTTP/1.0
GET /why.htm HTTP/1.0
GET / HTTP/1.0
GET /aboutus.htm HTTP/1.0
GET /aboutus.htm HTTP/1.0
GET /forums HTTP/1.0
(i'm wanting it to tunnel deeper here, this is my forum root)
GET /HC_web_tou.htm HTTP/1.0
GET /HC_web_tou.htm HTTP/1.0
GET /HC_privacy.htm HTTP/1.0
GET /HC_privacy.htm HTTP/1.0
GET /reviews HTTP/1.0
GET /hometheater.htm HTTP/1.0
Am I even close to where I need to be? or should the GET /forums HTTP/1.0 URL be replaced with a more direct /forums/index.php link?