2
   

Google spiders and other things that go bump on the internet

 
 
Reply Mon 16 Jun, 2003 01:10 pm
Hi - this is my first post here, so please be gentle! Smile
I looked at the forums, and I hope this hasn't been asked (and answered) before. I found this site, able2know, from looking around www.phpbb.com - if you don't know phpbb.com, they're the folks behind the engine for this fine bulletin board based site, and numerous others, including my fledgling site. anyway, google did stop by my site once to visit, but didn't spider the site - meaning my content does not appear on google - i was happy to have them at least list my site now - but i do want spidering. i saw various modifications to the phpbb sessions.php file, and i was hoping some kind, brilliant soul here might actually have a good modification that works, to ensure google will in fact spider the site.
for whatever it's worth, my meta-tags are robot friendly, as is my robots.txt file - thanks for any help you genuises are able to give! Smile
  • Topic Stats
  • Top Replies
  • Link to this Topic
Type: Discussion • Score: 2 • Views: 5,291 • Replies: 33
No top replies

 
Craven de Kere
 
  1  
Reply Mon 16 Jun, 2003 01:23 pm
Ok, first of all the mod you mentioned to me by PM covers google and if I remember correctly inkitomi. It is a good start and is what I am currently using.

If you read the huge thread about google you will notice that two people made "spyder page" code at my suggestion.

Now neither of those two are using the page anymore over concerns that they might be penalized by Google.

So If you wnat that code I can give it to you.

You can see it in action here:

http://www.able2know.com/ataglance.php

The other one is simply not worth using.

I have modified the netclectics code for actual use by site visitors. You can see the result here:

http://www.able2know.com/allinone.php

As to meta tags google doesn't use them, just remember that. Meta tags are a great way for webmasters to "cheat" so while I miss the popularity of meta tags I understand Google's position on the issue.

Note that I am speaking of keyword and description metatags and their usefullness in SEO.

I also told you by PM that I had a better mod for you to use than the google mod you are using. I plan to use this code on the next versioon of this site.

It is phpbb 2.0.4 compliant:

Code:#
#-----[ OPEN ]------------------------------------------
#
includes/sessions.php

#
#-----[ FIND ]------------------------------------------
#
$SID = 'sid=' . $session_id;

#
#-----[ REPLACE WITH ]------------------------------------------
#
if ( $userdata['session_user_id'] != ANONYMOUS ){
$SID = 'sid=' . $session_id;
} else {
$SID = '';
}


----------------------------------------------------------


It was posted to that huge topic about spyders so I do not remember who posted it but it does work and it works for ALL spiders!

The reason why it works for all spiders as opposed to just the google and inkitomi bots is because it simply disables session ids for all guests.

There is one issue with this though. Using that code will eliminate the ability to allow guest posting and posting requires a valid session.

I do not use guest posting so I will be using that code.

I have also modified the phpbb code to make the page titles more Seach engine friendsly, I can give you my code for that too (it's not written up as a mod but it's easy, just one line).

In the headers I also repeat the page title for keyword frequency.

Another important thing is that I used a modified (heavily heavily modified) version of fetch all to make CMS pages.

These have proven to be absolute search engine bait.

Let me know if you want code for any of my modifications. I have not published most of them (haven't had time to write up the instructions).

Sorry for the segmented post, I am at work and in a hurry.
0 Replies
 
gadgetaddict
 
  1  
Reply Mon 16 Jun, 2003 01:40 pm
you're apologizing to me for fragmentation? Thank you thank you thank you so much for:
1) replying
2) replying at length
3) replying so quickly!
so i think i can overlook a little fragmentation Smile

I did NOT understand this part though, so perhaps when you have more time, you would consider a little elaboration? you said (i'm horrible with the quoting function so i cut and pasted):

In the headers I also repeat the page title for keyword frequency.

Another important thing is that I used a modified (heavily heavily modified) version of fetch all to make CMS pages.

These have proven to be absolute search engine bait.

Now, as long as it's not jail bait count me in Wink I didn't understand the CMS reference. As for the headers, i'm sorry to be stupid here, but do you mean headers in headers.php, or on each module, or index.html and index.php? i know just enough to wipe out my database, but not enough to fix it...
I understand what you did to the sid with your one-liner - makes a LOT of sense, instead of specifying individual bots - as i don't want anonymous posting, i don't mind it keeps guests from posting, in fact, that would be my preference!

so, I guess I'm basically begging for whatever search engine mods you're willing to post. The phpbb site had roughly 29 pages on the major thread, and then a couple of other threads - and basically i was more and more confused as time went on. I did look through my site's logs and see google's webcrawler visited april 23 - and was never seen again. I had already enabled one of the mods i had seen from the phpbb site, but i guess google didn't like it.
I am using phpbbfm (fully modded) which is basically a phpbb2.04 package with 217 mods - which could potentially be giving google a coronary.
I have avoided mentioning my site because i didn't want to seem "oportunistic" but it is http://www.gadgetaddict.com
which then routes you to a subdirectory and the index
In appreciation, i'd like to steer you to a small site i found recently, which has a small set of modules which make a site mobile friendly - you can't post from your handheld, but it is very readable. for all i know, you already have a mobile version, but in case you don't (or for my fellow newbies) it is: http://www.togosolo.com/
I hope i didn't break any rules posting links, etc. - and thank you again for the help you have already provided! Smile
0 Replies
 
Craven de Kere
 
  1  
Reply Mon 16 Jun, 2003 01:57 pm
gadgetaddict wrote:

you're apologizing to me for fragmentation? Thank you thank you thank you so much for:
1) replying
2) replying at length
3) replying so quickly!
so i think i can overlook a little fragmentation Smile


I'll answer anything I know the answer to and then a few million things that I don't. :-)

gadgetaddict wrote:

In the headers I also repeat the page title for keyword frequency.


Notice how in the header near the Able2Know.com the page title is reapeated? The page title is the key in phpbb code for optimizing for search engines and to my knowledge this site is the only one that has optimized the way I have done it.

In regard to page title I did:

A) removed the site name from the page title. Nobody is searching for the site name, they are seraching for the keywords.

If you look at the title of this apge in IE (or whatever your browser is) you will see only the topic title as the page title. I removed the site name.

B) I repeat the page title in the header template (overall_header.tpl). This is for keyword frequency.

gadgetaddict wrote:

Now, as long as it's not jail bait count me in Wink I didn't understand the CMS reference.


CMS is content management system. It's what some people call a "portal" (portal means both a directory and a CMS these days).

On our home page you might notice that there is dynamic forum content, and that there are over 60 pages that dynamically use the phpbb database.

I have integrated everythign with the forum database. If you use the contact us page: http://www.able2know.com/contact.php

You will see that the username and email is pulled from the database.

If you check out teh flash chat room you will see a quick and dirty database integration as well (many people have asked for my chat code and I have released it without documentation).

The CMS pages I use were originally phpbb fetch_all pages by a coder named Ca5ey.

I modified them with suggestions from him and have used those pages for many many many applications.

I have written modified portal scripts and have used other scripts in these pages.

If you want to use fetch all I'll dig up a link but phpbb is coming out with their own portal soon so you may want to wait.

gadgetaddict wrote:

As for the headers, i'm sorry to be stupid here, but do you mean headers in headers.php, or on each module, or index.html and index.php? i know just enough to wipe out my database, but not enough to fix it...


By headers I meant the top part of the page that is repeated.

In phpbb this is defined in the templates using templates/templatename/overall_header.tpl

My code is weird and I actually use 6 header files (of the php variety) and 6 header tpl files.

gadgetaddict wrote:

I understand what you did to the sid with your one-liner - makes a LOT of sense, instead of specifying individual bots - as i don't want anonymous posting, i don't mind it keeps guests from posting, in fact, that would be my preference!


Those are the reasons I will be using the code. It also decreases the risk of being accused by a search engine of "cloaking".

gadgetaddict wrote:
so, I guess I'm basically begging for whatever search engine mods you're willing to post.


Remind me if I don't post some tonight, I need someone to bug me to write up my own mods.

gadgetaddict wrote:
The phpbb site had roughly 29 pages on the major thread, and then a couple of other threads - and basically i was more and more confused as time went on.


I know, I tried to help with a simplified set of threads but people kept asking the same questions and confusing each other. Plus I have less time to help on phpbb.com.

gadgetaddict wrote:

I did look through my site's logs and see google's webcrawler visited april 23 - and was never seen again.


You need inbound links to get spidered. Otherwise Google will just check you out.

gadgetaddict wrote:

I had already enabled one of the mods i had seen from the phpbb site, but i guess google didn't like it.
I am using phpbbfm (fully modded) which is basically a phpbb2.04 package with 217 mods - which could potentially be giving google a coronary.


Mods are unlikely to kill your chances in Google but they can hurt.

gadgetaddict wrote:
routes you to a subdirectory and the index


That kind of a redirect REALLY hurts you in search engines.
0 Replies
 
bobsmyth
 
  1  
Reply Mon 16 Jun, 2003 02:04 pm
Craven:

Just wanted to remark on what a nice guy you are. Thanks for being so kind to a poor soul lost in the dark My already great esteem for you has risen even more.

Bob
0 Replies
 
gadgetaddict
 
  1  
Reply Mon 16 Jun, 2003 02:08 pm
bobsmyth wrote:
Craven:

Just wanted to remark on what a nice guy you are. Thanks for being so kind to a poor soul lost in the dark My already great esteem for you has risen even more.

Bob

Me too! although that really doesn't say much, since i hadn't heard of you before today Smile unless, are you the hamster fella from phpbb.com?
0 Replies
 
Craven de Kere
 
  1  
Reply Mon 16 Jun, 2003 02:10 pm
Yes I am the hamster fella over there. I've also used a few other names because phpbb's database has been corrupted a few times and my login to the forums wouldn't work to submit my mods to teh mod database.

A big reason you won't find many of my mods is because of that blasted buggy mod database login at phpbb.
0 Replies
 
Craven de Kere
 
  1  
Reply Mon 16 Jun, 2003 02:11 pm
Before I forget, the hamster account was also used by a freind who was helping me make this site in Brazil.
0 Replies
 
bobsmyth
 
  1  
Reply Mon 16 Jun, 2003 02:21 pm
Dear gadgetaddict:

As you can see from Craven's post I am not the aforementioned hamster. I tried to be but just wasn't cute enough. C'est la vie. Visit us during your copious free time. Welcome!
0 Replies
 
Craven de Kere
 
  1  
Reply Mon 16 Jun, 2003 02:51 pm
BTW, I forgot to say:

Thanks bob!

I'm just passing it on. Whenever I need help people give me their time and the open source community is all about passing it on.
0 Replies
 
gadgetaddict
 
  1  
Reply Mon 16 Jun, 2003 03:02 pm
Now THAT'S the body building hamster i've come to know and respect! Why, I once saw him lift a car just to get at a piece of carrot....
actually, i'm so brain dead today, that for the life of me i can't remember your avatar from earlier today. at least i remember most of the tips you mentioned...something about 15% of the check total as i recall....
SERIOUSLY, thanks again for your openness and help earlier - i have contributed zippo to this site but you've treated me better than admins at sites where i have published articles or written reviews on their behalf - watch out, fella, i'm going to pay you back! err, that didn't come out right. no stalking references intended.
Now i do seriously have to consider stripping my site down to the phpbb essentials and then bionically building it up.
is it likely i'll lose my posts if i do that? i imagine as long as i don't touch the mysql table i should be okay - agree?
0 Replies
 
Craven de Kere
 
  1  
Reply Mon 16 Jun, 2003 03:06 pm
Again, there is no reason you HAVE to lose your posts but there are plenty of ways to screw up the DB.

fm modified the db so you might need to do it a special way. But probably not (you'll probably just have extra fields in the db).

If you want to do it I can help you to make sure you do it right.
0 Replies
 
jespah
 
  1  
Reply Mon 16 Jun, 2003 04:25 pm
Howdy, gadgetaddict! Good to see you made in all right! :-D
0 Replies
 
Craven de Kere
 
  1  
Reply Mon 16 Jun, 2003 06:51 pm
Gadget,

Here is the spider page mod:

http://www.able2know.com/forums/viewtopic.php?t=8606
0 Replies
 
gadgetaddict
 
  1  
Reply Wed 25 Jun, 2003 08:24 am
I did it! I installed Spyder_Manage - and it was so simple even a gary could do it! It is a little bit jumbled (like yours) but it is also a terrific cramped summary - i saw posts i did'nt know were on the board, with photos, etc. - anyway, installation took about 3 minutes and life is great with spyder_manage - now to get the rest of the headaches taken care of....
0 Replies
 
Craven de Kere
 
  1  
Reply Wed 25 Jun, 2003 09:51 am
Make sure to place a link to the spider page in the header or footer. Do not put an invisible link. Google penalizes for this (I have a few hidden links that I'll be removing when I overhaul the site).
0 Replies
 
gadgetaddict
 
  1  
Reply Wed 25 Jun, 2003 03:27 pm
strange, this one didn't send me a reply e-mail either....
anyway, i don't know how to put that in the header or footer, so if you would tell moi, it would be gratefully appreciated! Smile
something strange happened on GA - a guest was able to post, though i thought i disabled that (lemme go check) = strange, i don't see how to require registered users to post, but someone (actually, netclectic) was able to post a reply while on as a guest - and someone else is posting as we speak - i had added your code, so i thought it would not allow this - did i do something wrong, or maybe the code doesn't do what you thought? i'm referring of course to the sessions mod - i'll copy and paste if that might help....
0 Replies
 
Craven de Kere
 
  1  
Reply Wed 25 Jun, 2003 04:22 pm
The reason it didn't send you a reply email is probably because you still had an active session when I replied. I just checked the server and the emails are going out without any problems.
0 Replies
 
Craven de Kere
 
  1  
Reply Wed 25 Jun, 2003 04:25 pm
Well the sessions mod isn't supposed to let guests post but that's not the purpose. It just removes the session ID for guests and a side effect was that guests couldn't post.

I had a look at your forum and you are alllowing guest replies (but not allowing guest to post a new thread).

To add a link in your header or footer use simple html links and place it in the overall_header.tpl or overall_footer.tpl files.
0 Replies
 
gadgetaddict
 
  1  
Reply Thu 26 Jun, 2003 07:38 am
stupid question # 158: i'm not sure what an invisible link is.
i used this:
Site summary can be found by clicking <a href="http://www.gadgetaddict.com/forums/spyder_manage.php" target="_blank">here.</a>

was this what you had in mind? personally, i'm out of my mind Smile
0 Replies
 
 

Related Topics

Webdevelopment and hosting - Question by harisit2005
Showing an Ico File - Discussion by Brandon9000
how to earn money in internet - Discussion by rizwanaraj
The version 10 bug. Worse then Y2K! - Discussion by Nick Ashley
CSS Border style colors - Question by meesa
There is no Wisdom in Crowds - Discussion by ebrown p
THANK YOU CRAVEN AND NICK!!! - Discussion by dagmaraka
I'm the developer - Discussion by Nick Ashley
 
  1. Forums
  2. » Google spiders and other things that go bump on the internet
Copyright © 2024 MadLab, LLC :: Terms of Service :: Privacy Policy :: Page generated in 0.04 seconds on 04/24/2024 at 07:55:06