NYCPHP Meetup

NYPHP.org

[nycphp-talk] Search Engine Indexable PHP Sites

ophir prusak prutwo at onebox.com
Mon Sep 9 16:19:51 EDT 2002


i gave a presentation on this subject at a PHP meeting a couple of months
ago.
my notes are at http://www.consolemonkey.com/nyphp/

---- "Lynn, Michael " <MLynn at exchange.ml.com> wrote:
> Greetings,
> 
> I've developed a LAMP based online catalog for a jewelry company and
> for the most part the site is great.  The problem is that nobody can
> tell it works so well due, in large part to the fact that I
> can't get the site indexed by the search engines.  I believe this is
> because it is a dynamic site and each url on the generated pages contain
> references to sub-pages using
> pagename.php?variable1=value&variable2=value
> 
> I've read up on a few articles from evolt.org and searchtools regarding
> "clean" urls and I have embarked on a redevelopment of the site to
> use urls like
> 
> http://www.camelotbridal.com/new/index/rings/2 (keep in mind - I'm
> currently working on this so don't be shocked by the debug output and
> random errors)
> 
> Instead of 
> 
> http://www.camelotbridal.com/index.php?p=rings&c=3 
> 
> My question is this: 
> 
> Are there any tools that mimic the search engines indexing behavior
> so that I can gauge the effectiveness of my redevelopment?

not really.
each search engine uses their own rules on what and how to index.

> 
> I've tried weblech and it's quite nice - builds a directory tree with
> the results of a scan of my new layout.  But is weblech doing /just/
> what the spiders are?  There should be a tool that mimics
> each search engine spider (google, altavista, etc)... If not, then
> I'm wondering if someone knows how to get hold of the searching behaviour
> of the most popular engines.  For example: Does the spider
> stop when it finds a url with a "?" in it (second example above).

for the most part question marks in a URL are a sure way of reducing
your chance of getting into a search engine.

your best bet is to read up on sites like searchenginewatch.com to understand
how they work.

While the "path info" solution (what you're trying to do) is much better
than the previous URLs, I personally prefer using mod_rewrite.  

That way you can create a url like http://www.camelotbridal.com/rings_3_something.html
and have apache convert it to http://www.camelotbridal.com/index.php?p=rings&c=3


Either way, I think you should add .html as a file suffix (even if u
don't use it).
Something like http://www.camelotbridal.com/new/index/rings/2/x.html

> 
> aTdHvAaNnKcSe,
> Mike

----
Ophir Prusak
Internet developer 
prutwo at onebox.com | http://www.prusak.com/ 


> 
> 
>  
> 
> 
> 
> --- Unsubscribe at http://nyphp.org/list ---
> 
> 
>  



More information about the talk mailing list