[nycphp-talk] Tuning MySQL Full Text Search
Ben Sgro (ProjectSkyLine)
ben at projectskyline.com
Wed Aug 22 12:42:45 EDT 2007
Hello Rob,
I'm happy w/the relevance, but the order isn't right, and too many results
are being returned (which is my own issue to fix).
> ( (3 * MATCH(title) AGAINST ('term')) + (1 * MATCH(body) AGAINST
> ('term')) )
That's really cool. I didn't realize you could do that, and that is
something I'd like to do.
Without the BOOLEAN, the results were really off, minimal results and not
that accurate.
Once I added BOOLEAN, the results got a lot better.
There is also the problem where common words, aren't returning anything,
such as a search for
"water". It should however, since the water keyword is very frequent
throughout the site.
- Ben
Ben Sgro, Chief Engineer
ProjectSkyLine - Defining New Horizons
+1 718.487.9368 (N.Y. Office)
Our company: www.projectskyline.com
Our products: www.project-contact.com
This e-mail is confidential information intended only for the use of the
individual to whom it is addressed.
----- Original Message -----
From: "Rob Marscher" <rmarscher at beaffinitive.com>
To: "NYPHP Talk" <talk at lists.nyphp.org>
Sent: Wednesday, August 22, 2007 12:31 PM
Subject: Re: [nycphp-talk] Tuning MySQL Full Text Search
> On Aug 22, 2007, at 10:50 AM, Ben Sgro ((ProjectSkyLine)) wrote:
>> I'd like to tune this to have different weights for words, because I'm
>> not happy with the search results.
>> $dbObject->DatabaseQuery('SELECT id, title, body, links_to,'
>> . ' MATCH(title, body)'
>> . ' AGAINST (' . $dbObject->Safe
>> ($searchStr)
>> . ' IN BOOLEAN MODE)'
>> . ' AS score FROM ' .
>> DATABASE_TABLE_CONTENT
>> . ' WHERE MATCH (title, body)'
>> . ' AGAINST (' . $dbObject->Safe
>> ($searchStr)
>> . ' IN BOOLEAN MODE)'
>> . ' ORDER BY score DESC',
>> constReturnArray, LOG_LEVEL_DEBUG);
>
> Hey Ben,
>
> Are you not happy with the relevance sorting? Or is it not returning
> rows that you think should match... or return too many rows? What's an
> example of how you could weight the words?
>
> You should test this to see if it's true... but I've seen it mentioned
> that the score returned by "IN BOOLEAN MODE" is an integer with the
> number of terms matched. Without "IN BOOLEAN MODE", it gives a floating
> point number that " is computed based on the number of words in the row,
> the number of unique words in that row, the total number of words in the
> collection, and the number of documents (rows) that contain a particular
> word."
>
> Actually... do you need the "IN BOOLEAN MODE"? Otherwise, results are
> automatically sorted on relevance.
>
> Also... you could put a higher weight on title matches over body matches:
> ( (3 * MATCH(title) AGAINST ('term')) + (1 * MATCH(body) AGAINST
> ('term')) )
>
> -Rob
> _______________________________________________
> New York PHP Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> NYPHPCon 2006 Presentations Online
> http://www.nyphpcon.com
>
> Show Your Participation in New York PHP
> http://www.nyphp.org/show_participation.php
More information about the talk
mailing list