Posted Oct 30, 2014 by Paul Briscoe
Magento, by default uses MySql full text, like, or combined search.
“Most full text search implementations use an “inverted index.” This is an index where the keys are individual terms and the associated values are sets of records that contain the term. Full text search is optimized to compute the intersection, union, etc. of these record sets, and usually provides a ranking algorithm to quantify how strongly a given record matches search keywords.
The SQL LIKE operator can be extremely inefficient. If you apply it to an un-indexed column, a full scan will be used to find matches (just like any query on an un-indexed field). If the column is indexed, matching can be performed against index keys, but with far less efficiency than most index lookups. In the worst case, the LIKE pattern will have leading wildcards that require every index key to be examined. In contrast, many information retrieval systems can enable support for leading wildcards by pre-compiling suffix trees in selected fields.”
What this means is that searching for a term like “Teve” if it is the name of a product, what also hit for results with the word “whatever.” You get the idea. Although Magento search is notoriously terrible because of this, there are some ways to make the search results a little more relevant if you know which attributes the system is using to index for search and where it indexes the data. With this information, you can then suggest better ways to set up your indexable data to make the best out of full text search.
In Catalog->Manage Attributes you can set the Use in Quick Search to Yes or No to index this particular attributes information.
In System->Configuration Under Catalog -> Catalog Search is where you can set basic settings for search.
In the database, magento uses three tables to store search information:
The most important of these tables (in my opinion) is the catalogsearch_fulltext because this is where Magento stores the indexed data from all attributes where Use In Quick Search is set to ‘Yes’
Data index value sample:
This is the entry point for catalogsearch, specifically, the indexAction.
Is where the query object is built and the index is referenced.
$query = Mage::helper(‘catalogsearch’)->getQuery();
If you want to find where the index events actually get built you would look at:
The System->Index Management actions pass through this class where there are directed to their Modules resource model to build and set the data.
In the case of the CatalogSearch Index Process, this happens in:
If you have a bunch of customers searching for product names, then having descriptions, short descriptions, and other text fields as indexed data, MySql will likely provide a result collection that is too large, and full of data that is making matches on pieces of sentences that are not relevant to the search.
A easy way to work around this is to make the product name the only indexable data for a product.
Obviously, this won’t work for all examples, but it is a way to make the Magento search more useable without the overhead of a service like Solr. Further, if you know the way this works, you can instruct product managers to name their products appropriately to help further improve search results. Keep in mind that for configurable products, Magento will index all of the simple products into the data_index as well.