Excluding assets from search

We have a bunch (i.e. hundreds) of PDF assets that seem to come up in search results no matter what we search for.


We'd like to be able to exclude them from search. If we change their weighting to "0", they still appear in search results - and it would seem that it's because they have metadata applied to them. The metadata is important though, as it's used in numerous asset listings.



I've heard whispers of some kind of "exclude from search" setting that might be coming in upcoming releases, but I'm really only interested in a solution that will work now in 3.6.



Any ideas?

Katen,


Not sure if this will help you but…we have had the same problem with our searches so we removed PDF's and Word documents altogether.



From what I understand upon uploading Word and PDF files there are two external tools (pdftohtml and Antiword) which convert the contents of these files to HTML. These are then indexed for use by the search engine.



If, for example, you have a 180 page PDF - you have a LOT of keywords which will always put large PDF and Word docs to the top of your search results.



Under System Configuration > External Tools set the following to No:



Enable pdftohtml

Enable Antiword



If you do this the contents of PDF and Word docs won't be indexed for searching. However you can still use the metadata on your assets to search for these files.



One more thing that you may need to do is a Re-index on your site from the Search Manager.



Hope this helps!



Ben