Problem with search and multiple terms in metadata field


(Mahearnpad) #1

Matrix Version:
5.3.2.1

I have a search asset with a search field defined to search for terms in a metadata field. The logic is set to ALL WORDS.
The form contains a select list of terms. The terms used for this select list, and the metadata field, are from the same set of data (list of legislation).

My problem is that the metadata field can contain multiple terms (a policy instrument can be tied to multiple pieces of legislation).

From what I can tell, the search asset is seeing one long string, so individual terms are not getting matched in the word logic.

Is there a way to parse out the terms from the source metadata field, before the search asset does it’s thing? I can’t see any way to do this.


(Mahearnpad) #2

Can someone from Squiz please tell me whether it is possible to do this or not?

To clarify, The user can select from multiple terms (checkboxes), and I need the search page to find those terms within a metadata field that may have multiple terms in it.

And no, Funnelback is not an option.
???


(Bart Banda) #3

You’ve got a couple of options for it, but depends on how your metadata select field is set up.

Can you give us some examples of some of the option values and keys that are used?

My suggestion would be to have a different key to the value, for example:

Squiz Matrix

That way, the search would be done on the value and the read friendly value just used on the front end for display purposes.

Would something like that work?


(Mahearnpad) #4

Thanks for responding Bart

I’m not sure you’ve understood my problem.

Here is part of the multi-select metadata field:

Option Key Option Value
0 Aboriginal Land Rights Act 1983
1 Aboriginal Land Rights Regulation 2014
2 Administrative Decisions Tribunal Act 1997
3 Administrative Decisions Review Regulation 2014
4 Agricultural Industry Services Act 1998
5 Agricultural Industry Services (Polls and Elections) Regulation 2005 currently called Agricultural Industry Services Regulation 2015

When the content author selects some of the values from this select dropdown, they get added to the “Legislation” metadata field on the page. They appear then as:

value 1; value 2; value 3; etc

Now if a user wants to search for documents with metadata values equal to “value 2” or “value 3”, I don’t believe the search page is finding these values as discrete values. It seems that the values in the fields are evaluated as one long string. I may be wrong - I haven’t gone to the extent of trying to find the file that actually does this, but the way the results are returned certainly bears this out.

Is this clear now, what I’m talking about?


(Bart Banda) #5

Thanks for that. And yes, the search treats the metadata value as one long string and doesn’t store them as separate values in the DB, so it actually just stores “value 1; value 2; value 3;” as a string.

So to combat that you would then need to ideally store it as:
value_1; value_2; value_3;

So that the search could then differentiate the values properly.

The other alternative is to have another metadata text field, that uses the value of your first one by default using the %metadata_field_% format, see https://matrix.squiz.net/manuals/keyword-replacements/chapters/metadata

You could then do a keyword modifier on it to replace the spaces with underscores, something like:
%metadata_field_^replace: _%

Hope that helps.


(Mahearnpad) #6

Bart, what you suggest certainly makes sense. I’ll call you a bloody genius, even before trying these out. Hope I don’t have to take that back! :wink:

thanks for your help!


(Mahearnpad) #7

Hi Bart

I’ve tried to do what you suggested as far as using the keyword replacement
%asset_metadata_Legislation^replace: :_%
but I’m not sure I’m doing it the correct way.

I’ve created a new metadata field, and I’ve put this keyword replacement as the default value.

This is not working. Am I doing it the right way?


#8

Lot easier is to customise the search result page to return the results in JSON format for example. You can then parse the results any way you want

I use semicolon separated data in a metadata field for our navigation. (Typical metadata string in the filed: login to, or register for, My Account;login to, or join, My Account;sign-in to, or register for, My Account)

Example:
Search results page:
{"tokens":[%asset_metadata_i_want_to^preg_replace:208925^preg_replace:208928%],"url":"%asset_url%","display_string":"%asset_metadata_i_want_to^preg_replace:209174%"},

Regular expression to parse the results:
(regex) - (replace)
208925:

/([a-zA-Z0-9 .,'"():&*\-_\/<>]+)(\s|\n|)(;|)(\s|\n|)/ - "$1"

(regex) - (replace)
208928:
/""/ - ","

(regex) - (replace)
209174 (2 sets):
/(\s|\n|)( ; )(\s|\n|)([a-zA-Z0-9 ,.()&'"*\-_\/<>]+)/ - $1 /;$/
Remove spaces between brackets in “( ; )” - had to add them to prevent it to be converted to smiley face.

Then create Design that sets the content to JSON adding only:
<MySource_PRINT id_name="__global__" var="content_type" content_type="application/json; charset=utf-8" /> <MySource_area id_name="body" design_area="body" />

into the parse file, apply design to result page and you are done.

Then make an Ajax call to that result page and parse them to dynamic list.

You then have two options:
When user types text in to search field, you call the JSON search, if they don’t select any from the result list, you redirect them to generic search page. (That’s what I do on our site)

Alternatively, do normal search but leave result pages empty from Matrix keywords and parse the JSON instead (or what ever you want to do - mix Matrix default results with JSON etc)

You can see the whole thing at work on our site: www.tmbc.gov.uk (“I want to…”)

Sorry, one more thing: of course you then use JavaScript to search the returned JSON and adapt the regexes to your needs (e.g. to split results from a semcolon).

Another way could be to have an asset listing page with that metadata, split the data from semicolon and search the contents of that page.


(Mahearnpad) #9

Thanks for your reply piivonen. This is the kind of solution I was looking for myself.

However, it seems that after doing all this, you might as well just use an asset listing and pull in all results, then use JavaScript to filter on the front-end, based on the user’s keywords. You end up having to reproduce the search anyway, on the front-end, on the returned JSON, as you point out in your penultimate paragraph.

I’ll give this a go, but still not sure it’s the right way to do it.

Thanks again for your detailed response
cheers

Michael


#10

I agree, the dynamic search may not be best for your needs thus the asset listing idea. I built the first method to change search to a navigation tool with typeahead and bloodhound, results pulled from the metadata as described. And if use doesn’t select anything, they are sent to regular search results.

Using the asset listing where you spilt content from semicolon and search the result is probably better suited to traditional search. If you have lots of entries throughout the site, use advanced search features allowing user to select root nodes etc to make it more manageable.


(Bart Banda) #11

You need to this it this way:

So your keyword value would be:
%metadata_field_Legislation^replace: :_%


(Mahearnpad) #12

Bart, I already tried that keyword replacement and it didn’t work, which is why I started to try other ones, as in my post. I’ll try again, but the question is:
where do I put this keyword replacement? Is it in the metadata schema, as another field? If so, and the field needs to mirror the select field, then how to handle the multiple terms?

There are a number of problems here which seem to be unsolvable in this CMS. But I will continue to reserve judgement and keep trying.

The other problem, for me, is that I have a meeting tomorrow with the part of the business I’m building this for, and I need to give them some idea whether this function is possible.


(Bart Banda) #13

I’m interested to know what’s not actually working?

I have replicated your scenario a little bit but changed the keys over to match what you wanted. I did notice that I had to add another keyword replacement to remove the semi colons first.

See below for what I did, would this work for you?

Ideally, if you could, you’d just change the option keys from “value 1” to “value_1” so that you wouldn’t even need this extra metadata field.

I did also notice a limitation with the %metadata_field_<field_name>% keyword when used against select metadata fields where you can’t specify the _value or the _key after it to bring out either value, like you can with %asset_metadata_<fieldname>_value%, have reported that in our roadmap tool to get added.


(Mahearnpad) #14

Thanks for getting back to me Bart.

OK, so this is working OK for checkboxes, as per your example. I have a few of these, so this should work, although I haven’t yet tried.

However, the one I’ve been working with is a SELECT list, and you’ve unfortunately just confirmed my worst fears about not being able to specify the key and value separately. My key is a numeric value, so would have worked if there wasn’t this limitation.

Thanks


(Bart Banda) #15

Any chance you could just change your keys to what I mentioned above?
Instead of 0, 1, 2
Use key_0, key_1, key_2

?


(Mahearnpad) #16

Yes, of course. I’ll do that and let you know whether it works.

thanks


(Mahearnpad) #17

I know I’m getting close now, but still having this one problem (hope I can explain it succinctly enough):

Using your keyword replacement for the Legislation field value (this field is a SELECT list), produces this:
key_1 key_2

However, the values from the form, are the text inside the tags:

queries_legislation_query_posted=1&queries_legislation_query=Aboriginal+Land+Rights+Act+1983

I can’t get it to print the “value” attribute of the tag, because the keyword replacement for the data recordset field (the field corresponding to the key in the metadata field (e.g. key_1, key_2 etc) is not working for some reason.

The HTML I’m using to print the SELECT list for the search form, is:

<option value="%ds__field_0%">
    %ds__field_1%
</option>

As mentioned, %ds__field_0% does not get printed. I’ve tried with and without the quotes.

So really, I need to be able to compare the value of the select list with the values in the metadata field (which are now the keys of the reference list). This is not possible while the above keyword replacement is not getting printed.

EDIT: Since writing this, I’ve managed now to get the key (%ds__field_0%) to print inside the value attribute of the tag. So I’ll now work from there. Hopefully I can get this working correctly now.


(Mahearnpad) #18

Thank you so much for all your help with this, Bart!

I’ve got it working correctly now and very relieved!

Michael