Does any method to decode HTML entities within keywords exist?
I’ve looked through the documentation at https://matrix.squiz.net/manuals/keyword-replacements/chapters/keyword-modifiers and it doesn’t look like it – you’ve got htmlentities to encode but no way of decoding. As a workaround, I’ve considered using a Regular Expression but it’s going to be cumbersome to manage so many entities, even if I don’t try and regex them all.
In PHP, you’d just be calling html_entity_decode() and I’d imagine adding the implementation as a new modifier would be straightforward.
Thanks for the suggestion. Unfortunately, since unescapehtml uses htmlspecialchars_decode() in its implementation, it only handles decoding &, ", ’ and <> (docs).
In case anyone’s curious, you can fake it with a Regular Expression asset. You just need to know that the regex format needs some cajoling in the case of ; (semicolon) characters in the original content. What appeared to be a string of – needs to be matched against the string &ndash\;, which means the regex /&ndash\\;/ (noting the escape of the backslash) is needed.
In my case, I want my metadata description field to not feature characters, not their HTML entities, so I used the following:
This would capture entities using either name or number format (eg for an ndash: – OR – OR –), but would also remove language character such as diacritical marks, witch may not be ideal depending on your specific requirements.
Thanks again, but the aim isn’t to get rid of any part of the content. In my current case, that’d get rid of quotation marks, dashes and foreign characters, affecting the readability and actual content. Also, for what it’s worth, asking editors to not use special characters is untenable – the content could contain words or names with foreign characters.
I see this didn’t go anywhere. Strange to have encode but not decode.
Was hoping to not have to use client-side javascript to decode a long string into actual html tags.