Context - https://marketplace.squiz.net/extensions/asset-version-history
Firstly, thanks for Squiz for publishing the new versioning extension on marketplace so we can take it for a test toast before getting it installed in production. It makes all the difference in easing adoption. I’ve given it a whirl on a test instance and cobbled together some notes and requests, most of which I’m sure the devs have already thought of.
No pagination
Presumably just because this is a bit MVP, there’s no way to get history records past the first page.
Serialised attributes not stored
Serialised attributes (like questions on form assets) don’t appear to be captured in the history JSON.
Suggestion - API field formats
The API behind the Version History screen would be useful to consume programmatically at times, but some of the response fields are already formatted for display (version_date & status).
/__avh/versionList/49
"versions" : [
{
"version_date" : "Feb 20th 2020 - 02:42am",
"status_icon" : "<span title=\"Live\" class=\"sq-status-square sq-status-16\"></span>",
"user_name" : "Root User",
"version" : "0.2.1"
},
It would be handy for date to be ISO8601 or similar, and for the status to be sent as the code rather than html.
Request - Exempt asset types
It would be very useful to be able to (either by default, or opt-in) exempt form submissions from the version history system. Form submissions contents generally don’t change after initial creation, and can be very sensitive in nature. You don’t want to have to sanitise the version history to remove things like that, and I suspect that for most use cases submissions would be out of scope for version concerns (because they’re not content)
Request - add URLs to version_data
Currently the web paths for an asset are stored with each version, but not the full URLs. Technically this information should be findable from the links (with effort), but it would be useful to store an extra copy of this info with each version_data.
Content Template compatibility
Currently when you update a page it stores a copy of each container’s “contents” against that page, but it doesn’t store the MD fields for any CCT which is attached those containers. I think that the page should either store neither (because that information is stored in the version history of the container assets) or it should have the MD fields as well to get a complete picture.
Storage requirements and performance?
My main concern is with how much data is going to end up in this table and what that means for performance and the recommended frequency of truncation.
Considering an example of a page with a big container (say 100kB of html) and a small container
going through the hypothetical changes
- Change to Safe Edit
- Edit Small Container
- Apply for Approval
- Approve & Make Live
After that process, which didn’t change the big container at all, sq_ast_vers_history will contain another 7 copies of the big container’s content. By comparison, rollback and sq_internal_msg would not have created any new copies. For large pages (and we have some blobs of HTML approaching 1MB) this might be significant.
I don’t know for sure this is a big problem, but as users we would always want to keep as much history as possible in this table. If it’s significantly over-storing that may impact how much we can safely keep.
This could be partially mitigated by not duplicating container detail on page assets - instead version_data could store the container IDs and the Version HIstory screen could query for the version_data stored against those containers with the same correlation_id. That approach would bring the above example from 7 down to 3 (just the status changes) and would also fix the content template compatibility (above).
Suggestion - correlation_id added to sq_internal_msg?
I was thinking that since sq_ast_vers_history doesn’t capture the detail of “what changed?” we will want to keep using the existing logs for some purposes (probably with attributes.fulllog.scalar blacklisted). It would be helpful for using the two in concert if sq_internal_msg rows had the same correlation_id as sq_ast_vers_history. We’d also need a way to query sq_ast_vers_history by correlation_id to make use of it.
Request - Access the history of assets no longer in the system
One of the issues with the existing logs is there is no way to access information for assets that have been deleted from the system. It would be very useful to have a method of accessing this data for arbitrary AssetIDs. Obviously there are permission issues, so maybe this would need to be System Administrators only.
Contexts compatibility
Version history doesn’t seem to play nicely with Contexts (although it’s pretty fine with Variations). I don’t use contexts so it’s not pressing for me, just something I noticed.