Delete_assets_by_id.php


(Douglas (@finnatic at @waikato)) #1

The description for delete_assets_by_id.php on https://matrix.squiz.net/manuals/server-administrator/chapters/squiz-matrix-script-index is:

Deletes assets under specified root ID; move to Trash.

Can anyone confirm that it doesn’t actually do that where a child asset of the root ID is linked under multiple locations?

:rage:


(David Schoen) #2

Hi Douglas,

It looks like at the moment that script is deleting the actual assets supplied to it directly, so if you have:

Root
|- A 
|  |- C
|
|- B
   |- C

and pass it “A” expecting it to remove the children of “A” (which is what I would have expected from the manuals description) it will end up moving a link of “A” to trash. Instead if you ran it with the ID for “C” (the way the script seems to be intended to be used) one of the two links would be moved to trash, but there would be no way to tell which.

I’ve been looking through the history of this script and I believe it was kept a bit by accident as it was originally just living under scripts/dev (which got cleaned up a few years ago) and originally added purely to clean up some assets when doing load testing - it was fit for that purpose, but it’s definitely not fit to be used generally.

I’ll see what I can do to get that sorted out, but in the mean time would you like to expand on the problem you were originally trying to solve and we’ll see what we can do to help?

Cheers,
Dave


(Douglas (@finnatic at @waikato)) #3

Thanks for the response David.

We have a situation where we have a large collection of assets © which are now redundant within Squiz Matrix, along with two collections of assets (A and B) left over from development work which weren’t tidied up.

Each collection has a parent asset (A & B are asset listings, C is a folder), and child assets of A have links through to assets in B, and child assets of B have links through to assets in C.

While C assets have mostly a small number of links back to B assets, most B assets have links back to the majority of A assets. All this creates a significant number of webpaths on the B and C assets.

We also have A2 and B2 collections of assets, which have a similar link structure, with the B2 assets having links through to assets in C. We want to keep both A2 and B2.

Through a review with Chris Grist, we’ve identified that we want to reduce the number of weblinks and assets on our system and have taken actions that have seen the C assets made redundant.

We want to ideally delete the parent assets and all assets underneath, along with the removal of any links back to other parents.

In looking to delete them we’ve talked with Squiz NZ who unfortunately didn’t have any experience with the script, looking for a solution that’s smarter than trying to do manual deletes. There are ~5744 assets in the C collection. In my experience with move to trash, the HIPO job would timeout only partially completed.

Apologies for the length of the above explanation. If there’s any solutions that you’re aware of we could use that would be appreciated. At this point I’m considering an asset listing to identify all relevant IDs which we’ll then feed to delete_assets_by_id either in a hideous monolithic command list or a script to loop through them all.


(Douglas (@finnatic at @waikato)) #4

So, this is what I’m going with

  • a simple asset listing which outputs asset id and asset name in a CSV format (asset name wrapped in quotes)
  • a simple shell script:
#!/bin/sh
#--------------------------------------------------------------------------------------
# Name: data.sh - Delete All The Assets
# Author: Douglas
# Date: August 2017
# Description: Iterate through a file of assets, running delete_by_asset_id against each id
# Input: csv / text file containing the target asset ids
#-------------------------------------------------------------------------------------
filename=assets.txt
if [ -n "$1" ]; then
   filename=$1
fi
echo -e "Starting DATA (Delete All The Assets)"
IFS=","
while read assetid assetname; do
   echo running delete_asset_by_id against $assetid - $assetname
   sudo -u openresty php /var/www/uow-prod/scripts/delete_assets_by_id.php $assetid
done < $filename

(David Schoen) #5

Sorry, I got stuck about here. I’m not clear where A2 and B2 have come from, it sounds like you’ve got a solution though…

I just want to make sure you’re clear that any asset that has multiple parents will have its link to one parent removed and it’ll be unpredictable which parent that is.

The JS_API moveLink call might be more useful here as you can at least specify the parentid the link has to be moved out from under: https://matrix.squiz.net/manuals/web-services/chapters/javascript-api#moveLink

If you’re already working with Chris and we (Squiz AU) have any kind of access I’d be happy to take a more direct look too - but I won’t take any action on a client instance from a forum post, it’d have to come through via a ticket.

Edit: typos, mostly.


(David Schoen) #6

I just wanted to clarify a bit more linking theory.

There is only ever one link between a unique parent and child (internally “majorid” and “minorid”), so with:

Root
|- A 
|  |- C
|     |- D
|
|- B
   |- C
      |- D

So A, B and D all only have one parent (despite D showing up in the tree twice), but C has two parents.

If D had any children (grandchildren etc), they’d also show up in the tree twice even if they were only had a single link under D.


(Douglas (@finnatic at @waikato)) #7

A and B were prototype setups for A2 and B2, which were abandoned but never cleared up. A2 and B2 are the live asset collections we have producing rather important content for our site.

Is there a reason why there isn’t a ‘delete an asset and all links to it’ function somewhere (JS API, shell script, etc) ?

From the linking screen inside _admin, you can delete all links… :confused:

I’d love to raise a support ticket directly with Squiz Australia for this, but aren’t sure that our contract with Squiz New Zealand allows for that. I will make enquiries.

In terms of linking - it looks more like this:

|-Qualifications
| |-Qualification 1
| | |-Subject 1
| | | |- Course 1
| | | |- Course 2
| | | |- Course 3...
| | |-Subject 2
| | | | - Course 3
| | | | - Course 4... 
| | |-Subject 3

We want to delete all the course assets. Then delete all the Subjects and Qualifications from the prototype assets which have been redundant for some time now.


(Douglas (@finnatic at @waikato)) #8

I’ve created a Squizmap idea - Refine delete_assets_by_id.php :

What:
Refine the delete_assets_by_id.php so that it operates as per it’s stated description in the squiz manuals, with additional options of:

  • remove all links to the asset except the link to trash
  • automatic purge of asset from trash after deletion of all (non trash) links
    Why:
  1. A script be available allowing Squiz Matrix users to script deletion of assets where a manual deletion would require excessive manual effort.
  2. The script should operate as the manuals state.

(Douglas (@finnatic at @waikato)) #9

Which would include a link to trash as per your squiz map idea and as I discovered…

I’ve refined the script I’m using a little. It now iterates through a loop calling delete_assets_by_id, followed by a call to purge_trash - so it should delete each link one by one as it processes through the asset list.

#!/bin/sh
#--------------------------------------------------------------------------------------
# Name: data.sh - Delete All The Assets
# Author: Douglas Davey
# Date: August 2017
#
# Description: Iterate through a file of assets, running delete_by_asset_id against each id
#  1) delete_by_asset_id against each id
#  2) purge_trash
#  3) repeating until delete_by_asset_id can't locate Aldebaran
#
# Input: csv / text file containing the target asset ids
#-------------------------------------------------------------------------------------
filename=assets.txt
if [ -n "$1" ]; then
   filename=$1
fi

echo -e "Starting DATA (Delete All The Assets)"
IFS=","
while read assetid assetname; do
    echo running delete_asset_by_id against $assetid - $assetname
    OUTPUT="";
    while [ `echo $OUTPUT | grep -c unable` = 0 ]; do
        OUTPUT=`sudo -u openresty php /var/www/uow-prod/scripts/delete_assets_by_id.php $assetid`;
        sudo -u openresty php /var/www/uow-prod/scripts/purge_trash.php /var/www/uow-prod/
        echo $OUTPUT
    done
done < $filename