Performance of sq_vw_ast_perm in 3.20 on postgres (with roles)


(Peter Sheppard) #1

The roles saga continues!


I've just upgraded our intranet to 3.20.0 and on testing it was running awfully, with basic permissions lookups taking seconds.



I tried putting in our usual changes to asset_manager.inc that make the roles table joins in there far more efficent; whilst this went from everything locked up to pages taking 20 seconds, it's obviously far too long.



I have now isolated the issue to the sq_vw_ast_perm view taking a stupidly long time to do anything useful. This view has been totally re-written since our previous version (3.18.5), and I think it changed in 3.18.7? I'm not sure of the reason for the change, or if the old and new views are supposed to produce any different results - perhaps somebody can give the technical background to that?



Anyway, when I looked at the view a little deeper, I noticed some duplication of effort. The first CASE statement is checking to see if r.assetid exists, and using that if it does, and p.assetid if it doesn't. However, if r.assetid exists, it will ALWAYS be the same as p.assetid, due to the ON clause of the LEFT JOIN further on in the query. Replacing the first of the CASE statements with just p.assetid produced exactly the same results in all my test cases, so I changed my system to reflect this, and it brought page load times back to where they were in 3.18.5



As usual, comments/feedback/further analysis welcome!


(Chris Smith) #2

Hi Peter,

[quote]The roles saga continues!



I've just upgraded our intranet to 3.20.0 and on testing it was running awfully, with basic permissions lookups taking seconds.



I tried putting in our usual changes to asset_manager.inc that make the roles table joins in there far more efficent; whilst this went from everything locked up to pages taking 20 seconds, it's obviously far too long.[/quote]



Can we see a diff containing your changes?


[quote]I have now isolated the issue to the sq_vw_ast_perm view taking a stupidly long time to do anything useful. This view has been totally re-written since our previous version (3.18.5), and I think it changed in 3.18.7? I'm not sure of the reason for the change, or if the old and new views are supposed to produce any different results - perhaps somebody can give the technical background to that?[/quote]



They give the same results, but it was turned into a simple left join.


[quote]Anyway, when I looked at the view a little deeper, I noticed some duplication of effort. The first CASE statement is checking to see if r.assetid exists, and using that if it does, and p.assetid if it doesn't. However, if r.assetid exists, it will ALWAYS be the same as p.assetid, due to the ON clause of the LEFT JOIN further on in the query. Replacing the first of the CASE statements with just p.assetid produced exactly the same results in all my test cases, so I changed my system to reflect this, and it brought page load times back to where they were in 3.18.5[/quote]



Good pickup, I will try that here and run it through our unit test system to make sure there are no adverse effects. Though I'm not sure how that will change the load time because it's still doing the left join etc.



Can you post an 'EXPLAIN ANALYZE' of both the old & new views (using the same assetid's, userid's etc) ? If you don't want them public, you can send me a private message with the details.


(Chris Smith) #3

[quote]Good pickup, I will try that here and run it through our unit test system to make sure there are no adverse effects. Though I’m not sure how that will change the load time because it’s still doing the left join etc.


Can you post an ‘EXPLAIN ANALYZE’ of both the old & new views (using the same assetid’s, userid’s etc) ? If you don’t want them public, you can send me a private message with the details.[/quote]



For anyone interested in the technical details, I posted the question to the postgresql-performance list. You can see the thread here:



http://archives.postgresql.org/pgsql-performance/2009-01/msg00137.php



The CASE statement (even though it’s in the select) is changing the way the join is put together - which turns it from a left join which uses an index, into a join that can’t use an index (the CASE is being evaluated per row because it involves columns in the left join).



I’ll run this change through the unit tests tonight and see if any problems show up.