Problems with PDF files not working


(Kequi) #1

Matrix Version: - 5.1.2.1

I have a pdf with 79 pages. It’s only 3.4MB but lots of text. 150dpi images with high jpeg compression.
Working in the admin interface - it appears to load fine as a pdf file asset - but it refuses to display on the front end.
It looks like it’s uploading fine - uploading completes - I can see the right filename and size etc in the asset tree and details screen etc.
But if I try to preview or go to the web path of the file - it just hangs with a blank page and never loads.

If I redo the file with only 30 pages or less - it works fine.
If I redo the file with 40 pages or more - it doesn’t work.

What else I have tried so far:

Tried a 30 page file with less compression ( 300 dpi high jpeg - 11 MB) - and this worked OK.

Tried a 40 page file with more compression (1.9 MB) - and that worked OK.

Tried a full 79 page pdf on lowest compression (72dpi low jpeg images - 2.6 MB) - and that worked OK.

Tried a full 79 page pdf on medium compression (100dpi high jpeg - 2.9 MB) - and that failed.

Tried turning OFF search indexing while I uploaded the file - no difference.

Tried different browsers, different pdf creation software, different pdf settings.

It appears to be some combination of number of pages + image compression - and if I go over that it just doesn’t work.

Not working example: http://www.qualityincare.org/__data/assets/pdf_file/0006/1176387/QiCD_CS16_DIGITAL_ALL_medium_FILE.pdf

Working example:
http://www.qualityincare.org/__data/assets/pdf_file/0005/1176386/QiCD_CS16_DIGITAL_ALL_SMALL_FILE.pdf

Anyone got any ideas?
Help appreciated as always.

Thanks

Karl


(Peter McLeod) #2

Hi
The not working link you gave seems fine when I view it.
Might be a caching thing with you viewing it? Have you tried recaching, testing issue in another browser ect
Peter


(Kequi) #3

Thanks Peter.

Tried in 3 different browsers with no success.
Then tried /_nocache - and that worked!
Then tried /_recache - and that didn’t work.
Then tried /_nocache again - and now that doesn’t work either.

Right now - nothing working. Tried clearing all browser cache and history - still no joy.

My colleague on a PC seems to be able to view the file OK.
Other colleague on a Mac can see the pdf in Safari - but not Chrome.
Other colleague on a Mac can’t see the file in Chrome or Firefox.

Worked fine on my iphone.

I’m at a loss at the moment.

Thanks Peter

Karl


(Nic Hubbard) #4

Both those links are working for me. Did you change something?


(Kequi) #5

Nope - haven’t changed those at all - they still don’t work for me.

The one below worked for a while - but as soon as I cleared browser cache it was gone - wouldn’t load any more.

http://www.qualityincare.org/__data/assets/pdf_file/0003/1176348/QiCD_CS_DIGITAL_2016.pdf

Its like it would make a local cache of it first time it loaded on Chrome - then use this all the time - then as soon as I clear the cache it refuses to download it again. And it won’t download on either of the other browsers.

I just get “connecting…” or “Waiting for…” messages.

Smaller pdf files seem to be fine - 100% reliable - pdfs files on other sites seem to be fine.

I’ve even upload the files to other systems like pdfarchive - and they work 100% on there so I know it’s nothing to do with the pdf file itself

EG: https://www.pdf-archive.com/2016/11/11/qicd-cs16-digital-l/qicd-cs16-digital-l.pdf


(Marcus Fong) #6

The interesting thing about Matrix URLs starting with __data (i.e. file assets with Unrestricted Access enabled) is that they’re served directly as plain files by your webserver software (usually Apache or Nginx). It’s a performance optimisation, to save wasted database queries checking permissions for a file which you’ve deliberately made public (hence the name “Unrestricted Access”).

As such, there really isn’t anything in Matrix that can go wrong when requesting __data URLs, because the Matrix PHP code doesn’t even execute when you request them! Your webserver just reads the file and sends it. So if there’s an issue with serving __data URLs, it’s generally at a lower level than Matrix - the webserver software, the server, the network or some combination of those.

Have you tried testing it from home or some other different network? Does your office network have some kind of proxy that your browsers have to go through? Or are there any errors in your webserver’s logs?

I’ve tried your examples and they all work fine for me (MacOS X, Chrome and Firefox).


(Hugh Williams) #7

I am also able to access all of the example PDF links (MacOS Sierra, Chrome, Firefox and Safari)


(Kequi) #8

Thanks Marcus, Nick, Hugh

Very puzzling. I can’t find anyone outside outside of our office who has the same issue (yet)…

I’ve published some of the links on the website now - so I guess I’ll soon see if I get a flood of complaints!

Our IT guys cant’ see anything odd about the internet setup - not using any proxies or routing the data differently to the Macs for example.
When I can get it to work - after the first time it downloads - it’s then super fast - obviously loading from a local cache. When it breaks it just can’t seem to get the file from the webserver - just hanging at connection.
It’s the same whether it’s a __data url or a friendly url - makes no difference.

I’ll keep treating it as a local issue for the moment.

Thanks for all the help and advice.

Karl


(Bruce Kyle) #9

Hi - we have a similar issue, seems to be just a Chrome issue (or more specifically the PDF plugin that Chrome uses). So far have managed to reproduce it both internally and externally. One solution that does work is using https instead of http. I would recommend trying that to see if you can successfully download your test PDF. Then as a stop-gap measure - go to settings on the page that hosts the PDF and force secure connection.

Currently looking at switching to https for the entire site (there are some google SEO benefits by going down this path too).

Cheers
Bruce