pdf.js relies on urls to contain the 'pdf' extension #4284

octosquid · 2014-02-12T17:33:32Z

When the server doesn't provide the Content-Disposition header, pdf.js relies on urls to contain the 'pdf' extension. But URLs are locators, not names.
Steps to reproduce:

mv web/compressed.tracemonkey-pldi-09.pdf web/compressed.tracemonkey-pldi-09
sed -i 's/compressed.tracemonkey-pldi-09.pdf/compressed.tracemonkey-pldi-09/g' web/viewer.js
firefox web/viewer.html

Now click on download. You will be offered a 'document.pdf' file. The name should be something more meaningful.
The bug also happens when I leave out the pdf extension on my apache web server.

Proposed solution:
Use the title of the pdf. (as this viewer.js code). The title is also used by firefox for the File -> "save page as" functionality when displaying a HTML page like http://en.wikipedia.org/wiki/Internet_media_type .

Not every html web page ends in .html. Instead by the extension, a document's type is specified by its MIME-type.
However, most pdf files have the pdf extension, and most pdfs online also have a good-to-store name in the url.
I don't know whether the new retrieval method should overwrite the url retrieval, or be a fallback to it.

See also #3455.

The text was updated successfully, but these errors were encountered:

Mercieral · 2016-08-22T13:24:19Z

@timvandermeij

Any update on this? It has been open for more than two years now. Because my file param is a server call that sends a pdf file back, the pdf viewer is not able to detect the name of the file because it seems to be looking for a .pdf extension and so I'm stuck with "document.pdf" when downloading and "untitled.pdf" in the window bar when viewing.

It would be handy if we could also specify a "title" in the URI as well as the "file" such as .../pdf-viewer/viewer.html?file="..."&title="..."

timvandermeij · 2016-08-23T12:27:05Z

I know that currently work is being done in #7554 to support the Content-Disposition header, which is a way to solve this issue. I do agree, however, that document.pdf is not the best possible name and we might need to improve the function for getting the (file)name. Patches for this are welcome, so I'm labeling this as a good beginner bug as it should not be too hard to implement.

Mercieral · 2016-08-23T12:33:00Z

@timvandermeij Excellent thank you, I believe supporting Content-Disposition would actually fix my issue.

I agree, as I was going through the code I noticed it should not be too difficult to just add another URL param for the filename. I'll give it a try in the next few days, Thanks.

Snuffleupagus · 2016-08-23T13:03:23Z

Patches for this are welcome, so I'm labeling this as a good beginner bug as it should not be too hard to implement.

@timvandermeij Please remember that in PR #4956 we purposely moved away from letting various hash parameters affect the viewer (unless debugging is enabled, see https://github.com/mozilla/pdf.js/wiki/Debugging-pdf.js).
Hence I do not think that we should make it possible to specify the title using a hash parameter!

Especially considering that it would be non-standard (in the context of http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_open_parameters.pdf), and compared to the Content-Disposition approach in PR #7554, it really doesn't add much value.

timvandermeij · 2016-08-23T13:06:12Z

Sorry, I should have been more clear. I meant that patches are welcome for improving the function that determines the file name from the URL. I think we can do better there instead of only relying on the file extension. I agree that we should not add more hash parameters.

anirudhrb · 2017-03-01T16:35:29Z

What's the status of this issue? Is this still open?

yurydelendik · 2017-03-01T16:52:04Z

What's the status of this issue? Is this still open?

@anirudhrb still opened, there was an attempt to implement that at #7554, would you like contribute to that?

anirudhrb · 2017-03-01T17:26:54Z

@yurydelendik Yes, I would like to contribute. What is expected in a PR for this issue?

yurydelendik · 2017-03-01T17:31:16Z

@anirudhrb, you may just take the above PR as a base since it has remoting of data somewhat right -- we would expect small patch with a unit tests. We don't need spec Content-Disposition parsing, but enough to get filename.

anirudhrb · 2017-03-04T11:41:44Z

@yurydelendik I have started working on this. This is my first attempt at contributing to an open-source project. I'll need some time to get comfortable with the codebase. :)

himanish-star · 2018-01-02T10:28:31Z

@yurydelendik , @timvandermeij Could I take this issue up if it's okay with you all?

timvandermeij · 2018-01-02T21:38:00Z

There is a pull request above which looks like the right direction, but there has been no more activity for it. If you're interested in fixing up that one, that sounds good. I'll ask if the original author is still planning to work on it.

timvandermeij · 2018-01-18T22:30:30Z

Fixed in #9379.

timvandermeij added 1-viewer and removed 1-viewer labels Feb 12, 2014

jrubins mentioned this issue Feb 16, 2015

When downloading a document, the file name should be set to the object name tsgrp/HPI#529

Open

11 tasks

timvandermeij added the good-beginner-bug label Aug 23, 2016

jsperezg mentioned this issue Jul 23, 2017

Parse content disposition header #8688

Closed

timvandermeij closed this as completed Jan 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pdf.js relies on urls to contain the 'pdf' extension #4284

pdf.js relies on urls to contain the 'pdf' extension #4284

octosquid commented Feb 12, 2014

Mercieral commented Aug 22, 2016

timvandermeij commented Aug 23, 2016

Mercieral commented Aug 23, 2016

Snuffleupagus commented Aug 23, 2016

timvandermeij commented Aug 23, 2016

anirudhrb commented Mar 1, 2017

yurydelendik commented Mar 1, 2017

anirudhrb commented Mar 1, 2017

yurydelendik commented Mar 1, 2017

anirudhrb commented Mar 4, 2017

himanish-star commented Jan 2, 2018

timvandermeij commented Jan 2, 2018

timvandermeij commented Jan 18, 2018

pdf.js relies on urls to contain the 'pdf' extension #4284

pdf.js relies on urls to contain the 'pdf' extension #4284

Comments

octosquid commented Feb 12, 2014

Mercieral commented Aug 22, 2016

timvandermeij commented Aug 23, 2016

Mercieral commented Aug 23, 2016

Snuffleupagus commented Aug 23, 2016

timvandermeij commented Aug 23, 2016

anirudhrb commented Mar 1, 2017

yurydelendik commented Mar 1, 2017

anirudhrb commented Mar 1, 2017

yurydelendik commented Mar 1, 2017

anirudhrb commented Mar 4, 2017

himanish-star commented Jan 2, 2018

timvandermeij commented Jan 2, 2018

timvandermeij commented Jan 18, 2018