View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0000104||file||[All Projects] General||public||2019-09-10 21:04||2019-09-11 17:07|
|Priority||normal||Severity||minor||Reproducibility||have not tried|
|Target Version||Fixed in Version|
|Summary||0000104: pdf file incorrectly reported as `data`|
|Description||Some pdf files downloaded from the internet are incorrectly reported as `data` by file. Their associated mime-type is `application/octet-stream` and not `application/pdf`. I join such a pdf to this report.|
|Tags||No tags attached.|
certificat_scolarité_l2_eco.pdf (1,184,843 bytes)
These are the first few lines of the file:
HTTP/1.1 200 OK
Date: Tue, 10 Sep 2019 08:38:20 GMT
Server: Apache/2.4.38 (Debian)
Content-Disposition: attachment; filename="21808995-2019-certificat-scolarite.pdf"
Cache-Control: no-cache, private
Here's where the pdf file starts:
The tool you used to download it or the original file has junk in front. Of course some browsers ignore the junk and process it as a pdf file (because users want things to just work), but this is just crappy behavior. Most application will not open it properly, and it is also a security issue since you can masquerade files this way. It is also fragile. How many lines does it try to parse? 10? 1K of data? Who knows. Depends on the implementation. Of course file can also be modified to mimick this behavior at the loss of efficiency and encouraging people to produce junk...
Oh, I didn’t know I could open pdf files with a text editor.
I don’t think you should ignore junk in front of file. I just needed some way to get this file (and a few other) to be recognized as pdf files, but if I can just open them and get rid of the leading incorrect lines, I will just do it.
Thank you for your answer.
As far as I’m concerned, you can consider this issue closed.
|2019-09-10 21:04||Ilrandar||New Issue|
|2019-09-10 21:04||Ilrandar||File Added: certificat_scolarité_l2_eco.pdf|
|2019-09-11 14:39||christos||Assigned To||=> christos|
|2019-09-11 14:39||christos||Status||new => assigned|
|2019-09-11 14:42||christos||Status||assigned => feedback|
|2019-09-11 14:42||christos||Note Added: 0003288|
|2019-09-11 17:07||Ilrandar||Note Added: 0003295|
|2019-09-11 17:07||Ilrandar||Status||feedback => assigned|