View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000554 | file | General | public | 2024-08-25 06:48 | 2024-08-25 13:30 |
Reporter | Slush9 | Assigned To | |||
Priority | normal | Severity | minor | Reproducibility | always |
Status | new | Resolution | open | ||
Platform | Linux | OS | Debian | OS Version | 12 |
Product Version | 5.44 | ||||
Summary | 0000554: .eml file identified as text/html (should be message/rfc822) when "Subject:" header is first | ||||
Description | Hi, I have a .eml file generated by Apple Mail which seems to fully conform to RFC 5322 but is being identified by file-5.44 as text/html: $ file --mime-type Re_\ We\'ve\ made\ changes\ to\ your\ bill.eml Re_ We've made changes to your bill.eml: text/html $ file -v file-5.44 magic file from /etc/magic:/usr/share/misc/magic I've determined that the false identification is caused by the fact that the first header in the file is a "Subject:" header: $ head -1 Re_\ We\'ve\ made\ changes\ to\ your\ bill.eml Subject: Re: We've made changes to your bill If I move the "Date:" header to the top and run the command again, I get the expected identification: $ head -1 Re_\ We\'ve\ made\ changes\ to\ your\ bill.eml Date: Mon, 18 Dec 2023 23:34:30 +1100 $ file --mime-type Re_\ We\'ve\ made\ changes\ to\ your\ bill.eml Re_ We've made changes to your bill.eml: message/rfc822 False identifications also occur for these headers when first-billed: | ||||
Additional Information | Out of all the headers in this particular email, only "Date:" and "From:" cause `file` to correctly identify the file as message/rfc822, and only when one of those is first-billed. Any of the other headers in this email: - Cc: - Content-Type: - In-Reply-To: - Message-Id: - Mime-Version: - References: - Subject: - To: - X-Apple-Base-Url: - X-Apple-Mail-Remote-Attachments: - X-Apple-Mail-Signature: - X-Apple-Windows-Friendly: - X-Uniform-Type-Identifier: - X-Universally-Unique-Identifier: if first-billed, cause the file to be misidentified as either text/html or text/plain. I appreciate that it wouldn't be possible for the libmagic maintainers to predict every possible first header in a .eml file, but "To:", "Cc:", and "Subject:" are all highly prevalent and should perhaps be checked. | ||||
Tags | No tags attached. | ||||
Date Modified | Username | Field | Change |
---|---|---|---|
2024-08-25 06:48 | Slush9 | New Issue |