View Issue Details

IDProjectCategoryView StatusLast Update
0000131fileGeneralpublic2020-01-17 21:29
Reporterdavid_keeffe Assigned Tochristos  
PrioritynormalSeverityminorReproducibilityalways
Status assignedResolutionopen 
PlatformRHEL 7OSLinuxOS Version3.10.0
Summary0000131: The magic file in File 5.11 reports some HTML as C++
DescriptionIf an HTML file has the word 'class' starting at the beginning of a line, file reports it as C++.
This appears often in HTML output by Microsoft Word: text between <...> is line wrapped.

The issue seems to be that the distributed magic file raises the strength of C++ constructs and/or C++ appears before HTML.
Steps To ReproduceOn RHEL 7:

file /path/to/file
TagsNo tags attached.

Activities

christos

2020-01-17 17:26

manager   ~0003343

Can you please share an example?

david_keeffe

2020-01-17 21:29

reporter   ~0003346

I think the problem has been fixed in a newer version - testing on a Mac gives expected results. I think this bug needs raising with RedHat since they choose to use much older version of 'file'.

Idril:~ david$ cat small.html
<html>
<body>
<div class='xxx'>
    

Hello World


</div>
</body>
</html>
Idril:~ david$ file small.html
small.html: HTML document text, ASCII text
Idril:~ david$ cat small-c.html
<html>
<body>
<div
class='xxx'>
    

Hello World


</div>
</body>
</html>
Idril:~ david$ file small-c.html
small-c.html: HTML document text, ASCII text
Idril:~ david$ file -v
file-5.33
magic file from /usr/share/file/magic

Issue History

Date Modified Username Field Change
2020-01-14 03:35 david_keeffe New Issue
2020-01-17 17:25 christos Assigned To => christos
2020-01-17 17:25 christos Status new => assigned
2020-01-17 17:26 christos Status assigned => feedback
2020-01-17 17:26 christos Note Added: 0003343
2020-01-17 21:29 david_keeffe Note Added: 0003346
2020-01-17 21:29 david_keeffe Status feedback => assigned