View Issue Details

IDProjectCategoryView StatusLast Update
0000180file[All Projects] Generalpublic2021-04-27 19:35
ReporterEuphCatAssigned Tochristos 
PrioritynormalSeveritytrivialReproducibilityalways
Status resolvedResolutionfixed 
Product Version5.39 
Target VersionFixed in Version5.40 
Summary0000180: A file filled with 0xFF gets reported to be ISO-8859
Description* I don't know how the parser works, or how file types are managed. I'm okay with NOTABUG

A file filled with 0xFF gets reported to be ISO-8859. I find this misleading.
Steps To Reproduce$ dd if=/dev/zero ibs=1k count=1 | tr "\000" "\377" > 0xFFfile.bin
1+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000256375 s, 4.0 MB/s
$ xxd ./0xFFfile.bin | head
00000000: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000010: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000020: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000030: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000040: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000050: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000060: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000070: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000080: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000090: ffff ffff ffff ffff ffff ffff ffff ffff ................
$ file 0xFFfile.bin
0xFFfile.bin: ISO-8859 text, with very long lines, with no line terminators
Additional InformationBecause of this issue, mkfs.ext4 reports false positive on file validity confirmation.
"/dev/sda2 contains `ISO-8859 text, with very long lines, with no line terminators`
Proceed anyway? (y,N)"
TagsNo tags attached.

Activities

christos

2020-08-15 12:06

manager   ~0003447

Fixed by requiring at least 3 distinct character values.

christos

2021-04-27 19:35

manager   ~0003602

Reverted the fix. Breaks other tests. A file can have the same character repeated many times and that should not change how file detects it. Heuristics just add confusion to the behavior.

Issue History

Date Modified Username Field Change
2020-08-15 05:30 EuphCat New Issue
2020-08-15 12:06 christos Assigned To => christos
2020-08-15 12:06 christos Status new => assigned
2020-08-15 12:06 christos Status assigned => resolved
2020-08-15 12:06 christos Resolution open => fixed
2020-08-15 12:06 christos Fixed in Version => 5.40
2020-08-15 12:06 christos Note Added: 0003447
2021-04-27 19:35 christos Note Added: 0003602