View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
456 [file] General minor always 2023-06-05 16:52 2023-06-05 16:52
Reporter: Albrecht Platform: x86_64  
Assigned To: OS: Debian  
Priority: normal OS Version: Bookworm  
Status: new Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: LRZip: wrong MIME type
Description: For the attached LRZip sample, the human-readable message is correct:

$ file sample.lrz
sample.lrz: LRZIP compressed data - version 0.6

…but the MIME type is not – expected is “application/x-lrzip”:

$ file --mime-type sample.lrz
sample.lrz: application/octet-stream
Tags: patch
Steps To Reproduce: See above.
Additional Information: I think the effect is caused by a bad order of statements in the file magic/Magdir/compress, at least the attached patch seems to fix the issue. No idea if it has any adverse effects, though.
Attached Files: lrzip-mime.patch (527 bytes) 2023-06-05 16:52
https://bugs.astron.com/file_download.php?file_id=343&type=bug
sample.lrz (81 bytes) 2023-06-05 16:52
https://bugs.astron.com/file_download.php?file_id=342&type=bug
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
455 [file] General text N/A 2023-06-02 02:13 2023-06-02 02:13
Reporter: raf Platform: All  
Assigned To: OS: All  
Priority: low OS Version: All  
Status: new Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Incomplete documentation for magic_getpath in libmagic.man
Description: The libmagic manual entry only mentions magic_getpath() in relation to its return value. It doesn't include it in the list of available functions, and it doesn't explain what it does. Below, in the additional information section, is a patch to fix that. Technically, the existing statement that magic_getpath() returns NULL on error isn't true. It returns the system default path whenever there is an error, but it's probably harmless to leave that.
Tags: patch
Steps To Reproduce: Run "man libmagic" then search for "magic_getpath".
Additional Information: diff --git a/doc/libmagic.man b/doc/libmagic.man
index f006c828..ec5cca0c 100644
--- a/doc/libmagic.man
+++ b/doc/libmagic.man
@@ -84,6 +84,8 @@
 .Fn magic_setparam "magic_t cookie" "int param" "const void *value"
 .Ft int
 .Fn magic_version "void"
+.Ft const char *
+.Fn magic_getpath "const char *magicfile" "int action"
 .Sh DESCRIPTION
 These functions
 operate on the magic database file
@@ -347,6 +349,18 @@ from
 .In magic.h .
 This can be used by client programs to verify that the version they compile
 against is the same as the version that they run against.
+.Pp
+The
+.Fn magic_getpath
+command returns the colon separated list of magic database locations. If the
+.Fa filename
+is non-NULL, then it is returned. Otherwise, if the
+.Dv MAGIC
+environment variable is defined, then it is returned.
+Otherwise, if
+.Fa action
+is 0 (meaning "file load"), then any user-specific magic database file is included.
+Otherwise, only the system default magic database path is included.
 .Sh RETURN VALUES
 The function
 .Fn magic_open
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
452 [file] General minor always 2023-05-22 10:13 2023-06-01 16:01
Reporter: lukem Platform:  
Assigned To: christos OS: NetBSD  
Priority: normal OS Version: 10.99.4  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: mdoc man pages misidentified as C source
Description: I noticed that file 5.43 (as nbfile tool in NetBSD-current) misidentifies various mdoc man pages as C source.

If I use file 5.41 on macOS 13.4 on the same source tree, it misidentifies a smaller subset.


Tags:
Steps To Reproduce: cd src/lib/libc
find . -name '*.[1-9]' | xargs path/to/file | grep -v 'troff or' | sort
./gen/ctype.3: C source, ASCII text
./gen/getdevmajor.3: C source, ASCII text
./gen/randomid.3: C source, ASCII text
./locale/nl_langinfo.3: C source, ASCII text
./net/getifaddrs.3: C source, ASCII text
./sys/dup.2: C source, ASCII text
./sys/mremap.2: C source, ASCII text
./sys/readlink.2: C source, ASCII text
./sys/select.2: C source, ASCII text
./sys/undelete.2: C source, ASCII text
Additional Information: I discovered this whilst debugging NetBSD/src/share/man/man0 and observing different behaviour between the host file and the TOOLDIR nbfile, and realised that both were buggy, one was less buggy than the other.
Attached Files:
Notes
(0003942)
polluks   
2023-05-22 12:05   
At least NetBSD 9.3 file-5.37 says
$ file /usr/share/man/man3/ctype.3
/usr/share/man/man3/ctype.3: troff or preprocessor input, ASCII text
(0003943)
lukem   
2023-05-22 12:10   
Yes, file 5.43 seems to detect more of that list as C instead of troff, but file 5.37 will still misidentify some of that list.

Using file 5.37 (NetBSD 9.0, ftp.netbsd.org), checking /usr/share/man there's a bunch of mismatches including those I listed from libc.

./man1/flex.1: C source, ASCII text
./man1/lex.1: C source, ASCII text
./man1/calendar.1: C source, ASCII text
./man1/ident.1: C source, ASCII text
./man1/makeinfo.1: HTML document, ASCII text
./man1/sqlite3.1: HTML document, ASCII text
./man2/dup2.2: C source, ASCII text
./man2/dup.2: C source, ASCII text
./man2/dup3.2: C source, ASCII text
./man2/mremap.2: C source, ASCII text

...
(0003946)
christos   
2023-06-01 16:01   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
134 [tcsh] General minor always 2020-01-29 07:00 2023-06-01 15:31
Reporter: Kazuo Kuroi Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: assigned Product Version: 6.22.02  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: [tcsh 6.22.02] testsuite: 132 175 189 failed (IRIX 6.5.22/MIPSPro)
Description: I am reporting some testsuite issues for IRIX 6.5.22 using the MIPSPro compiler 7.4.4m on IRIX 6.5.22 (later versions are nearly identical for these cases, for the record)

configure command: % ./configure --disable-nls

CC=c99
Version info: % c99 --version
c99 ERROR: -- not allowed in non XPG4 environment
c99 ERROR parsing --version: unknown flag
MIPSpro Compilers: Version 7.4.4m

% uname -spR
IRIX 6.5 6.5.22m mips
Tags:
Steps To Reproduce: run configure command

make
make test
Additional Information: The release otherwise appears to be fine. I am more than happy to help troubleshoot the issue and if needed launch cvd (IRIX debugger)
Attached Files: testsuite.log (97,932 bytes) 2020-01-29 07:00
https://bugs.astron.com/file_download.php?file_id=100&type=bug
Notes
(0003378)
christos   
2020-02-18 20:22   
Can you try compiling without the optimizer? Or using gcc?
(0003387)
Kazuo Kuroi   
2020-02-22 21:29   
Removed optimize flag from CLFAGS, exact same results

After reviewing this again, I think the issue is that IRIX doesn't have gettent as part of its command set. I didn't realize this as I had been using getent from glibc on IRIX as part of some experiments, and thus my reason for being perplexed. Checking other IRIX machines, the exact same thing happens, so I feel that the main issue is the lack of getent on IRIX. If these test cases could be refactored to not use getent, I'm sure they'd pass on IRIX. OK to close.
(0003945)
lukem   
2023-06-01 15:31   
Kazuo: is there an equivalent to getent(1) on IRIX that we can use instead? (It's been over 25 years since I last used an IRIX system).

If not, we can instead modify the testsuite to skip those test groups if getent isn't available.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
454 [file] General tweak always 2023-05-24 15:14 2023-05-24 15:14
Reporter: Albrecht Platform: x86_64  
Assigned To: OS: Debian  
Priority: normal OS Version: Bookworm  
Status: new Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: C source file classified as HTML
Description: Some of my C source files (like the attached sample) are classified as

/tmp/sample.c: HTML document, Unicode text, UTF-8 text

if any comment contains Doxygen HTML-style links (see https://www.doxygen.nl/manual/autolink.html#linkurl). Removing the 5th line containing the link produces the expected result

/tmp/sample.c: C source, Unicode text, UTF-8 text

This is a little ugly as Doxygen is widely used for code documentation, and supports a variety of HTML commands to pimp up the output (see https://www.doxygen.nl/manual/htmlcmds.html). I didn't check if any other HTML tag command produces the same effect, though.
Tags:
Steps To Reproduce: See above - just run file v. 5.44 or the latest Git version on the attached sample. Note that file 5.39 coming with Debian Bullseye correctly returns

/tmp/sample.c: C source, UTF-8 Unicode text

Additional Information:
Attached Files: sample.c (328 bytes) 2023-05-24 15:14
https://bugs.astron.com/file_download.php?file_id=341&type=bug
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
442 [file] General feature always 2023-04-26 14:49 2023-05-22 12:24
Reporter: sprinter Platform:  
Assigned To: OS:  
Priority: normal OS Version:  
Status: new Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: DKIF / IVF formats not recognized
Description: Does not recognize multimedia formats (recognized as data).
Tags:
Steps To Reproduce:
Additional Information: https://gitlab.gnome.org/GNOME/gimp/-/issues/9356
Attached Files: animation-woods.ivf (406,941 bytes) 2023-04-26 14:49
https://bugs.astron.com/file_download.php?file_id=332&type=bug
Notes
(0003944)
polluks   
2023-05-22 12:24   
See https://chromium.googlesource.com/chromium/src/media/+/master/filters/ivf_parser.h


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
453 [file] General major always 2023-05-22 11:56 2023-05-22 11:56
Reporter: polluks Platform:  
Assigned To: OS:  
Priority: urgent OS Version:  
Status: new Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Make magic.mgc fails
Description: [...]
make[2]: Entering directory '/home/stefan/g/file/magic'
../src/file -C -m magic
magic/dwarfs, 1: Warning: offset ``' invalid
magic/images, 2565: Warning: offset `!mime image/vnd.radiance' invalid
magic/svf, 5: Warning: no need to escape `:'
file: could not find any valid magic files!
make[2]: *** [Makefile:866: magic.mgc] Error 1
make[2]: Leaving directory '/home/stefan/g/file/magic'
make[1]: *** [Makefile:463: all-recursive] Error 1
make[1]: Leaving directory '/home/stefan/g/file'
make: *** [Makefile:372: all] Error 2
Tags:
Steps To Reproduce: make clean all
Additional Information:
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
437 [file] General minor always 2023-03-28 08:49 2023-05-22 05:38
Reporter: laz Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Match strength not accounted in custom magic definition
Description: One can enrich the default magic database (in my case, "/usr/share/misc/magic.mgc") by adding a custom .mgc file in their home directory. Matches from this custom file however will always take precedence over the default definitions, regardless of the `strength` across all matches.

Tags: magic
Steps To Reproduce: # custom_magic
0 search/1024 ABC ABC file
!:mime application/abc
!:strength - 10

# default_magic
0 search/1024 ABCD ABCD file
!:mime application/abcd
!:strength + 50

# file -C -m default_magic && cp default_magic.mgc /usr/local/share/misc/magic.mgc
# file -C -m custom_magic && cp custom_magic.mgc ~/.magic.mgc

# file --list -m /usr/local/share/misc/magic.mgc
Set 0:
Binary patterns:
Text patterns:
Strength = 88@1: ABCD file [application/abcd]
Set 1:
Binary patterns:
Text patterns:

# file --list -m ~/.magic.mgc
Set 0:
Binary patterns:
Text patterns:
Strength = 29@1: ABC file [application/abc]
Set 1:
Binary patterns:
Text patterns:

# echo "ABCD" > abcd.txt && file abcd.txt -k
abcd.txt: ABC file\012- ABCD file, ASCII text

The order of the returned MIME types should have been reversed - `ABCD file\012- ABC file`
Additional Information:
Attached Files:
Notes
(0003939)
christos   
2023-05-21 17:28   
The magic strengths of patterns are implemented by sorting magic entries within a set (a file) and then matching them in sequence. This is why you see that effect once you split the magic entries across multiple files. Fixing this would be very expensive at runtime (would require merging the entries for multiple files and then re-sorting. Not impossible, but is it really worth it?
(0003941)
laz   
2023-05-22 05:38   
Hey christos

I'm not aware of the implementation of the magic entries sorting, so feel free to close this one if you think it's going to be more of a headache than actually beneficial.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
435 [file] General minor always 2023-03-22 21:47 2023-05-22 01:44
Reporter: haowenl Platform:  
Assigned To: christos OS: Windows  
Priority: normal OS Version:  
Status: assigned Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: file_magic: stat fails with errno EOVERFLOW on Windows for large files
Description: On Windows, stat by default calls _stat64i32, which uses 32bit file length types. This causes stat to fail with EOVERFLOW for large files.

This can be fixed by either using _stat64 for all files on Windows, or catch the EOVERFLOW errno and retry with _stat64.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003936)
christos   
2023-05-21 17:15   
does adding:

#ifdef _WIN64
#define stat _stat64
#endif

in file.h fix it?
(0003940)
haowenl   
2023-05-22 01:44   
Yes, that seems sufficient. Unfortunately tho, I do not currently have access to a Windows machine and cannot test it.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
445 [file] General minor always 2023-04-28 16:00 2023-05-21 17:24
Reporter: milahu Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: feedback Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: file/libmagic fails to detect cp1252 encoding
Description: actual result is the encoding "unknown-8bit"

$ printf "what\x92s up?\n" | file -i -
/dev/stdin: text/plain; charset=unknown-8bit

expected result is cp1252:

$ printf "what\x92s up?\n" | chardetect
<stdin>: Windows-1252 with confidence 0.73

$ printf "what\x92s up?\n" | iconv -f cp1252 -t utf8
what’s up?
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003938)
christos   
2023-05-21 17:24   
Fixing it would require either implementing more heuristics to improve charset detection, or using a 3rd party library that already does this well. Both of these are fairly large projects to implement.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
451 [file] General feature N/A 2023-05-21 16:06 2023-05-21 17:19
Reporter: mhx Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Improved magic file for DwarFS file system images
Description: Author of DwarFS here, following up on issue 449 (which was resolved while I was writing an update).

I've been working on a magic file independently of @srjs when the issue was raised on the github issue tracker. I'm posting my magic file as well, as it uses stricter checks that prevent accidental matches, supports DwarFS images with headers, and outputs the block compression algorithm used.

One thing that I don't know how to solve is that files with headers (which are typically shell scripts) will be identified either as shell scripts or as DwarFS images, depending on the order in which the magic file rules are read. I don't know what approach the file utility typically takes in case of such "hybrid" files.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: magictest-header.dwarfs (951 bytes) 2023-05-21 16:06
https://bugs.astron.com/file_download.php?file_id=340&type=bug
magictest.dwarfs (839 bytes) 2023-05-21 16:06
https://bugs.astron.com/file_download.php?file_id=339&type=bug
dwarfs.magic (1,659 bytes) 2023-05-21 16:06
https://bugs.astron.com/file_download.php?file_id=338&type=bug
Notes
(0003937)
christos   
2023-05-21 17:19   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
438 [file] General minor always 2023-04-03 12:00 2023-05-21 17:13
Reporter: Touchstone64 Platform: M1 Macbook Pro  
Assigned To: christos OS: MacOS  
Priority: normal OS Version: 13.3 (22E252)  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Some Apple QuickTime videos not recognised by file 5.44
Description: Using the MacOS Image Capture utility, specifically Image Capture Version 8.0 (1106), importing 'live' photos and regular movies from an iPhone 14 Pro using iOS 16 results in QuickTime .MOV files.

About 75% of these videos are recognised as 'video/quicktime' when using 'file --mime-type video.mov'. These videos contain atoms 'ftyp', 'wide', 'mdat' and 'moov' in that order.

However, about 25% of the video files don't have the 'ftyp' atom, they just have 'wide', 'mdat' and 'moov', in that order. In these cases 'file --mime-type video.mov' identifies the video files as 'application/octet-stream'. Using exiftool, it can be seen that the 'MIME type' tag in these files is indeed 'video/quicktime'.

.MOV files from earlier versions of iOS don't seem to have this issue, presumably a change in iOS is responsible for the different .MOV file content.
Tags: QuickTime
Steps To Reproduce: Use Image Capture to import a few dozen 'live' photos and videos from an iPhone running iOS 16.
Use the command 'file --mime-type *.MOV' in the directory where the files are stored and observe the output on the command-line.

(Unfortunately all my .MOV files with this content are larger than 2,048KiB so I can't upload one to demonstrate.)
Additional Information: There is already a test for the 'wide' atom in source file ./magic/Magdir/animation 1.87, at line 21. Uncommenting lines 21 and 22 in that file, and using file's -m command-line option to use the modified magic file(s), file recognises the video files as 'video/quicktime'.
Attached Files:
Notes
(0003935)
christos   
2023-05-21 17:13   
Uncommented, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
436 [file] General minor always 2023-03-23 15:17 2023-05-21 17:10
Reporter: fstanchina Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: "file" reports incorrect line ending for a file with CR on 65536th byte
Description: I have a file that is reported as having "CRLF, CR line terminators":
```
$ file xxx/file.cpp
xxx/file.cpp: C source, ISO-8859 text, with CRLF, CR line terminators
```

After numerous attempts at re-saving it with uniform line terminators, it still reported an inconsistency.

Turns out that "file" reads at most 65536 bytes to check for encoding and such, and this particular file has a CR exactly on the 65536th byte. So "file" doesn't see the LF on the 65537th byte and reports an inconsistency.

This is obviously not a big problem, but I believe it's worth fixing if possible.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003934)
christos   
2023-05-21 17:10   
Dup of PR/444


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
450 [file] General minor always 2023-05-20 19:04 2023-05-21 17:08
Reporter: Ambie Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: SIMH emulated tape files (".tap") mis-identified as ATARI Degas
Description: Emulated tape files for SIMH emulated tape drives are mis-identified as ATARI Degas Elite bitmap 640 x 400 x 2, color palette 0000 d08f 0000 0700 5dc1 ...

SIMH is a well-known computer hardware emulator. (it will be the 0000001 result of a Google search for SIMH).

Computers that have tape drives usually have simulated/emulated support on SIMH. There is a file format which (probably) will never change because a change in the format would cause widespread breakage to software which is distributed as SIMH tape files.

Numerous SIMH tape files are distributed in the Internet Archive, The Unix Historical Society, SourceForge, GitHub, etc.

References:
https://github.com/simh/simtools
Tags: bug, magic
Steps To Reproduce: Install file. I tested this on Arch Linux 2023-05-20 and Ubuntu 23.04.
Download a SIMH *.tap file.
Example:
https://www.tuhs.org/Archive/Distributions/UCB/4.1BSD-19810901-reconstructed/tape1.tap.xz

$ unxz tape1.tap.xz
$ file tape1.tap
tape1.tap: Atari DEGAS Elite bitmap 640 x 400 x 2, color palette 0000 d08f 0000 0700 5dc1 ...
Additional Information: Thanks for everything you do!
Attached Files:
Notes
(0003933)
christos   
2023-05-21 17:08   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
447 [file] General minor always 2023-05-11 17:24 2023-05-21 16:10
Reporter: Albrecht Platform: x86_64  
Assigned To: christos OS: Debian  
Priority: normal OS Version: Bookworm  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: MIME type output: missing separator between matches from multiple magic files
Description: In order to detect some broken or exotic file formats, I use a custom magic file in addition to the standard one coming with the Debian package. E.g. consider the following simple rule for broken (typically Malware) RTF files (which Word does open, btw.):

0 string {\\rt Rich Text Format (invalid header)
!:mime text/rtf

On Debian Bullseye (file v. 5.39) this used to work perfectly for detecting the MIME type, e.g. with the simple files in the attached ZIP:

file --mime-type -k -m ./magext.mgc:/usr/share/misc/magic Test.rtf
Test.rtf: text/rtf\012-
file --mime-type -k -m ./magext.mgc:/usr/share/misc/magic broken.rtf
broken.rtf: text/rtf\012-

On Debian Bookworm (file v. 5.44) the output is

file --mime-type -k -m ./magext.mgc:/usr/share/misc/magic Test.rtf
Test.rtf: text/rtftext/rtf
file --mime-type -k -m ./magext.mgc:/usr/share/misc/magic broken.rtf
broken.rtf: text/rtf

which looks as if the usual separator (“\012- ”) between multiple MIME types coming from different magic files is missing. For any input producing multiple MIME types from the same magic file the output is separated correctly.
Tags:
Steps To Reproduce: * unpack the attached ZIP file
* cd file_issue
* if necessary, edit the script variable MAGIC to point to the standard magic file (the value in the script is the Debian file location)
* ./runtest.sh

Note: the archive contains the results of running the script on Bullseye/5.39 and Bookworm/5.44, respectively.
Additional Information: For the RTF example above, it would be possible to fix the issue by adding a check like “not followed by the char f”. However, I noticed some more complex cases where e.g. the standard magic patterns classify the input as text/plain, whereas my rules actually detect a message/rfc822. Similar to the RTF example above, the output is “message/rfc822text/plain”, so this looks like a more general issue to me.
Attached Files: file_issue.zip (2,590 bytes) 2023-05-11 17:24
https://bugs.astron.com/file_download.php?file_id=335&type=bug
Notes
(0003932)
christos   
2023-05-21 16:10   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
439 [file] General minor always 2023-04-07 21:43 2023-05-21 16:03
Reporter: dajhorn Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Add magic for SmartVersion binary patch files
Description: This patch recognizes SVF files, which do the same thing as VCDIFF files.
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: 0001-Add-magic-for-SmartVersion-binary-patch-files.patch (1,082 bytes) 2023-04-07 21:43
https://bugs.astron.com/file_download.php?file_id=331&type=bug
Notes
(0003931)
christos   
2023-05-21 16:03   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
440 [file] General minor always 2023-04-11 11:55 2023-05-21 16:00
Reporter: truff Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Bug when using -z with an lzrip compressed file
Description: file is not using the right options to call lrzip when using -z
Tags:
Steps To Reproduce: $ file -z hello.lrz
hello.lrz: ERROR:[lrzip: Option requires an argument -- 'o'] (LRZIP compressed data - version 0.6)
Additional Information: from compress.c:
#define lrzip_flags "-do"

$ lrzip -h 2>&1 | grep -- -o,
        -o, --outfile filename specify the output file name and/or path

-o option is not what you are looking for here and there is no obvious option to have lrzip read content from stdin, it refuses to use /dev/stdin or - so the fix won't be trivial
Attached Files:
Notes
(0003930)
christos   
2023-05-21 16:00   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
441 [file] General minor always 2023-04-17 10:37 2023-05-21 15:51
Reporter: fabianthdev Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Radiance File MIME-Types Commented Out
Description: When checking the file type of an `.hdr` Radiance file, the correct format is reported by `file`:
```
$ file radiance_file.hdr
radiance_file.hdr: Radiance HDR image data
```

However, when querying for the MIME type, the correct value of `image/vnd.radiance` is not returned, instead the following MIME type is determined:
```
$ file --mime radiance_file.hdr
radiance_file.hdr: application/octet-stream; charset=binary
```

Looking at `magic/Magdir/images`, it appears that the correct MIME type is defined for radiance files, however it seems to be commented out and is thus not returned.

I would suggest the following patch to return the correct MIME type for Radiance files:
```
diff --git a/magic/Magdir/images b/magic/Magdir/images
index 19e362ae..be10993a 100644
--- a/magic/Magdir/images
+++ b/magic/Magdir/images
@@ -2549,7 +2549,7 @@
 # URL: http://local.wasp.uwa.edu.au/~pbourke/dataformats/pic/
 # Radiance HDR; usually has .pic or .hdr extension.
 0 string #?RADIANCE\n Radiance HDR image data
-#!mime image/vnd.radiance
+!:mime image/vnd.radiance
 
 # From: Adam Buchbinder <adam.buchbinder@gmail.com>
 # URL: https://www.mpi-inf.mpg.de/resources/pfstools/pfs_format_spec.pdf
```

If the MIME type definition has been commented out deliberately, please explain why.
Tags:
Steps To Reproduce: Check the file type of a Radiance HDR file with the `file` command:
```
$ file --mime radiance_file.hdr
```

Result: `radiance_file.hdr: application/octet-stream; charset=binary`

Expected result: `radiance_file.hdr: image/vnd.radiance; charset=binary`
Additional Information:
Attached Files:
Notes
(0003929)
christos   
2023-05-21 15:51   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
446 [file] General minor always 2023-05-02 20:01 2023-05-21 15:49
Reporter: mike Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: PDF appendix allows header in first 1024 bytes; Magic looks in 256
Description: TL;DR, I think this line needs to be changed to 1024, not 256:

https://github.com/file/file/blob/FILE5_39/magic/Magdir/pdf#L39

----

Long version....

The Seventh Circuit of Appeals in the United States has started publishing documents that do not comply with the PDF specification, but which do comply with the PDF Compatibility and Implementation notes from Appendix H of the specification. See attached for an example, or this link should work:

http://media.ca7.uscourts.gov/cgi-bin/OpinionsWeb/processWebInputExternal.pl?Submit=Display&Path=Y2023/D04-27/C:22-2500:J:Brennan:aut:T:fnOp:N:3036932:S:0

The PDF specification says on page 92 that:

> The first line of a PDF file is a header identifying the version of the PDF [...] For a file conforming to PDF 1.7, the header should be:
>
> `%PDF=1.7

(See: https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.7old.pdf#page=92)

Easy enough.

But Appendix H of the same spec says:

> Acrobat viewers require only that the header appear somewhere within the first 1024 bytes of the file.

(See: https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.7old.pdf#page=1102)

A few years ago, this issue came up:

https://bugs.astron.com/view.php?id=104

If I'm understanding correctly, a fix was put in place to look in the first 256 bytes of the file, here:

https://github.com/file/file/blob/FILE5_39/magic/Magdir/pdf#L39

I think we just need to adjust this to look in the first 1024 bytes instead, and it should fix this and other issues.
Tags:
Steps To Reproduce: 1. Download the file
2. Run `file the-file.pdf`
3. Note that it's detected as `data`.
4. Do `head the-file.pdf`
5. Note that `%PDF-` is there, but that there's text before it.
6. Study the spec and appendix H
7. Note that this file opens properly in Adobe Reader, and other PDF readers.
8. Note that file doesn't follow the implementation (aka the defacto specification)
Additional Information:
Attached Files: document.pdf (337,981 bytes) 2023-05-02 20:01
https://bugs.astron.com/file_download.php?file_id=334&type=bug
Notes
(0003928)
christos   
2023-05-21 15:49   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
444 [file] General minor always 2023-04-28 14:11 2023-05-21 15:43
Reporter: beijingjazzpanda Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: different result when file length is bigger than 65537 bytes
Description: for example, 2 text files, one's length is less than 65537 bytes, another is bigger than 65537 bytes.
execute `file` on both files returned different information of EOL (End Of Line).
Tags: file size
Steps To Reproduce: unzip the attachment, execute the `test-script` bash script.
The result will be like

```
$ ./test-script
bigger_than_65537.txt: ASCII text, with CRLF, CR line terminators
smaller_than_65537.txt: ASCII text, with CRLF line terminators
```
Additional Information: The issue is similar with 0000071. [https://bugs.astron.com/view.php?id=71](https://bugs.astron.com/view.php?id=71)
Attached Files: 65537-file-length-issue.zip (1,252 bytes) 2023-04-28 14:11
https://bugs.astron.com/file_download.php?file_id=333&type=bug
Notes
(0003927)
christos   
2023-05-21 15:43   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
449 [file] General feature N/A 2023-05-20 08:41 2023-05-21 15:25
Reporter: srjs Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Add magic for the DWARFS compressed file system format
Description: DwarFS ( https://github.com/mhx/dwarfs ) is a high compression read-only file system much like SquashFS or EROFS unrecognized by the file utility. Attached is a magic file to recognize DwarFS archives.

The file format may be found at: https://github.com/mhx/dwarfs/blob/main/doc/dwarfs-format.md

Also attached is a test file. Further test files may be found under the test folder of the project repo: https://github.com/mhx/dwarfs/tree/main/test

Tags:
Steps To Reproduce:
Additional Information:
Attached Files: example.dwarfs (4,070 bytes) 2023-05-20 08:41
https://bugs.astron.com/file_download.php?file_id=337&type=bug
dwarfs.magic (242 bytes) 2023-05-20 08:41
https://bugs.astron.com/file_download.php?file_id=336&type=bug
Notes
(0003926)
christos   
2023-05-21 15:25   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
448 [file] General minor always 2023-05-13 10:32 2023-05-18 13:29
Reporter: Albrecht Platform: x86_64  
Assigned To: christos OS: Debian  
Priority: normal OS Version: Bookworm  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: invalid MIME type for broken input
Description: Running file v. 5.44 for detecting the MIME type of the broken executable file as of issue 211 (https://bugs.astron.com/file_download.php?file_id=187&type=bug) returns

$ file --mime-type -b ./file-error-sample
application/x-executable, can't read section at -1

Note that the returned value is not a valid MIME type according to RFC 6838, Sect. 4.2. Whilst it would not be complicated to remove the broken extra output (starting at the “,”), this unnecessarily complicates any software parsing the output of file. Thus, the extra informational message should be present only when file shall produce human-readable output. When running file with the option --mime-type, the output should always comply with RFC 6838.
Tags:
Steps To Reproduce: see above
Additional Information: -
Attached Files:
Notes
(0003925)
christos   
2023-05-18 13:29   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
427 [file] General major always 2023-02-18 00:37 2023-05-09 18:08
Reporter: a1rind Platform:  
Assigned To: christos OS:  
Priority: high OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: reopened  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: docx file is determined as zip
Description: Hi!

There is an OOXML format docx file that is being determined as application/zip. Unfortunately I can not share the document yet but I have some debug info that hopefully can help.

The `zipinfo` list following directories/files:

```
Zip file size: 36239 bytes, number of entries: 36
-rw---- 4.5 fat 399 b- stor 80-Jan-01 00:00 [trash]/0000.dat
-rw---- 4.5 fat 739 b- defN 80-Jan-01 00:00 _rels/.rels
-rw---- 4.5 fat 41347 b- defN 80-Jan-01 00:00 word/document.xml
-rw---- 4.5 fat 1116 b- defN 80-Jan-01 00:00 docProps/app.xml
-rw---- 4.5 fat 381 b- stor 80-Jan-01 00:00 [trash]/0002.dat
-rw---- 4.5 fat 269 b- stor 80-Jan-01 00:00 [trash]/0003.dat
-rw---- 4.5 fat 450 b- stor 80-Jan-01 00:00 [trash]/0001.dat
-rw---- 4.5 fat 288 b- defN 80-Jan-01 00:00 word/_rels/header1.xml.rels
-rw---- 4.5 fat 288 b- defN 80-Jan-01 00:00 word/_rels/header3.xml.rels
-rw---- 4.5 fat 3225 b- defN 80-Jan-01 00:00 word/fontTable.xml
-rw---- 4.5 fat 2864 b- defN 80-Jan-01 00:00 word/footer1.xml
-rw---- 4.5 fat 3380 b- defN 80-Jan-01 00:00 word/header1.xml
-rw---- 4.5 fat 9807 b- defN 80-Jan-01 00:00 word/header2.xml
-rw---- 4.5 fat 3380 b- defN 80-Jan-01 00:00 word/header3.xml
-rw---- 4.5 fat 680 b- defN 80-Jan-01 00:00 word/media/image1.wmf
-rw---- 4.5 fat 38367 b- defN 80-Jan-01 00:00 word/numbering.xml
-rw---- 4.5 fat 9410 b- defN 80-Jan-01 00:00 word/settings.xml
-rw---- 4.5 fat 31843 b- defN 80-Jan-01 00:00 word/styles.xml
-rw---- 4.5 fat 6992 b- defN 80-Jan-01 00:00 word/theme/theme1.xml
-rw---- 4.5 fat 483 b- defN 80-Jan-01 00:00 word/webSettings.xml
-rw---- 4.5 fat 1768 b- stor 80-Jan-01 00:00 [trash]/0005.dat
-rw---- 4.5 fat 296 b- defS 80-Jan-01 00:00 customXml/_rels/item1.xml.rels
-rw---- 4.5 fat 201 b- defS 80-Jan-01 00:00 customXml/itemProps2.xml
-rw---- 4.5 fat 219 b- defS 80-Jan-01 00:00 customXml/item2.xml
-rw---- 4.5 fat 201 b- defS 80-Jan-01 00:00 customXml/itemProps1.xml
-rw---- 4.5 fat 296 b- defS 80-Jan-01 00:00 customXml/_rels/item2.xml.rels
-rw---- 4.5 fat 443 b- stor 80-Jan-01 00:00 [trash]/0004.dat
-rw---- 4.5 fat 2383 b- defN 80-Jan-01 00:00 word/_rels/document.xml.rels
-rw---- 4.5 fat 236 b- stor 80-Jan-01 00:00 [trash]/0006.dat
-rw---- 4.5 fat 201 b- defS 80-Jan-01 00:00 customXml/itemProps3.xml
-rw---- 4.5 fat 296 b- defS 80-Jan-01 00:00 customXml/_rels/item3.xml.rels
-rw---- 4.5 fat 775 b- defN 80-Jan-01 00:00 docProps/core.xml
-rw---- 4.5 fat 563 b- defN 80-Jan-01 00:00 docProps/custom.xml
-rw---- 4.5 fat 2530 b- defN 80-Jan-01 00:00 [Content_Types].xml
-rw---- 4.5 fat 11932 b- defS 80-Jan-01 00:00 customXml/item1.xml
-rw---- 4.5 fat 587 b- defS 80-Jan-01 00:00 customXml/item3.xml
36 files, 178635 bytes uncompressed, 31036 bytes compressed: 82.6%
```

The first file listed is coming from a [trash] directory e.g. [trash]/0000.dat and the regex at line 36 here (https://github.com/file/file/blob/master/magic/Magdir/msooxml#L36) isn't expecting such file.

Furthermore according to OOXML specification there can exists a trash directory:

> Trash items represent parts that have been discarded or are no longer in use. Trash items shall not conform to
OPC part naming guidelines as defined in ECMA-376-2 and shall not be associated with a content type. All trash
items shall follow the naming scheme: [trash]/HHHH.dat where H represents a hexadecimal digit.

As I see and understood the msooxml magic rules expects a certain order for files in order to identify correct content type based on magic bytes at certain memory locations. The presence of trash items is causing it to fail.

Any tips and tricks to skip over trash items?

Thanks!
Tags: bug, magic
Steps To Reproduce:
Additional Information:
Attached Files: unsupported-prepared.docx (182,798 bytes) 2023-03-13 11:33
https://bugs.astron.com/file_download.php?file_id=328&type=bug
Notes
(0003898)
a1rind   
2023-02-28 16:46   
Hi! Any thoughts on this?
(0003903)
christos   
2023-03-05 19:52   
Does this diff fix it?

--- msooxml 16 Aug 2022 11:16:39 -0000 1.18
+++ msooxml 5 Mar 2023 19:51:25 -0000
@@ -33,7 +33,7 @@
 # make sure the first file is correct
 >0x1E use msooxml
 >0x1E default x
->>0x1E regex \\[Content_Types\\]\\.xml|_rels/\\.rels|docProps|customXml
+>>0x1E regex \\[trash\\]|\\[Content_Types\\]\\.xml|_rels/\\.rels|docProps|customXml
 # skip to the second local file header
 # since some documents include a 520-byte extra field following the file
 # header, we need to scan for the next header
(0003909)
a1rind   
2023-03-09 10:43   
Hi!

Thanks for the response. The suggested change doesn't fix the problem. I think we need to skip trash files and have the logic after the regex works by reading bytes from the expected file header. As you notice those trash files are not in ordered, they could be anywhere not just at start or at bottom.

Unfortunately I can not share the document yet but soon I will for the ease of debugging.

Kind Regards!
(0003915)
a1rind   
2023-03-13 11:33   
Hi!

I've attached the problematic document. Had to remove some confidential information and manually zip it according to the order of the same files as before.

Thanks!
(0003916)
christos   
2023-03-14 19:46   
Fixed, thanks!
(0003918)
a1rind   
2023-03-15 12:49   
Hi!

Thanks a lot for looking into this. However the latest changes doesn't fix the issue I think. When I try the latest magic rules it still recognizes it as application/zip:

```
file -m msooxml unsupported-prepared.docx
```

Produces:
```
Zip archive data, at least v2.0 to extract, compression method=store
```

Also when I try to compile the rules with the latest changes I get the following error:
```
/usr/share/file/magic/mail.news, 84: Warning: Unparsable number `xu \b, dcrypt version %d'
```
(0003919)
a1rind   
2023-03-21 12:16   
Hi!

Any thoughts on the issue? or am I doing something wrong?

Kind Regards!
(0003920)
christos   
2023-03-21 14:03   
why is it picking up files from /usr/share/file/magic? Is there some environment setting? Also line 84 in the most recent version of file, does not match that string...
(0003921)
a1rind   
2023-04-04 10:22   
Sorry for getting back late on this. Turned out the newer changes works only with the lates version. Tested with file-5.44 and works fine. But can not work with file-5.41, unable to test file-5.42 and file-5.43.
(0003924)
christos   
2023-05-09 18:08   
Submitter verified it is fixed on the latest version.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
443 [file] General minor always 2023-04-26 20:43 2023-04-26 20:43
Reporter: andrushka Platform:  
Assigned To: OS:  
Priority: normal OS Version:  
Status: new Product Version: 5.43  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Private PKCS8 SSH keys aren't recognized
Description: Private PKCS8 SSH keys show up as ASCII text.

Thanks!
Tags:
Steps To Reproduce: ssh-keygen -m pkcs8 -f /tmp/debug_key.pass -qN 'passphrase'
ssh-keygen -m pkcs8 -f /tmp/debug_key.nopass -qN ''
file /tmp/debug_key.*
# Shows:
# /tmp/debug_key.nopass: ASCII text
# /tmp/debug_key.nopass.pub: OpenSSH RSA public key
# /tmp/debug_key.pass: ASCII text
# /tmp/debug_key.pass.pub: OpenSSH RSA public key

Additional Information: head -n 1 /tmp/debug_key.{,no}pass
# Shows:
# ==> /tmp/debug_key.pass <==
# -----BEGIN ENCRYPTED PRIVATE KEY-----
#
# ==> /tmp/debug_key.nopass <==
# -----BEGIN PRIVATE KEY-----

Maybe relevant: https://www.rfc-editor.org/rfc/rfc5958
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
252 [file] General minor always 2021-03-31 07:40 2023-04-17 12:46
Reporter: malat Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version: 5.38  
Product Build: Resolution: reopened  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: mime-type: image/jpeg instead of image/jls
Description: It would be nice to disambiguate 'image/jpeg' vs 'image/jls' mime type as seen in the example attached.

$ file --mime-type filelogo.jls
filelogo.jls: image/jpeg

with:

$ file --version
file-5.38
magic file from /etc/magic:/usr/share/misc/magic


For reference:

* https://www.iana.org/assignments/media-types/image/jls

Implementation used to generate this JPEG-LS file is at:

* https://github.com/team-charls/charls
Tags:
Steps To Reproduce: $ file --mime-type filelogo.jls
filelogo.jls: image/jpeg
Additional Information:
Attached Files: filelogo.jls (16,386 bytes) 2021-03-31 07:40
https://bugs.astron.com/file_download.php?file_id=214&type=bug
Notes
(0003591)
christos   
2021-04-19 18:58   
Added, thanks!
(0003922)
malat   
2023-04-13 13:37   
It seems the issue is back. At least on Debian:

% file --mime-type filelogo.jls
filelogo.jls: image/jpeg

with:

% file --version
file-5.44
magic file from /etc/magic:/usr/share/misc/magic

* https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1034353
(0003923)
malat   
2023-04-17 12:46   
Upstream regression, bisect led to

    commit 19d5ac6c83fb5d0d9a3868f0f6f2709b1f11882f
    Author: Christos Zoulas <christos@zoulas.com>
    Date: Sat Aug 28 12:30:52 2021 +0000

        restore jpeg strength to beat msdos boot sector

    Christoph


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
434 [file] General minor always 2023-03-15 14:19 2023-03-15 14:19
Reporter: toni.reed Platform:  
Assigned To: OS:  
Priority: normal OS Version:  
Status: new Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: DOS executable detection classifies files inside of OOXML documents as DOS block device drivers
Description: file uses a heuristic to determine whether a file is a DOS executable, for example, a DOS block device driver. This heuristic seems too broad and imprecise. It regularly classifies files inside of OOXML documents created by Microsoft Word as DOS block device drivers. The email content filter amavis uses libmagick to determine the file type of email attachments and regularly rejects emails with OOXML documents when it is configured to reject executables for Microsoft operating systems and to unpack OOXML documents (default behaviour). Therefore, this heuristic is more than an exotic classification mistake. Multiple workarounds are documented on the Internet because the issues affects many users of amavis.
Tags:
Steps To Reproduce: 1. Create a file with the following contents:

ff ff ff ff 00 00 00 00

For example:

$ hexdump \[trash\]/0000.dat
0000000 ffff ffff 0000 0000 0000 0000 0000 0000
0000010 0000 0000 0000 0000 0000 0000 0000 0000
*
0000790 0000 0000
0000793

2. Let file determine the type of the file:

$ file \[trash\]/0000.dat
[trash]/0000.dat: DOS executable (block device driver)
Additional Information: The relevant section in the file msdos seem to be

# DOS device driver updated by Joerg Jenderek at May 2011,Mar 2017,Aug 2020
# URL: http://fileformats.archiveteam.org/wiki/DOS_device_driver
# Reference: http://www.delorie.com/djgpp/doc/rbinter/it/46/16.html
# https://amaus.net/static/S100/IBM/software/DOS/DOS%20techref/CHAPTER.009
0 ulequad&0x07a0ffffffff 0xffffffff

and

# DOS device driver attributes
>4 uleshort&0x8000 0x0000 \bblock device driver

However, the heuristic seems to broad that it might also classify other file as DOS executables and the entire heuristic seems to be affected. Classifying DOS executables also seems to be a hard problem as they don't seem to have an easily distinguishable magic number.
Attached Files: 0000.dat (1,939 bytes) 2023-03-15 14:19
https://bugs.astron.com/file_download.php?file_id=330&type=bug
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
433 [file] General major always 2023-03-14 12:58 2023-03-14 19:48
Reporter: nix Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Since f7a65db, ELF file magic is broken for all ELF files with a note section
Description: The symptoms are simple:

compiler@loom 7897 /usr/src/file/x86_64-silk/shai-build.silk% src/file src/.libs/libmagic.so.1.0.0
src/.libs/libmagic.so.1.0.0: ERROR: , dynamically linked Note section size too big (48 > 0) (Invalid argument)
Tags:
Steps To Reproduce: (see above)
Additional Information: Caused by missing initialization of recently added ms->elf_shsize_max. Patch attached.
Attached Files: shsize-max.diff (442 bytes) 2023-03-14 12:58
https://bugs.astron.com/file_download.php?file_id=329&type=bug
Notes
(0003917)
christos   
2023-03-14 19:48   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
421 [file] General minor always 2023-02-04 19:48 2023-03-11 18:27
Reporter: bbaovanc Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Debian package with control file with zstd compression has extension truncated
Description: Debian packages (.deb) that have the control archive compressed with zstd will be named `control.tar.zst`, but the last letter of that filename gets chopped off in the output from `file`.
Tags:
Steps To Reproduce: 1. Download attached deb
2. Run `file com.jaidan.testtool_1.0.0_all.deb`
3. Note that it says "with control.tar.zs"
Additional Information: Looking at `magic/Magdir/archive:274`, it looks like it only reads a 14-character long filename for the control archive. That works for `control.tar.gz` and `control.tar.xz`, but `control.tar.zst` is 15 characters long.

Attached Files: com.jaidan.testtool_1.0.0_all.deb (2,202 bytes) 2023-02-04 19:48
https://bugs.astron.com/file_download.php?file_id=326&type=bug
Notes
(0003914)
christos   
2023-03-11 18:27   
Widened, thanks.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
423 [file] General minor always 2023-02-12 23:25 2023-03-11 18:12
Reporter: jpferreira Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Doesn't properly identify Mathematica notebooks MIME type
Description: Trying to identify a Mathematica notebook MIME type using "file" will say "text/plain", even thought Mathematica has registered MIME types, which you can see in the section "Background & Context" on the following URL: https://reference.wolfram.com/language/ref/format/NB.html
Also, the beginning of the file also includes some information about that, e.g.:
(* Content-type: application/vnd.wolfram.mathematica *)

(*** Wolfram Notebook File ***)
(* http://www.wolfram.com/nb *)

(* CreatedBy='Mathematica 12.1' *)

(*CacheID: 234*)
(* Internal cache information:
NotebookFileLineBreakTest
NotebookFileLineBreakTest
NotebookDataPosition[ 158, 7]
NotebookDataLength[ 405898, 6949]
NotebookOptionsPosition[ 403928, 6910]
NotebookOutlinePosition[ 404321, 6926]
CellTagsIndexPosition[ 404278, 6923]
WindowFrame->Normal*)

(* Beginning of Notebook Content *)
(...)
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003913)
christos   
2023-03-11 18:12   
Added, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
432 [file] General minor always 2023-03-08 22:55 2023-03-11 17:54
Reporter: maarten Platform: Linux  
Assigned To: christos OS: Fedora  
Priority: normal OS Version: 6.1.14-100.fc36.  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: file does not recognize precompiled files, generated by llvm
Description: Precompiled headers, generated by llvm (clang), are not recognized by file.

$ file pch.h.pch
pch.h.pch: data
Tags:
Steps To Reproduce: # Copy/paste these commands to your terminal.
# They require clang to be available.

# 1. Create a small header
cat >pch.h <<EOF
__attribute__((visibility("default"))) int myfunction(void);
EOF

# 2. Create a dummy source file
cat >pch.c <<EOF
EOF

# 3. Build the precompiled header
clang -Xclang -emit-pch -Xclang -include -Xclang pch.h -x c-header -o pch.h.pch -c pch.c

# 4. Run file on the precompiled header
file pch.h.pch
Additional Information:
Attached Files:
Notes
(0003912)
christos   
2023-03-11 17:54   
Added, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
428 [file] General tweak always 2023-02-24 15:56 2023-03-11 17:48
Reporter: lu3 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: extended filesystem: ext2/3/4 (ext4 identified as ext2)
Description: In short: file identifies ext4 file system as ext2.

Long story:
I have this ext4 file system that once was an ext2 file system. The kernel now only supports mounting it as ext4, since I added extra_isize (which is ext4-only). To create such a file system, use "mke2fs -t ext2", then use "tune2fs -O extra_isize" on the file system. The Linux kernel will then only mount it as ext4. (Like it will only mount an ext2 with added journal, e.g. tune2fs -O has_journal, as ext3 or ext4...)

Suggestion: Change "ext2" to either "extended filesystem" (or "extended file system", but the man pages write "filesystem" without space), which includes the original ext, ext2, ext3 and ext4. OR change the text to "ext2/3/4". OR include additional logic to distinguish ext2, ext3 and ext4 more reliably.

I did some testing and created ext2/ext3/ext4 file systems (as sparse files). The results are inconsistent. For example I created an ext4 file system, the default features are: "has_journal ext_attr resize_inode dir_index filetype extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum". I then deactivated most, except extends and flex_bg everything I could bring back to ext2 standards, leaving "ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file uninit_bg". NOTE that this is now STILL an ext4 filesystem due to extends! file now identifies it as ext2...
Tags: file system
Steps To Reproduce: # dd if=/dev/zero of=sparse_file bs=1 count=0 seek=512M
0+0 records in
0+0 records out
0 bytes copied, 3.8972e-05 s, 0.0 kB/s

# mke2fs -t ext2 sparse_file
mke2fs 1.46.5 (30-Dec-2021)
Discarding device blocks: done
Creating filesystem with 131072 4k blocks and 32768 inodes
Filesystem UUID: 60ab585c-0cc5-4e1c-b89d-98ee8eab6ef6
Superblock backups stored on blocks:
        32768, 98304

Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done

# tune2fs -O extra_isize sparse_file
tune2fs 1.46.5 (30-Dec-2021)

# dumpe2fs -h sparse_file | grep features
dumpe2fs 1.46.5 (30-Dec-2021)
Filesystem features: ext_attr resize_inode dir_index filetype sparse_super large_file extra_isize

# file sparse_file
sparse_file: Linux rev 1.0 ext2 filesystem data, UUID=60ab585c-0cc5-4e1c-b89d-98ee8eab6ef6 (large files)

# rm sparse_file


# dd if=/dev/zero of=sparse_file bs=1 count=0 seek=512M
0+0 records in
0+0 records out
0 bytes copied, 4.1416e-05 s, 0.0 kB/s

# mke2fs -t ext4 sparse_file
mke2fs 1.46.5 (30-Dec-2021)
Discarding device blocks: done
Creating filesystem with 131072 4k blocks and 32768 inodes
Filesystem UUID: bf3337cb-bdd0-489c-a161-3d77450c4341
Superblock backups stored on blocks:
        32768, 98304

Allocating group tables: done
Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

# dumpe2fs -h sparse_file | grep features
dumpe2fs 1.46.5 (30-Dec-2021)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
Journal features: (none)

# tune2fs -O ^has_journal,^64bit,^huge_file,^dir_nlink,^extra_isize,^metadata_csum sparse_file
tune2fs 1.46.5 (30-Dec-2021)
Disabling checksums could take some time.
Proceed anyway (or wait 5 seconds to proceed) ? (y,N) y

Please run e2fsck -f on the filesystem.

After running e2fsck, please run `resize2fs -s sparse_file' to disable 64-bit mode.

# e2fsck -f sparse_file
e2fsck 1.46.5 (30-Dec-2021)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: +(98304--98368)
Fix<y>? yes to all
Free blocks count wrong for group 0000001 (36799, counted=32703).
Free blocks count wrong for group 0000002 (28672, counted=32768).

sparse_file: ***** FILE SYSTEM WAS MODIFIED *****
sparse_file: 11/32768 files (0.0% non-contiguous), 2257/131072 blocks

# resize2fs -s sparse_file
resize2fs 1.46.5 (30-Dec-2021)
Converting the filesystem to 32-bit.
The filesystem on sparse_file is now 131072 (4k) blocks long.

# dumpe2fs -h sparse_file | grep features
dumpe2fs 1.46.5 (30-Dec-2021)
Filesystem features: ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file uninit_bg

# file sparse_file
sparse_file: Linux rev 1.0 ext2 filesystem data, UUID=bf3337cb-bdd0-489c-a161-3d77450c4341 (extents) (large files)

# rm sparse_file
Additional Information: (My ext2+extra_isize=ext4 is on /dev/nvme0n1p5:)

# dumpe2fs -h /dev/nvme0n1p5 | grep features
dumpe2fs 1.46.5 (30-Dec-2021)
Filesystem features: ext_attr resize_inode dir_index filetype sparse_super large_file extra_isize

# mount -t ext2 /dev/nvme0n1p5 /boot
mount: /boot: wrong fs type, bad option, bad superblock on /dev/nvme0n1p5, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

# mount -t ext3 /dev/nvme0n1p5 /boot
mount: /boot: wrong fs type, bad option, bad superblock on /dev/nvme0n1p5, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

# dmesg -t | tail
EXT4-fs (nvme0n1p5): couldn't mount as ext2 due to feature incompatibilities
EXT4-fs (nvme0n1p5): couldn't mount as ext3 due to feature incompatibilities

# mount -t ext4 /dev/nvme0n1p5 /boot
# dmesg -t | tail
EXT4-fs (nvme0n1p5): mounted filesystem 01234567-89ab-cdef-fecd-ba9876543210 without journal. Quota mode: disabled.
ext4 filesystem being mounted at /boot supports timestamps until 2038 (0x7fffffff)

# file -s /dev/nvme0n1p5
/dev/nvme0n1p5: Linux rev 1.0 ext2 filesystem data (mounted or unclean), UUID=01234567-89ab-cdef-fecd-ba9876543210, volume name "Linux boot" (large files)
Attached Files:
Notes
(0003905)
christos   
2023-03-05 19:56   
The test for ext2 vs 3,4 is if it has a journal. I guess we should make it smarter. Do you know where the flags vs versions documentation lives?
(0003911)
lu3   
2023-03-11 17:48   
If not from the Linux kernel sources themselves, I guess maybe the Ext4 (and Ext2/Ext3) Wiki can be informative: https://ext4.wiki.kernel.org/index.php/Main_Page
Especially the https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout should describe what you need.

The man page also lists which features correspond with which extended filesystem version: https://www.man7.org/linux/man-pages/man5/ext4.5.html

I'm not sure what the real differences are, since the ext4 kernel driver can mount ext2/3/4 as well, hence my suggestion to simply call it all "extended filesystem". I guess a logic would simply have to check for the enabled features and decide whether it's an ext2, ext3 or ext4 based on them:
ext2 = filetype | sparse_super | large_file | ext_attr
ext3 = filetype | sparse_super | large_file | has_journal | ext_attr | dir_index | resize_inode
ext4 = (all the rest, currently:) filetype | sparse_super | large_file | has_journal | ext_attr | dir_index | resize_inode | 64bit | dir_nlink | extent | extra_isize | flex_bg | huge_file | meta_bg | uninit_bg | mmp | bigalloc | quota | inline_data | sparse_super2 | metadata_csum | encrypt | metadata_csum_seed | project | ea_inode | large_dir | casefold | verity | stable_inodes

This does, however, leave out the original extended filesystem ("ext1" if you will, or "ext").


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
217 [file] General tweak always 2020-12-21 12:02 2023-03-11 13:36
Reporter: Helge Kreutzmann Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: assigned Product Version: 5.39  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Issues in file man pages
Description: Dear file maintainer,
the manpage-l10n project maintains a large number of translations of
man pages both from a large variety of sources (including file) as
well for a large variety of target languages.

During their work translators notice different possible issues in the
original (english) man pages. Sometimes this is a straightforward
typo, sometimes a hard to read sentence, sometimes this is a
convention not held up and sometimes we simply do not understand the
original.

We use several distributions as sources and update regularly (at
least every 2 month). This means we are fairly recent (some
distributions like archlinux also update frequently) but might miss
the latest upstream version once in a while, so the error might be
already fixed. We apologize and ask you to close the issue immediately
if this should be the case, but given the huge volume of projects and
the very limited number of volunteers we are not able to double check
each and every issue.

Secondly we translators see the manpages in the neutral po format,
i.e. converted and harmonized, but not the original source (be it man,
groff, xml or other). So we cannot provide a true patch (where
possible), but only an approximation which you need to convert into
your source format.

Finally the issues I'm reporting have accumulated over time and are
not always discovered by me, so sometimes my description of the
problem my be a bit limited - do not hesitate to ask so we can clarify
them.

I'm now reporting the errors for your project. If future reports
should use another channel, please let me know.

Man page: file.1
Issue: What ist "E<.Bk -words>"?

"E<.Nm> E<.Bk -words> E<.Op Fl bcdEhiklLNnprsSvzZ0> E<.Op Fl Fl apple> E<.Op "
"Fl Fl exclude-quiet> E<.Op Fl Fl extension> E<.Op Fl Fl mime-encoding> E<.Op "
"Fl Fl mime-type> E<.Op Fl e Ar testname> E<.Op Fl F Ar separator> E<.Op Fl f "
"Ar namefile> E<.Op Fl m Ar magicfiles> E<.Op Fl P Ar name=value> E<.Ar> E<."
"Ek> E<.Nm> E<.Fl C> E<.Op Fl m Ar magicfiles> E<.Nm> E<.Op Fl Fl help>"
--
Man page: file.1
Issue: Two quote signs around magic / magic numbers does not make sense

"in the standard include directory. These files have a E<.Dq \"magic number"
"\"> stored in a particular place near the beginning of the file that tells "
"the E<.Tn UNIX> operating system that the file is a binary executable, and "
"which of several types thereof. The concept of a E<.Dq \"magic\"> has been "
"applied by extension to data files. Any file with some invariant identifier "
"at a small fixed offset into the file can usually be described in this way. "
"The information identifying these files is read from the compiled magic file "
"E<.Pa /usr/share/file/misc/magic.mgc>, or the files in the directory E<.Pa /"
"usr/share/file/misc/magic> if the compiled file does not exist. In "
"addition, if E<.Pa $HOME/.magic.mgc> or E<.Pa $HOME/.magic> exists, it will "
"be used in preference to the system magic files."
--
Man page: file.1
Issue: the file command â the command <.Nm>

"Causes the file command to output the file type and creator code as used by "
"older MacOS versions. The code consists of eight letters, the first "
"describing the file type, the latter the creator. This option works "
"properly only for file formats that have the apple-style output defined."
--
Man page: file.1
Issue: option â (This) option

"option causes symlinks not to be followed (on systems that support symbolic "
"links). This is the default if the environment variable E<.Dv "
"POSIXLY_CORRECT> is not defined."

"option causes symlinks to be followed, as the like-named option in E<.Xr ls "
"1> (on systems that support symbolic links). This is the default if the "
"environment variable E<.Ev POSIXLY_CORRECT> is defined."
--
Man page: file.1
Issue: file â E<.Nm>

"Causes the file command to output mime type strings rather than the more "
"traditional human readable ones. Thus it may say E<.Sq text/plain; "
"charset=us-ascii> rather than E<.Dq ASCII text>."

"On systems where libseccomp E<.Pa ( https://github.com/seccomp/libseccomp>) "
"is available, the E<.Fl S> flag disables sandboxing which is enabled by "
"default. This option is needed for file to execute external decompressing "
"programs, i.e. when the E<.Fl z> flag is specified and the built-in "
"decompressors are not available. On systems where sandboxing is not "
"available, this option has no effect."

"The magic file entries have been collected from various sources, mainly "
"USENET, and contributed by various authors. Christos Zoulas (address below) "
"will collect additional or corrected magic file entries. A consolidation of "
"magic file entries will be distributed periodically."

"John Gilmore revised the code extensively, making it better than the first "
"version. Geoff Collyer found several inadequacies and provided some magic "
"file entries. Contributions of the E<.Sq \\*[Am]> operator by Rob McMahon, "
"E<.Aq cudcv@warwick.ac.uk>, 1989."

"E<.Em Note:> This Debian version of file was built without seccomp support, "
"so this option has no effect."
--
Man page: file.1
Issue: option â flag?

"On systems where libseccomp E<.Pa ( https://github.com/seccomp/libseccomp>) "
"is available, E<.Nm> is enforces limiting system calls to only the ones "
"necessary for the operation of the program. This enforcement does not "
"provide any security benefit when E<.Nm> is asked to decompress input files "
"running external programs with the E<.Fl z> option. To enable execution of "
"external decompressors, one needs to disable sandboxing using the E<.Fl S> "
"flag."
--
Man page: file.1
Issue: Missing full stop.

"Some of the encoding logic is hard-coded in encoding.c and can be moved to "
"the magic files if we had a !:charset annotation"
--
Man page: file.1
Issue: This bug was closed in 2008?

"Store arbitrarily long strings, for example for %s patterns, so that they "
"can be printed out. Fixes Debian bug #271672. This can be done by "
"allocating strings in a string pool, storing the string pool at the end of "
"the magic file and converting all the string pointers to relative offsets "
"from the string pool."
--
Man page: file.1
Issue 1: security considerations) â security) considerations
Issue 2: so move around the file â to move around in the file

"If the offsets specified internally in the file exceed the buffer size ( E<."
"Dv HOWMANY> variable in file.h), then we don't seek to that offset, but we "
"give up. It would be better if buffer managements was done when the file "
"descriptor is available so move around the file. One must be careful though "
"because this has performance (and thus security considerations)."
--
Man page: file.1
Issue: is E<.Ek> â file? This long string is difficult to review for translation

"E<.Nm> E<.Bk -words> E<.Op Fl bcdEhiklLNnprsSvzZ0> E<.Op Fl Fl apple> E<.Op "
"Fl Fl extension> E<.Op Fl Fl mime-encoding> E<.Op Fl Fl mime-type> E<.Op Fl "
"e Ar testname> E<.Op Fl F Ar separator> E<.Op Fl f Ar namefile> E<.Op Fl m "
"Ar magicfiles> E<.Op Fl P Ar name=value> E<.Ar> E<.Ek> E<.Nm> E<.Fl C> E<.Op "
"Fl m Ar magicfiles> E<.Nm> E<.Op Fl Fl help>"
--
Man page: file.1
Issue: Two quote signs around magic / magic numbers does not make sense

"in the standard include directory. These files have a E<.Dq \"magic number"
"\"> stored in a particular place near the beginning of the file that tells "
"the E<.Tn UNIX> operating system that the file is a binary executable, and "
"which of several types thereof. The concept of a E<.Dq \"magic\"> has been "
"applied by extension to data files. Any file with some invariant identifier "
"at a small fixed offset into the file can usually be described in this way. "
"The information identifying these files is read from /etc/magic and the "
"compiled magic file E<.Pa /usr/share/misc/magic.mgc>, or the files in the "
"directory E<.Pa /usr/share/misc/magic> if the compiled file does not exist. "
"In addition, if E<.Pa $HOME/.magic.mgc> or E<.Pa $HOME/.magic> exists, it "
"will be used in preference to the system magic files."
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003517)
christos   
2021-01-03 20:45   
Issue: What ist "E<.Bk -words>"?
- See the "Keep" macro https://www.freebsd.org/cgi/man.cgi?query=mdoc&sektion=7

Issue: Two quote signs around magic / magic numbers does not make sense
- Removed, don't make a difference

Issue: the file command â the command <.Nm>
- Use .Nm

Issue: option â (This) option
- Added "This"

Issue: file â E<.Nm>
- fixed .Nm

Issue: option â flag?
- changed all to option

Issue: Missing full stop.
- fixed.

Issue: This bug was closed in 2008?
- No it is still broken (string limits in magic descriptions)

Issue 1: security considerations) â security) considerations
Issue 2: so move around the file â to move around in the file
- Rewrote and clarified.

Issue: is E<.Ek> â file? This long string is difficult to review for translation
- This is the list of flags. .Ek is the closing of .Bk macro.

Issue: Two quote signs around magic / magic numbers does not make sense
- duplicate, fixed.
(0003524)
Helge Kreutzmann   
2021-01-05 15:39   
Thanks for the swift handling, no more comments from my side.
(0003707)
Helge Kreutzmann   
2022-03-09 17:14   
Has this already landed in some kind of release? I wonder because I still see some issues in file(1) in all our upstream distributions.
(0003910)
Helge Kreutzmann   
2023-03-11 13:36   
As of mid February 2023 I still see them in the major distros. What is the ETA for including them?


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
424 [file] General feature N/A 2023-02-15 11:27 2023-03-05 20:16
Reporter: polluks Platform: MacBookPro17,1  
Assigned To: christos OS: macOS  
Priority: normal OS Version: 12.5  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: audio update
Description: mod.get psyched: 4-channel Protracker module sound data Title: "get psyched"
mod.lose: 4-channel Protracker module sound data Title: "lose"
mod.metal heads: 4-channel Protracker module sound data Title: "metal heads"
mod.metallic bop amiga: 4-channel Protracker module sound data Title: "metallic bop amiga"
mod.robot attack: 4-channel Protracker module sound data Title: "robot attack"
mod.rushin in: 4-channel Protracker module sound data Title: "rushin in"
mod.soundfx: 4-channel Protracker module sound data Title: "soundfx"
mod.win: 4-channel Protracker module sound data Title: "win"

https://github.com/zeropolis79/PETSCIIRobots-SDL/tree/main/Music
Tags:
Steps To Reproduce: diff --git a/magic/Magdir/audio b/magic/Magdir/audio
index 7a0a192b..c7fa4b38 100644
--- a/magic/Magdir/audio
+++ b/magic/Magdir/audio
@@ -188,6 +188,7 @@
 #audio/x-protracker-module
 >0 string >\0 Title: "%s"
 1080 string M!K! 4-channel Protracker module sound data
+1080 string !PM! 4-channel Protracker module sound data
 !:mime audio/x-mod
 #audio/x-protracker-module
 >0 string >\0 Title: "%s"
Additional Information: Running test: ../tests/cmd1
TZ=UTC MAGIC=../magic/magic ./test -e ../tests/cmd1.testfile ../tests/cmd1.result
../tests/cmd1.testfile: 4-channel Protracker module sound data
test: ERROR: result was (len 38)
4-channel Protracker module sound data
expected (len 46)
a /usr/bin/cmd1 script, ASCII text executable
System Description Apple M1
Attached Files:
Notes
(0003895)
polluks   
2023-02-15 12:01   
Magic fixed, check ok:

diff --git a/magic/Magdir/audio b/magic/Magdir/audio
index 7a0a192b..c7fa4b38 100644
--- a/magic/Magdir/audio
+++ b/magic/Magdir/audio
@@ -188,6 +188,7 @@
 #audio/x-protracker-module
 >0 string >\0 Title: "%s"
 1080 string M!K! 4-channel Protracker module sound data
+1080 string \!PM! 4-channel Protracker module sound data
 !:mime audio/x-mod
 #audio/x-protracker-module
 >0 string >\0 Title: "%s"
(0003908)
christos   
2023-03-05 20:16   
Added (the bang needed to be escaped that's why the test broke).


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
425 [file] General minor always 2023-02-15 12:04 2023-03-05 20:04
Reporter: polluks Platform: MacBookPro17,1  
Assigned To: christos OS: macOS  
Priority: high OS Version: 13.2  
Status: feedback Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: check fails
Description: Running test: ../tests/hello-racket_rkt
TZ=UTC MAGIC=../magic/magic ./test -e ../tests/hello-racket_rkt.testfile ../tests/hello-racket_rkt.result
../tests/hello-racket_rkt.testfile: Racket bytecode (version \002)
test: ERROR: result was (len 30)
Racket bytecode (version \002)
expected (len 30)
Racket bytecode (version 8.5)
Tags:
Steps To Reproduce:
Additional Information:
System Description Apple M1
Attached Files:
Notes
(0003907)
christos   
2023-03-05 20:04   
Interesting, I tried it on my M1 with 13.2.1 and I could not reproduce it...


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
426 [file] General minor always 2023-02-17 15:04 2023-03-05 20:01
Reporter: claudiu Platform:  
Assigned To: christos OS: Ubuntu  
Priority: normal OS Version: 20.04  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Error "lhs/off overflow 4294967295 0" is printed to console
Description: When running "file" over files which are composed of only 0xff bytes (at least 6 bytes), I get the above error. For example:

{code}
$ ./file -m magic.mgc ff.bin
lhs/off overflow 4294967295 0
ff.bin: ISO-8859 text, with no line terminators
$ hexdump -C ff.bin
00000000 ff ff ff ff ff ff |......|
00000006
{code}

The error seems to be generated from the do_ops function:
{code}
file_private int
do_ops(struct magic *m, uint32_t *rv, intmax_t lhs, intmax_t off)
{
    intmax_t offset;
    // On purpose not INTMAX_MAX
    if (lhs >= UINT_MAX || lhs <= INT_MIN ||
        off >= UINT_MAX || off <= INT_MIN) {
        fprintf(stderr, "lhs/off overflow %jd %jd\n", lhs, off);
        return 1;
    }
{code}
, but my knowledge of libmagic is limited so I don't understand why this is a problem.

Aside from the error itself, I'm wondering why such errors are printed to the console, since this is part of the libmagic functionality...but of course, this is a separate issue.
Tags: bug
Steps To Reproduce: 1. Create a file with only 0xff bytes:
{code}
$ printf "\xff\xff\xff\xff\xff\xff" > ff.bin
{code}
2. Run "file" on it:
{code}
$ ./file -m magic.mgc ff.bin
lhs/off overflow 4294967295 0
ff.bin: ISO-8859 text, with no line terminators
{code}
Additional Information: I first encountered this in a file from an ISO archive: https://mirror.netsite.dk/centos/7.9.2009/isos/x86_64/CentOS-7-x86_64-DVD-2207-02.iso

The file location within the ISO is: CentOS-7-x86_64-DVD-2207-02.iso --> Packages/ecj-4.5.2-3.el7.x86_64.rpm --> ecj-4.5.2-3.el7.src.cpio.xz --> ecj-4.5.2-3.el7.src.cpio --> ./usr/share/java/ecj.jar --> org/eclipse/jdt/internal/compiler/parser/unicode/part2.rsc
Attached Files: softmagic.c.patch (3,506 bytes) 2023-02-23 08:39
https://bugs.astron.com/file_download.php?file_id=327&type=bug
Notes
(0003896)
polluks   
2023-02-20 13:32   
workaround "2>/dev/null"
(0003897)
claudiu   
2023-02-23 08:39   
I've attached a patch that only prints those messages to stderr if the MAGIC_DEBUG flag is set. This seems to be the rule in the libmagic code, aside from some special cases (e.g. if CDF_DEBUG is defined).
(0003906)
christos   
2023-03-05 20:01   
Fixed to only print debugging with debug.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
429 [file] General minor always 2023-02-27 06:54 2023-03-05 19:52
Reporter: xry111 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Build failure with gcc 4.8
Description: We have the following build failure with file 5.44 and gcc 4.8:

  CC funcs.lo
../../src/funcs.c: In function 'check_regex':
../../src/funcs.c:665:2: error: 'for' loop initial declarations are only allowed in C99 mode
  for (const char *p = pat; *p; p++) {
  ^
../../src/funcs.c:665:2: note: use option -std=c99 or -std=gnu99 to compile your code
make[3]: *** [Makefile:571: funcs.lo] Error 1
Tags: build
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003904)
christos   
2023-03-05 19:52   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
431 [file] General minor always 2023-03-04 20:50 2023-03-05 19:45
Reporter: Barteks2x Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: ERROR: (null) when running with --mime-type on a specific byte sequence
Description: I found this issue while trying to use the source code of "file" to make a "data recovery" tool for my own use that scans a disk for file signatures, and ran into this error.

The shortest sequence of bytes (in hex) that reproduces this issue is:
0000 feff f9ff f6ff
Tags:
Steps To Reproduce: printf "\x00\x00\xfe\xff\xf9\xff\xf6\xff" > test && file --mime-type test
Additional Information: After debugging it and reading the code this behavior seems intentional, but previous bug reports about similar output seem to contradict that observation.

It appears to fail in file_ascmagic, in file_ascmagic_with_encoding - it initially detects this as a UTF-32 file but then attempts to handle it as UTF-8. But overall outputting that error when this code fails to get any information about the text seems intentional.
Attached Files:
Notes
(0003902)
christos   
2023-03-05 19:45   
fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
430 [file] General tweak always 2023-03-03 01:38 2023-03-05 19:02
Reporter: lbrtchx Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: feedback Product Version: 5.39  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: is there a way to get just the basic metadata about a file type without too much specifics about the particular file?
Description:  if you use:
 file --brief --mime "$file"
 you would get too little, practically unusable info.
 file --brief "$file"
 would give you what you need when it comes to pdf files (yes, the structure of pdf files depends on their version), but with images it gives you data relating to the specific file, such as:

 org/wikipedia/en/Big_Bang_files/16px-He1523a.jpg|JPEG image data,
baseline, precision 8, 16x18, components 3
 org/wikipedia/en/East_Germany_files/20px-DDR_-_helfer_der_volkspolizei.jpg|JPEG
image data, baseline, precision 8, 20x13, components 3
 org/wikipedia/en/Country_code_top-level_domain_files/23px-Flag_of_Hong_Kong.png|PNG
image data, 23 x 15, 8-bit colormap, non-interlaced
 org/wikipedia/en/Country_code_top-level_domain_files/45px-Flag_of_Venezuela.png|PNG
image data, 45 x 30, 8-bit colormap, non-interlaced

 is there a way to go like:

 file --brief --no-specifics "$file"

 to just get:

 JPEG image data, baseline, precision 8, components 3
 PNG image data, 8-bit colormap, non-interlaced

 If not I would propose it as a recommendation/RFE of sort.

 You could always use some code to clean up the output from file, but it would be optimal is file had that option.

Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003901)
christos   
2023-03-05 19:02   
So the size of the image is the only information that should be considered image-specific?


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
422 [file] General feature always 2023-02-07 06:13 2023-02-07 06:13
Reporter: wzy Platform:  
Assigned To: OS:  
Priority: low OS Version:  
Status: new Product Version: 5.44  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: shell completion
Description: Can file has shell completions for bash/zsh/...?
Tags: completion, shell
Steps To Reproduce:
Additional Information: Some projects like [github-cli](https://github.com/cli/cli/blob/cf4b73ff958b272cf3c9c0cf9351459f76b793a0/pkg/cmd/completion/completion.go) maintained shell completions in their repository and some projects maintained shell completions in other repositories like bash-completion.
(file has a bash completion in bash-completions although it is old.)
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
419 [file] General minor always 2023-02-02 06:56 2023-02-04 13:23
Reporter: davidjb Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Add Cflags compile-time flags for pkg-config
Description: Being able to use libmagic (magic.h) requires knowledge of where it its headers are located on the system. In my case, I'm compiling a project that depends on magic.h on a system where the path isn't set system wide (e.g. macOS, Homebrew). In other words by running `pkg-config --cflags` should return the location of the header files for this project such that it's usable.

I've attached the patch that facilitates this for libmagic.pc.in. It would be greatly appreciated if you could incorporate this fix. Thanks very much for such a useful library!
Tags:
Steps To Reproduce: 1. ./configure && make && make install
2. Run pkg-config --cflags and observe no output
Additional Information:
Attached Files: libmagic.pc.in-cflags.patch (273 bytes) 2023-02-02 06:56
https://bugs.astron.com/file_download.php?file_id=325&type=bug
Notes
(0003894)
christos   
2023-02-04 13:23   
committed, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
420 [file] General minor have not tried 2023-02-03 20:35 2023-02-03 21:06
Reporter: jstein Platform:  
Assigned To: administrator OS:  
Priority: normal OS Version:  
Status: assigned Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: GARMIN Firmware images (new file format)
Description: The format is documented on:
https://www.memotech.franken.de/FileFormats/Garmin_GCD_Format.pdf

You find examples here:
https://www.gpsrchive.com/GPSMAP/GPSMAP%2066sr/Firmware.html#x-Firmware%20History-5.50


head gupdate.gcd | strings
GARMINd
Copyright 1996-2013 by Garmin Ltd. or its subsidiaries.
@FYFRFKF


head gupdate.gcd | hexdump
0000000 4147 4d52 4e49 0064 0001 0001 02dc 1500
0000010 0000 0000 0000 0000 0000 0000 0000 0000
0000020 0000 0000 0000 0003 0009 d410 135c 4504
0000030 140d 0541 3700 4300 706f 7279 6769 7468
0000040 3120 3939 2d36 3032 3331 6220 2079 6147
0000050 6d72 6e69 4c20 6474 202e 726f 6920 7374
0000060 7320 6275 6973 6964 7261 6569 2e73 0001
0000070 0001 024b 8900 000f 0000 0000 0000 0000
0000080 0000 0000 0000 0000 0000 0000 0000 0000
*
0001000 0001 0001 0664 0a00 0a00 1510 0920 0d10
0001010 0310 0750 0a00 0800 0000 0942 6e00 260e
0001020 0802 0000 08ff 9ff0 dae5 0f41 2c00 0a9b
0001030 2e00 0a9b 3c00 0600 0000 0000 f000 9db9
0001040 0f38 6246 e8c7 ffff ffff ffff c2ff 0941
0001050 0000 0000 6e00 000e 2600 2402 0000 c800
0001060 0c00 a080 14e1 9f01 00e5 a010 08e3 8010
0001070 00e5 a000 4ae3 0102 01eb a000 00e3 9f11
0001080 3ae5 0102 01eb a000 f8e3 9f10 29e5 0102
0001090 01eb a000 42e3 0102 01eb a000 e8e3 9f10
00010a0 32e5 0102 01eb a000 e0e3 9f10 21e5 0102
00010b0 00eb 0f00 1fe1 0000 13e2 5000 00e3 a080
00010c0 0003 a0b0 0003 a0a0 0003 a090 d103 a000
00010d0 00e3 21f0 b8e1 9fd0 d2e5 a000 00e3 21f0
00010e0 b0e1 9fd0 d3e5 a000 00e3 21f0 a8e1 9fd0
00010f0 dbe5 a000 00e3 21f0 a0e1 9fd0 d7e5 a000
0001100 00e3 21f0 98e1 9fd0 dfe5 a000 00e3 21f0
0001110 90e1 9fd0 02e5 a01a 10e3 011f 00ee a010
0001120 10e3 051f 17ee 071f 17ee 081f 10ee 120f
0001130 00ee a000 ffe1 ffff 10ea 110f 01ee 800a
0001140 10e3 010f 10ee 120f 00ee a000 ffe1 ffff
0001150 01ea 8f00 10e2 2fff 40e1 b2f0 27fa 84f0
0001160 40f9 5946 5246 4b46 0046 2524 261c 271c
0001170 a01c a146 a246 a346 1046 7ef0 46f9 19b0
0001180 0004 1e10 0010 ff00 00ff fe00 00ff ff10
0001190 00ff fe10 20ff fe00 20ff fe02 40ff fe02
00011a0 80ff fe02 60ff fe02 f0ff e8a9 6e02 5100
00011b0 75e3 00f5 700a 5100 eae3 00f5 660a 5100
00011c0 00e3 010b 650a 5100 fee3 000a
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
417 [file] General feature always 2023-01-17 01:29 2023-01-24 20:46
Reporter: rebus Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: filetype for Microsoft OneNote files is not detected
Description: File utility currently doesn't recognize the format of the OneNote files from Microsoft . As it starts to be used as a vector for spreading malware is becomes interesting to have the recognition for the format in the file utility.
Tags: magic
Steps To Reproduce: $ file test.one
test.one: data

Additional Information: OneNote files start with the magic bytes:
- E4525C7B8CD8A74DAEB15378D02996D3 - for .one file = GUID {7B5C52E4-D88C-4DA7-AEB1-5378D02996D3} as specified by MS-ONESTORE - https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-onestore/405b958b-4cb7-4bac-81cc-ce0184249670

- A12FFF43D9EF764C9EE210EA5722765F - for the .onetoc2 filetype = GUID {43FF2FA1-EFD9-4C76-9EE2-10EA5722765F} as specified by the MS-ONESTORE and ONETOC2 - https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-onestore/2b394c6b-8788-441f-b631-da1583d772fd

Mime type should be probably "application/onenote".


Attached Files:
Notes
(0003893)
christos   
2023-01-24 20:46   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
416 [file] General minor have not tried 2023-01-11 21:47 2023-01-24 20:37
Reporter: thesamesam Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Test failure on sparc
Description: Running the test suite on Linux + sparc results in a test suite failure.

Noticed with 5.44 but reproduced from git:
```
Running test: ../tests/fit-map-data
TZ=UTC MAGIC=../magic/magic ./test -e ../tests/fit-map-data.testfile ../tests/fit-map-data.result
../tests/fit-map-data.testfile: FIT Map data, unit id 65536, serial 3879446968, Sat May 31 03:00:34 2014, manufacturer 1 (garmin), product 1632, type 4 (Activity)
test: ERROR: result was (len 130)
FIT Map data, unit id 65536, serial 3879446968, Sat May 31 03:00:34 2014, manufacturer 1 (garmin), product 1632, type 4 (Activity)
expected (len 131)
FIT Map data, unit id 65536, serial 3879446968, Sat May 31 10:00:34 2014, manufacturer 1 (garmin), product 1632, type 4 (Activity)
make[2]: *** [Makefile:761: check-local] Error 1
make[2]: Leaving directory '/root/file/tests'
make[1]: *** [Makefile:637: check-am] Error 2
make[1]: Leaving directory '/root/file/tests'
make: *** [Makefile:465: check-recursive] Error 1
```

I've attached the full log from `make check`. Let me know if I can give more information. Access to the environment is available if needed.
Tags:
Steps To Reproduce:
Additional Information: # emerge --info
Portage 3.0.42 (python 3.10.9-final-0, default/linux/sparc/17.0/64ul/desktop, gcc-12, glibc-2.36-r6, 5.15.69-gentoo sparc64)
=================================================================
System uname: Linux-5.15.69-gentoo-sparc64-sun4v-with-glibc2.36
KiB Mem: 531346544 total, 402137592 free
KiB Swap: 0 total, 0 free
Timestamp of repository gentoo: Wed, 11 Jan 2023 21:02:13 +0000
sh dash 0.5.12
ld GNU ld (Gentoo 2.39 p5) 2.39.0
ccache version 4.7.4 [disabled]
app-misc/pax-utils: 1.3.5::gentoo
app-shells/bash: 5.2_p15::gentoo
dev-lang/perl: 5.36.0-r1::gentoo
dev-lang/python: 3.8.16::gentoo, 3.9.16::gentoo, 3.10.9::gentoo, 3.11.1::gentoo
dev-lang/rust-bin: 1.65.0::gentoo
dev-util/ccache: 4.7.4::gentoo
dev-util/cmake: 3.25.1::gentoo
dev-util/meson: 1.0.0::gentoo
sys-apps/baselayout: 2.9::gentoo
sys-apps/openrc: 0.45.2-r2::gentoo
sys-apps/sandbox: 2.29::gentoo
sys-devel/autoconf: 2.71-r5::gentoo
sys-devel/automake: 1.16.5::gentoo
sys-devel/binutils: 2.39-r4::gentoo
sys-devel/binutils-config: 5.4.1::gentoo
sys-devel/gcc: 10.4.1_p20221222::gentoo, 11.3.1_p20221223::gentoo, 12.2.1_p20221224::gentoo
sys-devel/gcc-config: 2.8::gentoo
sys-devel/libtool: 2.4.7-r1::gentoo
sys-devel/make: 4.4::gentoo
sys-kernel/linux-headers: 6.1::gentoo (virtual/os-headers)
sys-libs/glibc: 2.36-r6::gentoo
Repositories:

gentoo
    location: /bound/portage
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    volatile: True
    sync-rsync-verify-jobs: 1
    sync-rsync-extra-opts:
    sync-rsync-verify-max-age: 24
    sync-rsync-verify-metamanifest: yes

ACCEPT_KEYWORDS="sparc ~sparc"
ACCEPT_LICENSE="@FREE"
CBUILD="sparc64-unknown-linux-gnu"
CFLAGS="-O2 -pipe -mcpu=ultrasparc"
CHOST="sparc64-unknown-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe -mcpu=ultrasparc"
DISTDIR="/bound/distfiles"
EMERGE_DEFAULT_OPTS="--complete-graph --with-bdeps=y --keep-going --deep --jobs=4"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME"
FCFLAGS="-O2 -pipe -mcpu=ultrasparc"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg-live cgroup config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe -mcpu=ultrasparc"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="C.UTF8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,--hash-style=gnu"
LEX="flex"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/bash"
USE="X a52 aac acl alsa big-endian branding bzip2 cairo caps cdda cdr cli crypt cups dbus dri dts dvd dvdr encode exif filecaps flac fortran gdbm gif gmp gpm gtk gui iconv icu introspection ipv6 jit jpeg lcms libglvnd libnotify libtirpc llvm-libunwind mad mng mp3 mp4 mpeg ncurses nls nptl ogg opengl openmp pam pango pcre pdf png policykit ppds readline sdl sound sparc spell split-usr ssl startup-notification svg test-rust tiff truetype udev udisks unicode upower usb vorbis wxwidgets x264 xattr xcb xft xml xv xvid zlib" ADA_TARGET="gnat_2021" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-4 php8-0" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_10" PYTHON_TARGETS="python3_10 python3_8 python3_9" RUBY_TARGETS="ruby27 ruby26" USERLAND="GNU" VIDEO_CARDS="dummy fbdev" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, LINGUAS, MAKE, MAKEFLAGS, MAKEOPTS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
Attached Files: file-test-suite.log (5,612 bytes) 2023-01-11 21:47
https://bugs.astron.com/file_download.php?file_id=320&type=bug
Notes
(0003888)
thesamesam   
2023-01-11 21:59   
I've gone back to 5.39 and the same test fails consistently, which is weird because I feel like I would've noticed if it failed previously, as we have 5.43 marked stable on sparc in Gentoo.
(0003889)
thesamesam   
2023-01-12 07:19   
I can't reproduce this in another container, I think something was wrong with the environment, as 'date' wouldn't respect TZ either. Apologies! Please close as invalid.
(0003892)
christos   
2023-01-24 20:37   
The problem was that the test is printing localtime and your timezone is not what the tests expects (UTC). Forced in the test program now.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
418 [file] General tweak have not tried 2023-01-20 17:58 2023-01-24 20:30
Reporter: joveler Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Patch for HWP file format signature
Description: This patch revises HancomOffice HWP (Hangul Word Processor) document file format signatures.
HancomOffice HWP is a word processor (or semi-desktop publishing software) mainly used in the Republic of Korea.

*Changes*
1. Add support for the HWPX format
- Hancom is promoting that they are changing the most supported format to HWPX from HWP 5.0.
- HWPX (OWPML) is based on OCF specification (PKZIP container), so the signature goes into magDir/archive.

2. Update filetype of HWP 3.0/5.0 format
- HWP 3.0/5.0 filetype now starts with `Hancom HWP (Hangul Word Processor) file`.
- Current HWP 3.0/5.0 format filetype contains `Hangul (Korean)`, but it is highly ambiguous.
  In this context, Hangul is a trademarked name of the word processor, not Korean characters.
  Also, the HWP formats do not have a distinction between Korean/Global HWP (program) releases.
- I put the company name (Hancom) and program name (HWP), following the OOXML filetype convention (e.g. Microsoft Word 2007+).
  I also added the full name of the HWP program, 'Hangul Word Processor', to avoid ambiguity between the program name and extension.
- HWP 3.0 format is a proprietary binary format, so it had been in magDir/wordprocessors.
- HWP 5.0 format uses MS compound data format similar to MS Office 97 ~ 2003.
  The filetype string is hardcoded on src/readcdf.c, and also exists on magDir/ole2compounddocs. Both two files were patched.

*Before Patch*
```
/c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwp: Hangul (Korean) Word Processor File 5.x
/c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwpx: Zip data (MIME type "application/hwp+zip"?)
/c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP97.hwp: Hangul (Korean) Word Processor File 3.0
```

*After Patch*
```
/c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwp: Hancom HWP (Hangul Word Processor) file, version 5.0
/c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwpx: Hancom HWP (Hangul Word Processor) file, HWPX
/c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP97.hwp: Hancom HWP (Hangul Word Processor) file, version 3.0
```
Tags: hwp hwpx magic
Steps To Reproduce:
Additional Information:
Attached Files: file-5.44-hwp.diff (3,151 bytes) 2023-01-20 17:58
https://bugs.astron.com/file_download.php?file_id=321&type=bug
HWP2016.hwpx (14,377 bytes) 2023-01-24 15:59
https://bugs.astron.com/file_download.php?file_id=324&type=bug
HWP2016.hwp (9,216 bytes) 2023-01-24 15:59
https://bugs.astron.com/file_download.php?file_id=323&type=bug
HWP97.hwp (8,975 bytes) 2023-01-24 15:59
https://bugs.astron.com/file_download.php?file_id=322&type=bug
Notes
(0003890)
joveler   
2023-01-24 15:59   
Here are test sample files of the HWP format family.
(0003891)
christos   
2023-01-24 20:30   
Committed, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
415 [file] General minor always 2023-01-04 14:50 2023-01-08 20:14
Reporter: vt Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: ChangeLog.lz: empty (lzip compressed data, version: 1)
Description: `FILE5_44`/`master`(at 8a07d6803554b25fbee28cd34bb0a0e7cea4cfd8) stopped to correctly detect file content inside of `lzip`'ed files.
```
file (master)$ src/file -S -z -m magic/magic.mgc ChangeLog.lz
ChangeLog.lz: empty (lzip compressed data, version: 1)
```
Tags:
Steps To Reproduce: ```
file (master)$ lzip -k ChangeLog
file (master)$ ls -la ChangeLog*
-rw-r--r-- 1 vt vt 61943 Jan 4 17:41 ChangeLog
-rw-r--r-- 1 vt vt 17529 Jan 4 17:39 ChangeLog.lz

file (master)$ src/file -S -z -m magic/magic.mgc ChangeLog.lz
ChangeLog.lz: empty (lzip compressed data, version: 1)
```
Additional Information: `FILE5_43` was working correctly:
```
file ((FILE5_43))$ src/file -S -z -m magic/magic.mgc ChangeLog.lz
ChangeLog.lz: ASCII text (lzip compressed data, version: 1)
```
Attached Files:
Notes
(0003887)
christos   
2023-01-08 20:14   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
414 [file] General minor always 2023-01-01 14:12 2023-01-08 17:11
Reporter: arsenm Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Missing handling for EM_AMDGPU and most other architectures
Description: file does not recognize the e_machine type for AMDGPU and many other registered EM_* values. It reports this as "*unknown arch 0xe0*
Tags:
Steps To Reproduce: $ file amdgpu.o
amdgpu.o: ELF 64-bit LSB shared object, *unknown arch 0xe0* version 1, dynamically linked, not stripped

Attached a random sample
Additional Information: Grepping the sources suggests the list of handled cases is much smaller than the list of registered elf machines:

src/readelf.h:#define EM_SPARC 2
src/readelf.h:#define EM_386 3
src/readelf.h:#define EM_SPARC32PLUS 18
src/readelf.h:#define EM_SPARCV9 43
src/readelf.h:#define EM_IA_64 50
src/readelf.h:#define EM_AMD64 62

The current list of registered target IDs is at https://www.sco.com/developers/gabi/latest/ch4.eheader.html, all of these should be recognized
Attached Files: amdgpu.o (107,040 bytes) 2023-01-01 14:12
https://bugs.astron.com/file_download.php?file_id=319&type=bug
Notes
(0003886)
christos   
2023-01-08 17:11   
These constants are not used for detection, they are used in code to provide custom hardware properties for specific processor architectures. The missing amdgpu was added in magic/Magdir/elf. Thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
409 [file] General block always 2022-12-12 16:37 2023-01-08 16:50
Reporter: Girgias Platform: Linux  
Assigned To: christos OS: CentOS (Core)   
Priority: normal OS Version: 7.9.2009  
Status: assigned Product Version: 5.43  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Compilation error in softmagic.c due to redefining strndup in AIX preprocessor block
Description: This was first reported to PHP here: https://github.com/php/php-src/pull/10074

The issue is located between lines 503 and 523 (https://github.com/file/file/blob/master/src/softmagic.c#L503-L523)
Where the redeclaration of strndup probably should have been in an else block as otherwise the following compile errors can arise (line numbers inaccurate as it is the patched PHP version to use the PHP memory allocator):

/www/server/source/php/php82/ext/fileinfo/libmagic/softmagic.c:507:7: error: expected identifier or ‘(’ before ‘extension’
char *strndup(const char *, size_t);
^
/www/server/source/php/php82/ext/fileinfo/libmagic/softmagic.c:510:1: error: expected identifier or ‘(’ before ‘extension’
strndup(const char *str, size_t n)
^
make: *** [libmagic/softmagic.lo] Error 1
make: *** [libmagic/softmagic.lo] Error 1
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003878)
christos   
2022-12-26 19:23   
can you provide the preprocessor output cc -E of those lines?
(0003883)
Girgias   
2023-01-02 15:54   
I've just asked the end-user if they can provide this information.
(0003884)
Girgias   
2023-01-08 15:49   
End user came back with this:

    gcc -E ext/fileinfo/libmagic/softmagic.c

[root@VM-EtsmUAZgUPAH php82]# gcc -E ext/fileinfo/libmagic/softmagic.c
# 1 "ext/fileinfo/libmagic/softmagic.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "ext/fileinfo/libmagic/softmagic.c"
# 32 "ext/fileinfo/libmagic/softmagic.c"
# 1 "ext/fileinfo/libmagic/file.h" 1
# 36 "ext/fileinfo/libmagic/file.h"
# 1 "ext/fileinfo/libmagic/config.h" 1
In file included from ext/fileinfo/libmagic/file.h:36:0,
                 from ext/fileinfo/libmagic/softmagic.c:32:
ext/fileinfo/libmagic/config.h:1:17: fatal error: php.h: No such file or directory
 #include "php.h"
                 ^
compilation terminated.

    Compile time error

libmagic/softmagic.dep -MT libmagic/softmagic.lo -fPIC -DPIC -o libmagic/.libs/softmagic.o
In file included from /usr/include/string.h:633:0,
                 from /www/server/php/82/include/php/main/../main/php_config.h:2207,
                 from /www/server/php/82/include/php/Zend/zend_config.h:1,
                 from /www/server/php/82/include/php/Zend/zend_portability.h:43,
                 from /www/server/php/82/include/php/Zend/zend_types.h:25,
                 from /www/server/php/82/include/php/Zend/zend.h:27,
                 from /www/server/php/82/include/php/main/php.h:31,
                 from /www/server/source/php/php82/ext/fileinfo/libmagic/config.h:1,
                 from /www/server/source/php/php82/ext/fileinfo/libmagic/file.h:36,
                 from /www/server/source/php/php82/ext/fileinfo/libmagic/softmagic.c:32:
/www/server/source/php/php82/ext/fileinfo/libmagic/softmagic.c:507:7: error: expected identifier or ‘(’ before ‘__extension__’
 char *strndup(const char *, size_t);
       ^
/www/server/source/php/php82/ext/fileinfo/libmagic/softmagic.c:510:1: error: expected identifier or ‘(’ before ‘__extension__’
 strndup(const char *str, size_t n)
 ^
make: *** [libmagic/softmagic.lo] Error 1

Hopefully this is the requested information, and apologies for the delay.
(0003885)
christos   
2023-01-08 16:50   
Unfortunately not, since the compilation terminated early because the user did not supply the correct -I path where php.h lives.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
411 [file] General minor always 2022-12-24 18:57 2022-12-28 17:50
Reporter: yakov Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: Sony XAVC reported as application/octet-stream
Description: The file utility does recognize "MPEG v4 system, Sony XAVC Codec", however it doesn't have a mime-type in the database. Consequently it's reported as application/octet-stream, which prevents xdg-open from opening those files with a video player.
Probably reporting it as video/mp4 would be more appropriate.
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: xavc.patch (422 bytes) 2022-12-26 20:25
https://bugs.astron.com/file_download.php?file_id=317&type=bug
Notes
(0003876)
christos   
2022-12-26 19:15   
It is very difficult to recognize MPEG4, if you have some suggested magic, by all means, send it to me.
(0003880)
yakov   
2022-12-26 20:25   
It's already recognized, but it's missing a mime-type; here's a patch.
(0003882)
christos   
2022-12-28 17:50   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
413 [file] General block always 2022-12-27 19:24 2022-12-28 17:48
Reporter: grobian Platform: x86_64  
Assigned To: christos OS: OpenIndiana  
Priority: normal OS Version: 2021.10  
Status: resolved Product Version: 5.44  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: softmagic.c fails to compile due to undefined UINT_MAX
Description: libtool: compile: x86_64-pc-solaris2.11-gcc -DHAVE_CONFIG_H -I. -I/gentoo/prefix64/var/tmp/portage/sys-apps/file-5.44/work/file-5.44/src -I.. -DMAGIC=\"/gentoo/prefix64/usr/share/misc/magic\" -fvisibility=hidden -Wall -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wmissing-declarations -Wredundant-decls -Wnested-externs -Wsign-compare -Wreturn-type -Wswitch -Wshadow -Wcast-qual -Wwrite-strings -Wextra -Wunused-parameter -Wformat=2 -O2 -pipe -c /gentoo/prefix64/var/tmp/portage/sys-apps/file-5.44/work/file-5.44/src/softmagic.c -fPIC -DPIC -o .libs/softmagic.o
/gentoo/prefix64/var/tmp/portage/sys-apps/file-5.44/work/file-5.44/src/softmagic.c: In function ‘do_ops’:
/gentoo/prefix64/var/tmp/portage/sys-apps/file-5.44/work/file-5.44/src/softmagic.c:1462:20: error: ‘UINT_MAX’ undeclared (first use in this function)
 1462 | if (lhs >= UINT_MAX || lhs <= INT_MIN ||
      | ^~~~~~~~
/gentoo/prefix64/var/tmp/portage/sys-apps/file-5.44/work/file-5.44/src/softmagic.c:46:1: note: ‘UINT_MAX’ is defined in header ‘<limits.h>’; did you forget to ‘#include <limits.h>’?
   45 | #include "der.h"
  +++ |+#include <limits.h>
   46 |

% gcc -dumpversion
12.2.0
% uname -a
SunOS hollandscheleeuw 5.11 illumos-a7aaa5137d i86pc i386 i86pc Solaris

Adding a patch as the compiler suggests makes it compile:

% cat solaris.patch
--- src/softmagic.c.orig 2022-12-27 20:05:39.006034099 +0000
+++ src/softmagic.c 2022-12-27 20:05:46.330774386 +0000
@@ -42,6 +42,7 @@
 #include <ctype.h>
 #include <stdlib.h>
 #include <time.h>
+#include <limits.h>
 #include "der.h"
 
 file_private int match(struct magic_set *, struct magic *, file_regex_t **, size_t,

Please see attached patch.
Thanks!
Tags: build
Steps To Reproduce: configure, make
Additional Information:
Attached Files: file-5.44-solaris.patch (395 bytes) 2022-12-27 19:24
https://bugs.astron.com/file_download.php?file_id=318&type=bug
Notes
(0003881)
christos   
2022-12-28 17:48   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
408 [file] General crash always 2022-12-11 11:34 2022-12-26 19:23
Reporter: SpraxDev Platform:  
Assigned To: christos OS: Manjaro Linux  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: --preserve-date causes crash when sandboxing syscalls (SIGSYS)
Description: When I run 'file --preserve-date /bin/true' I get an error message (not in English, otherwise I would share).
But strace gives me the additional info '+++ killed by SIGSYS +++'.

When running 'file --no-sandbox --preserve-date /bin/true' it works without any issues.
Tags:
Steps To Reproduce: Run file --preserve-date /bin/true
Additional Information: When running 'file --no-sandbox --preserve-date /bin/true' it works without any issues.
Attached Files:
Notes
(0003879)
christos   
2022-12-26 19:23   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
410 [file] General minor always 2022-12-18 10:19 2022-12-26 19:21
Reporter: pandrew Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: add support for bitcoin-core blk.dat, rev.dat, also .ldb files
Description: Detect magic bytes and display basic data from the first header.

For the LevelDB table data files, just detect the 64-bit magic number, no further processing.

File attached and on github:

https://github.com/pandrewhk/andrew/blob/master/magic/bitcoin
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: bitcoin (669 bytes) 2022-12-18 10:19
https://bugs.astron.com/file_download.php?file_id=313&type=bug
Notes
(0003877)
christos   
2022-12-26 19:21   
Added, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
412 [file] General block always 2022-12-26 18:37 2022-12-26 18:49
Reporter: joveler Platform: MSYS2 MinGW-w64  
Assigned To: christos OS: Windows 10 22H2  
Priority: normal OS Version: gcc 12.2.0  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: HEAD  
    Target Version:  
Summary: file 5.43 compile error on MinGW/MSYS2 and fix
Description: * This issue is continued from the 5.40 compile issue report, https://bugs.astron.com/view.php?id=255

file 5.43 needs patches to be compiled on MSYS MinGW-w64.

[*] putc call

In `file.c`, `fname_print()` has a code path dedicated for non-widechar support.
MinGW-w64 goes into that route, and meets `putc` without any FILE* to write on.
Simply adding `stdout` parameter solves the issue.

-----
file.c: In function 'fname_print':
file.c:605:25: error: too few arguments to function 'putc'
  605 | putc(c);
      | ^~~~
In file included from file.h:79,
                 from file.c:32:
C:/msys64/mingw64/include/stdio.h:705:15: note: declared here
  705 | int __cdecl putc(int _Ch,FILE *_File);
      | ^~~~
-----
[*] pipe call

This issue is related to the previous 5.40 fcntl build error.
Thanks to the official patch, 5.43 does not have any fcntl compile/link error on Windows.
In spite of the fix, Windows C runtime does not provide `pipe()` function, creating a linker error.
The proposed patch adds `__MINGW32__` check at the top of the function to mitigate this.

-----
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: .libs/funcs.o: in function `file_pipe_closexec':
src/funcs.c:850: undefined reference to `pipe'
-----

[*] ioctl call

This issue is identical with the previous 5.40 ioctl build error.

MinGW does not support ioctl call and produce a linker error.

-----
C:/msys64/mingw32/bin/../lib/gcc/i686-w64-mingw32/10.2.0/../../../../i686-w64-mingw32/bin/ld.exe: D:\Jang\Build\Source\PEBakery\Library\libmagic\file-5.40-mod\src/compress.c:417: undefined reference to `ioctl'
-----

The proposed patch fixes this by adding !defined(__MINGW32__) check.

If the library is configured with `--disable-xzlib --disable-bzlib`, the code is excluded and does not block the build.



Tags: build, patch, Windows
Steps To Reproduce: - Install MSYS2 on a Windows machine, and compile file source with normal `configure` and `make`.
Additional Information: [*] Additional info of compile environment

- Platform: Windows 10 22H2 with MSYS2
- Tested Target Arch : i686, x86_64, aarch64
- Toolchain: (1) MinGW-w64 of MSYS2 (i686/x86_64) (2) llvm-mingw (aarch64)
Attached Files: MinGW_w64_pipe_fix.diff (374 bytes) 2022-12-26 18:37
https://bugs.astron.com/file_download.php?file_id=316&type=bug
MinGW_w64_ioctl_fix.diff (660 bytes) 2022-12-26 18:37
https://bugs.astron.com/file_download.php?file_id=315&type=bug
MinGW_w64_putc_fix.diff (326 bytes) 2022-12-26 18:37
https://bugs.astron.com/file_download.php?file_id=314&type=bug
Notes
(0003875)
christos   
2022-12-26 18:49   
Committed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
404 [file] General minor always 2022-11-08 16:03 2022-12-09 18:02
Reporter: manfredsc Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: reopened  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: mime-types and extensions for GRIB
Description: I would suggest enhancing the magic entries of "GRIB" with ext and mime items:

in file magic/Magdir/meteorological:
# https://en.wikipedia.org/wiki/GRIB
0 string GRIB
>7 byte =1 Gridded binary (GRIB) version 1
!:mime application/x-grib
!:ext grb/grib
>7 byte =2 Gridded binary (GRIB) version 2
!:mime application/x-grib2
!:ext grb2/grib2


Tags:
Steps To Reproduce:
Additional Information: For reasoning for the mime type, see https://tika.apache.org/1.28.5/formats.html and search for "grib"

http://www.opengis.net/doc/IS/CIS-GRIB2/1.0 (search for "application/")
suggests the mime-type "application/wmo-grib2" to be registered with IANA,
but this never happened and is nowhere used in practice. Until then, it is recommended to use "application/x-grib2"
(for the version 2 variant).

For file suffixes, both the short and the long variants are used approximately equally often in practice,
so I suggest adding both variants. The suffixes grb1 or grib1 are seldom used.

As there is software which only can handle either GRIB1 or GRIB2 and not both, I suggest adding two separate mime entries
and not merging them.
Attached Files:
Notes
(0003870)
christos   
2022-12-02 17:45   
Thanks!
(0003871)
manfredsc   
2022-12-05 10:58   
Thanks a lot.
Your change resulted in

# https://en.wikipedia.org/wiki/GRIB
0 string GRIB
>7 byte >0
>>7 byte <3 Gridded binary (GRIB) version %d
!:mime application/x-grib
!:ext grb/grib

I expected to get an entry for GRIB2, too. GRIB1 is a legacy format which is only seldom used nowadays.
Since 2003 the successor GRIB2 is recommended to be used, and since about 10 years the overwhelming
majority of weather agencies use GRIB2, in my view.
Suffixes grb and grib are explicitly, exclusively used for GRIB1 datasets. The same goes for the mime type x-grib.

Could we have some suffix and mime entries for GRIB2 too, please?

There may be some vague dreams of a GRIB3, but this will most likely not manifest in the next 10 years, if at all.
It is far more likely that some sort of XML or json format will be the successor of GRIB2, and it is not clear how this
will be called.


(0003872)
christos   
2022-12-09 18:02   
changed as suggested.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
405 [file] General minor always 2022-11-15 23:04 2022-12-02 17:42
Reporter: eagr Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Wrong mime for JPEG XL images
Description: The mime type for JXL images has been incorrectly redefined as image/x-jxl by the b62c39d1e13a7547e634ce36272f94f3fd39cbc4 commit

Instead, it should be defined as image/jxl
- As such it used to be before the aforementioned commit
- As provided by the URL in the magic/Magicdir/jpeg comment to the JXL-related definitions: http://fileformats.archiveteam.org/wiki/JPEG_XL
- As there is no definition for image/x-jxl anywhere in the official JPEG XL reference implementation. There is, however, an image/jxl mime definition:
https://github.com/libjxl/libjxl/blob/main/plugins/mime/image-jxl.xml

The mistake causes image viewers such as XFCE Ristretto, that rely on file, to be unable to load JXL images because the gdk-pixbuf loader from the JXL reference implementation which they also employ only understands the image/jxl mime.
https://github.com/libjxl/libjxl/tree/main/plugins/gdk-pixbuf
Tags:
Steps To Reproduce: - Have installed ristretto and libjxl, jxl-pixbuf-loader as well if the distro provides the pixbuf loader in a separate package
- Try open a JPEG XL image with ristretto

Reproducible on a fresh archlinux install.
Additional Information: Setting the JXL mime in magic/Magicdir/jpeg back to image/jxl makes things work again.
Attached Files:
Notes
(0003869)
christos   
2022-12-02 17:42   
thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
406 [file] General crash always 2022-11-27 18:00 2022-12-02 17:38
Reporter: rwmjones Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.42  
Product Build: Resolution: won't fix  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: seccomp support causes the -z option to fail on zstd-compressed files
Description: "file" is been compiled with seccomp support. When we ask to inspect a zstd-compressed file with the -z option, it crashes:

$ ./file-5.42/src/.libs/file -z /var/tmp/lib-i586.so.zst
/var/tmp/lib-i586.so.zst: Bad system call
Tags:
Steps To Reproduce: Please see attached zstd file. Just run the above command on it.
Additional Information:
Attached Files: lib-i586.so.zst (1,712 bytes) 2022-11-27 18:00
https://bugs.astron.com/file_download.php?file_id=311&type=bug
Notes
(0003863)
rwmjones   
2022-11-27 18:00   
This issue was originally reported by Toolybird here:
https://github.com/libguestfs/libguestfs/issues/100#issuecomment-1328182986
(0003864)
rwmjones   
2022-11-27 18:03   
Bad system call is possibly pipe2 ...?

pipe2(0x7fff7c648048, O_CLOEXEC) = 293
+++ killed by SIGSYS +++
Bad system call (core dumped)
(0003865)
rwmjones   
2022-11-28 10:19   
I've read about the -S option which the manual suggests combining with -z.

Nevertheless it is possible to get it to work by enabling the following syscalls (quite a broad set):

clone3
execve
pipe2
prlimit64
wait4

I wonder if they could only be enabled when we know we're going to need to run a subcommand.
(0003868)
christos   
2022-12-02 17:38   
If you enable clone and execve might as well turn off the sandbox. 5.43 gives you a better error message than bad system call. Fix the distribution provider to include the proper libraries instead.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
407 [file] General minor always 2022-11-28 15:06 2022-12-02 17:26
Reporter: manfredsc Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: mime-type and extension not emitted for lzma
Description: # file test.txt.lzma
test.txt.lzma: LZMA compressed data, non-streamed, size 5

# file --mime-type test.txt.lzma
test.txt.lzma: application/octet-stream

# file --extension test.txt.lzma
test.txt.lzma: ???


The magic entry is as follows [magic/Magdir/compress]:
# Type: LZMA
0 lelong&0xffffff =0x5d
>12 leshort 0xff LZMA compressed data,
!:mime application/x-lzma
!:ext lzma
>>5 lequad =0xffffffffffffffff streamed
>>5 lequad !0xffffffffffffffff non-streamed, size %lld
>12 leshort 0 LZMA compressed data,
>>5 lequad =0xffffffffffffffff streamed
>>5 lequad !0xffffffffffffffff non-streamed, size %lld


If the mime entry is moved to the end of the block, the mime description is
given just fine. So I suggest moving the "!:mime" and "!:ext" lines to the end.

# Type: LZMA
0 lelong&0xffffff =0x5d
>12 leshort 0xff LZMA compressed data,
>>5 lequad =0xffffffffffffffff streamed
>>5 lequad !0xffffffffffffffff non-streamed, size %lld
>12 leshort 0 LZMA compressed data,
>>5 lequad =0xffffffffffffffff streamed
>>5 lequad !0xffffffffffffffff non-streamed, size %lld
!:mime application/x-lzma
!:ext lzma


With this, I get
# file --mime-type test.txt.lzma
test.txt.lzma: application/x-lzma
# file --extension test.txt.lzma
test.txt.lzma: lzma
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: test.txt.lzma (23 bytes) 2022-11-28 15:06
https://bugs.astron.com/file_download.php?file_id=312&type=bug
Notes
(0003866)
manfredsc   
2022-11-28 15:23   
Or, other possibility would be to duplicate the mime and ext entries, I'm not
familiar enough with the magic format to decide which is better:

# Type: LZMA
0 lelong&0xffffff =0x5d
>12 leshort 0xff LZMA compressed data,
!:mime application/x-lzma
!:ext lzma
>>5 lequad =0xffffffffffffffff streamed
>>5 lequad !0xffffffffffffffff non-streamed, size %lld
>12 leshort 0 LZMA compressed data,
!:mime application/x-lzma
!:ext lzma
>>5 lequad =0xffffffffffffffff streamed
>>5 lequad !0xffffffffffffffff non-streamed, size %lld
(0003867)
christos   
2022-12-02 17:26   
Fixed, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
400 [file] General feature N/A 2022-10-31 15:56 2022-11-06 18:47
Reporter: lindblad Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Haskell & Julia
Description: Add detection of Haskell and Julia scripts.

Haskell

$ cat Hi.hs
#!/usr/bin/env runghc
main = putStrLn "hi!"
$ cat Hello.hs
#!/usr/bin/env runhaskell
main = putStrLn "hello!"
$

desired output simulated below

$ file Hi.hs
Hi.hs: GHC script, ASCII text executable
$ file Hello.hs
Hello.hs: Haskell script, ASCII text executable
$

Julia

$ cat Hello.jl
#!/usr/bin/env julia
println("hello world")
$
 
desired output simulated below
 
$ file Hello.jl
Hello.jl: Julia script, ASCII text executable
$
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003860)
christos   
2022-11-06 18:47   
Fixed, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
403 [file] General minor always 2022-11-06 10:04 2022-11-06 18:32
Reporter: pinymantis Platform: Thinkpad  
Assigned To: christos OS: Linux  
Priority: normal OS Version: OpenSuse Leap 15  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Any text file containing a line starting with "PROC" identified as Algol 68 source text
Description: Note: This is related to https://bugs.astron.com/view.php?id=49

Any text file containing a line starting with "PROC" is identified as Algol 68 source text.

This is at least irritating.
Tags:
Steps To Reproduce: Run
```
#! /bin/bash

echo "PROC" > t1
echo " PROC" > t2
echo "PROC " > t3
echo "PROCESSOR" > t4
echo "PROCX" > t5

cat << EOF > t6
MACHINE INFORMATION
   
Machine Manufacturer: LENOVO
Machine Type-Model(MTM): XXXX
Product Version: ThinkPad XXXX
Serial Number: XXXX
Eth Physical Address: XX-XX
   
BIOS INFORMATION
   
BIOS Version: N3EET22W (1.08 )
BIOS Release Date: 07/21/2022
BIOS Manufacturer: LENOVO
EC Version: N3EHT18W(1.08)
Intel ME Version: 16.1.25.1932
   
PROCESSOR INFORMATION

XXXX
XXXX
...
EOF

for f in t1 t2 t3 t4 t5 t6; do
    file $f
done
```

and the result will be

t1: Algol 68 source, ASCII text
t2: ASCII text
t3: Algol 68 source, ASCII text
t4: Algol 68 source, ASCII text
t5: Algol 68 source, ASCII text
t6: Algol 68 source, ASCII text
Additional Information: test code `ptest` attached

Motivation: Information from the UEFI-BIOS is UTF-16 encoded and needs to be converted to UTF-8 to be useful on (std) Linux.
Attached Files: ptest (558 bytes) 2022-11-06 10:04
https://bugs.astron.com/file_download.php?file_id=310&type=bug
Notes
(0003859)
christos   
2022-11-06 18:32   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
402 [file] General feature always 2022-11-03 01:51 2022-11-04 13:36
Reporter: Alan Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Add support for Playdate native files
Description: The Playdate portable video game console (https://play.date/) has several native file formats: pdi, pdt, pdv, pda, pdz, and pds. Currently file doesn't recognize any of them. The attached magic file adds support for them.

Test data for many of these can be found in the free SDK https://play.date/dev/, and it can be used to create more.
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: playdate-magic (1,538 bytes) 2022-11-03 01:51
https://bugs.astron.com/file_download.php?file_id=309&type=bug
Notes
(0003857)
christos   
2022-11-04 13:36   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
399 [file] General minor N/A 2022-10-30 12:25 2022-11-04 13:35
Reporter: lindblad Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: reopened  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: file typos
Description: I recently cloned the GitHub mirror.

The code appears to contain some typographical errors.

The attached shell script is authorised for modification and/or use by the file utility developers.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: typos.sh (1,831 bytes) 2022-10-30 12:25
https://bugs.astron.com/file_download.php?file_id=308&type=bug
Notes
(0003855)
christos   
2022-10-31 14:24   
Committed, thanks. Portability note: sed -i <pattern> is not portable; you should use sed -ie <pattern>. For example fails on macosx and bsd.
(0003856)
lindblad   
2022-10-31 15:26   
I was using GNU sed, so thanks for the tip regarding sed -ie <pattern>.

sed -i 's/alpabetical/alphabetic/g' file/magic/Magdir/msdos

factually the above line should have read

sed -i 's/alpabetical/alphabetical/g' file/magic/Magdir/msdos

which would have resulted in the following

# check for "long" alphabetical Lotus driver name like:


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
397 [file] General major always 2022-10-20 09:55 2022-10-27 19:20
Reporter: dadv Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: reopened  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Regression in mode -f- (stdin)
Description: Upto file-5.42 an application may popen("file -bnf-"), feed it with filenames and get results immediately.
For example:

(echo /lib/libc.so.7; sleep 3600) | file -bnf -

Such command prints result immediately with file-5.42.
file-5.43 does not print result immediately.

It was broken with following commit:
https://github.com/file/file/commit/19bf47777d0002ee884467e45e6ace702e40a4c1

Rollback of that commit fixes the regression.
Tags: regression
Steps To Reproduce: See above.
Additional Information: The problem is serious as it breaks application that continuously feed "file -bnf-" with filenames and expect results at once, before EOF.
Attached Files:
Notes
(0003845)
christos   
2022-10-23 14:23   
Added -I flag to address this.
(0003846)
dadv   
2022-10-23 20:18   
Thank you very much for prompt response. Sadly, new flag still does not restore breakage of existing applications that rely on previous behaviour that existed for long time; applications would need to be modified and its is not always possible.

Please consider adding new flag for new behaviour instead, so that "file -bnf-" would behave as before.
(0003848)
christos   
2022-10-24 20:19   
I am guessing that there are very few applications that depend on this behavior (which was not guaranteed anyway). I think that it is better to have consistent output between:

      cat foo | file -f -
and
      file -f foo

as opposed to having them be different by default. If you want immediate results and inconsistent output, you should be explicit about it. Now that I think about it, there is nothing special about -f - vs -f /dev/stdin, so I better fix it to be also consistent.
(0003849)
dadv   
2022-10-25 02:07   
Such applications do exist, nevertheless. Now they cannot just run "file" with same set of flags for any operating system, because different systems will have different "file" versions with different default behaviour for considerable time.

# echo my.cnf | file -bnIf -
file: invalid option -- I
ASCII text
Usage: file [-bcCdEhikLlNnprsSvzZ0] [--apple] [--extension] [--mime-encoding]
            [--mime-type] [-e <testname>] [-F <separator>] [-f <namefile>]
            [-m <magicfiles>] [-P <parameter=value>] [--exclude-quiet]
            <file> ...
       file -C [-m <magicfiles>]
       file [--help]
(0003850)
christos   
2022-10-25 22:25   
I am not very sympathetic of such apps. They could easily be converted to use libmagic. Running commands needlessly causes security issues (for example filenames with \n in them will not be processed correctly.
(0003851)
dadv   
2022-10-26 08:21   
From the manual page:

     -n, --no-buffer
             Force stdout to be flushed after checking each file. This is
             only useful if checking a list of files. It is intended to be
             used by programs that want filetype output from a pipe.

For me, it means that behaviour was "guaranteed".
(0003852)
christos   
2022-10-26 16:57   
Here is a good point! I just used -n to restore the previous behavior and removed -I.
(0003854)
dadv   
2022-10-27 01:59   
Thank you very much for this change. It really fixes compatibility. This PR can be closed now.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
398 [file] General major always 2022-10-25 22:36 2022-10-26 18:09
Reporter: Fabrice Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Build failure without wide support (e.g. on uclibc)
Description: The build fails without wide support (e.g. on uclibc) since version 5.43 and
https://github.com/file/file/commit/c80065fe6900be5e794941e29b32440e9969b1c3:

file.c: In function 'fname_print':
file.c:605:10: error: macro "putc" requires 2 arguments, but only 1 given
  605 | putc(c);
      | ^

Full log:
 - http://autobuild.buildroot.org/results/7ff1dd9f79408d2e6286c005302b6f3c505ab259
Tags: build, patch
Steps To Reproduce:
Additional Information: The attaced patch will fix the build failure
Attached Files: 0001-src-file.c-fix-build-without-wide-support.patch (1,113 bytes) 2022-10-25 22:36
https://bugs.astron.com/file_download.php?file_id=307&type=bug
Notes
(0003853)
christos   
2022-10-26 18:09   
Thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
387 [file] General minor always 2022-10-01 19:52 2022-10-24 19:50
Reporter: rrt Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: !:strength sometimes causes magic to disappear
Description: I was investigating a problem with Lua magic patterns. With the default Lua magic in CVS, I get these results:

```
$ MAGIC=./magic/magic src/file --list|grep "Lua script text executable"
Strength = 51@11: Lua script text executable [text/x-lua]
Strength = 49@15: Lua script text executable [text/x-lua]
Strength = 48@13: Lua script text executable [text/x-lua]
Strength = 45@9: Lua script text executable [text/x-lua]
```

This is correct: there are four patterns in the lua magic file.

If I then add identical !:strength annotations to each entry, specifically "!:strength * 2", then one of the entries disappears (the one that should now have strength 96):

```
$ MAGIC=./magic/magic src/file --list|grep "Lua script text executable"
Strength = 102@12: Lua script text executable [text/x-lua]
Strength = 98@18: Lua script text executable [text/x-lua]
Strength = 90@9: Lua script text executable [text/x-lua]
```

If I change the multiplier on the missing entry from 2 to 3, it returns:

```
$ MAGIC=./magic/magic src/file --list|grep "Lua script text executable"
Strength = 144@15: Lua script text executable [text/x-lua]
Strength = 102@12: Lua script text executable [text/x-lua]
Strength = 98@18: Lua script text executable [text/x-lua]
Strength = 90@9: Lua script text executable [text/x-lua]
```
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: lua.mgc (6,016 bytes) 2022-10-20 19:19
https://bugs.astron.com/file_download.php?file_id=306&type=bug
Notes
(0003825)
christos   
2022-10-09 17:23   
Can you please upload the two different minimal magic files you are using to reproduce this?
(0003843)
rrt   
2022-10-20 19:19   
I tried to make a minimal magic file, and in the process I obtained a similar strange result. When I compile just the CVS Lua magic file and then list its contents, I only get two text patterns, although the file contains four patterns. However, when I built the default "everything" magic file, it gives the expected four results.

Here is the unexpected result:

```
$ ./src/file -C -m ./magic/Magdir/lua
$ MAGIC=./lua.mgc src/file --list
Set 0:
Binary patterns:
Strength = 70@19: Lua bytecode, []
Text patterns:
Strength = 51@11: Lua script text executable [text/x-lua]
Strength = 48@13: Lua script text executable [text/x-lua]
Set 1:
Binary patterns:
Text patterns:
```

Note that there are only two text pattern results in Set 0. (I am not sure what Set 1 is?)

If I run against the complete magic file, I get more patterns that can only come from the `lua` file:

```
$ MAGIC=./magic/magic.mgc src/file --list|grep text/x-lua
Strength = 51@11: Lua script text executable [text/x-lua]
Strength = 49@15: Lua script text executable [text/x-lua]
Strength = 48@105: LuaTex script text executable [text/x-luatex]
Strength = 48@13: Lua script text executable [text/x-lua]
Strength = 45@9: Lua script text executable [text/x-lua]
```

In this set of results, only the LuaTex [sic] result is not from the `lua` file.

I have attached the `lua.mgc` file produced by my system.
(0003844)
christos   
2022-10-23 13:23   
Yes, the listing was skipping entries. I fixed it now. (apprentice.c 1.338).
(0003847)
rrt   
2022-10-24 10:03   
Many thanks for the prompt fix!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
396 [file] General minor have not tried 2022-10-18 08:13 2022-10-18 14:12
Reporter: mwfearnley Platform: Linux  
Assigned To: christos OS: Ubuntu (WSL)  
Priority: low OS Version: 20.04.4 LTS  
Status: assigned Product Version: 5.38  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: JS source file misdetected as JPEG2000
Description: The source code at https://jplayer.org/latest/dist/jplayer/jquery.jplayer.min.js is misdetected as "JPEG 2000 image".
The first six bytes ("/*! jP") is enough to trigger this identification.
Tags:
Steps To Reproduce: $ wget https://jplayer.org/latest/dist/jplayer/jquery.jplayer.min.js
$ file jquery.jplayer.min.js
jquery.jplayer.min.js: JPEG 2000 image

$ printf "/*! jP" > jp.txt
$ file jp.txt
jp.txt: JPEG 2000 image
Additional Information:
Attached Files: jp.txt (6 bytes) 2022-10-18 08:13
https://bugs.astron.com/file_download.php?file_id=305&type=bug
jquery.jplayer.min.js (60,950 bytes) 2022-10-18 08:13
https://bugs.astron.com/file_download.php?file_id=304&type=bug
Notes
(0003840)
christos   
2022-10-18 13:04   
Can't reproduce in file 5.43.
(0003841)
mwfearnley   
2022-10-18 13:47   
OK, thanks.
I've managed to build the code from Github, and on a hunch I found the commit (and, I suspect, the removed line) that fixed it:

https://github.com/file/file/commit/b62c39d1e13a7547e634ce36272f94f3fd39cbc4#diff-b5e20963b12b6811f0ca24ddd36790e9f3016bd5bb7af6083ca585c50ed34d4fL33

It's beyond my understanding (of JPEG 2000, or Quicktime formats, or of file itself) to say much more though..

Does it warrant a regression test?
(0003842)
christos   
2022-10-18 14:12   
Sure, I just added one.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
389 [file] General minor have not tried 2022-10-01 20:11 2022-10-18 13:01
Reporter: rrt Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: /usr/bin/env patterns should be stronger than more general patterns
Description: In "varied.script" the /usr/bin/env patterns are annotated with "!:strength / 10", while the more general patterns that do not match /usr/bin/env get "!:strength / 2". This means that the /usr/bin/env-specific patterns never match first, which seems wrong.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003826)
christos   
2022-10-09 17:26   
Can you show an example of incorrect behavior?
(0003838)
rrt   
2022-10-17 18:54   
With current file CVS, with the file `foo.script` containing:

```
#!/usr/bin/env foo
```

I get the following output from file:

```
$ file foo.script
foo.script: a /usr/bin/env foo script. ASCII text executable
```

If I run `file -k foo.script`, the second diagnosis is the one I would expect:

```
$ file -k foo.script
foo.script: a /usr/bin/env foo script text executable\012- a foo script, ASCII text executable
```
(0003839)
christos   
2022-10-18 13:01   
Fixed, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
388 [file] General minor have not tried 2022-10-01 20:09 2022-10-17 18:39
Reporter: rrt Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Generic match for #! in "commands" file outweighs some language-specific matches
Description: (Tested in current CVS)

The pattern at around line 100 in "commands" say:

```
0 string/wt #!\ a
>&-1 string/T x %s script text executable
```

This gets strength 60. By contrast, the Lua-specific patterns in "lua" score between 40 and 50.

Also, it's not obvious that this pattern should exist at all, as there are similar patterns in the "varied.script" file.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003827)
christos   
2022-10-09 17:53   
I made all of them 30 and reorganized.
(0003837)
rrt   
2022-10-17 18:39   
Thanks, some nice simplification here!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
395 [file] General minor always 2022-10-14 12:34 2022-10-16 13:00
Reporter: saltedcoffii Platform: x86_64  
Assigned To: christos OS: Arch Linux  
Priority: normal OS Version: rolling  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: no magic for .tar.bz3 and .bz3 files
Description: .tar.bz3 and .bz3 files created by the bzip3 utility are reported as “data” and “application/octet-stream”. They should proabably show up similarly to the .tar.bz2 and .bz2 files, e.g., “bzip3 compressed data, block size = <>" and “application/x-bzip3”
Tags: compression, magic
Steps To Reproduce: 1. Compress a file using the bzip3 utility (or use the random data file provided)
2a. Read the file using `file <>.bz3`
2b. and `file --mime-type <>.bz3`
Additional Information: External links:
https://github.com/kspalaiologos/bzip3
https://aur.archlinux.org/packages/bzip3
https://pkgs.alpinelinux.org/package/edge/testing/x86_64/bzip3
Attached Files: foo.bz3 (756,960 bytes) 2022-10-14 12:34
https://bugs.astron.com/file_download.php?file_id=302&type=bug
16.10.22-file-issue-0000395-files-to-help-devs.tar.gz (211,312 bytes) 2022-10-16 12:21
https://bugs.astron.com/file_download.php?file_id=303&type=bug
Notes
(0003835)
saltedcoffii   
2022-10-16 12:21   
I realized that to find magic bytes, multiple files are required to compare and not all developers may have access to a compiled bzip3 yet. Here is a statically linked bzip3 v1.1.6 (latest version at the time of writing) for x86_64, and some files that have been randomly generated with `base64 /dev/random | head -c 10000` and compressed with `bzip3` and `tar`/`bzip3`.

Also, how do developers find magic bytes? I work with file a lot, so I'm sure I could help if I knew how to find magic bytes for unsupported files. Let me know!
(0003836)
christos   
2022-10-16 13:00   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
394 [file] General minor always 2022-10-10 07:32 2022-10-10 18:46
Reporter: ulm Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Makefile misidentified as "PRO-PACK archive data", while file -i is correct
Description: file-5.43 incorrectly reports "PRO-PACK archive data" for attached Makefile.
"file -i" reports it correctly as text/x-makefile.
Tags:
Steps To Reproduce: $ file Makefile
Makefile: PRO-PACK archive data
$ file -i Makefile
Makefile: text/x-makefile; charset=us-ascii
Additional Information:
Attached Files: Makefile (479 bytes) 2022-10-10 07:32
https://bugs.astron.com/file_download.php?file_id=301&type=bug
Notes
(0003833)
ulm   
2022-10-10 07:58   
Here's some information about the Pro-Pack (aka RNC) header format:
https://www.segaretro.org/Rob_Northen_compression

Looks like requiring an additional byte 0x01 or 0x02 after "RNC" would mitigate the problem.
(0003834)
christos   
2022-10-10 18:46   
Fixed. thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
304 [file] General tweak always 2021-12-06 06:32 2022-10-09 19:04
Reporter: zachs18 Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Netpbm format does not correctly parse size for some images
Description: The Netpbm image formats allow any amount of any whitespace between the width, height, and (if given) maxval fields in the header (it is only after the last field that exactly one whitespace character is allowed).
The format specification in magic/Magdir/images, however, assumes that there is exactly one space character (\ ) between width and height. This leads file to not correctly parse the width and height in files where the width and height are separated by other whitespace, e.g. a newline, two spaces, or a tab.
Tags: magic
Steps To Reproduce: Place the below into a file (not including quotes):
"P2
2
2
255
0 1 2 3"
`file` will report the file as `Netpbm image data, size = 0 x 1, greymap, ASCII text`, instead of `Netpbm image data, size = 2 x 2, greymap, ASCII text`
Additional Information: I believe this can be fixed by changing the regular expression on line 182 of magic/Magdir/images from "[0-9]{1,50}\ [0-9]{1,50}" to "[0-9]{1,50}[\040\t\f\r\n]+[0-9]{1,50}"

Attached files are reported as:
test1.pgm: `Netpbm image data, size = 0 x 0, greymap, ASCII text`
test2.pgm: `, rawbits, greymap`
but should be reported as:
test1.pgm: `Netpbm image data, size = 2 x 2, greymap, ASCII text`
test2.pgm: `Netpbm image data, size = 2 x 2, rawbits, greymap`
Attached Files: test1.pgm (19 bytes) 2021-12-06 06:32
https://bugs.astron.com/file_download.php?file_id=251&type=bug
test2.pgm (15 bytes) 2021-12-06 06:32
https://bugs.astron.com/file_download.php?file_id=250&type=bug
Notes
(0003684)
christos   
2021-12-17 14:43   
Fixed as suggested, thanks!
(0003822)
christos   
2022-10-02 22:42   
Fix Breaks:

P3
# CREATOR: GIMP PNM Filter Version 1.1
10 20
255
255
(...)


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
337 [file] General major always 2022-04-05 19:06 2022-10-09 18:55
Reporter: jmp3r Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Data files identified as text, text as data
Description: new bug, based on previous tickets
0000319, 0000334

I used the latest sources (HEAD) from github

I understand that it could be complicated to realize correct identifying text files but hope that its possible.

Attached more files for test.

Folders in archive:
data - binaries (files encrypted with ransomware) detected as `Unicode text, UTF-16, big-endian text, with no line terminators`
text - correct text files detected as data
CORRECT_DETECTED_AS_DATA - one file similar to the others (in data folder) but identified correctly.

If you need I can provide more files.
Tags: bug
Steps To Reproduce: Scan files from attach with `file` version latest sources (HEAD)
Additional Information:
Attached Files: bug_new.zip (166,197 bytes) 2022-04-05 19:06
https://bugs.astron.com/file_download.php?file_id=272&type=bug
Notes
(0003748)
christos   
2022-05-22 20:01   
It all has to do with the ucs16 detection in looks_ucs16 in encoding.c. if you change https://github.com/file/file/blob/master/src/encoding.c#L480 to 'return 0;', i.e. ignoring all ucs16 files that don't have BOM, then the text files mis-detected will get fixed (but then we'll miss usc16 files without BOM).
The Estonian and Hebrew files have invalid low surrogate pair characters (dc00 and de05 respectively). If you comment out https://github.com/file/file/blob/master/src/encoding.c#L516, they succeed.

The English.tr file has 2 0x13 (^S) characters, that is why it fails. The rest have some 0x7f DEL characters that is why they fail. If you comment out https://github.com/file/file/blob/master/src/encoding.c#L511, they all succeed.
(0003750)
jmp3r   
2022-05-27 22:56   
Thank you for explanation. Now I'm trying to apply 3rd party (chardet / charset-normalizer) to detect text files.

But what about files in `data` folder ? Files are encrypted but still detected as Unicode text ?
(0003751)
christos   
2022-05-28 00:27   
This is the first sentence (about UCS16 files without BOM). If you comment out 480, they will all return data.
(0003752)
jmp3r   
2022-05-28 01:02   
Oh, sry, didnt check correctly. Now I have `data` as I want and can handle them with postprocessing using `charset-normalizer` (pypi.org/project/charset-normalizer).

Can I ask for future option (maybe post-processing for possible text files or replace current detection text data) that will provide results as good as charset-normalizer:

for all encrypted (`data` folder) files we have correct `undefined` output:

normalizer -m *
Unable to identify originating encoding for "encry-https___download.eclipse.org_eclipse_updates_4.22_R-4.22-202111241800_p2.index". Maybe try increasing maximum amount of chaos.
Unable to identify originating encoding for "encry-https___download.eclipse.org_technology_epp_packages_2021-12_202112021200_p2.index". Maybe try increasing maximum amount of chaos.
Unable to identify originating encoding for "encry-https___download.eclipse.org_tools_cdt_releases_10.5_p2.index". Maybe try increasing maximum amount of chaos.
Unable to identify originating encoding for "encry-led-20-headers.h". Maybe try increasing maximum amount of chaos.
Unable to identify originating encoding for "encry-pom.properties". Maybe try increasing maximum amount of chaos.
undefined
undefined
undefined
undefined
undefined

and also for files in `text` folder:
normalizer -m *
utf_16
utf_16_le
utf_16
utf_16_le
utf_16
utf_16
utf_16

As you can see, all files detected 'closest to perfect' output
(0003832)
christos   
2022-10-09 18:55   
Seems to be fine now.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
372 [file] General minor always 2022-07-30 14:24 2022-10-09 18:54
Reporter: LevilJiang Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: allocation-size-too-big for file with ASAN
Description: Hi dev

I test the file with the latest commit (3d8a991) with AddressSanitizer. Unfortunately, it incurred a crash with the following error information. Any help would be greatly appreciated from you :D

```
=================================================================
==123037==ERROR: AddressSanitizer: requested allocation size 0xffffffffffffff06 (0x708 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T0)
    #0 0x55a7d202447e in __interceptor_malloc (/workspace/file/src/file+0xbf47e) (BuildId: 39c0b201f6cf154ce3a6ce6f762fe5e98224e3f3)
    0000001 0x55a7d20bc640 in doshn readelf.c
    0000002 0x55a7d20b865b in file_tryelf (/workspace/file/src/file+0x15365b) (BuildId: 39c0b201f6cf154ce3a6ce6f762fe5e98224e3f3)
    0000003 0x55a7d208fdef in file_buffer (/workspace/file/src/file+0x12adef) (BuildId: 39c0b201f6cf154ce3a6ce6f762fe5e98224e3f3)
    0000004 0x55a7d2064462 in file_or_fd magic.c
    0000005 0x55a7d2064636 in magic_file (/workspace/file/src/file+0xff636) (BuildId: 39c0b201f6cf154ce3a6ce6f762fe5e98224e3f3)
    0000006 0x55a7d2062232 in process file.c
    0000007 0x55a7d205fed0 in main (/workspace/file/src/file+0xfaed0) (BuildId: 39c0b201f6cf154ce3a6ce6f762fe5e98224e3f3)
    0000008 0x7f74af057d8f (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 89c3cb85f9e55046776471fed05ec441581d1969)

==123037==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: allocation-size-too-big (/workspace/file/src/file+0xbf47e) (BuildId: 39c0b201f6cf154ce3a6ce6f762fe5e98224e3f3) in __interceptor_malloc
==123037==ABORTING
```

Thanks & Best regards!
Tags:
Steps To Reproduce: 1. clone and build the file with AddressSanitizer from the Github repository

2. run the file binary with the attached crash input
Additional Information:
Attached Files: crash_input (516 bytes) 2022-07-30 14:24
https://bugs.astron.com/file_download.php?file_id=293&type=bug
Notes
(0003792)
christos   
2022-07-30 18:12   
What are you trying to do? Are you giving it a big elf file to identify?
(0003793)
LevilJiang   
2022-07-31 05:29   
Actually, I adopted fuzzing testing to the file program and it generates the crash input. I'm not sure of the cause of allocation-size-too-big with the crash input.
(0003794)
christos   
2022-07-31 16:01   
Ok, I committed a change to limit it to 128M. Does this work for you?
(0003795)
LevilJiang   
2022-08-01 03:06   
Thanks very much!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
373 [file] General minor always 2022-08-02 18:58 2022-10-09 18:53
Reporter: vismarli Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.34  
Product Build: Resolution: reopened  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: only match first out of multiple magic files
Description: If we specify multiple magic files with "-m" and the file has matching from all of them, only the first magic with match will be evaluated, the rest will not be evaluated.

for example, with "-m magicA:magicB:magicC" I observed the following:
* if file has matching in magicA, magicB and magicC: only magicA will be evaluated
* if file has matching in magicB and magicC : magicA and magicB will be evaluated, magicC won't be evaluated
* if file has matching in magicC only : magicA, magicB and magicC will be evaluated

not sure if this is expected behavior, from manual page it didn't mention any shortcut in evaluating multiple magic files nor any details of how multiple magic files will be evaluated.
Tags:
Steps To Reproduce: $ cat magicA
0 search {\\rt1 RTF1.0
16 search ViVa2 Viva File 2.0

$ cat magicB
6 search ABCD ABCD File
10 search TesT Test File 1.0

$ xxd test-file-AB
0000000: 7b5c 7274 3120 4142 4344 5465 7354 2078 {\rt1 ABCDTesT x
0000010: 7856 6956 6132 xViVa2

$ file -km magicA test-file-AB
test-file-AB: RTF1.0\012- Viva File 2.0, ASCII text, with no line terminators\012- data
$ file -km magicB test-file-AB
test-file-AB: ABCD File\012- Test File 1.0, ASCII text, with no line terminators\012- data

$ file -km magicA:magicB test-file-AB
test-file-AB: RTF1.0\012- Viva File 2.0, ASCII text, with no line terminators\012- data

$ file -km magicB:magicA test-file-AB
test-file-AB: ABCD File\012- Test File 1.0, ASCII text, with no line terminators\012- data

Additional Information: $ file -d -km magicA:magicB test-file-AB
[try zmagic 0]
[try tar 0]
[try cdf 0]
[try elf 0]
[try softmagic 0]
bb=[0x1f37280,22], 0 [b=0x1f37280,22], [o=0, c=0]
mget(type=20, flag=0x40, offset=0, o=0, nbytes=22, il=0, nc=0)
mget/96 @0: \000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000

1: > 0 search,={\rt1,"RTF1.0"]
0 == 0 = 1
bb=[0x1f37280,22], 16 [b=0x1f37280,22], [o=0x10, c=0]
mget(type=20, flag=0x40, offset=16, o=0, nbytes=22, il=0, nc=0)
mget/96 @16: \000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000

2: > 16 search,=ViVa2,"Viva File 2.0"]
0 == 0 = 1
[try ascmagic 1]
test-file-AB: RTF1.0\012- Viva File 2.0, ASCII text, with no line terminators\012- data





$ file -d -km magicB:magicA test-file-AB
[try zmagic 0]
[try tar 0]
[try cdf 0]
[try elf 0]
[try softmagic 0]
bb=[0x794280,22], 6 [b=0x794280,22], [o=0x6, c=0]
mget(type=20, flag=0x40, offset=6, o=0, nbytes=22, il=0, nc=0)
mget/96 @6: \000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000

1: > 6 search,=ABCD,"ABCD File"]
0 == 0 = 1
bb=[0x794280,22], 10 [b=0x794280,22], [o=0xa, c=0]
mget(type=20, flag=0x40, offset=10, o=0, nbytes=22, il=0, nc=0)
mget/96 @10: \000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000

2: > 10 search,=TesT,"Test File 1.0"]
0 == 0 = 1
[try ascmagic 1]
test-file-AB: ABCD File\012- Test File 1.0, ASCII text, with no line terminators\012- data

Attached Files:
Notes
(0003796)
vismarli   
2022-08-03 01:04   
Intuitively if I specify multiple magic files e.g. "magicA:magicB", I expects libmagic to evaluate all rules from magicA+magicB combined/union-ed. But apparently its not that straightforward.
Please tell me if its a bug or it's expected, can anyone clarify the behavior of multiple magic files via "-m" argument?
(0003797)
vismarli   
2022-08-16 14:40   
Adding more clarifying information:
* arguments to trigger this issue are: keep-going (-k) with multiple magic files specified in (-m) argument
* the issue is always reproducible
* this issue also occurs in latest version 5.42
(0003798)
christos   
2022-08-17 08:45   
Fixed, thanks!
(0003801)
vismarli   
2022-08-17 16:03   
thanks christos for the update, I can see all rules from multiple files are now hits correctly, but the output separator is not correct.

I got something like this from testing with multiple magic "magicB:magicA" :
MagicB-Rule1\012- MagicB-Rule2MagicA-Rule1\012- MagicA-Rule2, ASCII text, with no line terminators

the "MagicB-Rule2" supposed to be separated from "MagicA-Rule1".

I believe the "firstline" flag in match() should not be initialized to 1, it should be based on "need_separator" and "printed_something" flags.
(0003804)
christos   
2022-08-18 07:58   
Yup, fixed.
(0003807)
vismarli   
2022-08-18 17:27   
looks great now, thank you christos. you can close this ticket.
(0003831)
christos   
2022-10-09 18:53   
Verified fixed.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
377 [file] General minor have not tried 2022-08-23 15:49 2022-10-09 18:52
Reporter: standage Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Support TSV detection
Description: It is very convenient that file and libmagic now support detection of CSV files. Files separated by tabs instead of commas are also very common. In fact, it's the de facto standard that UNIX shell tools use for pipeline compatibility: sort, cut, paste, and so on. In any case, being able to distinguish TSV files (mime type: text/tab-separated-values) from other unstructured ASCII text files would be a very useful feature. I hope it wouldn't be too difficult to extend the existing CSV support to support TSV.
Tags:
Steps To Reproduce: $ file some.csv
some.csv: CSV text
$ file some.tsv
some.tsv: ASCII text
Additional Information:
Attached Files:
Notes
(0003830)
christos   
2022-10-09 18:52   
Yes, it is on the list of things to do :-)


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
390 [file] General feature always 2022-10-02 09:19 2022-10-09 18:51
Reporter: cweiske Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Detection for Frontier Silicon firmware downloads
Description: Frontier Silicon (now Frontier Smart) is a company building chips that other companies build radios with. All of their firmware downloads have the same format, and they begin with 0x76110000 - see https://github.com/MatrixEditor/frontier-smart-api/blob/main/docs/firmware-2.0.md#11-header-structure

The following magic line detects them:

0 belong 0x76110000 Frontier Silicon firmware download

Example files can be found at https://github.com/cweiske/frontier-silicon-firmwares (ending with .isu.bin)
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003829)
christos   
2022-10-09 18:51   
Committed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
386 [file] General minor always 2022-10-01 06:18 2022-10-09 18:00
Reporter: delphij Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: tests/gpkg-1-zst.result contains an extra \n
Description: (I'm not 100% sure if this is a defect or not; it was discovered by FreeBSD CI, and https://ci.freebsd.org/job/FreeBSD-main-amd64-test/22193/testReport/usr.bin.file/file_test/contrib_file_tests/ is an example)

So basically what our test cases does is that for each *.testfile under file's tests/, it does a "file --brief" on it and compare the trimmed output against the .result. The code can be found here:

https://cgit.freebsd.org/src/tree/usr.bin/file/tests/file_test.sh

In file 5.42, a test case was added for gpkg-1-zst.testfile, but the result contained an extra \n at the end, so the cmp would fail because the result is no longer exactly identical.
Tags:
Steps To Reproduce: $ file --brief tests/gpkg-1-zst.testfile | tr -d '\012' > /tmp/actual
$ cmp tests/gpkg-1-zst.result /tmp/actual
cmp: EOF on /tmp/actual

Expected result: cmp should give empty result.
Additional Information: Proposed fix would be to remove the trailing \n so it matches the others:

truncate -s 85 tests/gpkg-1-zst.result

and possibly run the tests as part of CI process for file :) However, I am not 100% sure if the omission of the trailing \n's were intentional for these .result files. It took me some time to understand why the FreeBSD test case was removing the \n (https://cgit.freebsd.org/src/tree/usr.bin/file/tests/file_test.sh#n43).

So please consider either applying this fix or changing all the .result files so that they have a trailing \n.
Attached Files:
Notes
(0003828)
christos   
2022-10-09 18:00   
Added newlines to all result files for consistency.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
393 [file] General minor always 2022-10-09 05:41 2022-10-09 16:48
Reporter: redshift Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: dsd64-dsf / JW07022A.mp3 tests failing with musl libc
Description: In https://bugs.astron.com/view.php?id=382 an issue was reported for compiling file 5.43 against musl libc. I confirmed that finding. When I add the patch committed for issue 382 (https://github.com/file/file/commit/1294029cdb18d4c0997f2b52df435076b8444137) I don't get success but a different failure on the same test:

```
Running test: ../tests/dsd64-dsf.testfile
../tests/dsd64-dsf.testfile: DSF audio bitstream data mono, simple-rate, 1 bit, 141184 samples
lt-test: ERROR: result was (len 65)
DSF audio bitstream data mono, simple-rate, 1 bit, 141184 samples
expected (len 94)
DSF audio bitstream data, 1 bit, mono, "DSD 64" 2822400 Hz, no compression, ID3 version 2.3.0
```

I used git bisect to check whether this is fixed by any later commit up to HEAD, and it's not, but there's one interesting thing - as soon as I hit the second-to-latest commit (https://github.com/file/file/commit/a960a2adb1c41ecb5b996390f2484035878c02f6) I get a different failure on a different test:

```
Running test: ../tests/JW07022A.mp3.testfile
../tests/JW07022A.mp3.testfile: Audio file with ID3 version 2.2.0, contains:MPEG ADTS, layer III, v1, 96 kbps, 44.1 kHz, Monaural
lt-test: ERROR: result was (len 97)
Audio file with ID3 version 2.2.0, contains:MPEG ADTS, layer III, v1, 96 kbps, 44.1 kHz, Monaural
expected (len 99)
Audio file with ID3 version 2.2.0, contains: MPEG ADTS, layer III, v1, 96 kbps, 44.1 kHz, Monaural
```

I suspected the test system stops on the first failure, so I removed the tests that run before dsd64-dsf to test whether it still fails, and it does. So, commit a960a2adb1 is adding additional failures, in this scenario.

These failures are reproducible every time for me.
Tags:
Steps To Reproduce: Build against musl libc; I'm on x86-64. I'm building for Void Linux which uses this build template - https://github.com/void-linux/void-packages/blob/master/srcpkgs/file/template - which does a fairly standard configure, make, make check, make install. Here's the full configure:

./configure --prefix=/usr --sysconfdir=/etc --sbindir=/usr/bin --bindir=/usr/bin --mandir=/usr/share/man --infodir=/usr/share/info --localstatedir=/var --host=x86_64-unknown-linux-musl --build=x86_64-unknown-linux-musl --libdir=${exec_prefix}/lib64 --enable-static --disable-libseccomp
Additional Information:
Attached Files:
Notes
(0003824)
christos   
2022-10-09 16:48   
It's all been fixed now, thanks! It had nothing to do with muslc, it was broken magic and broken code.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
382 [file] General minor always 2022-09-15 14:32 2022-10-09 05:15
Reporter: Marv Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: dsd64-dsf test fails with musl C library
Description: When updating file from 5.42 to 5.43 on a system using the musl libc (version 1.2.3) the following test failure surfaced:
```
Running test: ../tests/dsd64-dsf.testfile
../tests/dsd64-dsf.testfile: DSD Stream File, mono, simple-rate, 1 bit, 141184 samples
test: ERROR: result was (len 57)
DSD Stream File, mono, simple-rate, 1 bit, 141184 samples
expected (len 94)
DSF audio bitstream data, 1 bit, mono, "DSD 64" 2822400 Hz, no compression, ID3 version 2.3.0
make[2]: *** [Makefile:738: check-local] Error 1
```

It seems like the wrong magic information is picked since there are multiple definitions for "DSD\x20" at position 0:
```
magic/Magdir/dsf:7:0 string DSD\x20 DSD Stream File,
magic/Magdir/audio:1230:0 string DSD\x20 DSF audio bitstream data
```

Any advice how to track this down further is appreciated as I'm not familiar with the codebase

Best regards,
Marvin
Tags:
Steps To Reproduce: Build using the musl libc

I used the following configure invocation:
```
./configure --build=x86_64-pc-linux-musl --host=x86_64-pc-linux-musl --prefix=/usr/x86_64-pc-linux-musl --bindir=/usr/x86_64-pc-linux-musl/bin --sbindir=/usr/x86_64-pc-linux-musl/bin --libdir=/usr/x86_64-pc-linux-musl/lib --datadir=/usr/share --datarootdir=/usr/share --docdir=/usr/share/doc/file-5.43 --infodir=/usr/share/info --mandir=/usr/share/man --sysconfdir=/etc --localstatedir=/var/lib --disable-dependency-tracking --disable-silent-rules --enable-fast-install --enable-bzlib --enable-xzlib --enable-zlib --disable-libseccomp --disable-static
```
Additional Information:
Attached Files:
Notes
(0003819)
christos   
2022-09-27 19:08   
Merged duplicate entries.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
392 [tcsh] General major sometimes 2022-10-05 18:24 2022-10-05 18:24
Reporter: RohanTalip Platform:  
Assigned To: OS:  
Priority: normal OS Version:  
Status: new Product Version: 6.21.00  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Segmentation faults on macOS Monterey 12.6
Description: In one terminal window, after the issue seen in https://bugs.astron.com/view.php?id=391 was observed, typing commands that don't exist (initially accidentally) resulted in "Segmentation fault
Exit 139" being printed (I have printexitvalue set).

Later, trying to pipe the history to less also resulted in the same segmentation fault message, as did trying to use tab completion.

I am not sure how related this is to https://bugs.astron.com/view.php?id=391 (maybe the issue with job control / pipes corrupted some memory) but it seemed different enough to warrant a separate bug report.
Tags: bug
Steps To Reproduce: Assume "%N" below is my prompt, and "#" starts a comment.

%157 asdf list all nodejs | less

# Success.


%158 asdf list all nodejs | less

[1] + Done asdf list all nodejs |
       Suspended (tty output) /usr/local/bin/less
[1] 34808 34809


%159 jobs
[1] + Done asdf list all nodejs |
       Suspended (tty output) /usr/local/bin/less


%160 fg
asdf list all nodejs | /usr/local/bin/less
fg: No such job (badjob).
Exit 1


%161 asdf list all nodejs | less

# Success, I think. At least there was no exit status printed and no issue with the job control.

%162 asdf list all nodejs | grep ^18
18.0.0
18.1.0
18.2.0
18.3.0
18.4.0
18.5.0
18.6.0
18.7.0
18.8.0
18.9.0
18.9.1
18.10.0


# I accidentally typed tsc (which if configured correctly will compile/transpile TypeScript files into JavaScript files) :

%236 tsc
tsc: Command not found.
Segmentation fault
Exit 139

%237 npx tsc

# Success.


%238 tsc
tsc: Command not found.
Segmentation fault
Exit 139


# h is an alias for history:

%335 h | less
Segmentation fault
Exit 139


# "h 10" works as expected:

%337 h 10

%338 h | grep eslint | grep fix
Segmentation fault
Exit 1

%339 h | less
Segmentation fault
Exit 139


# gc is an alias for "git checkout"; I also have a completion for "gc"; trying to use tab completion resulted in "Segmentation fault" :

%367 gc sr<TAB>Segmentation fault

error: pathspec 'sr' did not match any file(s) known to git
Exit 1

# Later:

%460 gc ro<TAB>Segmentation fault

%570 aabb
aabb: Command not found.
Segmentation fault
Exit 139

%571 echo $version
tcsh 6.21.00 (Astron) 2019-05-08 (x86_64-apple-darwin) options wide,nls,dl,bye,al,kan,sm,rh,color,filec

%572 aaa
aaa: Command not found.
Segmentation fault
Exit 139

%581 complete gc
'p/1/`git branch | cut -c 3-`/'


# Other valid commands not requiring pipes or job control or tab completion seem to work fine.
Additional Information: Other less long-lived tcsh sessions that have also seen the issue in https://bugs.astron.com/view.php?id=391 don't exhibit these segmentation fault issues:

%42 aabb
aabb: Command not found.
Exit 1
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
391 [tcsh] General major sometimes 2022-10-05 17:45 2022-10-05 17:45
Reporter: RohanTalip Platform:  
Assigned To: OS:  
Priority: normal OS Version:  
Status: new Product Version: 6.21.00  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Job control or tty output does not work all the time on macOS Monterey 12.6
Description: Either the job control or the tty output doesn't work roughly 1 in 30 times in tcsh 6.21.00 on macOS Monterey 12.6. tcsh 6.21.0 is the version that is installed on macOS Monterey 12.6. I also tried tcsh 6.24.01 from Homebrew ( https://formulae.brew.sh/formula/tcsh#default ), with similar results.

I didn't notice the problem on macOS Monterey 12.5.1 or lower versions but maybe I didn't exercise the problem enough.
Tags: bug
Steps To Reproduce: Assume "%N" below is my prompt, and "..." represents omitted text or commands:

%1 echo $version
tcsh 6.21.00 (Astron) 2019-05-08 (x86_64-apple-darwin) options wide,nls,dl,bye,al,kan,sm,rh,color,filec

%2 sleep 2 |& less

%3 sleep 2 |& more

%4 sleep 2 |& more

% 5 sleep 1 |& more

%6 which less
less: aliased to /usr/local/bin/less

%7 less --version
less 608 (PCRE2 regular expressions)
...


%8 sleep 1 |& less

%9 sleep 1 |& less

...

%31 vmstat |& less
Exit 1

%32 iostat |& less

...

% 34 iostat -c 10 |& less

% 35 sleep 1 |& less

% 36 sleep 1 |& less

[1] + Done sleep 1 |&
       Suspended (tty output) /usr/local/bin/less
[1] 52209 52210

%37 jobs
[1] + Done sleep 1 |&
       Suspended (tty output) /usr/local/bin/less

%39 less --version | head -1
less 608 (PCRE2 regular expressions)
[1] + Done sleep 1 |&
       Terminated /usr/local/bin/less

%40 less --version | head -1
less 608 (PCRE2 regular expressions)

%41 jobs

# No jobs returned.
Additional Information: Occasionally, typing "fg" after there is a "Exit 1" or a "Done" followed by a "Suspended (tty output)" in the job output that is reported when there is an error/problem, will result in the tcsh process taking up 100% of a CPU core, after which I have to kill it.

Other times typing "fg" after such messages in the job output results in this message: "fg: No such job (badjob)."
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
350 [tcsh] General block always 2022-05-27 09:02 2022-10-03 17:11
Reporter: HKOB Platform:  
Assigned To: OS:  
Priority: none OS Version:  
Status: new Product Version: 6.21.00  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: complete: completion list words do not accept the `/` character.
Description: It seems to me completion list words do not accept the `/` character. Escaping it does not seem to work. If there's a correct way to write such expression a small example would be nice, perhaps in the docs or in a test.
Tags:
Steps To Reproduce: Example. Neither of these work to give the expected completion options a/a a/b a/c:
```
complete ml 'p/1/(a/a a/b a/c)/'
complete ml 'p/1/"(a/a a/b a/c)"/'
```
Additional Information:
Attached Files:
Notes
(0003816)
alzwded   
2022-09-10 17:35   
This seems to work

```
> complete ml 'p_1_(a/a b/b a/c)_'
> ml^D
a/a a/b a/c
```

The second character is considered the separator throughout (# seems to work as well).

I'm not entirely sure where in that pretty long section on the `complete` builtin this note would go, I don't think I ever read every word in that section.

I'm not sure what an autotest would test, it requires someone interactively triggering completion options; when you add a completion, it pretty much just stores the string as is and only evaluates it when needed. So 'man page' is probably the correct place to mention this.
(0003823)
alzwded   
2022-10-03 17:11   
I was getting ready to update the man page, but re-reading the section more carefully, I noticed there already is a passing mention to alternative delimiters:
                [.........................] For example, the Elm mail program uses
               `=' as an abbreviation for one's mail directory. One might use

                   > complete elm c@=@F:$HOME/Mail/@

               to complete `elm -f =' as if it were `elm -f ~/Mail/'. Note
               that we used `@' instead of `/' to avoid confusion with the
               select argument [...................]


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
385 [file] General tweak always 2022-09-27 10:10 2022-09-27 19:26
Reporter: saltedcoffii Platform: x86_64  
Assigned To: christos OS: Arch Linux  
Priority: normal OS Version: rolling  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Anki APKG files are not recognized separately from ZIP files
Description: Anki (ankiweb.net) APKG files are compressed ZIP archives with exported flash card information, so that flash card sets from the app can be shared. From what I understand, libmagic sees the ZIP magic bytes and identifies it as a ZIP file, however this failure to recognize an APKG files has implications: on GNOME, double-clicking on an APKG file does not import the flash cards to Anki, if it is installed (expected behavior) but instead opens the archive in an archive manager, such as file-roller.
Tags: filename, magic, zip
Steps To Reproduce: 1. Download any APKG file (many are available here https://ankiweb.net/shared/decks/)
2. run `file -mime-type <your filename>.apkg` and observe mimetype is not `application/x-apkg` but is `application/zip`
Additional Information:
Attached Files:
Notes
(0003821)
christos   
2022-09-27 19:26   
Added magic for them, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
384 [file] General minor always 2022-09-25 15:48 2022-09-27 19:12
Reporter: darose Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: Mime type application/x-ndjson misspelled as "x-ndjason"
Description: This is a follow up from issue 0000359. That issue was resolved by recognizing JSON Lines files and identifying them as type "application/x-ndjson". However, the mimetype is being misspelled as "x-ndjason".
Tags:
Steps To Reproduce: 1. Upgrade "file" to v5.43
2. Run "file" on a valid JSON Lines file.
3. "file" will identify it as "application/x-ndjason" rather than "application/x-ndjson".
Additional Information: See the following output for an example:

$ cat /tmp/test.json
{}
{}

$ pacman -Q file
file 5.43-1

$ file -bz --mime-type /tmp/test.json
application/x-ndjason
Attached Files:
Notes
(0003820)
christos   
2022-09-27 19:12   
Fixed, thanks.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
381 [file] General minor always 2022-09-15 00:33 2022-09-27 19:05
Reporter: vt Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.44  
    Target Version:  
Summary: test.c:61:22: warning: use after 'free' of 's_17' [CWE-416] [-Wanalyzer-use-after-free]
Description: GCC-12 static analyzer seems correctly report:
```
test.c: In function 'slurp':
test.c:61:22: warning: use after 'free' of 's_17' [CWE-416] [-Wanalyzer-use-after-free]
   61 | *s++ = c;
      | ^

test.c:67:14: warning: use after 'free' of 's_18' [CWE-416] [-Wanalyzer-use-after-free]
   67 | *s++ = '\0';
      | ^

```
Even thought this is just a test.

ps. Also additional warning:
```
file.c: In function 'unwrap':
file.c:550:25: warning: leak of 'flist_41' [CWE-401] [-Wanalyzer-malloc-leak]
  550 | return 1;
      | ^
```
Tags:
Steps To Reproduce: CFLAGS=-fanalyzer
Additional Information:
Attached Files:
Notes
(0003818)
christos   
2022-09-27 19:05   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
383 [file] General minor always 2022-09-25 04:27 2022-09-27 19:01
Reporter: delphij Platform: arm64  
Assigned To: christos OS: FreeBSD  
Priority: normal OS Version: -CURRENT  
Status: resolved Product Version: 5.43  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: printf %lc expects wint_t
Description: In src/file.c, fname_print():

        wchar_t nextchar;
[...]
                        printf("%lc", nextchar);

but %lc expects wint_t, and this will cause build breakage for FreeBSD/arm64.
Tags:
Steps To Reproduce: build with -Werror and -Wformat.
Additional Information:
Attached Files: 0001-file.c-Explicitly-cast-nextchar-to-wint_t-for-printf.patch (775 bytes) 2022-09-25 04:27
https://bugs.astron.com/file_download.php?file_id=300&type=bug
Notes
(0003817)
christos   
2022-09-27 19:01   
fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
380 [file] General crash always 2022-09-01 14:23 2022-09-01 16:03
Reporter: jmoyano Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: feedback Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: PDF file incorrectly reported as "data"
Description: PDF files are incorrectly reported as "data" if the file has leading spaces in the "%PDF-" string

  %PDF-1.4 <-- Notice the spaces at the beginning
%����
3 0 obj
<</Type /Page
/Parent 1 0 R
/MediaBox [0 0 595.280 841.890]
/TrimBox [0.000 0.000 595.280 841.890]
/Resources 2 0 R...
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: boleta (5).pdf (60,395 bytes) 2022-09-01 14:23
https://bugs.astron.com/file_download.php?file_id=299&type=bug
Notes
(0003813)
christos   
2022-09-01 16:03   
While it is easy to fix the magic recognition to ignore spaces, according to the spec https://opensource.adobe.com/dc-acrobat-sdk-docs/standards/pdfstandards/pdf/PDF32000_2008.pdf this is not a pdf file.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
379 [file] General trivial always 2022-08-30 17:33 2022-09-01 12:27
Reporter: lvieirajr Platform: All  
Assigned To: christos OS: All  
Priority: normal OS Version: All  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: New python file-magic release
Description: Hi,
I'm sorry if this is not the correct place to mention this but it was the only location I could find.

There has been a bug fixed on the python binding for file, that has been fixed for close to a year but has not been released yet as a new version to be downloaded on Pypi
https://bugs.astron.com/view.php?id=285

It was supposed to be out whenever 0.4.1 came out, but that never happened.
As you can see in here: https://pypi.org/project/file-magic/#history
File-magic is still on version 0.4.0

Any chance we could get version 0.4.1 built and distributed to Pypi? This is an issue that we face daily while using file-magic.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003811)
christos   
2022-09-01 12:27   
Release 0.4.1


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
363 [file] General major always 2022-07-01 01:55 2022-08-31 13:53
Reporter: dimich Platform: x86_64  
Assigned To: christos OS: Linux  
Priority: normal OS Version: Arch Linux  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Truncated filenames containing multibyte characters
Description: Bugfix for issue 351 introduced new bug: filenames are truncated due to incorrect calculation of printable filename width.

Filename width is calculated first in file_mbswidth() with respect to multibyte characters (first statement of #if/#endif), it uses iswprint() and handles multibyte characters correctly. Then filename is passed to file_printable() which uses simple isprint() and replaces every byte of multibyte character with 4 characters. Filename width is limited by previously calculated width (wid argument) and truncated, even in raw mode.
Tags: bug, filename, multibyte
Steps To Reproduce: $ touch файл.txt
$
$ ls --zero | hexdump -b
0000000 321 204 320 260 320 271 320 273 056 164 170 164 000
000000d
$
$ file файл.txt
\321\204\320\260\320\271\320\273: empty
$
$ file -r файл.txt
файл: empty
Additional Information: I think --raw option should not affect filenames at all. Non-printable characters may be replaced but at least with respect to multibyte encodings.
See also issue 362.
Attached Files: issue363.patch (2,036 bytes) 2022-07-01 03:10
https://bugs.astron.com/file_download.php?file_id=283&type=bug
issue363-upd1.patch (3,308 bytes) 2022-07-01 04:20
https://bugs.astron.com/file_download.php?file_id=284&type=bug
issue363-upd2.patch (3,406 bytes) 2022-07-01 06:43
https://bugs.astron.com/file_download.php?file_id=285&type=bug
Notes
(0003771)
dimich   
2022-07-01 03:10   
This patch seems fix the issue
(0003772)
dimich   
2022-07-01 04:20   
Updated patch: handle invalid sequences in filenames and multicolumn characters.
(0003773)
dimich   
2022-07-01 06:43   
One more try :) Do not replace invalid sequence characters in raw mode, print as is.
The only issue i found is when --raw mode is on, --no-pad is off, LC_CTYPE=C (or another 1-byte encoding) and console is UTF-8. In this case field width cannot be calculated correctly: we don't know how many character cells a sequence will take.
Possible solution is to force --no-pad in --raw mode.
(0003780)
christos   
2022-07-04 19:46   
Dup for PR/362
(0003782)
christos   
2022-07-04 20:16   
I like your idea to print invalid as octal, so I applied to my patch.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
376 [file] General minor always 2022-08-22 10:17 2022-08-23 08:01
Reporter: bsevens Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Commit 43f7989076aa3731f3558c6954780bc7b2734b64 broke VHD detection
Description: Commit 43f7989076aa3731f3558c6954780bc7b2734b64 (https://github.com/file/file/commit/43f7989076aa3731f3558c6954780bc7b2734b64) aimed at fixing spelling errors, but also included changes to some file signatures.

E.g. `conectix` was changed to `connectix`, which means Microsoft Disk Images are currently not detected by file.
Tags:
Steps To Reproduce: $ echo connectix > /tmp/wrong.vhd
$ file /tmp/wrong.vhd
/tmp/wrong.vhd: Microsoft Disk Image, Virtual Server or Virtual PC, 0 bytes, type 0, State 0
$ echo conectix > /tmp/correct.vhd
$ file /tmp/correct.vhd
/tmp/correct.vhd: ASCII text
Additional Information:
Attached Files:
Notes
(0003808)
christos   
2022-08-23 08:01   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
375 [file] General major always 2022-08-08 18:07 2022-08-18 08:08
Reporter: DC Platform: Linux  
Assigned To: christos OS: Gentoo  
Priority: high OS Version: Latest  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Broken output for national characters
Description: Since version 5.42 the file command doesn't output correctly files with national characters in the path:

$ file -v
file-5.42
magic file from /usr/share/misc/magic
seccomp support included
$ file /tmp/ěščřžýáíé/file.txt
/tmp/\304\233\305\241\304\215\305\231\305\276\303\275\303\241\303\255\303\251: ASCII text

File version 5.41 is working fine:
$ file -v
file-5.41
magic file from /usr/share/misc/magic
seccomp support included
$ file /tmp/ěščřžýáíé/file.txt
/tmp/ěščřžýáíé/file.txt: ASCII text

My locales are:
LANG=cs_CZ.UTF-8
LC_ALL=cs_CZ.UTF-8
Tags: bug, filename
Steps To Reproduce: Run the file command on file with national characters in the path.
Additional Information:
Attached Files:
Notes
(0003800)
christos   
2022-08-17 08:56   
How do you reproduce it?
bash-3.2$ touch /tmp/ěščřžýáíéfile.txt
bash-3.2$ ./file -m ../magic/magic.mgc /tmp/*.txt
/tmp/ěščřžýáíéfile.txt: empty
bash-3.2$ file -v
file-5.42
magic file from /usr/local/share/misc/magic
(0003802)
lilydjwg   
2022-08-18 04:17   
I get this issue with Chinese characters too.
christos, you checked the version of a wrong file binary?
(0003803)
christos   
2022-08-18 06:40   
I am running with the version from the HEAD of the tree which is different. Can you try that?
(0003805)
DC   
2022-08-18 08:05   
I confirm that when I build the file binary from latest sources then everything seems to be working fine again!
(0003806)
christos   
2022-08-18 08:08   
Great, let me close this and plan for a new release! Thanks for testing.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
374 [file] General major always 2022-08-06 15:53 2022-08-17 08:48
Reporter: piru Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Endless busyloop in file_mbswidth
Description: file_mbswidth contains an endless bysyloop in the non-widechar version.

The bug is in the while loop:

    while (*s) {
        width += (ms->flags & MAGIC_RAW) != 0
            || isprint(CAST(unsigned char, *s)) ? 1 : 4;
    }

ref: https://github.com/file/file/blob/e1233247bbe4d2d66b891224336a23384a93cce1/src/file.c#L678


Note that variable `s' is not incremented at all. Fix is easy, add s++; to the loop.
Tags:
Steps To Reproduce: 1. Build file for system without widechar support
2. file anyfile
Additional Information: This bug was added by commit https://github.com/file/file/commit/f448f3e5c37de8c285ac14b032b2bdcea82fc08b
Attached Files:
Notes
(0003799)
christos   
2022-08-17 08:48   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
99 [tcsh] General minor always 2019-08-13 14:44 2022-08-09 09:53
Reporter: xdelaruelle Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version: 6.21.00  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: use of :q in back-tick context leads to erroneous extra history entries
Description: I am developing a tool called Modules (http://modules.sourceforge.net/) which enables to dynamically update user's shell environment. In short, this tool provides a shell alias called `module` which evaluates in the current shell the output of a script producing shell code.

The `module` command is currently defined this way:

  $ alias module 'eval "`/usr/share/Modules/libexec/modulecmd.tcl tcsh \!*:q`"'

Which leads to erroneous entries in the ~/.history file:

  #+1565705427
  module load foo bar
  #+1565705427
  load foo bar#+1565705430
  history

All arguments passed from the alias to the modulecmd.tcl script are added as a second history line. These erroneous entries seem to come from the use of the quote modifier `:q` within a back-tick context ``.

Some explanation on the use of the quote modifier :q in this alias: as the modulecmd.tcl script outputs shell codes, it contains sometime special characters like curly braces, so the result of the script execution is enclosed in double quotes to pass it to eval. In this situation to correctly obtain quoted arguments, the quote modifier is used.

The issue has been reproduced on tcsh versions 6.18, 6.19, 6.20 and 6.21.
Tags:
Steps To Reproduce: To reproduce the issue on a small example, here is a script that produces some shell code (to define an alias or set an environment variable):

$ cat ./dispatch
#!/bin/csh
if ( $#argv != 1 ) then
  echo echo should get exactly 1 arg
  exit 1
endif

switch ( $argv[1] )
case myname:
  echo alias myname getent\\ passwd\\ \\\$USER\ \\\|\\ awk\\ -F:\\ \\\'\\\{print\ \\\$5\\\}\\\';
  breaksw;
case "":
  echo setenv EMPTY 1;
  breaksw;
endsw

$ echo $tcsh
6.21.00

Here we define the shell alias that calls the script and evaluates the shell code this script outputs:

$ alias tuneenv 'eval "`./dispatch \!*:q`"'

Then we use the alias:

$ tuneenv ""
$ echo $EMPTY
1
$ tuneenv myname
$ myname
Xavier
$ exit

Looking at history file, erroneous entries can be seen:

$ tail ~/.history
#+1565702553
tuneenv ""
#+1565702553
""#+1565702559
echo $EMPTY
#+1565702565
tuneenv myname
#+1565702565
myname#+1565702567
myname
Additional Information: Some additional tests to demonstrate the need to enclose script result in double quotes to pass it to eval:

$ alias tuneenv 'eval `./dispatch \!*`'
$ tuneenv myname
Missing '}'.

So without the double quotes, the shell alias myname which contains curly braces cannot be set

Then if we enclose script result in double quotes, !* should get the quote modifier applied to correctly transmit quoted arguments:

$ alias tuneenv 'eval "`./dispatch \!*`"'
$ tuneenv myname
$ myname
Xavier
$ tuneenv ""
should get exactly 1 arg

Without :q, the "" argument is not transmitted to the dispatch script.
Attached Files: dispatch (279 bytes) 2019-08-13 14:44
https://bugs.astron.com/file_download.php?file_id=72&type=bug
Notes
(0003317)
christos   
2019-10-19 18:54   
Sorry, I can't reproduce it with 6.21.00 on either Linux or NetBSD. Did that break with 6.21.00 or the bug was always there?
(0003318)
xdelaruelle   
2019-10-20 15:21   
I get the exact same result whether I test this on tcsh version 6.18, 6.19, 6.20 and 6.21 on a Linux system.

Here are the details of the tcsh 6.21.00 ran:

$ echo $version
tcsh 6.21.00 (Astron) 2019-05-08 (x86_64-unknown-linux) options wide,nls,dl,al,kan,sm,rh,color,filec

With that shell, applying code sequence described in 'Steps To Reproduce' section with dispatch script attached to this ticket (shebang adapted to match the tcsh shell ran), the same erroneous ~/.history file is obtained (as described in section).


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
371 [file] General minor always 2022-07-27 20:56 2022-07-30 18:07
Reporter: Mark.Taylor Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: magic type returned is 100% bogus: ", utcTime=2073-\0120-.0 23:85:8\012 GMT"
Description: Tested using Mac `file` 5.41, Linux `file` 5.37, 5.38, and 5.42: a sample text file containing floating point numbers returns a bogus magic type, as in ", utcTime=2073-\0120-.0 23:85:8\012 GMT".


0.000252
0.065583
0.024648
0.028474
0.024969
0.024273
0.031606
0.024479
0.02417
0.024741
0.024859
0.024396
0.027473
0.023858
0.024483
0.024377
0.032009
0.024065
0.024564
Tags:
Steps To Reproduce: Put the above, or the include d/l file, and run `file` on it - it should return `ASCII text`, but instead it returns the indicated string.

Note that on a Mac with 5.41 it returns:
# file x.txt
x.txt: , not-valid-before=2073-
0-.0 23:85:8
 GMT
Additional Information: Not including "soft" (`file -e soft`) makes the return `ASCII text` as expected. Note that I tracked it down to a problem in softmagic.c:match() when it was working on either magindex 14184 or 14231 (skipping both of those seems to make it DTRT).
Attached Files: x.txt (170 bytes) 2022-07-27 20:56
https://bugs.astron.com/file_download.php?file_id=292&type=bug
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
370 [file] General trivial have not tried 2022-07-25 17:54 2022-07-30 17:06
Reporter: phll4 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Typo for "PhotometricIntepretation"
Description: magic/Magdir/images contains a typo in the word "PhotometricIntepretation" which should be "PhotometricInterpretation" (additional "r": PhotometricInte**r**pretation).

Link to the code mirror on GitHub of the line in current master: https://github.com/file/file/blob/1c9f670f69bd508da2b4d05a28d27bd92407f2c8/magic/Magdir/images#L368
Tags:
Steps To Reproduce:
Additional Information: The same typo did exist in libtiff, at least in the manpage of tiffgt at some point (old version is still online at http://www.libtiff.org/man/tiffgt.1.html).
Current tiffgt does NOT have the typo anymore: https://gitlab.com/libtiff/libtiff/-/blob/master/doc/tools/tiffgt.rst
Nor does it exist anywhere in modern libtiff: https://gitlab.com/search?search=PhotometricIntepretation&nav_source=navbar&project_id=4720790&group_id=2221836&search_code=true&repository_ref=master
Attached Files:
Notes
(0003791)
christos   
2022-07-30 17:06   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
369 [file] General minor always 2022-07-21 15:00 2022-07-30 17:04
Reporter: hbent Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: src/vasprintf.c typo fix
Description: src/vasprintf.c line 638 is:
memcpy (&s.vargs, &vargs, sizeof (s.va_args));

should be
memcpy (&s.vargs, &vargs, sizeof (s.vargs));
Tags:
Steps To Reproduce: Attempt to compile on a platform that lacks va_copy
Additional Information: Found on alphaev56-dec-osf5.1b
Attached Files:
Notes
(0003790)
christos   
2022-07-30 17:04   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
366 [file] General feature N/A 2022-07-10 23:25 2022-07-17 15:36
Reporter: polluks Platform: MacBookPro17,1  
Assigned To: christos OS: macOS  
Priority: normal OS Version: 12.3.1  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Added to xo65
Description: The simulator files are missing so far.
Tags:
Steps To Reproduce:
Additional Information:
System Description Apple M1
Attached Files: xo65 (1,154 bytes) 2022-07-10 23:25
https://bugs.astron.com/file_download.php?file_id=288&type=bug
Notes
(0003789)
christos   
2022-07-17 15:36   
Added, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
367 [file] General feature always 2022-07-14 16:14 2022-07-17 15:33
Reporter: Mytherin Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Add detection of DuckDB database files
Description: DuckDB is an open source database system, similar to SQLite but focused on analytical workloads: https://github.com/duckdb/duckdb

This patch adds detection for database files generated by DuckDB.

Attached is an example database file.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: example-duckdb.db (274,432 bytes) 2022-07-14 16:14
https://bugs.astron.com/file_download.php?file_id=290&type=bug
duckdb.magic.patch (346 bytes) 2022-07-14 16:14
https://bugs.astron.com/file_download.php?file_id=289&type=bug
Notes
(0003788)
christos   
2022-07-17 15:33   
added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
368 [tcsh] General minor always 2022-07-15 19:09 2022-07-15 19:09
Reporter: untitled Platform: amd64  
Assigned To: OS: FreeBSD, Linux  
Priority: normal OS Version: all  
Status: new Product Version: 6.23.00  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Pipe error
Description: vmstat -m | head -n 1 && vmstat -m | grep smtn_present

does not _always_ show the output of the second command, but prints only the headers. The main word here is "always", as sometimes it works fine, but sometimes shows only the headers of vmstat. When commands are separated with a semicolon, everything works fine.
Also
vmstat -m | head -n 1 || vmstat -m | grep smtn_present
sometimes shows output from both commands.

Tested on FreeBSD and Debian linux.
Tags:
Steps To Reproduce: - log into tcsh
- vmstat -m | head -n 1 && vmstat -m | grep smtn_present
- enter the command many-many times and see the different results
Additional Information:
Attached Files: Screenshot 2022-07-15 at 22.09.23.png (768,709 bytes) 2022-07-15 19:09
https://bugs.astron.com/file_download.php?file_id=291&type=bug
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
365 [file] General feature always 2022-07-09 11:04 2022-07-09 16:12
Reporter: wof Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: add detection of Unison archive format
Description: Unison Two-way cross-platform file synchronizer.
To store the current state after a sync, unison creates archive files. (see. https://www.cis.upenn.edu/~bcpierce/unison/download/releases/stable/unison-manual.html#archives)

This patch adds detection for this file type.

The unison project can be found on https://github.com/bcpierce00/unison
Tags:
Steps To Reproduce: % file ar1a68e37cd2df302f691f0881c17b2074
ar1a68e37cd2df302f691f0881c17b2074: data
Additional Information:
Attached Files: unison.magic.patch (880 bytes) 2022-07-09 11:04
https://bugs.astron.com/file_download.php?file_id=287&type=bug
ar1a68e37cd2df302f691f0881c17b2074 (6,624 bytes) 2022-07-09 11:04
https://bugs.astron.com/file_download.php?file_id=286&type=bug
Notes
(0003787)
christos   
2022-07-09 16:12   
Added to archive.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
364 [file] General minor always 2022-07-05 14:24 2022-07-07 17:20
Reporter: mam-ableton Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Truncated "from" string when processing QEMU ELF coredumps
Description: File shows incorrect output when processing ELF coredumps created by QEMU.

Here is example output:

root@489a926e3c6b:/ci/test-arm# file qemu_segfault-arm-more-than-sixteen-chars_20220701-162324_884.core
qemu_segfault-arm-more-than-sixteen-chars_20220701-162324_884.core: ELF 32-bit LSB core file, ARM, version 1 (SYSV), SVR4-style, from 'segfault-arm-mor./segfault-arm-more-than-sixteen-chars', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: './segfault-arm-more-than-sixteen-chars', platform: 'v8l'

Note this specific part: "from 'segfault-arm-mor./segfault-arm-more-than-sixteen-chars"

It should actually say: "from './segfault-arm-more-than-sixteen-chars"

This is two separate strings that have been incorrectly concatenated: "segfault-arm-mor" and "./segfault-arm-more-than-sixteen-chars"

The first string comes from the `pr_fname` member of the `struct elf_prpsinfo`. It is a 16 byte char buffer that contains the first 16 bytes of the filename. It is not required to be NULL terminated, *although Linux does*. QEMU does not in its coredumps — if the filename is >=16 characters, there is no NULL separating this buffer from `pr_psargs` (described next).

The second string comes from the immediate following struct member, `pr_psargs`, an 80 byte char buffer with the argv strings.

See struct definition here: https://github.com/torvalds/linux/blob/c1084b6c5620a743f86947caca66d90f24060f56/include/linux/elfcore.h#L73-L74
QEMU has a matching one: https://github.com/qemu/qemu/blob/19361471b59441cd6f2aa22d4fbee7a6e9e76586/linux-user/elfload.c#L3558-L3559

Here's the bug: File first examines `pr_psargs` via its offsets lists: https://github.com/file/file/blob/f042050f59bfc037677871c4d1037c33273f5213/src/readelf.c#L266-L267

If it finds a valid string there, it then checks the `pr_fname` buffer (immediately before in memory) if it contains only printable characters. If it does, it concludes that both buffers are the first and second parts of the same string, and prints output with them concatenated. See https://github.com/file/file/blob/f042050f59bfc037677871c4d1037c33273f5213/src/readelf.c#L894-L905

Strictly speaking, this is not sound because the `pr_fname` buffer is not guaranteed to be NULL terminated (i.e. have a non-printable character).

In practice, this bug does not manifest for most coredumps because they are generated by the Linux kernel, which happens to NULL terminate `pr_fname`. This causes the above printable character check to fail, and only the `pr_psargs` buffer is output.



Environment:

```
# file --version
file-5.38
magic file from /etc/magic:/usr/share/misc/magic

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04
Codename: focal

# uname -a
Linux 489a926e3c6b 5.10.104-linuxkit 0000001 SMP Thu Mar 17 17:08:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
```
Tags:
Steps To Reproduce: For convenience I've attached a QEMU coredump and native Linux coredump. They exceed the upload limit, so I've made them available here: https://www.dropbox.com/sh/2si8tcbt2w5rumh/AAB447Q2Vey6kgSozQV5uKa6a?dl=0

# file *
native-linux-core.core: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from './main-more-than-sixteen-chars', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: './main-more-than-sixteen-chars', platform: 'x86_64'
qemu_main-test-this-is-more-than-sixteen-chars_20220705-161944_20938.core: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from 'main-test-this-i', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, platform: 'i686'

To try from scratch:

- Build an x86_64 binary that segfaults and name it with a filename >= 16 bytes in length

int main(){
        *(int*)(0) = 0;
}

gcc -o main-more-than-sixteen-chars main.c

- Install qemu-user (e.g. apt install qemu-user)
- Enable coredumps (ulimit -c unlimited)
- Run the binary under qemu-x86_64 (e.g. qemu-x86_64 main-more-than-sixteen-chars)
- Run file on the coredump (e.g. file core)
Additional Information:
Attached Files:
Notes
(0003783)
mam-ableton   
2022-07-05 14:26   
> - Run file on the coredump (e.g. file core)

Typo here; the coredump will not be named "core''. It will begin with "qemu_..."
(0003784)
christos   
2022-07-07 15:32   
Hmm the native hexdump looks like:
000006e0 be 51 00 00 01 00 00 00 6d 61 69 6e 2d 6d 6f 72 |.Q......main-mor|
000006f0 65 2d 74 68 61 6e 2d 00 2e 2f 6d 61 69 6e 2d 6d |e-than-../main-m|
00000700 6f 72 65 2d 74 68 61 6e 2d 73 69 78 74 65 65 6e |ore-than-sixteen|
00000710 2d 63 68 61 72 73 20 00 00 00 00 00 00 00 00 00 |-chars .........|

We first look at the full name at 0x6f8, we find it and we print it.

Where the qemu one looks like:
00000590 ca 51 00 00 01 00 00 00 6d 61 69 6e 2d 74 65 73 |.Q......main-tes|
000005a0 74 2d 74 68 69 73 2d 69 d7 58 80 01 40 20 20 20 |t-this-i.X..@ |
000005b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000005f0 00 00 00 00 00 00 00 00 05 00 00 00 10 01 00 00 |................|
00000600 06 00 00 00 43 4f 52 45 00 75 6e 61 03 00 00 00 |....CORE.una....|


As you can see QEMU does not put the full command line where we expect it (at 0x5a8 byte 0xd7) and the contents there are non printable, so it tries at 0x598 and prints the short name (which is as you mention non-nul-terminated).
(0003785)
mam-ableton   
2022-07-07 16:01   
Thanks for the quick reply. My mistake - that qemu core dump was from a buggy old qemu version which produced buggy coredumps. (See https://github.com/qemu/qemu/commit/5f779a3a26a9dcc8072d909b7759bb9fade097a9)

I have supplied a coredump from a newer qemu version : "qemu_main-test-this-is-more-than-sixteen-chars_20220702-121202_14541.core" in the same link: https://www.dropbox.com/sh/2si8tcbt2w5rumh/AAB447Q2Vey6kgSozQV5uKa6a?dl=0

That produces output like this:

qemu_main-test-this-is-more-than-sixteen-chars_20220702-121202_14541.core: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from 'main-test-this-i./main-test-this-is-more-than-sixteen-chars', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: './main-test-this-is-more-than-sixteen-chars', platform: 'x86_64'

In this case the short name and arguments are directly continuous. First it will look at the args, but then it will peek at the short name immediately before, see that there are all printable characters, and assume they are both part of the same string, which is not the case.

000005f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000600: 0000 0000 0500 0000 8800 0000 0300 0000 ................
00000610: 434f 5245 0075 6e61 0000 0000 0000 0000 CORE.una........
00000620: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000630: cd38 0000 ad38 0000 cd38 0000 ad38 0000 .8...8...8...8..
00000640: 6d61 696e 2d74 6573 742d 7468 6973 2d69 main-test-this-i
00000650: 2e2f 6d61 696e 2d74 6573 742d 7468 6973 ./main-test-this
00000660: 2d69 732d 6d6f 7265 2d74 6861 6e2d 7369 -is-more-than-si
00000670: 7874 6565 6e2d 6368 6172 7320 0000 0000 xteen-chars ....
00000680: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000690: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000006a0: 0500 0000 2001 0000 0600 0000 434f 5245 .... .......CORE
000006b0: 0075 6e61 0300 0000 0000 0000 4000 0000 .una........@...
(0003786)
christos   
2022-07-07 17:20   
Thanks, fixed.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
359 [file] General minor always 2022-06-14 17:02 2022-07-04 20:08
Reporter: darose Platform:  
Assigned To: christos OS: Linux  
Priority: normal OS Version: 5.18.3  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: v5.42 not correctly identifying JSON Lines files
Description: v5.41 would correctly identify a JSON Lines file (see https://jsonlines.org/ and http://ndjson.org/) as json. v5.42 now only identifies it as ascii text.

(This is breaking functionality in a project of mine which requires the ability to detect json files.)
Tags:
Steps To Reproduce: 1. Upgrade "file" to v5.42
2. Run "file" on a valid JSON Lines file.
3. "file" will not identify it as json; only ascii text.
Additional Information: See the following output for an example:

$ cat /tmp/test.json
{}
{}

$ pacman -Q file
file 5.41-1

$ file /tmp/test.json
/tmp/test.json: JSON data

$ sudo pacman -S file
...
Packages (1) file-5.42-1
...

$ pacman -Q file
file 5.42-1

$ file /tmp/test.json
/tmp/test.json: ASCII text
Attached Files:
Notes
(0003781)
christos   
2022-07-04 20:08   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
362 [file] General major always 2022-06-25 14:43 2022-07-04 19:45
Reporter: ro-ee Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: File name in output shortened with -raw option
Description: With recent PR/351: CathyKMeow: octalify unprintable characters in filenames unless raw, a regression has been introduced that has ramifications for Midnight Commander (mc). MC doesn’t use the -raw option, and does not expect the octal version of the file name in the output. This is not the issue, though.

With the -raw option, the output is shortened. It looks like the file name in the output is shortened to /n/ Bytes where /n/ is the number of Characters, not Bytes.

Example:
Testö.jpg = 9 characters, but 10 Bytes (because ö requires 2 bytes).
File then outputs testö.jp , which is a string of 9 byte length
Tags:
Steps To Reproduce: Have a file with one or more characters ≥ U+0080, e.g. ä ö ü Χ Л هل

file -r <filename> will output the filename shorted

file äöü -r
ä�: empty

Additional Information: I noticed different behavior when using shell globbing.

file -r * in a directory with äöü will output äöü correctly, but will shorten Χαίρετε to Χαίρετ , Здравствуйте to Здравс
Attached Files: image.png (3,142 bytes) 2022-06-25 14:43
https://bugs.astron.com/file_download.php?file_id=282&type=bug
Notes
(0003766)
ro-ee   
2022-06-25 18:03   
Thinking about it...
"PR/351: CathyKMeow: octalify unprintable characters in filenames unless raw"

why are characters ≥ U+0080 even considererd unprintable?
The change was originally introduced because of some issues with control characters < U+0020 (especially \n), see Bug 351.
(0003767)
dimich   
2022-06-28 03:05   
Bug 351 is closed and i can't comment there, so commen here.
1) `ls` and `find` replace non-printable characters only if stdout is a tty. There are no character replacement for piped output:
```
$ mkdir a$'\n'b
$ ls | cat
a
b
$ find . | cat
.
./a
b
```
2) Characters above 0x80 aren't non-printable.
(0003768)
ro-ee   
2022-06-28 08:53   
Hm, the changes from bug 351 lead to Midnight commander not properly detecting images with umlauts etc. in the file name. See https://www.midnight-commander.org/ticket/4377
I find it strange that midnight commander would not use piped output, so in theory the change should not even have any effect.

without the -r option, all characters above 0x80 get octalified.

alex@horus:~> file testö.jpg
test\303\266.jp: JPEG image data, JFIF standard 1.01, resolution (DPCM), density 118x118, segment length 16, progressive, precision 8, 1706x1132, components 3
(0003769)
dimich   
2022-06-28 09:15   
> I find it strange that midnight commander would not use piped output, so in theory the change should not even have any effect.
Yep, two overlapped issues together lead to the bug. First, `file` utility corrupts filenames. Second, `mc` relays on filename from file's output.
First one can be fixed by checking isatty(STDOUT_FILENO) as other tools do. Second one can be fixed by removing filename from output with --brief option (or even using libmagic directly).
But i can't understand why after all non-ascii characters are considered as non-printable.
(0003770)
dimich   
2022-06-28 09:44   
Sorry ro-ee, maybe i misunderstood your previous comment. I know about mc bug and commented there also. I was going to create a ticket for `file` here but you made it first.
Fix for "bug 351" is implemented incorrectly. I'd take CathyKMeow's attention but can't comment or reopen ticket 351. This issue affects not only mc but any other software which invokes `file` and reads filename, also it confuses users.
(0003779)
christos   
2022-07-04 19:45   
Try it now.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
361 [file] General feature always 2022-06-22 22:12 2022-07-04 17:15
Reporter: wof Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: add save file detection of burp
Description: BurpSuite is a HTTP man-in-the-middle-proxy used for penetration tests. At the moment save files are not detected.
I've attached my first patch to file.
Tags:
Steps To Reproduce: % file 2022-06-23-test.burp
2022-06-23-test.burp: data
Additional Information:
Attached Files: 2022-06-23-test.burp (524,288 bytes) 2022-06-22 22:12
https://bugs.astron.com/file_download.php?file_id=281&type=bug
burp.magic.patch (858 bytes) 2022-06-22 22:12
https://bugs.astron.com/file_download.php?file_id=280&type=bug
Notes
(0003778)
christos   
2022-07-04 17:15   
committed, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
360 [file] General major have not tried 2022-06-14 22:21 2022-07-04 17:12
Reporter: kloczek Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: "file -N -f" print dot instead file name
Description: Just found that after upgrade I'm not able to build any rpm package because "file -N -f" prints in filst column of the output instead file name just ".".
Tags:
Steps To Reproduce: Example

[tkloczko@devel-g2v SPECS]$ find /home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64/ '!' -path '/home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64//usr/lib/debug/*.debug' -type f '(' -perm -0100 -or -perm -0010 -or -perm -0001 ')' -print
/home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64/usr/lib64/libusb-1.0.so.0.3.0
/home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64/usr/lib64/libusb-1.0.la

[tkloczko@devel-g2v SPECS]$ find /home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64/ '!' -path '/home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64//usr/lib/debug/*.debug' -type f '(' -perm -0100 -or -perm -0010 -or -perm -0001 ')' -print |file -N -f -
/: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=5dc726ff611aee897cb801d30ac1927f0434a46e, with debug_info, not stripped, too many notes (256)
/: libtool library file, ASCII text
[tkloczko@devel-g2v SPECS]$ file --version
file-5.42
magic file from /etc/magic:/usr/share/misc/magic

The same after downgrade to prev version

[tkloczko@devel-g2v SPECS]$ find /home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64/ '!' -path '/home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64//usr/lib/debug/*.debug' -type f '(' -perm -0100 -or -perm -0010 -or -perm -0001 ')' -print |file -N -f -
/home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64/usr/lib64/libusb-1.0.so.0.3.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=5dc726ff611aee897cb801d30ac1927f0434a46e, with debug_info, not stripped, too many notes (256)
/home/tkloczko/rpmbuild/BUILDROOT/libusb-1.0.26-2.fc35.x86_64/usr/lib64/libusb-1.0.la: libtool library file, ASCII text
[tkloczko@devel-g2v SPECS]$ file --version
file-5.41
magic file from /etc/magic:/usr/share/misc/magic
Additional Information: I've not changed anything except version between those two version in my build prcedure.
Part of my spec file which is building file package

%build
autoreconf -fiv
%configure \
        --disable-libseccomp \
        --disable-rpath \
        --disable-static \
        --enable-fsect-man5 \
        %{nil}
%make_build
Attached Files:
Notes
(0003765)
kloczek   
2022-06-16 22:19   
Looks like Fedora rolled back as well.
https://bugzilla.redhat.com/show_bug.cgi?id=2095871
(0003777)
christos   
2022-07-04 17:12   
Dup of PR/358


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
358 [file] General major always 2022-06-12 21:48 2022-07-04 17:01
Reporter: jpalus Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: filenames in output limited to single char when reading files from stdin (5.42 regression)
Description: When reading files from stdin file 5.42 cuts filename in output to single char (see "c" instead of whole "configure.ac"):
```
$ echo configure.ac|./src/file -m magic/magic.mgc -f -
c: M4 macro processor script, ASCII text
```
Appears to be regression introduced by:
https://github.com/file/file/commit/f448f3e5c37de8c285ac14b032b2bdcea82fc08b

where `inname` is now passed to `file_printable` before printing, however last parameter `wid`, that should be length on `inname` is always `1` for input from stdin.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: file-5.42-fix-size-of-lines-read-from-stdin.patch (689 bytes) 2022-06-14 12:56
https://bugs.astron.com/file_download.php?file_id=279&type=bug
Notes
(0003764)
bero   
2022-06-14 12:56   
The problem is that stdin can't be rewound, therefore the check for the longest filename size doesn't work. The code is aware of this, but "fixes" it by hardcoding 1.
The attached patch should fix it correctly.
(0003776)
christos   
2022-07-04 17:01   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
357 [file] General feature N/A 2022-06-12 05:32 2022-07-04 16:40
Reporter: a@gitadora.top Platform: aarch64  
Assigned To: christos OS: Debian  
Priority: normal OS Version: 11  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Please add EROFS loop file detection
Description: Please add EROFS loop file detection,it's relatively new file system format.
Tags: magic
Steps To Reproduce: 1.Got a EROFS file(i.e. system_a.bin)
2.Using file to test it
$file system_a.bin
system_a.bin: data
3.Test mount it
$sudo mount ./system_a.bin /media
$sudo df -Th /media
Filesystem Type Size Used Avail Use% Mounted on
/dev/loop0 erofs 3.7G 3.7G 0 100% /media
Additional Information:
Attached Files:
Notes
(0003762)
polluks   
2022-06-12 18:42   
#define EROFS_SUPER_MAGIC_V1 0xE0F5E1E2
https://kernel.googlesource.com/pub/scm/linux/kernel/git/xiang/erofs-utils/+/refs/heads/experimental/include/erofs_fs.h#12
(0003775)
christos   
2022-07-04 16:40   
Added!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
356 [file] General minor always 2022-06-11 11:42 2022-07-04 16:18
Reporter: davewhite Platform: Linux  
Assigned To: christos OS: Ubuntu  
Priority: normal OS Version: 20.04  
Status: resolved Product Version: 5.42  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: JSON parsing incorrectly accepts misspellings for true/false/null in json_parse_const()
Description: A JSON file containing the structure
{"test":true}
Is detected as 'JSON data'. The text
{"test":txxx}
Is also considered valid JSON.

During parsing, when detecting 't' (in file is_json.c:374) the json_parse_const() is called with the value "true".

json_parse_const() verifies the text found matches the expected constant, but does nothing with the
result of the test, and always returns true as long as the first letter matches (t for true, f for false or n for null)
and the word found was the correct length.

Hence
{"test":nxx} is invalid, while
{"test":nxxx} is valid json.
Tags: json
Steps To Reproduce: $ echo '{"test":txxx}' > file.json
$ file file.json
file.json: JSON data
Additional Information: Issue exists when built from latest source.
Attached Files:
Notes
(0003761)
davewhite   
2022-06-11 11:45   
The following batch resolves the issue

diff --git src/is_json.c src/is_json.c
index 86def31..8053d4f 100644
--- src/is_json.c
+++ src/is_json.c
@@ -327,6 +327,7 @@ json_parse_const(const unsigned char **ucp, const unsigned char *ue,
    for (len--; uc < ue && --len;) {
        if (*uc++ == *++str)
            continue;
+ break
    }
    if (len)
        DPRINTF("Bad const: ", uc, *ucp);
(0003774)
christos   
2022-07-04 16:18   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
351 [file] General feature always 2022-05-27 23:50 2022-06-28 04:03
Reporter: CathyKMeow Platform: GNU/Linux  
Assigned To: christos OS: Arch Linux ARM  
Priority: none OS Version: Rolling  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Escape "special" characters before outputting
Description: `file` does not escape "special" characters in file name before outputting. This is vulnerable to Trojan Source attacks.

(See https://trojansource.codes)

Example:
An attacker make an executable binary file containing malicious code look like a non-executable ASCII text file, so the user might try to open them in the GUI by double clicking on it, which instead executes the file.

Expected behavior:
```
user@localhost:~$ mkdir $'a\nb'
user@localhost:~$ file $'a\nb'
'a'$'\n''b': directory
```

What I see instead:
```
user@localhost:~$ mkdir $'a\nb'
user@localhost:~$ file $'a\nb'
a
b: directory
```
Tags:
Steps To Reproduce: ```
$ mkdir $'a\nb'
$ file $'a\nb'
```
Additional Information: ```
user@localhost:~/file_bug_test$ mkdir $'a\nb'
mkdir: cannot create directory 'a\nb': File exists
user@localhost:~/file_bug_test$ ls
'a'$'\n''b'
user@localhost:~/file_bug_test$ find .
.
./a?b
user@localhost:~/file_bug_test$ tar -cf file_bug_test.tar *
user@localhost:~/file_bug_test$ tar --list -f file_bug_test.tar
a\nb/
user@localhost:~/file_bug_test$
```
Attached Files:
Notes
(0003753)
christos   
2022-05-28 01:06   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
354 [file] General minor always 2022-06-06 21:37 2022-06-10 14:14
Reporter: vinc17 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.43  
    Target Version:  
Summary: Fails to detect JSON in case of empty array with spaces
Description: When a JSON file has an empty array with spaces between the brackets, it is misdetected as a text file.
Tags:
Steps To Reproduce: $ echo '{"a":[ ]}' | file -
/dev/stdin: ASCII text
Additional Information: This is due to a missing call to json_skip_space in function json_parse_array of src/is_json.c (see json_parse_object as a comparison). I've attached a patch with a testcase.
Attached Files: file-json-array.patch (1,189 bytes) 2022-06-06 21:37
https://bugs.astron.com/file_download.php?file_id=277&type=bug
Notes
(0003760)
christos   
2022-06-10 14:14   
Committed, many thanks (and for providing the test case!).


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
355 [file] General minor always 2022-06-08 18:35 2022-06-10 14:10
Reporter: hridoy31 Platform: Linux  
Assigned To: christos OS: Debian  
Priority: normal OS Version: 11.3  
Status: feedback Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: multiple definition of `handle_interrupt'
Description: When compiling tcsh according to the BUILDING file, at step 8, after run make, the make gives back error about multiple definition of `handle_interrupt', with the following line:
/usr/bin/ld: tc.sig.o:/home/hridoy/tcsh/tc.sig.c:59: multiple definition of `handle_interrupt'; sh.o:/home/hridoy/tcsh/sh.h:569: first defined here
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: Screenshot from 2022-06-09 00-34-26.png (31,149 bytes) 2022-06-08 18:35
https://bugs.astron.com/file_download.php?file_id=278&type=bug
Notes
(0003758)
christos   
2022-06-10 14:10   
Should be fixed in HEAD.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
352 [tcsh] General minor always 2022-05-30 05:56 2022-06-05 10:01
Reporter: zjs Platform: AMD64  
Assigned To: OS: FreeBSD  
Priority: normal OS Version: 13.1-RELEASE  
Status: new Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: tcsh does not display emoji
Description: Hi there,

I have run FreeBSD 13.1 RELEASE. I have a Chinese input method, which could input emoji. The problem is when I input an emoji, for example: 🥰 after the tcsh prompt, it displayed as:

```
zjs@freebsd:~ % \U+1F970
🥰: Command not found.
```

The tcsh could display some emoji, for example: ❤️️.

The version of the tcsh is:
tcsh 6.22.04 (Astron) 2021-04-26

I have run into the same problem on Debian with tcsh version 6.21.00 (Astron) 2019-05-08.

Best,
Jinsong
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003757)
polluks   
2022-06-05 10:01   
Indeed, Unicode's SMP support is missing.
Here comes a quick macOS check:
bash ok
dash ok
fish ok
ksh ok
tcsh fails
zsh fails


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
345 [file] General minor always 2022-05-12 18:20 2022-06-01 12:05
Reporter: Almalixia Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: feedback Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: mime_content_type result gets duplicated for xlsx
Description: When calling mime_content_type() on a file of type xlsx, it's returning 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheetapplication/vnd.openxmlformats-officedocument.spreadsheetml.sheet'.
Tags:
Steps To Reproduce: echo mime_content_type('file_name.xlsx');
Additional Information:
Attached Files:
Notes
(0003745)
christos   
2022-05-21 22:30   
I don't maintain the php bindings for libmagic; I just tried it on my machine and it works:
[6:09pm] 370>php
<?php
echo mime_content_type('foo.xlsx') . "\n";
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

[6:10pm] 371>php -v
PHP 7.4.27 (cli) (built: Apr 25 2022 13:02:57) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
(0003749)
jeremysawesome   
2022-05-26 22:44   
Performed similar steps on my machine:
```
[jdev@dev-01 ~]$ php -r 'echo mime_content_type("Foo.xlsx")."\n";'
application/vnd.openxmlformats-officedocument.spreadsheetml.sheetapplication/vnd.openxmlformats-officedocument.spreadsheetml.sheet
[jdev@dev-01 ~]$ php -v
PHP 7.4.28 (cli) (built: Feb 15 2022 13:23:10) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
    with Zend OPcache v7.4.28, Copyright (c), by Zend Technologies
    with Xdebug v2.9.6, Copyright (c) 2002-2020, by Derick Rethans
```

@christos - is this the expected output? And, is this the correct place to report these errors? If this is not the correct place to report these errors, where would the correct place be?

Thanks!
(0003756)
christos   
2022-06-01 12:05   
I think that the PHP bug tracker would be a more appropriate place.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
348 [file] General minor always 2022-05-24 06:14 2022-05-31 18:54
Reporter: frokaikan Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Can the "-m" parameter take anything as its value?
Description: `./file -m ./bad_magic ./file`
The file `bad_magic` was attached.
Then the program throws SIGABRT.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: bad_magic (1,378 bytes) 2022-05-24 06:14
https://bugs.astron.com/file_download.php?file_id=276&type=bug
Notes
(0003755)
christos   
2022-05-31 18:54   
Add missing switch cases.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
349 [file] General minor always 2022-05-24 06:24 2022-05-31 18:41
Reporter: Farknay Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Any text file content starting with PR is detected as RAGE Package Format (RPF)
Description: Since the addition of GTA file formats in the games magic file, any text file starting with the string PR is detected as RAGE Package Format (RPF).

I've only been able to test this in Git Bash on windows, all the Linux boxes I work on are using older versions of file, and I'm not allowed to upgrade them.

Tags:
Steps To Reproduce: echo 'PROMPT Hello' > sample.sql

file sample.sql
sample: RAGE Package Format (RPF),

Additional Information:
Attached Files:
Notes
(0003754)
christos   
2022-05-31 18:41   
Made stronger.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
343 [file] General feature have not tried 2022-05-09 20:56 2022-05-21 22:51
Reporter: jstein Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: btrfs send image (new file format)
Description: btrfs send can dump a filesystem in a structured file system image file. This file can be imported by btrfs receive.
The dump starts with
btrfs-stream

Tags:
Steps To Reproduce:
Additional Information: see also
https://btrfs.wiki.kernel.org/index.php/Design_notes_on_Send/Receive
Attached Files:
Notes
(0003747)
christos   
2022-05-21 22:51   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
344 [file] General major always 2022-05-11 06:13 2022-05-21 22:47
Reporter: rven Platform:  
Assigned To: christos OS:  
Priority: high OS Version: Ubuntu 20.04.4 L  
Status: feedback Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: image/svg+xml not correctly guessed from buffer
Description: When a svg needs to be parsed with the from_buffer method, it returns an incorrect mimetype when the <?xml version='1.0' encoding='UTF-8' ?> tag is included on top of the xml declaration
Tags:
Steps To Reproduce: import magic
a = b"<svg height='180' width='180' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><rect fill='hsl(349, 60%, 45%)' height='180' width='180'/><text fill='#ffffff' font-size='96' text-anchor='middle' x='90' y='125' font-family='sans-serif'>M</text></svg>"
magic.from_buffer(a, mime=True)

=> 'image/svg+xml'

import magic
a = b"<?xml version='1.0' encoding='UTF-8' ?><svg height='180' width='180' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><rect fill='hsl(349, 60%, 45%)' height='180' width='180'/><text fill='#ffffff' font-size='96' text-anchor='middle' x='90' y='125' font-family='sans-serif'>M</text></svg>"
magic.from_buffer(a, mime=True)

=> 'text/xml'
Additional Information:
Attached Files:
Notes
(0003746)
christos   
2022-05-21 22:47   
should be fixed in HEAD.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
346 [file] General minor always 2022-05-13 12:09 2022-05-20 11:12
Reporter: jukuisma Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: feedback Product Version: 5.41  
Product Build: Resolution: reopened  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Incorrect video/dv mimetype
Description: `file` identifies DV files correctly, but returns `application/octet-stream` as the mimetype. No mimetype has been defined for DV files: https://github.com/file/file/blob/22209154702032e9b7f2e96eb7eab174f8e87af9/magic/Magdir/animation#L944.
Tags:
Steps To Reproduce: $ wget https://github.com/Digital-Preservation-Finland/file-scraper/raw/c9facae6df774544e4ef8f7a039a926796ef57b8/tests/data/video_dv/valid__pal_lossy.dv
$ file valid__pal_lossy.dv
$ file --mime-type valid__pal_lossy.dv
Additional Information: https://www.iana.org/assignments/media-types/video/DV
Attached Files:
Notes
(0003743)
christos   
2022-05-14 22:06   
Fixed, thanks
(0003744)
jukuisma   
2022-05-20 11:12   
Shouldn't this be "video/DV" or "video/dv" instead of "video/x-dv"? MIME type "video/DV" is registered in IANA, see the RFC of the registration:

https://www.rfc-editor.org/rfc/rfc6469.html

As we understand, MIME types with "x-" prefix should be avoided:

https://www.rfc-editor.org/rfc/rfc6838.html#section-3.4 (last paragraph of section 3.4)

which refers to:

https://www.rfc-editor.org/rfc/rfc6648.html


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
347 [file] General feature N/A 2022-05-14 18:04 2022-05-14 20:36
Reporter: GerbilSoft Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Detect Godot textures; improvements for NGPC, Mega Drive, others
Description: The attached patches add the following:
* Detect Godot STEX textures from Godot 3 and Godot 4. This includes image size, codec, and rescale value (if applicable).
* Neo Geo Pocket Color: Print the NEOPxxxx serial number.
* riff: Print calling metadata from RecorderGear TR500 call recordings. The metadata indicates if it was an incoming or outgoing call, and the dialed/received phone number.
* DDS: Print DXGI formats.
* Mega Drive: Improve system type detection; add more variants for Sega Pico, including a few that don't start with "SEGA".
* c64: Expand CBM cartridge image detection for VICE 3.0, which now includes C128, CBM-II, VIC-20, and Plus/4.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: file.2022-05-14.ngpc.TR500.DXGI.godot.c64-cart.tar.gz (9,486 bytes) 2022-05-14 18:04
https://bugs.astron.com/file_download.php?file_id=275&type=bug
Notes
(0003742)
christos   
2022-05-14 20:36   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
341 [file] General minor have not tried 2022-04-23 11:00 2022-04-25 17:34
Reporter: blacktav Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: feedback Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: "file -z" breaks on zipped files with "Bad system call"
Description: When testing a zip archive, "file" breaks reporting "Bad system call"

Testing a dump from Google Photos
$ file Photos-001.zip
Photos-001.zip: Zip archive data, at least v2.0 to extract, compression method=deflate

$ file -z Photos-001.zip
Photos-001.zip: Bad system call

Testing a zip archive
$ file mc-test-pass.zip
mc-test-pass.zip: Zip archive data, at least v1.0 to extract, compression method=store

$ file -z mc-test-pass.zip
mc-test-pass.zip: Bad system call

OS is ArchLinux
Tags:
Steps To Reproduce: 1. download a bundle from Google Photos
2. test download with "file -z <filename>"

or

1. create an archive using zip (Zip 3.0 (July 5th 2008), by Info-ZIP)
2. test download with "file -z <filename>"
Additional Information: $ file --version
file-5.41
magic file from /usr/share/file/misc/magic
seccomp support included
Attached Files: mc-test-pass.zip (16,836 bytes) 2022-04-23 11:00
https://bugs.astron.com/file_download.php?file_id=274&type=bug
Notes
(0003736)
blacktav   
2022-04-23 11:10   
Sorry, inappropriate report
Solution being to use -S switch as in "file -S -z <filename>"

Maybe the error response could be more useful though
(0003741)
christos   
2022-04-25 17:34   
Yes, we could install a bad system call handler, but it is ugly. I prefer to leave it as is.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
340 [file] General minor always 2022-04-12 20:47 2022-04-25 17:33
Reporter: ESultanik Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: The ASF_JFIF_Media guid definition is missing two bytes
Description: Line 91 of `magic/Magdir/asf` contains this GUID: `B61BE100-5B4E-11CF-A8FD-00805F5C44`. That GUID is missing its last two bytes. I believe it should actually be `B61BE100-5B4E-11CF-A8FD-00805F5C442B`.

https://github.com/file/file/blob/961e193e4519d40983322ed853cea6511d4b6494/magic/Magdir/asf#L91
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003740)
christos   
2022-04-25 17:33   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
339 [file] General feature always 2022-04-10 14:53 2022-04-25 17:31
Reporter: jmaynard Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Add more Hercules DASD image types
Description: The attached .magic file can be used to replace lines 1696-1711 of magic/Magdir/images . It adds recognition of CKD64/CCKD64 DASD, and for compressed DASD, it will report the number of cylinders on the volume and the compression algorithm.
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: .magic (1,942 bytes) 2022-04-10 14:53
https://bugs.astron.com/file_download.php?file_id=273&type=bug
Notes
(0003739)
christos   
2022-04-25 17:31   
Replaced, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
338 [file] General feature N/A 2022-04-09 09:15 2022-04-25 17:28
Reporter: polluks Platform: MacBook Pro  
Assigned To: christos OS: macOS  
Priority: normal OS Version: 12.3  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Added more Oric
Description: diff --git a/magic/Magdir/oric b/magic/Magdir/oric
index 79e264ea..678ba770 100644
--- a/magic/Magdir/oric
+++ b/magic/Magdir/oric
@@ -5,8 +5,12 @@
 # From: Stefan A. Haubenthal <polluks@sdf.lonestar.org>
 # References:
 # http://fileformats.archiveteam.org/wiki/TAP_(Oric)
+# http://fileformats.archiveteam.org/wiki/DSK_(Oric)
 0 string \x16\x16\x16\x24 Oric tape,
 >6 byte =0x00 BASIC,
 >6 byte =0x80 memory block,
 >7 byte >0x00 autorun,
 >13 string x "%.15s"
+
+0 string ORICDISK Oric Image
+0 string MFM_DISK Oric Image
Tags: magic
Steps To Reproduce:
Additional Information:
System Description Apple M1
Attached Files:
Notes
(0003738)
christos   
2022-04-25 17:28   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
342 [file] General minor always 2022-04-25 06:34 2022-04-25 06:34
Reporter: jayvdb Platform:  
Assigned To: OS:  
Priority: normal OS Version:  
Status: new Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: jar files with POSIX shell script header do not mention they are JAR files
Description: https://github.com/pinterest/ktlint/releases/download/0.45.2/ktlint is an example of a JAR file with a POSIX shell script header, which looks like

---
#!/bin/sh

JV=$(java -version 2>&1 | head -1 | cut -d'"' -f2 | sed '/^1\./s///' | cut -d'.' -f1)

X=$( [ "$JV" -ge "16" ] && echo "--add-opens java.base/java.lang=ALL-UNNAMED" || echo "")

exec java $X -Xmx512m -jar "$0" "$@"

PK...
```

The java executable can run it as a jar file directly. i.e. the following prints the help on all platforms

java -jar /path/to/ktlint --help

The file command says it is "POSIX shell script executable (binary data)"

When I manually remove the script header, file then reports it as "Zip archive data, at least v1.0 to extract, compression method=deflate"

It would be great if it could mention that it is a JAR or ZIP file, perhaps like

"POSIX shell script executable (JAR ..)" or "POSIX shell script executable (Zip archive data, ...)"
Tags:
Steps To Reproduce: 1. Download https://github.com/pinterest/ktlint/releases/download/0.45.2/ktlint
2. `file ktlint`
Additional Information:
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
282 [tcsh] General minor always 2021-08-15 13:51 2022-04-23 12:21
Reporter: kato Platform: GNU/Linux x86_64  
Assigned To: christos OS: Open SuSE Leap  
Priority: normal OS Version: 15.3  
Status: confirmed Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: tcsh 6.20.00 : shell variable "anyerror" does not work as described in the tcsh man page
Description: If "anyerror" is set, the exit status of a non-simple command should be non-zero if any subcommand fails. However, this does not hold in any case.
Tags:
Steps To Reproduce: /home/test> tcsh --version
tcsh 6.20.00 (Astron) 2016-11-24 (x86_64-unknown-linux) options wide,nls,lf,dl,al,kan,sm,color,filec
/home/test> cat non-existent-file|cat
cat: non-existent-file: No such file or directory
Exit 1
/home/test> set variable=`cat non-existent-file`
cat: non-existent-file: No such file or directory
Exit 1
/home/test> set variable=`cat non-existent-file|cat`
cat: non-existent-file: No such file or directory
/home/test> echo $?
0
Additional Information:
Attached Files:
Notes
(0003676)
christos   
2021-11-14 17:35   
set x=`cat /does/not/exist`
should set status to 1 and does not.
(0003737)
kato   
2022-04-23 12:21   
The bug persists with tcsh 6.24.00 with Open SuSE Leap 15.3:

/home/test> echo $version
tcsh 6.24.00 (Astron) 2022-02-02 (x86_64-suse-linux-suse-linux) options wide,nls,lf,dl,al,kan,sm,color,filec
/home/test> set anyerror
/home/test> set x=`cat /does/not/exist`
cat: /does/not/exist: No such file or directory
/home/test> echo $status
1
/home/test> set x=`cat /does/not/exist|less`
cat: /does/not/exist: No such file or directory
/home/test> echo $status
0


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
186 [file] General minor always 2020-08-24 02:14 2022-04-13 07:16
Reporter: joveler Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: confirmed Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: Korean text file misidentified as 'COM executable for DOS'
Description: [Summary]
Some Korean text file encoded as EUC-KR (aka CP949 on Windows) is misidentified as 'COM executable for DOS'.
Part of the COM signatures should be disabled to fix it.

[Technical Detail]
EUC-KR encodes 4% of Korean characters as 'B8xx' ('륫/B8A0' ~ '뫼/B8FE').
In libmagic, the simplest COM signature only checks for 0xB8 at offset 0.
As a result, libmagic causes false positive on EUC-KR text which starts with some Korean characters.

Windows notepad (prior to Windows 10 v19H1) used ANSI encoding as default.
It means almost every text file produced in Korean Windows is encoded as EUC-KR.
Therefore it is a critical issue on Korean text files, as much Korean text files are misidentified as executable.

[Fix]
To reduce the negative impact, I propose to disable the simplest COM file signature.
I have attached the diff file.

Tags:
Steps To Reproduce: Run file command with attached euckr_falsepositive.txt.

$ file euckr_falsepositive.txt
euckr_falsepositive.txt: COM executable for DOS

$ file euckr_falsepositive.txt --mime-type
euckr_falsepositive.txt: application/x-dosexec
Additional Information:
Attached Files: 0001-Disable-simplest-COM-signature-to-avoid-FP.patch (1,869 bytes) 2020-08-24 02:14
https://bugs.astron.com/file_download.php?file_id=155&type=bug
euckr_falsepositive.txt (293 bytes) 2020-08-24 02:14
https://bugs.astron.com/file_download.php?file_id=154&type=bug
Notes
(0003482)
christos   
2020-09-06 15:14   
Patched, thanks!
(0003648)
christos   
2021-10-12 18:24   
Will revert for now and revisit. Breaks too many com executables. Perhaps we can limit it on what follows b8?
(0003734)
joveler   
2022-04-13 06:54   
> Perhaps we can limit it on what follows b8?

I have tried, but it is impossible.

In 8086 opcode, 0xB8 is 'MOV AX, [IMM]' command.
Since the IMM is any arbitrary two bytes, we cannot limit the followings.
- B8 0A 16 -> MOV AX, 0x16A0
- B8 40 00 -> MOV AX, 0x0040
(0003735)
joveler   
2022-04-13 07:16   
Every Extended Unix Code charset, such as EUC-JP, shares the same address space as EUC-KR. (Bytes of 0xA0-0xFF range, except 0x80)
Keeping 0xB8 COM signature may also cause problems in every EUC charset.

One idea is the use text/binary detection on buffers since the EUC charset tries to avoid ASCII control characters.
I do not know how libmagic's text detection works yet, isn't it involve code patching?


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
335 [file] General feature N/A 2022-04-01 15:09 2022-04-09 09:13
Reporter: polluks Platform: MacBook Pro  
Assigned To: christos OS: macOS  
Priority: normal OS Version: 12.3  
Status: assigned Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Added mib
Description: magic for Management Information Base
Tags: magic
Steps To Reproduce:
Additional Information:
System Description Apple M1
Attached Files: mib (213 bytes) 2022-04-01 15:09
https://bugs.astron.com/file_download.php?file_id=271&type=bug
Notes
(0003729)
christos   
2022-04-04 16:14   
I have a feeling this will end up with too many false positives.
(0003732)
polluks   
2022-04-08 13:49   
Indeed, it's a bit weak magic.
See also https://datatracker.ietf.org/doc/html/rfc1213#section-6
(0003733)
polluks   
2022-04-09 09:13   
How this assignment operator is pretty unique.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
332 [file] General minor always 2022-03-22 09:49 2022-04-04 17:48
Reporter: vinc17 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: misdetects "[number] text" files as JSON data
Description: Some non-JSON text files have a form like "[number] text". For instance, Xorg.0.log log files start with something like

[ 48.187]
X.Org X Server 1.21.1.3

where "[number]" is a timestamp. Because "[number]" looks like a JSON object, "file" detects such files as JSON data, even though the object is followed by garbage when interpreted as JSON.
Tags:
Steps To Reproduce: $ echo "[1] foo" | file -
/dev/stdin: JSON data
Additional Information: I don't know whether this is related, but I can see in the src/is_json.c source:

/*
 * if JSON_COUNT != 0:
 * count all the objects, require that we have the whole data file
 * otherwise:
 * stop if we find an object or an array
 */
[...]
#if JSON_COUNT
        /* bail quickly if not counting */
        if (lvl > 1 && (st[JSON_OBJECT] || st[JSON_ARRAYN]))
                return 1;
#endif

The "#if JSON_COUNT" and "bail quickly if not counting" comment seem contradictory (if JSON_COUNT != 0, then it is counting), so I'm wondering what is expected.
Attached Files: PR332.patch (1,061 bytes) 2022-03-25 09:30
https://bugs.astron.com/file_download.php?file_id=269&type=bug
Notes
(0003721)
vinc17   
2022-03-22 09:59   
See also commit 479e0995523c42b83a055781d27a0c651dc286e2, whose intent was to fix PR/69 (the same bug I had reported in the past).
(0003722)
wgh   
2022-03-25 03:51   
PR/165 reported that some json files are recognized as ASCII text, so the conditions for json file recognition were relaxed, resulting in some files being mistakenly recognized as json again。 I think that should be the reason。
(0003723)
vinc17   
2022-03-25 08:58   
Note that in PR/165, all the examples consisted in one JSON object, with no "garbage" following it. If rules are relaxed to allow very simple objects like some of the PR/165 examples, then garbage detection becomes important to avoid many false positives. Anyway, I suppose that the fix of PR/69 was wrong: the solution was not to discard simple JSON objects; instead, it should have detected garbage (i.e. any non-whitespace character) after a JSON object has been parsed. Examples with json_pp:

$ echo '[]' | json_pp
[]
$ echo '[] ' | json_pp
[]
$ echo '[] 1' | json_pp
garbage after JSON object, at character offset 4 (before "\n") at /usr/bin/json_pp line 59.
(0003724)
vinc17   
2022-03-25 09:16   
I'm going to provide a very simple patch, with testcases.
(0003725)
vinc17   
2022-03-25 09:30   
In json_parse for the end of the recursion (lvl == 0), return 0 (failure) if the end of the file has not been reached (whitespace has been skipped just before).

Two testcases are provided:
1. A simple JSON array followed by whitespace (a newline character), which should be recognized as JSON data.
2. Ditto followed by a non-whitespace character (a digit); this is not a valid JSON file, thus should be recognized as ASCII text.
(0003726)
vinc17   
2022-03-25 11:01   
FYI, I've also reported the bug in the Debian BTS and put a simplified patch there (no testcases, 2 lines of context removed) so that it can also be applied on the current Debian package: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1008247
(0003731)
christos   
2022-04-04 17:48   
Committed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
334 [file] General major always 2022-03-30 13:51 2022-04-04 16:46
Reporter: jmp3r Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Binary data files detected as Unicode text
Description: After fixing this: https://bugs.astron.com/view.php?id=319 new bug appeared

Now the inverse situation for some files:
the encrypted binary files detected as text

I used the latest sources (HEAD) from github
Tags:
Steps To Reproduce: Scan files from attach with `file` version latest sources (HEAD)
I attached only two files, but there are thousands of such files.
Additional Information:
Attached Files: bin.zip (23,800 bytes) 2022-03-30 13:51
https://bugs.astron.com/file_download.php?file_id=270&type=bug
Notes
(0003730)
christos   
2022-04-04 16:46   
Detect invalid UTF16 and surrogate pairs.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
336 [file] General minor always 2022-04-04 08:29 2022-04-04 16:13
Reporter: stefanwascoding Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: using `env` breaks detection of zsh scripts (shown as plain text)
Description: Scripts using `#!/usr/bin/env zsh` as shebang are detected as text/plain mime type.

Both default detection as well as using "--mime" are broken; `#!/usr/bin/zsh` shows up as "Paul Falstad's zsh script text executable" & "text/x-shellscript", env version shows up as "a /usr/bin/env zsh script text executable" or "text/plain".
Works as expected when using bash in place of zsh.

This might be a regression of https://bugs.astron.com/view.php?id=114
Tags:
Steps To Reproduce: echo '#!/usr/bin/env zsh' > myzshscript && chmod +x myzshscript && file --mime-type myzshscript
Additional Information:
Attached Files:
Notes
(0003727)
polluks   
2022-04-04 13:04   
Indeed
$ grep usr/bin/env magic/Magdir/commands
0 search/1 #!/usr/bin/env\ zsh Paul Falstad's zsh script text executable
0 string/fwt #!\ /usr/bin/env\ bash Bourne-Again shell script text executable
0 string/fwt #!\ /usr/bin/env\ fish fish shell script text executable
0 string/fwt #!\ /usr/bin/env\ execlineb execline script text executable
(0003728)
christos   
2022-04-04 16:13   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
333 [file] General feature N/A 2022-03-24 08:26 2022-03-24 08:26
Reporter: evyatar Platform:  
Assigned To: OS:  
Priority: none OS Version:  
Status: new Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Introduce a magic_file_at() function
Description: I suggest introducing a new libmagic API function: magic_file_at which will have the signature:
const char *magic_file_at(magic_t cookie, int dirfd, const char *path)
It behaves exactly like magic_file() except that if path is relative then it is interpreted as a relative path to the directory referred to by dirfd except if dirfd is negative in which case path is interpreted as a relative path to the current working directory.
This is analogous to the openat() family of syscalls (except that AT_FDCWD is changed with any negative value).
The rationale behind this addition is laid out in the Linux mapage for open(2) but also, in my personal experience, it simplifies the use of readdir() greatly as no string copying needs to take place to call magic_file().
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
283 [file] General feature N/A 2021-08-17 06:03 2022-03-21 23:34
Reporter: polluks Platform: PowerBook5,8  
Assigned To: christos OS: MorphOS  
Priority: normal OS Version: 3.15  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: More X11
Description: Please use official name
Added bitmap
Tags:
Steps To Reproduce: --- xwindows.bak 2020-06-19 14:19:13 +0200
+++ xwindow 2021-08-17 00:57:25 +0200
@@ -1,7 +1,7 @@
 
 #------------------------------------------------------------------------------
 # $File: xwindows,v 1.11 2019/04/19 00:42:27 christos Exp $
-# xwindows: file(1) magic for various X/Window system file formats.
+# xwindow: file(1) magic for various X Window System file formats.
 
 # Compiled X Keymap
 # XKM (compiled X keymap) files (including version and byte ordering)
@@ -33,3 +33,7 @@
 !:mime image/x-xcursor
 >10 leshort x version %d
 >>8 leshort x \b.%d
+
+# X bitmap https://en.wikipedia.org/wiki/X_BitMap
+0 string #define\
+>8 regex [a-zA-Z0-9]+_width xbm image
Additional Information:
System Description
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
330 [file] General minor have not tried 2022-03-19 15:41 2022-03-21 23:28
Reporter: polluks Platform: MacBook Pro  
Assigned To: christos OS: macOS  
Priority: normal OS Version: 12.1  
Status: assigned Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: msdos: ZIP self-extracting archive
Description: file did not recognize the ZIP, unzip worked fine
Tags: magic, zip
Steps To Reproduce: IFAZE475.EXE: MS-DOS executable, NE for MS Windows 3.x (EXE)

│00003E60 4D 00 DD 0A 00 00 4C 9F 00 00 9E 03 CF 06 50 4B │◆│......L.......PK│
│00003E70 03 04 14 00 00 80 08 00 F3 80 73 20 59 59 74 17 │▒│..........s YYt.│
Additional Information: http://cd.textfiles.com/psl/pslv5nv05/WIN/GRAPHICS/IFAZE475.ZIP
System Description Apple M1
Attached Files:
Notes
(0003719)
christos   
2022-03-21 21:42   
It is a self-extracting zip (even unzip says so)... What would you have file say?
(0003720)
polluks   
2022-03-21 23:28   
File should say: This is not a plain NE exe but a ZIP.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
326 [file] General feature always 2022-03-04 00:59 2022-03-21 21:37
Reporter: aichingm Platform: amd64  
Assigned To: christos OS: Linux  
Priority: normal OS Version: 5.16  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: add support for QGis files which are currently identified as HTML document
Description: QGIS: A Free and Open Source Geographic Information System

File format descriptions: https://github.com/qgis/QGIS/blob/master/rpm/sources/qgis-mime.xml
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: qgis-project.qgs (3,709 bytes) 2022-03-04 00:59
https://bugs.astron.com/file_download.php?file_id=266&type=bug
Notes
(0003718)
christos   
2022-03-21 21:37   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
325 [file] General trivial always 2022-02-28 14:20 2022-03-21 21:28
Reporter: wolfgangwalther Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: WOFF / WOFF2 fonts have no mimetype associated
Description: WOFF / WOFF2 files are correctly identified as such, but the returned mimetype is application/octet-stream, even though RFC8081 [1] defines font/woff and font/woff2 as mimetypes for those files types.

[1]: https://www.rfc-editor.org/rfc/rfc8081#section-4.4.5
Tags: magic
Steps To Reproduce: Using any example woff/woff2 file (e.g. https://filesamples.com/formats/woff):

% file fontawesome-webfont.woff
fontawesome-webfont.woff: Web Open Font Format, TrueType, length 98164, version 4.7

% file --mime-type fontawesome-webfont.woff
fontawesome-webfont.woff: application/octet-stream
Additional Information:
Attached Files:
Notes
(0003717)
christos   
2022-03-21 21:28   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
328 [file] General minor always 2022-03-15 10:49 2022-03-21 21:26
Reporter: adepasquale Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Add various missing MIME-Types
Description: Add missing MIME-Types for:
- ACE archives
- Windows CHM
- Windows URL
- Windows LNK
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: mimetypes.FILE5_41.patch (1,674 bytes) 2022-03-15 10:49
https://bugs.astron.com/file_download.php?file_id=268&type=bug
Notes
(0003716)
christos   
2022-03-21 21:26   
Committed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
327 [file] General minor always 2022-03-15 00:28 2022-03-21 21:24
Reporter: vinc17 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: fails to detect a json file as JSON data
Description: file 5.41 fails to detect the attached json file as JSON data.
Tags:
Steps To Reproduce: $ file Q3235109.json
Q3235109.json: ASCII text, with very long lines (2409), with no line terminators

And with the -d option, I can see: "[try json 0]".
Additional Information: The json_pp utility doesn't detect any issue on this file.
Attached Files: Q3235109.json (2,409 bytes) 2022-03-15 00:28
https://bugs.astron.com/file_download.php?file_id=267&type=bug
Notes
(0003711)
polluks   
2022-03-21 14:04   
By the way "cc -DTEST is_json.c" and "cc -DTEST is_tar.c" are broken, "cc -DTEST is_csv.c" still works.
(0003712)
polluks   
2022-03-21 14:19   
--- is_json.c.bak 2022-03-21 15:13:15.933289900 +0100
+++ is_json.c 2022-03-21 15:14:48.814366000 +0100
@@ -37,6 +37,8 @@

 #include <string.h>
 #include "magic.h"
+#else
+#include <stddef.h>
 #endif

 #ifdef DEBUG
(0003715)
christos   
2022-03-21 21:24   
Bumped recursion limit.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
331 [file] General minor have not tried 2022-03-20 23:48 2022-03-21 19:58
Reporter: polluks Platform: MacBook Pro  
Assigned To: christos OS: macOS  
Priority: normal OS Version: 12.3  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: macOS: check fails
Description: Running test: ../tests/CVE-2014-1943.testfile
../tests/CVE-2014-1943.testfile: Apple Driver Map, blocksize 0
Running test: ../tests/JW07022A.mp3.testfile
../tests/JW07022A.mp3.testfile: Audio file with ID3 version 2.2.0, contains: MPEG ADTS, layer III, v1, 96 kbps, 44.1 kHz, Monaural
Running test: ../tests/android-vdex-1.testfile
../tests/android-vdex-1.testfile: Android vdex file, verifier deps version: 021, dex section version: 002, number of dex files: 4, verifier deps size: 106328
Running test: ../tests/android-vdex-2.testfile
../tests/android-vdex-2.testfile: Android vdex file, being processed by dex2oat, verifier deps version: 019, dex section version: 002, number of dex files: 1, verifier deps size: 1016
Running test: ../tests/arj.testfile
../tests/arj.testfile: ARJ archive data, v11, slash-switched, created 5 1980+48, original name: example_m0.arj, os: Unix
test: ERROR: result was (len 97)
ARJ archive data, v11, slash-switched, created 5 1980+48, original name: example_m0.arj, os: Unix
expected (len 79)
ARJ archive data, v11, slash-switched, original name: example_m0.arj, OS: Unix
make[2]: *** [check-local] Error 1
make[1]: *** [check-am] Error 2
make: *** [check-recursive] Error 1
Tags: build
Steps To Reproduce:
Additional Information:
System Description Apple M1
Attached Files:
Notes
(0003714)
christos   
2022-03-21 19:58   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
329 [file] General feature N/A 2022-03-17 14:35 2022-03-21 19:57
Reporter: polluks Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Another IFF format
Description: --- iff.bak 2021-12-06 12:05:46.956819300 +0100
+++ iff 2022-03-17 15:32:22.461280200 +0100
@@ -45,6 +45,7 @@
 >8 string ACBM \b, ACBM continuous image
 >8 string FAXX \b, FAXX fax image
 >8 string STFX \b, ST-Fax image
+>8 string IMAGIHDR \b, CD-i image
 # other formats
 >8 string FTXT \b, FTXT formatted text
 >8 string CTLG \b, CTLG message catalog
Tags: magic
Steps To Reproduce:
Additional Information: See also https://github.com/jsummers/deark/issues/40
Attached Files:
Notes
(0003713)
christos   
2022-03-21 19:57   
Thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
323 [file] General minor always 2022-02-27 13:30 2022-03-21 19:55
Reporter: vmurashev Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: 2 test samples are broken
Description: If to fix issue 0000322 it becomes clear
that 2 test samples are broken
  - fit-map-data
  - regex-eol

---

/mnt/c/yr/file/tests/regex-eol.testfile: Ansible Vault, version 1.1, encryption AES256
file_test: ERROR: result was (len 45)
Ansible Vault, version 1.1, encryption AES256
expected (len 57)
Ansible Vault text, version 1.1, using AES256 encryption

---

/mnt/c/yr/file/tests/fit-map-data.testfile: FIT Map data, unit id 65536, serial 3879446968, Sat May 31 13:00:34 2014, manufacturer 1 (garmin), product 1632, type 4 (Activity)
file_test: ERROR: result was (len 130)
FIT Map data, unit id 65536, serial 3879446968, Sat May 31 13:00:34 2014, manufacturer 1 (garmin), product 1632, type 4 (Activity)
expected (len 131)
FIT Map data, unit id 65536, serial 3879446968, Sat May 31 10:00:34 2014, manufacturer 1 (garmin), product 1632, type 4 (Activity)
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
322 [file] General major always 2022-02-27 13:22 2022-03-16 12:03
Reporter: vmurashev Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: test should not skip underlying OS errors (e.g. file not found)
Description: Please take a look at tests/test.c

magic cookie is opened with flag MAGIC_NONE

as result test exits with zero exit code even if input file for testing is not found

I believe that for testing magic cookie should be opened with flag MAGIC_ERROR
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003709)
christos   
2022-03-16 12:03   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
321 [file] General minor have not tried 2022-02-27 13:12 2022-03-16 11:59
Reporter: vmurashev Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: memory double free if to invoke test with unexpected count of arguments
Description: Please take a looks at test/test.c

    if (argc != 3) {
        (void)fprintf(stderr, "Usage: %s TEST-FILE RESULT\n", prog);
        magic_close(ms);
        goto bad;
    }
...
bad:
    free(desired);
    magic_close(ms);
    return e;

You can see that magic_close(ms) is called twice in such case
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003708)
christos   
2022-03-16 11:59   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
320 [file] General tweak always 2022-02-21 16:38 2022-02-21 19:21
Reporter: BEEDELLROKEJULIANLOCKHART Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: feedback Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: 'file' reports encodement as merely 'ISO-8859' rather than specifically 'ISO-8859-1' or 'ISO-8859-15' or 'ISO-8859-14'.
Description: 'file' reports encodement as merely 'ISO-8859' rather than specifically 'ISO-8859-1' or 'ISO-8859-15' or 'ISO-8859-14', which means that the informqation is not useful for me, because I must know more specifically the current encodement to be able to configure utilities that are similar to 'http://invent.kde.org/utilities/kate' to
Tags:
Steps To Reproduce: Install Windows 10.
Create one '.deskthemepack'-file by exporting the current theme.
Install 'http://dl.fedoraproject.org/pub/fedora/linux/development/rawhide/Server/x86_64/iso'.
Install 'http://src.fedoraproject.org/rpms/file'.
Utilise 'http://invent.kde.org/utilities/ark' to extract the '.theme'-file from the '.deskthemepack'-archive-file.
Invoke '/usr/bin/file' with the path of the '.theme'-file of text as the sole argument as '/usr/bin/file 'file.theme''.

Alternatively, or conclusively, utilise 'file' to provide the encodement of any file that contains text whose encodement is 'ISO-8859'.
Additional Information:
Attached Files:
Notes
(0003706)
christos   
2022-02-21 19:21   
The question is how to tell the difference? For example while you can probably tell the difference between 8859-1 and 8859-14 by using a Celtic dictionary, it would be nearly impossible to tell the difference between 8859-1 and 8859-15.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
309 [file] General minor always 2022-01-20 14:52 2022-02-21 07:52
Reporter: malat Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Add support for JPEG-XL
Description: It would be nice to add support for JPEG-XL :

```
% convert -size 512x512 -depth 8 xc:black black.pgm
% cjxl black.pgm black.jxl
% file black.jxl
black.jxl: data
```
Tags:
Steps To Reproduce:
Additional Information: Here is the typical bits to check:

* https://github.com/libjxl/libjxl/blob/main/plugins/mime/image-jxl.xml

```
<?xml version="1.0" encoding="UTF-8"?>
<mime-info xmlns="http://www.freedesktop.org/standards/shared-mime-info">
  <mime-type type="image/jxl">
    <comment>JPEG XL image</comment>
    <comment xml:lang="fr">image JPEG XL</comment>
    <comment xml:lang="nl">JPEG XL afbeelding</comment>
    <magic priority="50">
      <match type="string" offset="0" value="\xFF\x0A"/>
      <match type="string" offset="0" value="\0\0\0\x0CJXL \x0D\x0A\x87\x0A"/>
    </magic>
    <glob pattern="*.jxl"/>
  </mime-type>
</mime-info>
```
Attached Files:
Notes
(0003703)
christos   
2022-02-20 18:28   
What version of file are you using? The magic seems to be there in HEAD:
# fgrep -i JPEG * | fgrep -i xl
jpeg:# JPEG XL
jpeg:0 string \xff\x0a JPEG XL codestream
jpeg:# JPEG XL (transcoded JPEG file)
jpeg:0 string \x00\x00\x00\x0cJXL\x20\x0d\x0a\x87\x0a JPEG XL container

(0003704)
malat   
2022-02-21 07:51   
Indeed, I was using file from Debian/bullseye. Closing.
(0003705)
malat   
2022-02-21 07:52   
For reference: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1004081


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
319 [file] General major always 2022-02-17 11:19 2022-02-19 22:49
Reporter: jmp3r Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Text files are identified as data
Description: Some text files could not be identified correctly, tested with 5.39, 5.41

wise_lang - 3 language ini files
log_data - samples log files
Tags:
Steps To Reproduce: Scan any file from attached archives and check that the output of file utility will "data" but these files are simple text files.
Additional Information:
Attached Files: log_data.zip (97,149 bytes) 2022-02-17 11:19
https://bugs.astron.com/file_download.php?file_id=265&type=bug
wise_lang.zip (6,079 bytes) 2022-02-17 11:19
https://bugs.astron.com/file_download.php?file_id=264&type=bug
Notes
(0003702)
christos   
2022-02-19 22:49   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
316 [file] General minor always 2022-01-31 14:59 2022-02-19 22:36
Reporter: karagian Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: .dbf files misidentified as "amd 29k coff prebar executable"
Description: file identifies .dbf files as "amd 29k coff prebar executable".

Tags:
Steps To Reproduce: you can test file on attached file
Additional Information: amd 29k coff prebar executable checks for the first two bytes, according to Magdir/varied.out and expected octal 01572

0 beshort 01572 amd 29k coff prebar executable

According to dbf specification (found in this link https://www.dbase.com/Knowledgebase/INT/db7_file_fmt.htm ) a DBASE level 5 file, last updated in 2022 (makes second byte 122, that's 122 years after 1900 :P ), matches the above 2-byte signature, so it gets misidentified as executable
Attached Files: DiorthPolykastrou062.dbf (2,574 bytes) 2022-01-31 14:59
https://bugs.astron.com/file_download.php?file_id=261&type=bug
Notes
(0003692)
polluks   
2022-02-02 12:13   
Raise priority of Magdir/database and check two more bytes...
"12-13 2 bytes Reserved; filled with zeros."
(0003701)
christos   
2022-02-19 22:36   
Bumped version of dbf.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
317 [file] General major always 2022-02-02 11:18 2022-02-15 13:57
Reporter: ssaschaa Platform: MacBook Pro M1 arm64  
Assigned To: christos OS: MacOS  
Priority: normal OS Version: 12.2 Darwin 21.3  
Status: assigned Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: OOXML mime-type fails with "application/x-decompression-error-gzip-Unknown-compression-format"
Description: When trying to get the mime-type of e.g. Excel OOXML file at MacOS decompression fails and MimeType "application/x-decompression-error-gzip-Unknown-compression-format" is returned instead of "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet".

Tried both: compilation and test as ARM64 and X86_64 binary

Tags: bug, compression
Steps To Reproduce: git clone https://github.com/file/file
cd ./file
autoreconf -i
./configure
make check
./src/file -m ./magic/magic.mgc --mime-type /tmp/Excel_Test.xlsx
Additional Information: Tried both: compilation and test as ARM64 and X86_64 binary
Attached Files: Excel_Test.xlsx (8,460 bytes) 2022-02-02 11:18
https://bugs.astron.com/file_download.php?file_id=262&type=bug
Notes
(0003699)
christos   
2022-02-15 13:32   
Tried to reproduce it, but could not:
[8:32am] 1761>./file -m ../magic/magic.mgc --mime-type Excel_Test.xlsx
Excel_Test.xlsx: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
[8:32am] 1762>uname -a
Darwin vpn1-1.astron.com 21.2.0 Darwin Kernel Version 21.2.0: Sun Nov 28 20:29:10 PST 2021; root:xnu-8019.61.5~1/RELEASE_ARM64_T8101 arm64
(0003700)
ssaschaa   
2022-02-15 13:57   
Sorry, I forgot to add the CLI switch "-z" for decompression attempts.
On Linux I get (as expected): /tmp/Excel_Test.xlsx: text/xml; charset=us-ascii compressed-encoding=application/vnd.openxmlformats-officedocument.spreadsheetml.sheet; charset=binary
But on MacOS I get: /tmp/Excel_Test.xlsx: application/x-decompression-error-gzip-Unknown-compression-format


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
318 [file] General feature N/A 2022-02-04 23:55 2022-02-15 13:01
Reporter: polluks Platform: PowerBook5,8  
Assigned To: christos OS: MorphOS  
Priority: normal OS Version: 3.15  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Added Oric
Description: See http://fileformats.archiveteam.org/wiki/TAP_(Oric)
Tags:
Steps To Reproduce:
Additional Information:
System Description
Attached Files: oric (374 bytes) 2022-02-04 23:55
https://bugs.astron.com/file_download.php?file_id=263&type=bug
Notes
(0003698)
christos   
2022-02-15 13:01   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
314 [file] General minor always 2022-01-29 19:10 2022-02-15 12:58
Reporter: gms Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Add magic for Pronto CCF files
Description: I've attached a small magic section for identifying Philips Pronto IR remote control CCF exchange format files.

Example:

file -m ccf_magic Panasonic.ccf
Panasonic.ccf: Philips Pronto IR remote control CCF

Remote control databases such as remote central carry a lot of CCF files, cf. e.g. http://files.remotecentral.com/download/45/pan-air-csakr.zip.html for the above example file.

I've also tested it with other CCF files.

I couldn't find real documentation for the CCF file format, but there are some open source utilities which use these magic bytes.

See for example my extract utility:

https://github.com/gsauthof/pronto-ccf/blob/78084a46109356d2bbf6e8d86eeb2f051d4e6022/ccf2pulse.py#L150

Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: ccf_magic (294 bytes) 2022-01-29 19:10
https://bugs.astron.com/file_download.php?file_id=259&type=bug
Notes
(0003697)
christos   
2022-02-15 12:58   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
315 [file] General minor always 2022-01-31 00:25 2022-02-14 23:57
Reporter: polluks Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Fixed console
Description: Fixed my newer Atari Lynx cartridge dump
Tags: bug
Steps To Reproduce: $ file -m console *.lnx
Solitaire_[AtariGamer.Com](Homebrew).lnx: Lynx cartridge, bank 0 256k, bank 1 256k, "Solitare pack for Lynx ", "Karris project "
a.lnx: Lynx cartridge, bank 0 256k, "Cart name ", "Manufacturer "
cart.lnx: Lynx cartridge, bank 0 512k, bank 1 512k, "Cart name ", "Manufacturer "
Additional Information:
Attached Files: console (38,975 bytes) 2022-01-31 00:25
https://bugs.astron.com/file_download.php?file_id=260&type=bug
Notes
(0003696)
christos   
2022-02-14 23:57   
Added thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
311 [file] General feature N/A 2022-01-21 16:28 2022-02-14 16:51
Reporter: ylep Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: New magic for the NIfTI neuroimaging file format
Description: I would like to propose the attached magic rules for inclusion, which are for identifying files in NIfTI format. NIfTI is a widely used format for image storage and exchange in the neuroimaging community, see https://nifti.nimh.nih.gov/.

I went beyond mere identification of the file type, and implemented rules for printing some high-level metadata (image size, resolution, datatype, etc.) which I find really useful, but I would understand if you think it is too much to include in the main database.

Test files can be found at the links below:
https://nifti.nimh.nih.gov/nifti-1/data/
https://nifti.nimh.nih.gov/pub/dist/data/nifti2
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: nifti.magic (4,992 bytes) 2022-01-21 16:28
https://bugs.astron.com/file_download.php?file_id=258&type=bug
Notes
(0003695)
christos   
2022-02-14 16:51   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
308 [file] General minor always 2022-01-18 10:39 2022-02-14 16:47
Reporter: adepasquale Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Fix and improve ARJ file information
Description: Current file 5.41 has an issue parsing the "original filename" from ARJ archive headers (wrong offset).

I updated the offset, as well as added more information based on open source documentation:
https://fossies.org/linux/unarj/unarj.c
https://www.fileformat.info/format/arj/corion.htm
http://hmelnov.icc.ru/geos/scripts/WWWBinV.dll/ShowR?ARJ.rfh

See the attached patch and hexdump of a sample file header (use xxd -r to restore).

Output before/after applying the patch:
test.arj: ARJ archive data, v11, slash-switched, original name: , os: Unix
test.arj: ARJ archive data, v11, slash-switched, original name: example_m0.arj, OS: Unix
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: test_arj.hex (262 bytes) 2022-01-18 10:39
https://bugs.astron.com/file_download.php?file_id=256&type=bug
arj.patch (1,335 bytes) 2022-01-18 10:39
https://bugs.astron.com/file_download.php?file_id=255&type=bug
Notes
(0003694)
christos   
2022-02-14 16:47   
Committed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
310 [file] General major always 2022-01-21 14:50 2022-02-14 16:39
Reporter: p870613 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: AddressSanitizer: stack-buffer-overflow on address 0x7ffc1ece1ae0 at pc 0x00000050bb19 bp 0x7ffc1ecdf090 sp 0x7ffc1ecdf088
Description: - version
    ```
    ➜ src ./file --version
    file-5.41
    magic file from /usr/local/share/misc/magic
    ```
    - at branch 4c94d085
- environment
    ```
    ➜ release git:(master) uname -a
    Linux lin-System-Product-Name 5.11.0-40-generic 00000440000018:0000020.04.2-Ubuntu SMP Tue Oct 26 18:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
    ➜ release git:(master) lsb_release -r
    Release: 20.04
    ```
- reproduce
    ```
    git clone https://github.com/file/file.git
    cd file
    autoreconf -i
    ./configure CC=gcc CXX=g++ CFLAGS="-g -fsanitize=address" --disable-shared
    make V=1 all
    ./src/file -m ./magic/magic.mgc ./poc
    ```

- asan
```
=================================================================
==1321923==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd284ba010 at pc 0x56188a508267 bp 0x7ffd284b75f0 sp 0x7ffd284b75e0
READ of size 1 at 0x7ffd284ba010 thread T0
    #0 0x56188a508266 in strlcpy /home/lin/file/src/strlcpy.c:49
    0000001 0x56188a4fec64 in file_copystr /home/lin/file/src/funcs.c:59
    0000002 0x56188a521563 in do_core_note /home/lin/file/src/readelf.c:918
    0000003 0x56188a523610 in donote /home/lin/file/src/readelf.c:1236
    0000004 0x56188a51e0be in dophn_core /home/lin/file/src/readelf.c:412
    0000005 0x56188a52753c in file_tryelf /home/lin/file/src/elfclass.h:43
    0000006 0x56188a50113e in file_buffer /home/lin/file/src/funcs.c:433
    0000007 0x56188a4e0a33 in file_or_fd /home/lin/file/src/magic.c:533
    0000008 0x56188a4e0376 in magic_file /home/lin/file/src/magic.c:417
    #9 0x56188a4dd9d4 in process /home/lin/file/src/file.c:555
    0000010 0x56188a4dce13 in main /home/lin/file/src/file.c:428
    0000011 0x7f4a175d40b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
    0000012 0x56188a4dc28d in _start (/home/lin/file/src/file+0x1728d)

Address 0x7ffd284ba010 is located in stack of thread T0 at offset 8384 in frame
    #0 0x56188a51d977 in dophn_core /home/lin/file/src/readelf.c:351

  This frame has 3 object(s):
    [32, 64) 'ph32' (line 352)
    [96, 152) 'ph64' (line 353)
    [192, 8384) 'nbuf' (line 355) <== Memory access at offset 8384 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/lin/file/src/strlcpy.c:49 in strlcpy
Shadow bytes around the buggy address:
  0x10002508f3b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002508f3c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002508f3d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002508f3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002508f3f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x10002508f400: 00 00[f3]f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3
  0x10002508f410: f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3
  0x10002508f420: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002508f430: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x10002508f440: 02 f2 04 f2 04 f2 00 00 00 00 00 00 04 f2 f2 f2
  0x10002508f450: f2 f2 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable: 00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone: fa
  Freed heap region: fd
  Stack left redzone: f1
  Stack mid redzone: f2
  Stack right redzone: f3
  Stack after return: f5
  Stack use after scope: f8
  Global redzone: f9
  Global init order: f6
  Poisoned by user: f7
  Container overflow: fc
  Array cookie: ac
  Intra object redzone: bb
  ASan internal: fe
  Left alloca redzone: ca
  Right alloca redzone: cb
  Shadow gap: cc
==1321923==ABORTING

```
Tags:
Steps To Reproduce:  git clone https://github.com/file/file.git
 cd file
 autoreconf -i
./configure CC=gcc CXX=g++ CFLAGS="-g -fsanitize=address" --disable-shared
 make V=1 all
 ./src/file -m ./magic/magic.mgc ./poc
Additional Information:
Attached Files: poc (28,105 bytes) 2022-01-21 14:50
https://bugs.astron.com/file_download.php?file_id=257&type=bug
Notes
(0003693)
christos   
2022-02-14 16:39   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
313 [tcsh] General minor always 2022-01-22 15:10 2022-01-22 15:10
Reporter: dgusev Platform: AMD64  
Assigned To: OS: FreeBSD  
Priority: normal OS Version: 13.0  
Status: new Product Version: 6.21.00  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: On exit history file is overwitten without merging
Description: Settings:
set history = 65535
set savehist = ( 65535 merge )

According to the tcsh manual page (description section of the "history -S|-L|-M [filename]"):
"If the second word of savehist is set to `merge', the history list is merged with
the existing history file instead of replacing it"

But actually it overwrites the history file without merging.
I get the same results using "history -S" or on exit from tcsh session.

Also tried to set lower history number to 1000 etc. Same results.
Tags:
Steps To Reproduce: 1. Set history options in "~/.cshrc" or "/etc/csh.cshrc":
set history = 65535
set savehist = ( 65535 merge )

2. Open 2 tcsh sessions.

3. Enter some commands in the first session and then save the history by using "history -S" command or by exiting from the session.

4. Save history the same way in the second session. History file will be overwritten, the new commands from 1st session will be lost.
Additional Information:
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
306 [file] General minor always 2021-12-27 18:26 2022-01-10 20:12
Reporter: es20490446e Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: assigned Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Not recognized: MPEG-2 transport stream
Description: "MPEG-2 transport stream" is recognized as "application/octet-stream".

Mime spec at:
/usr/share/mime/video/mp2t.xml
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003687)
christos   
2022-01-10 16:19   
This is a data container format with no particular magic identifier. https://en.wikipedia.org/wiki/MPEG_transport_stream
(0003690)
es20490446e   
2022-01-10 20:12   
Do you mean it is okay as it is now?


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
305 [file] General major always 2021-12-23 15:18 2022-01-10 19:32
Reporter: felixsch Platform: linux  
Assigned To: christos OS: ubuntu  
Priority: normal OS Version: impish  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: file utility fails on a simple binary file with ERROR: (null)
Description: The file utility fails on the attached simple binary file with output

    tmp.bin: ERROR: (null)

It turns out that the first 4 bytes trigger this issue, in fact the error occurs if the binary file starts with
0x02020100 or 0x02020200

file-5.38 gives the expected result

    tmp.bin: data
Tags:
Steps To Reproduce: Run the command
    `file tmp.bin`
on the attached file
Additional Information: I encountered the problem on ubuntu impish with file utility version 5.39. The attached file is just the head of a large binary file.
After cloning the repository and doing a `git bisect` it turned out that the problematic commit is

commit 2ca292bcdf217bfddeeeaad1adc38c716ffab181 (HEAD, refs/bisect/bad)
Author: Christos Zoulas <christos@zoulas.com>
Date: Sun Mar 15 16:44:37 2020 +0000

    Improve on Windoes Precompiled INFO files (Joerg Jenderek)

PS: I used the github repo to reproduce and bisect, but with the commit message above it should be possible to find the corresponding commit in the original repo.

The issue is still present in the actual master (commit message)

    PR/304: zachs18: Allow whitespace in netpbm sizes.
Attached Files: tmp.bin (32 bytes) 2021-12-23 15:18
https://bugs.astron.com/file_download.php?file_id=252&type=bug
tmp2.bin (32 bytes) 2022-01-10 19:00
https://bugs.astron.com/file_download.php?file_id=254&type=bug
Notes
(0003686)
christos   
2022-01-10 15:04   
Thanks, but the problem seems to be fixed; I can't reproduce this file 5.41...
(0003688)
felixsch   
2022-01-10 19:00   
Oh sorry, I must be in idiot.
When playing around with starting signature of the file I tried different signatures to narrow the bug, and then I attached the wrong file.
In fact tmp.bin starts with 0x02020300, and this is working.
Please try the newly attached file tmp2.bin, which should start with 0x02020200, if you are patient enough.
I get the issue with file-5.41 in the latest ubuntu:jammy docker container.
(0003689)
christos   
2022-01-10 19:32   
Spurious mprint() return value.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
307 [file] General minor always 2022-01-02 20:01 2022-01-10 14:15
Reporter: Fabrice Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Build failure with gcc 4.8
Description: We have the following build failure on buildroot with file 5.41 and gcc 4.8

readelf.c: In function 'do_auxv_note':
readelf.c:1046:2: error: 'for' loop initial declarations are only allowed in C99 mode
  for (size_t off = 0; off + elsize <= descsz; off += elsize) {
  ^

funcs.c:93:2: error: 'for' loop initial declarations are only allowed in C99 mode
  for (const char *p = fmt; *p; p++) {
  ^

Please find a patch below

Full build log:
 - http://autobuild.buildroot.org/results/31c/31cbc313fceb84c0cbb1969fca5ac44244871dbc/build-end.log
Tags: build
Steps To Reproduce:
Additional Information:
Attached Files: 0001-fix-build-with-gcc-4.8.patch (3,086 bytes) 2022-01-02 20:01
https://bugs.astron.com/file_download.php?file_id=253&type=bug
Notes
(0003685)
christos   
2022-01-10 14:15   
fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
302 [file] General minor have not tried 2021-12-04 03:35 2021-12-06 22:13
Reporter: calestyo Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: GPT not correctly detected as such, because of it's protective MBR
Description: Hey.

A file that contains a GPT (GUID Partition Table) is nonetheless detected as MBR, which is probably because every GPT contains a protective MBR just at the position where the regular MBR would be.

E.g.:
# gdisk -l example.img
GPT fdisk (gdisk) version 1.0.6

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk example.img: 16777216 sectors, 8.0 GiB
Sector size (logical): 512 bytes
Disk identifier (GUID): C93252EC-C2A3-41C8-9A12-ECF030C66D7E
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 16777182
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number Start (sector) End (sector) Size Code Name
   1 2048 67583 32.0 MiB EF00 EFI system partition
   2 67584 16777182 8.0 GiB 8300 Linux filesystem


whereas file only detects the MBR:
# file example.img
example.img: DOS/MBR boot sector; partition 1 : ID=0xee, start-CHS (0x0,0,2), end-CHS (0x3ff,255,63), startsector 1, 16777215 sectors, extended partition table (last)


Not really sure how one should (safely) detect a GPT.

The protective MBR always looks like this:
Expert command (? for help): o

Disk size is 16777216 sectors (8.0 GiB)
MBR disk identifier: 0x00000000
MBR partitions:

Number Boot Start Sector End Sector Status Code
   1 1 16777215 primary 0xEE

i.e. always type 0xEE ... but that alone is AFAIU not enough to say there's a GPT... furthermore, GPTs may not use a protective MBR at all.


The best is perhaps to just rely on the GPT signature, which is:
EFI PART at 0x0 (and I think with respect to the endianess)
(see https://en.wikipedia.org/wiki/GUID_Partition_Table#Partition_table_header_(LBA_1) )


Not sure if it makes sense to check for the backup GPT, probably not.

Cheers,
Chris.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003682)
christos   
2021-12-06 20:07   
file just does not look after the MBR for the GPT information... It could be made to look, but currently it does not.
(0003683)
calestyo   
2021-12-06 22:13   
Guess that would make sense... just like it looks deeper into various container formats to find out what's actually inside.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
301 [file] General minor N/A 2021-11-24 23:48 2021-12-06 19:59
Reporter: rowlap Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: feedback Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: sysstat magic
Description: Please find attached magic rules to recognise data files generated by sysstat.

Discussion thread
https://github.com/sysstat/sysstat/discussions/297
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: sysstat.magic (288 bytes) 2021-11-24 23:48
https://bugs.astron.com/file_download.php?file_id=248&type=bug
Notes
(0003681)
christos   
2021-12-06 19:59   
Have you read "new magic guidelines" in https://github.com/file/file? It is strongly recommended to not submit magic with an initial prefix of <= 16 bits.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
303 [file] General minor have not tried 2021-12-05 15:15 2021-12-06 19:33
Reporter: polluks Platform: PowerBook5,8  
Assigned To: christos OS: MorphOS  
Priority: normal OS Version: 3.15  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: False positive PCP
Description: Because it's a 3DS...
Tags:
Steps To Reproduce: > file *
ball.3ds: 3D Studio model
myAdder.3DS: 3D Studio model
myAnaconda.3ds: 3D Studio model
myCoriolis.3DS: 3D Studio model
myMissile.3DS: PCP memory mapped values (V.512)
myThargoid.3ds: 3D Studio model
Additional Information:
System Description
Attached Files: myMissile.3DS (1,366 bytes) 2021-12-05 15:15
https://bugs.astron.com/file_download.php?file_id=249&type=bug
Notes
(0003679)
polluks   
2021-12-06 12:22   
5.41 says "PCP memory mapped values (V.131072)" because of sgi vs cad
(0003680)
christos   
2021-12-06 19:33   
bumped strength


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
300 [file] General minor always 2021-11-20 08:45 2021-11-21 19:40
Reporter: JoshuaFern Platform: Linux  
Assigned To: christos OS: NixOS  
Priority: normal OS Version: 21.11  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Long GIF image doesn't show correct resolution
Description: I have a very long GIF, 630 x 52337. When running file the larger resolution is blank.
Tags:
Steps To Reproduce: Run the following:

file acidshowcase.gif
acidshowcase.gif: GIF image data, version 89a, 630 x
Additional Information: I'm new to file, hopefully this bug report is valid and useful. I will upload the file in question upon request, it's too large to upload to this bugtracker.
Attached Files:
Notes
(0003678)
christos   
2021-11-20 13:34   
No need, I can reproduce with:
$ pbmmake 630 52337 | pamtogif > /tmp/foo.gif

Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
293 [file] General minor always 2021-10-20 13:01 2021-11-19 16:49
Reporter: Yardanico Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: feedback Product Version: 5.40  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: file is too strict with Nim file detection
Description: https://bugs.astron.com/view.php?id=273 reported that Nim wasn't in the file database and Nim magic was subsequently added to the `file`, but it's too strict - the current magic definition was clearly written for `koch.nim` specifically, but Nim is a programming language with a lot of possible code.

I think checking for `import` and `let` is enough for some minimal level of detection (if no other language matches that), and then for more accurate checking you can also check for the existence of `proc` or `echo`. I'm aware that quite a lot of languages have both`import` and `let` as keywords, but I think that file's magics would be enough to differentiate between them?
Tags: bug, magic, nim-lang
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003660)
christos   
2021-10-28 15:49   
Well, you know the language better, why don't you propose a patch?


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
291 [file] General minor have not tried 2021-09-24 14:09 2021-11-16 19:35
Reporter: rootkea Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: scan-build reports multiple logic errors
Description: Hello!

On latest master, scan-build[0] reports multiple logical errors. Please see the scan-build report here: https://rootkea.gitlab.io/file/scan-build/

Here's the .gitlab-ci.yml which generated this scan-build report: https://gitlab.com/rootkea/file/-/raw/gitlab-scan-build/.gitlab-ci.yml

Thanks!

[0] https://clang-analyzer.llvm.org/scan-build
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003664)
christos   
2021-10-28 16:37   
Fixed all but the vfork() ones. Yes, lseek(2)/close(2) are not strictly legal to call after vfork(2), but if you follow the strict rules, then the number of cases you can use vfork(2) become close to 0.
(0003667)
rootkea   
2021-10-29 05:11   
Can we replace `vfork()` with safer `posix_spawn()` as suggested by scan-build?
(0003677)
christos   
2021-11-16 19:35   
We can, and perhaps we should. It is a bit of work though... We'd also want to keep the old vfork code (or change it to use fork) for systems that don't have posix_spawn


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
271 [file] General minor always 2021-06-13 06:11 2021-11-14 12:21
Reporter: DaarkWel Platform:  
Assigned To: administrator OS:  
Priority: normal OS Version:  
Status: assigned Product Version: 5.40  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: MIME type for .nef needs to be image/x-nikon-nef
Description: MIME type for .nef files is "image/tiff" but it needs to be "image/x-nikon-nef". Mimetype from perl-file-mimeinfo shows it right.

file --mime-type /tmp/DSC_1234.nef
/tmp/DSC_1234.nef: image/tiff

mimetype /tmp/DSC_1234.nef
/tmp/DSC_1234.nef: image/x-nikon-nef
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003614)
administrator   
2021-06-30 09:46   
Do you have an example .nef file I can test with?
(0003629)
DaarkWel   
2021-07-14 08:24   
Sorry for delay. Yes.

https://mega.nz/file/4MsXkKJJ#kH1v5XHPXkWReCJBFFh-RKJW_1aTGCVm9N6wpAFZkbY
(0003640)
maxicarlos0   
2021-08-30 11:42   
Same issue here, this causes a lot if image viewing programs to fail opening raw images (NEF, ARW tested)
(0003674)
Tamaranch   
2021-11-14 12:20   
Attached is another example of a NEF file recognized as TIFF.
It comes from https://filesamples.com/categories/image where you can also find examples of PEF and DNG files recognized as TIFF.
(0003675)
Tamaranch   
2021-11-14 12:21   
Not possible to attach the file in fact: too big.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
298 [file] General minor always 2021-11-08 07:53 2021-11-14 12:12
Reporter: Tamaranch Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: SVGZ compressed image files created by Inkscape are identified as application/gzip
Description: Inkscape can create two types of SVGZ formats: lossy and lossless.
Both are identified by `file` or `magic_file()` as application/gzip, while a library like librsvg is able to recognize them and allow their loading via gdk-pixbuf.
Tags: magic
Steps To Reproduce: Open an SVG image file in Inkscape and save it as SVGZ.
Additional Information:
Attached Files: splash.svg (11,867 bytes) 2021-11-13 18:22
https://bugs.astron.com/file_download.php?file_id=245&type=bug
org.xfce.ristretto_inkscape_simple.svgz (4,530 bytes) 2021-11-13 18:22
https://bugs.astron.com/file_download.php?file_id=244&type=bug
org.xfce.ristretto_inkscape.svgz (5,246 bytes) 2021-11-13 18:22
https://bugs.astron.com/file_download.php?file_id=243&type=bug
org.xfce.ristretto.svg (30,221 bytes) 2021-11-13 18:22
https://bugs.astron.com/file_download.php?file_id=242&type=bug
org.xfce.mousepad_inkscape_simple.svgz (8,650 bytes) 2021-11-13 18:22
https://bugs.astron.com/file_download.php?file_id=241&type=bug
org.xfce.mousepad_inkscape.svgz (9,394 bytes) 2021-11-13 18:22
https://bugs.astron.com/file_download.php?file_id=240&type=bug
org.xfce.mousepad.svg (98,088 bytes) 2021-11-13 18:22
https://bugs.astron.com/file_download.php?file_id=239&type=bug
splash_inkscape_simple.svgz (4,510 bytes) 2021-11-13 18:22
https://bugs.astron.com/file_download.php?file_id=247&type=bug
splash_inkscape.svgz (4,838 bytes) 2021-11-13 18:22
https://bugs.astron.com/file_download.php?file_id=246&type=bug
Notes
(0003669)
christos   
2021-11-13 17:49   
Can you please attach a couple of sample files?
(0003670)
Tamaranch   
2021-11-13 18:22   
Here are three file: original (*.svg), Inkscape lossly (*_inkscape_simple.svgz), Inkscape lossless (*_inkscape.svgz).
(0003671)
christos   
2021-11-14 01:08   
Thanks, these are just compressed files; use file -z to see what's inside.
(0003672)
Tamaranch   
2021-11-14 11:38   
Oh right, and so the `MAGIC_COMPRESS` flag for libmagic.
I was misled by the fact that the .svgz (or .svg.gz) files produced by `convert file.svg file.svgz` were recognized directly.
But they are actually still SVG files in this case, unlike the files linked here.

I'll try to push my tests a bit further and read the documentation better before reporting a bug next time, sorry ^^'.
Thanks!
(0003673)
christos   
2021-11-14 12:12   
No fix necessary, files are just compressed.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
286 [file] General minor always 2021-09-04 15:53 2021-11-14 11:57
Reporter: maxicarlos0 Platform: 64 Bit  
Assigned To: christos OS: Arch Linux  
Priority: normal OS Version: up to date  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: MIME of raw images detected as image/tiff
Description: Recently, `file` started to detect raw images (NEF and ARW tested) as image/tiff.
This breaks a lot of image viewing programs such as Gwenview, lximage, Okular, and many more.

This wasn't like that always, I remember being able to correctly detect raw images a month ago or so.

Thanks in advance.
Tags:
Steps To Reproduce: Get a raw image (eg. http://www.luminescentphoto.com/nx2/nefs.html)
run `file RAW_IMAGE` or `file --mime-type RAW_IMAGE`
The output will say that it's a image/tiff image when it should be image/x-nikon-nef
Additional Information: `mimetype` recognizes the MIME correctly...
System Description
Attached Files:
Notes
(0003643)
christos   
2021-09-11 19:33   
I don't think that file(1) ever reported anything else but tiff for these files. They are tiff files after all...
(0003644)
maxicarlos0   
2021-09-11 20:48   
you might be right, I just checked and the last time file got updated was something around May, way before this problem started happening (in KDE)
(0003653)
christos   
2021-10-28 15:33   
Feedback timeout.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
299 [file] General minor always 2021-11-12 08:24 2021-11-13 17:48
Reporter: adepasquale Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.41  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Separate magic for uuencode and xxencode
Description: Current file 5.41 doesn't distinguish between uuencode and xxencode.

I have updated the magic files based on existing comments, see attached patch.

Note that some archivers (e.g. https://wiki.powerarchiver.com/en:help:main:tools:uuencode_xxencode_mime_base64_yenc) do not have the "begin " keyword at zero, so I had to use a regex/1024.
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: uuencode.patch (1,276 bytes) 2021-11-12 08:24
https://bugs.astron.com/file_download.php?file_id=238&type=bug
Notes
(0003668)
christos   
2021-11-13 17:48   
Applied, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
292 [file] General minor always 2021-10-18 11:00 2021-10-28 16:41
Reporter: mikewalrus Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: HTML files with LaTeX keywords are identified as LaTeX files.
Description: An HTML file containing magic entries like `\begin' is identified as a "LaTeX document".
Tags: magic
Steps To Reproduce: Save the following to a.html, and run file on it.

<!doctype html>
<html>
<head>
<title>title</title>
</head>
<body>


\begin{a}


</body>
</html>
Additional Information:
Attached Files:
Notes
(0003666)
christos   
2021-10-28 16:41   
Bumped the strength of HTML to beat LaTeX, but it is unclear if this is a win considering the opposite case :-)


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
296 [file] General major always 2021-10-24 10:21 2021-10-28 16:39
Reporter: eliz Platform: MinGW  
Assigned To: christos OS: MS-Windows  
Priority: normal OS Version: XPSP3  
Status: assigned Product Version: 5.40  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Problems building file-5.41 with MinGW on MS-Windows
Description: I've built the latest version 5.41 of file natively on MS-Windows using MinGW tools. (Yes, version 5.41; the bug tracker doesn't allow to select that version when reporting an issue.)

I've found several problems while building, described below. Let me know if you want me to attach proposed patches for any of those.

First, there are multiple compilation warnings due to C99 formatted output features, like the '#' flag and the %z or %j descriptors. (file.h has portability macros for taking care of that, but they are not used everywhere, and don't cover the '#' flag, for example.)

Next, compress.c triggers several warnings because variables and functions used only if HAVE_FORK are declared or defined without that conditional, so they are unused in a build without HAVE_FORK.

The function 'sread' uses 'ioctl' for sockets and pipes, but that cannot compile on Windows, and won't work even if tweaked (e.g., 'select' doesn't work on Windows pipes). So this needs to be #ifdef'ed away. Same for calls to 'fcntl' in funcs.c.

In magic.c, I added support for the HOME environment variable on MS-Windows. While this variable is not normally set on Windows, many users of ported Unix software have it set, so it is useful to support that.

The function 'unreadable_info' in magic.c calls 'access' with X_OK, which doesn't work well, or not at all, on MS-Windows, so I replaced it with a simple test of the file's extension to detect executables based on that.

Thanks for developing this package.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: DIFFS.mingw (12,033 bytes) 2021-10-28 16:39
https://bugs.astron.com/file_download.php?file_id=237&type=bug
Notes
(0003657)
christos   
2021-10-28 15:36   
Sure, please send patches.
(0003665)
eliz   
2021-10-28 16:39   
I attach patches for MinGW-related issues.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
297 [file] General major always 2021-10-24 13:52 2021-10-28 16:34
Reporter: eliz Platform: MinGW  
Assigned To: christos OS: MS-Windows  
Priority: normal OS Version: XPSP3  
Status: assigned Product Version: 5.40  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Looking inside compressed files disabled in builds which don't define HAVE_FORK
Description: Building version 5.41 on a system that doesn't have 'fork' disables built-in support for accessing compressed files, even if decompression libraries (zlib, liblzma, etc.) are available and enabled during configure run. For example 'file_zmagic' isn't even called if HAVE_FORK is not defined to a non-zero value.

I've reshuffled the various #define's and made some minor changes to the code to enable built-in decompression support when HAVE_FORK is not available. let me know if you are interested in the patch to do that.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: DIFFS.fork (6,155 bytes) 2021-10-28 16:34
https://bugs.astron.com/file_download.php?file_id=236&type=bug
Notes
(0003651)
christos   
2021-10-28 15:31   
Sure, please send me the patch.
(0003663)
eliz   
2021-10-28 16:34   
I attach the patches for the 'fork' problem.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
295 [file] General minor always 2021-10-23 17:31 2021-10-28 16:20
Reporter: eliz Platform: MS-Windows  
Assigned To: christos OS: XP  
Priority: normal OS Version: SP3  
Status: assigned Product Version: 5.41  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: pgp-binary-key-v4-dsa test fails due to incorrect expected result
Description: This test fails:

     Running test: ../tests/pgp-binary-key-v4-dsa.testfile
     ../tests/pgp-binary-key-v4-dsa.testfile: OpenPGP Public Key Version 4, Created Mon Apr 07 22:23:01 1997, DSA (1024 bits); User ID; Signature; OpenPGP Certificate
     test.exe: ERROR: result was (len 120)
     OpenPGP Public Key Version 4, Created Mon Apr 07 22:23:01 1997, DSA (1024 bits); User ID; Signature; OpenPGP Certificate
     expected (len 121)

This is because the expected result says "Mon Apr 7" instead of "Mon Apr 07" (2 spaces instead of a space an zero). Correcting the expected result makes the test succeed.
Tags:
Steps To Reproduce: make check
Additional Information:
Attached Files:
Notes
(0003650)
eliz   
2021-10-24 05:56   
By the way, this is for version 5.41, not 5.40; but the bug tracker doesn't allow to select 5.41 as the version for the bug.
(0003658)
christos   
2021-10-28 15:43   
Your libc is broken: file(1) just uses ctime/asctime: https://pubs.opengroup.org/onlinepubs/9699919799/
(0003659)
christos   
2021-10-28 15:47   
Added 5.41 to the list of releases.
(0003662)
eliz   
2021-10-28 16:20   
That broken libc is MSVCRT. the C runtime used by MinGW builds on MS-Windows. So I guess this means that test will fail for any MinGW build, unless the expected results are manually "fixed".


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
294 [file] General feature always 2021-10-21 21:06 2021-10-28 15:53
Reporter: Jamie Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.42  
    Target Version:  
Summary: Magic bytes c64 .dfi format
Description: Hello everyone,

attached is a patch for recognizing the c64 .dfi file format.

Currently such files are reported as data:

    $ file --version
    file-5.40
    magic file from /usr/share/file/misc/magic
    seccomp support included

    $ file --keep-going 10_Years_HVSC.dfi
    10_Years_HVSC.dfi: data

I got the structure from https://www.lemon64.com/forum/viewtopic.php?t=37415&sid=494dc2ca91289e05dadf80a7f8a968fe (at the bottom).
More general information about the format can be found at https://www.c64-wiki.com/wiki/DreamLoad.

An example file can be found in the HVSC Commodore 64 music collection, for example at https://kohina.duckdns.org/HVSC/C64Music/10_Years_HVSC.dfi.

Do you think it makes sense to include this?

Best,
Jamie
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: c64.patch (476 bytes) 2021-10-21 21:06
https://bugs.astron.com/file_download.php?file_id=235&type=bug
Notes
(0003649)
Jamie   
2021-10-21 21:09   
With the patch applied, the output is as follows:

    $ file --magic-file tmp.magic 10_Years_HVSC.dfi
    10_Years_HVSC.dfi: DFI Image version: 1.0 tracks: 4
(0003661)
christos   
2021-10-28 15:53   
Added!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
276 [file] General minor always 2021-07-26 15:24 2021-10-28 15:33
Reporter: abathur Platform: intel/x86_64  
Assigned To: christos OS: macOS  
Priority: normal OS Version: 10.15  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: between 5.37 and 5.39, file starts identifying bin/sh script with patched shebang as awk/perl
Description: I noticed a patched copy of esh 0.1.1 getting identified as an "awk or perl script" by newer versions of file (confirmed I see this in file 5.39 from nixpkgs and file 5.40 from homebrew).

I've attached an unpatched copy named `esh` and a patched copy named `nix_esh`, but I assume the salient part is the fact that it has had its shebang patched from /bin/sh to /nix/store/pcjan45rssdn01cxx3sjg70avjg6c3ni-bash-4.4-p23/bin/sh

Output here is captured on macOS, but I've confirmed the same behavior with file 5.39 in Linux (NixOS).
Tags:
Steps To Reproduce: # system/macOS file
$ file --version
file-5.37
magic file from /usr/share/file/magic

# unpatched /bin/sh shebang
$ file esh
esh: POSIX shell script text executable, ASCII text

# shebang patched by Nix to /nix/store/pcjan45rssdn01cxx3sjg70avjg6c3ni-bash-4.4-p23/bin/sh
$ file nix_esh
nix_esh: a /nix/store/pcjan45rssdn01cxx3sjg70avjg6c3ni-bash-4.4-p23/bin/sh script text executable, ASCII text

# file from nixpkgs
$ file --version
file-5.39
magic file from /nix/store/77p3lid93i5xjgdi9vkj3zqcpf2zddlw-file-5.39/share/misc/magic

# unpatched /bin/sh shebang
$ file esh
esh: POSIX shell script, ASCII text executable

# shebang patched by Nix to /nix/store/pcjan45rssdn01cxx3sjg70avjg6c3ni-bash-4.4-p23/bin/sh
$ file nix_esh
nix_esh: awk or perl script, ASCII text

# file from homebrew
$ file --version
file-5.40
magic file from /usr/local/Cellar/libmagic/5.40/share/misc/magic

# unpatched /bin/sh shebang
$ file esh
esh: POSIX shell script, ASCII text executable

# shebang patched by Nix to /nix/store/pcjan45rssdn01cxx3sjg70avjg6c3ni-bash-4.4-p23/bin/sh
$ file nix_esh
nix_esh: awk or perl script, ASCII text
Additional Information:
Attached Files: nix_esh (4,710 bytes) 2021-07-26 15:24
https://bugs.astron.com/file_download.php?file_id=232&type=bug
esh (4,302 bytes) 2021-07-26 15:24
https://bugs.astron.com/file_download.php?file_id=231&type=bug
Notes
(0003630)
christos   
2021-07-30 08:44   
With the HEAD of the file code this reports:
$ ./file -m ../magic/magic.mgc ~/nix_esh
/Users/christos/nix_esh: a /nix/store/pcjan45rssdn01cxx3sjg70avjg6c3ni-bash-4.4-p23/bin/sh script, ASCII text executable
(0003633)
abathur   
2021-07-31 22:28   
Thanks--I'll keep an eye out for the next release.

(I do see the same after figuring out how to rebuild the Nix package from the latest commit on the GH mirror).
(0003654)
christos   
2021-10-28 15:33   
Release has been out.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
290 [file] General minor always 2021-09-20 13:34 2021-10-28 15:32
Reporter: ChaoticRoman Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Wrong installation instructions in both INSTALL and README.DEVELOPER files
Description: In the INSTALL, there are generic "./configure; make; make install" instructions but it seems that correct process is

autoreconf -f -i
./configure --disable-silent-rules
make
make install

In the README.DEVELOPER, there is

    autoreconf -f -i
    make distclean
    ./configure --disable-silent-rules
    make -j4
    make -C tests check

but "make distclean" would complain "make: *** No rule to make target distclean'".

Tested on Ubuntu 20.04 by myself.

Original report: https://stackoverflow.com/questions/69222631/46-regex-error-17-for-dryad-bibo-v0-9-0-9-match-failed
Tags:
Steps To Reproduce: ./configure
autoreconf -f -i
make distclean
Additional Information:
Attached Files:
Notes
(0003645)
christos   
2021-09-20 14:05   
I added a comment that it can fail in README.DEVELOPER. The INSTALL file is fine because it instructs you to use 'make distclean' to cleanup after a previous build.
(0003652)
christos   
2021-10-28 15:32   
Feedback timeout.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
288 [file] General minor always 2021-09-13 07:18 2021-09-20 17:46
Reporter: Cirn09 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Naming conflict when compiling libmagic as Windows static library
Description: hello,
I want build `libmagic` as static library for Windows. But `libmagic` define the `DllMain` in `magic.c`:
```
/* Placate GCC by offering a sacrificial previous prototype */
BOOL WINAPI DllMain(HINSTANCE, DWORD, LPVOID);

BOOL WINAPI
DllMain(HINSTANCE hinstDLL, DWORD fdwReason,
    LPVOID lpvReserved __attribute__((__unused__)))
{
    if (fdwReason == DLL_PROCESS_ATTACH)
        _w32_dll_instance = hinstDLL;
    return 1;
}
```
This causes naming conflicts when linking libmagic to a dynamic library:
> `libmagic.lib(magic.obj) : error LNK2005: DllMain already defined in dllmain.obj`

`DllMain` is only used to initialize `_w32_dll_instance`.
In fact, for static libraries, `_w32_dll_instance` is not needed.
`_w32_dll_instance` only used in `get_default_magic`

```
private const char *
get_default_magic(void)
{
...
    /* Fourth, try to get magic file relative to exe location */
        _w32_get_magic_relative_to(&hmagicpath, NULL);

    /* Fifth, try to get magic file relative to dll location */
        _w32_get_magic_relative_to(&hmagicpath, _w32_dll_instance);
    /* Fifth, try to get magic file relative to dll location */
        _w32_get_magic_relative_to(&hmagicpath, _w32_dll_instance);
...
static void
_w32_get_magic_relative_to(char **hmagicpath, HINSTANCE module)
{
...
    if (!GetModuleFileNameA(module, dllpath, MAX_PATH))
...
```
> Parameters
> hModule
> A handle to the loaded module whose path is being requested. If this parameter is **NULL**,
> GetModuleFileName retrieves the path of the executable file of the current process.
> https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-getmodulefilenamea

I think to solve this problem need to add a new macro judgment, like:
```
#ifndef BUILD_AS_WINDOWS_STATIC_LIBARAY
BOOL WINAPI DllMain(HINSTANCE, DWORD, LPVOID)
...
#endif
```
Tags: build
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003646)
christos   
2021-09-20 17:46   
Added as suggested, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
287 [file] General minor always 2021-09-09 17:23 2021-09-11 19:20
Reporter: alealbonico Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: MIME of message/rfc822 detected as text/plain
Description: Email files with MIME message/rfc822 are being detected as text/plain because said file's header starts with "Date:" instead of "From:" or "Received:".

I checked the documentation for the standard way to write a rfc822 message and it looks like date is actually allowed to be on the first line, too.

Thanks in advance.
Tags:
Steps To Reproduce: Get a rfc822 file and place the header date as the first line in it.

Example of file:
-------

Date: Fri, 07 Aug 2020 02:09:32 +0100
From: "xxxx" <xxxx@gmail.com>
To: "yyyy" <yyyy@gmail.com>
Subject: zzzzzzzz
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="--464994466596adLKMdfn3566452152"
X-Rejection-Reason: zzzzzz

----464994466596adLKMdfn3566452152
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

[redacted]
----464994466596adLKMdfn3566452152
Content-Type: application/msword; name="xxxx"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="xxxx"

xyz

----464994466596adLKMdfn3566452152--

-------

Try to get the MIME type.
Output: text/plain

Notice that removing date and leaving "from" on top instead will return the correct output.
Additional Information:
Attached Files:
Notes
(0003642)
christos   
2021-09-11 19:20   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
285 [file] General block always 2021-08-26 09:25 2021-09-09 17:49
Reporter: Benjamin Platform: Linux  
Assigned To: christos OS: Ubuntu  
Priority: normal OS Version: Ubuntu 18.04.5  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Python biding of libmagic crash when trying to detect encoding in multithread
Description: Hello,

When i use the following code in a multithreaded environment (some django workers using Huey or Dramatiq)

magic.detect_from_content(
            bytes_file.read(MAX_BYTES)
        ).encoding

I got some:

  File "/home/benjamin/.cache/pypoetry/virtualenvs/qvNem8AN-py3.9/lib/python3.9/site-packages/magic.py", line 284, in detect_from_content
    return _create_filemagic(mime_magic.buffer(byte_content),
  File "/home/benjamin/.cache/pypoetry/virtualenvs/qvNem8AN-py3.9/lib/python3.9/site-packages/magic.py", line 251, in _create_filemagic
    mime_type, mime_encoding = mime_detected.split('; ')
ValueError: too many values to unpack (expected 2)

With the reproductible code given in this ticket errors can be differents ("munmap_chunk(): invalid pointer", "free(): invalid size", "double free or corruption (fasttop)")


It works correctly on a single thread.

Tags: magic
Steps To Reproduce:

import threading

import magic

MAX_BYTES = 4096


def thread_function():
    with open("/tmp/foo.csv", "rb") as bytes_file:
        print(magic.detect_from_content(
            bytes_file.read(MAX_BYTES)
        ).encoding)


if __name__ == '__main__':
    threads = list()
    for index in range(3):
        thread = threading.Thread(target=thread_function)
        threads.append(thread)
        thread.start()

    for index, thread in enumerate(threads):
        thread.join()
Additional Information:
My ubuntu libmagic1 version is: 1:5.32-2ubuntu0.4
My file-magic python packaque is 0.4.0
Attached Files:
Notes
(0003641)
christos   
2021-09-09 17:49   
Thanks, fixed in HEAD. Will release as 0.4.1 when I release the next version of file(1).


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
284 [file] General feature always 2021-08-18 23:35 2021-08-24 09:25
Reporter: ntavares Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: New file format - SER video sequence
Description: Hi there,

I found it strange that there was no magic for SER files, please add it. This is a very popular video format among astrophotographers.

It's a video file, so I'm not attaching a sample. I could provide the header from a real file, though, if it's useful.

Just let me know.
-NT
Tags:
Steps To Reproduce: # SER file format - simple uncompressed video format for astronomical use
# Initially developed by Lucam Recorder, as of 2021 maintained by Heiko Wilkens, Grischa Hahn
# Typical extensions: .SER
# V3 - http://www.grischa-hahn.homepage.t-online.de/astro/ser/SER%20Doc%20V3b.pdf

0 string LUCAM-RECORDER SER video sequence
>18 lelong 0 \b, bayer: mono
>18 lelong 8 \b, bayer: RGGB
>18 lelong 9 \b, bayer: GRBG
>18 lelong 10 \b, bayer: GBRG
>18 lelong 11 \b, bayer: BGGR
>18 lelong 16 \b, bayer: CYYM
>18 lelong 17 \b, bayer: YCMY
>18 lelong 18 \b, bayer: YMCY
>18 lelong 19 \b, bayer: MYYC
>18 lelong 100 \b, bayer: RGB
>18 lelong 101 \b, bayer: BGR
>22 lelong 0 \b, big-endian
>22 lelong 1 \b, little-endian
>26 lelong x \b, width: %d
>30 lelong x \b, height: %d
>34 lelong x \b, %d bit
>38 lelong x \b, frames: %d
Additional Information:
Attached Files:
Notes
(0003639)
christos   
2021-08-24 09:25   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
281 [file] General trivial always 2021-08-11 16:08 2021-08-16 10:20
Reporter: pwinckles Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: epub files report duplicate format names
Description: When you execute file on an epub file the output contains a duplicate format name: EPUB document EPUB document
Tags:
Steps To Reproduce: 1. Get an epub file (example: https://github.com/openpreserve/format-corpus/blob/master/ebooks/calibre%200.9.0/lorem-ipsum.epub)
2. Execute file on the file
3. See output like: lorem-ipsum.epub: EPUB document EPUB document
Additional Information:
Attached Files:
Notes
(0003637)
christos   
2021-08-16 10:20   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
279 [file] General major always 2021-08-05 03:20 2021-08-16 10:13
Reporter: Jayc3Ca0 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Java source code is reported as C source file
Description: Java source code file is reported as C source file by file command.
Tags: magic
Steps To Reproduce: Write the following Java sample code (TestJava.java):

public class TestJava {
    public static void main(String[] args) {
        System.out.println("Hello Java");
    }
}

Use file command of the latest version to check its type:

$ file TestJava.java
TestJava.java: C source, ASCII text


Additional Information: Tested on macOS 11.3.1 with version 5.40, which is installed by homebrew.
Also tested on CentOS 7 with version 5.11.37, which is installed by yum.
Attached Files:
Notes
(0003636)
christos   
2021-08-16 10:13   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
280 [file] General trivial always 2021-08-09 21:55 2021-08-16 10:07
Reporter: rouca Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Detect silverlight
Description: Hi,

Just send a mail to mailing list copy paste here:
In order to remove problematic source from debian we want to detect
silverlight compiled program

Some example:
http://cespage.com/silverlight/tutorials/sl4tut1.xap
https://www.microsoft.com/silverlight/new-controls/demo/System.Windows.Controls.Samples.xap

It seems that AppManifest.xaml string is present in the zip file
(uncompressed)...

Thanks

Bastien
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003635)
christos   
2021-08-16 10:07   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
277 [file] General feature N/A 2021-07-26 18:07 2021-07-30 12:25
Reporter: polluks Platform: PowerBook5,8  
Assigned To: christos OS: MorphOS  
Priority: normal OS Version: 3.15  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: More Plan 9
Description: Added object files
Tags: magic
Steps To Reproduce:
Additional Information:
System Description
Attached Files: plan9 (1,146 bytes) 2021-07-26 18:07
https://bugs.astron.com/file_download.php?file_id=233&type=bug
Notes
(0003632)
christos   
2021-07-30 12:25   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
275 [file] General minor always 2021-07-19 13:50 2021-07-30 11:47
Reporter: pwinckles Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: PDFs with /Filter/FlateDecode streams are incorrectly marked as "password protected"
Description: This commit[1] introduces a change that appears to label PDFs containing /Filter/FlateDecode streams as "password protected". I'm no PDF expert, but this designation seems incorrect as it just indicates that the stream is compressed. It may or may not be encrypted.


[1] https://github.com/file/file/commit/629972a91e05fcad8a1b5d906344838539b5f7ab#diff-1d80c89187edc2a2fab5b3ef59fadc199e03d7c8319e7e41e2bd1f329c00fee7
Tags:
Steps To Reproduce: 1. Download the following example PDF: https://github.com/harvard-lts/fits/blob/dev/testfiles/PDF_embedded_resources.pdf
2. Execute: file PDF_embedded_resources.pdf
3. See the following output, even though the file is not password protected: PDF_embedded_resources.pdf: PDF document, version 1.6 (password protected)
Additional Information:
Attached Files:
Notes
(0003631)
christos   
2021-07-30 11:47   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
239 [file] General feature N/A 2021-02-18 22:57 2021-07-27 22:20
Reporter: polluks Platform: PowerBook5,8  
Assigned To: christos OS: MorphOS  
Priority: normal OS Version: 3.15  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: More IFF magic again
Description: --- iff.bak 2020-10-27 00:41:15 +0100
+++ iff 2021-02-18 22:41:57 +0100
@@ -44,6 +44,7 @@
 >8 string FANT \b, Fantavision animation
 >8 string ACBM \b, ACBM continuous image
 >8 string FAXX \b, FAXX fax image
+>8 string STFX \b, ST-Fax image
 # other formats
 >8 string FTXT \b, FTXT formatted text
 >8 string CTLG \b, CTLG message catalog
Tags: magic
Steps To Reproduce:
Additional Information:
System Description
Attached Files:
Notes
(0003554)
polluks   
2021-02-20 17:34   
Update
--- iff.bak 2020-10-27 00:41:15 +0100
+++ iff 2021-02-20 18:28:42 +0100
@@ -44,6 +44,7 @@
 >8 string FANT \b, Fantavision animation
 >8 string ACBM \b, ACBM continuous image
 >8 string FAXX \b, FAXX fax image
+>8 string STFX \b, ST-Fax image
 # other formats
 >8 string FTXT \b, FTXT formatted text
 >8 string CTLG \b, CTLG message catalog
@@ -54,6 +55,7 @@
 >8 string WZRD \b, WZRD StormWIZARD resource
 >8 string DOC\ \b, DOC desktop publishing document
 >8 string SWRT \b, SWRT Final Copy/Writer document
+>8 string WORD \b, ProWrite document
 >8 string WTXT \b, WTXT Wordworth document
 >8 string WOWO \b, WOWO Wordworth document
 >8 string WVQA \b, Westwood Studios VQA Multimedia,
(0003556)
christos   
2021-02-23 01:07   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
274 [tcsh] General minor always 2021-07-09 12:04 2021-07-09 16:10
Reporter: polarnik Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 6.23.00  
    Target Version:  
Summary: Regression: slashes dropped on directory tab completion
Description: A commit on 2021-07-05 introduced a regression, in that terminal slashes are now dropped on directory tab completion.
Tags:
Steps To Reproduce: $ cd /bin/<tab>

expected: dir1/ dir2/ ...
actual: dir1 dir2
Additional Information: The last known working version is ed9ba69fe360d5b1110e4b1e71995ccf3eb72925.

The regression was introduced by one of two commits on 2021-07-05:

b5160d8de71e29c7cfa29efd26bcf149863ac544 -or- 9ad196ca789236217a9d7d330bc08a4e0b38bd57


----------

A subset of my settings:

set autoexpand
set autolist
set filec
complete cd 'p/1/d/'
Attached Files:
Notes
(0003627)
polarnik   
2021-07-09 12:08   
Minor correction: in steps to repro, it should have been a directory with subdirs: e.g. cd /<tab>
(0003628)
christos   
2021-07-09 16:10   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
273 [file] General minor have not tried 2021-07-04 18:25 2021-07-05 11:56
Reporter: timothee Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: nim files reported as ASCII text
Description: nim files reported as ASCII text
eg: this file https://github.com/nim-lang/Nim/blob/devel/koch.nim
should be reported as
nim program text, ASCII text
instead of: ASCII text


Tags:
Steps To Reproduce: create a file koch.nim with a nim extension (or download https://github.com/nim-lang/Nim/blob/devel/koch.nim)
type `file koch.nim`
shows:
koch.nim: ASCII text
Additional Information:
Attached Files:
Notes
(0003625)
christos   
2021-07-05 09:49   
Committed some magic.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
258 [tcsh] General feature always 2021-04-10 17:08 2021-07-05 10:41
Reporter: ajr Platform: Mac  
Assigned To: christos OS: macOS Big Sur  
Priority: normal OS Version: 11.2.3  
Status: resolved Product Version: 6.22.03  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 6.23.00  
    Target Version:  
Summary: ls-F and file expansion do not support 'ln=target' in LS_COLORS
Description: GNU ls and bash allow for 'ln=target' in LS_COLORS. This sets the color to that of the file pointed to while maintaining the '@' suffix. In tcsh it comes out as "argetm" since it assembles the color string as "\e[targetm".

Changes to tw.decls.h, tw.color.c, and tw.parse.c (attached) implements support for ln=target
Tags:
Steps To Reproduce:
Additional Information: Diffs can be seen at https://github.com/ajrosen/tcsh/commit/b516f30f4849267a1e953c4f3fa613805415bf82
Attached Files: tw.parse.c (58,277 bytes) 2021-04-10 17:08
https://bugs.astron.com/file_download.php?file_id=219&type=bug
tw.decls.h (5,030 bytes) 2021-04-10 17:08
https://bugs.astron.com/file_download.php?file_id=218&type=bug
tw.color.c (13,267 bytes) 2021-04-10 17:08
https://bugs.astron.com/file_download.php?file_id=217&type=bug
Notes
(0003626)
christos   
2021-07-05 10:41   
committed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
272 [file] General minor always 2021-06-28 09:27 2021-07-01 07:52
Reporter: kiefermat Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Missing separator in multiple mimetypes for some files
Description: When getting the mimetype of an mp3 file (e.g. https://filesamples.com/samples/audio/mp3/sample1.mp3) using the -k flag ("file -k --mime-type sample1.mp3"), file 5.40 reports the following mimetype:
audio/mpegapplication/octet-stream

Version 5.39 reports it as
audio/mpeg
- application/octet-stream
Tags:
Steps To Reproduce: file -k --mime-type sample1.mp3
Additional Information:
Attached Files:
Notes
(0003623)
christos   
2021-07-01 07:52   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
268 [file] General feature N/A 2021-06-03 12:04 2021-06-30 12:01
Reporter: benedikt Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: reopened  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Magic number and MIME type registration for Resilient Logic
Description: To whom I may concern,

I have just registered the MIME-type application/vnd.resilient.logic for my company.
The official entry at IANA can be found here: https://www.iana.org/assignments/media-types/application/vnd.resilient.logic

I'd like to have it included in the list of magic numbers that file includes.
The magic number sequence of this file format is 0x07, 0x52, 0x4c, 0x4d.


Best regards,
Benedikt Muessig
Resilient TechEd GmbH
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: resilient_logic_test.rlm (87 bytes) 2021-06-30 10:40
https://bugs.astron.com/file_download.php?file_id=230&type=bug
Notes
(0003617)
christos   
2021-06-30 10:23   
Thanks, added magic, but without an example file, I can't test.
(0003619)
benedikt   
2021-06-30 10:39   
Hello Christos,

thank you for taking care of adding the magic number.
I am sorry for not providing you with an example file.

I've just uploaded one here: https://c.gmx.net/@702592736817053755/K7lhZLD2Rqujc5BS-bbm0w


Best regards,
Benedikt
(0003620)
benedikt   
2021-06-30 10:40   
For some reason the file upload section did not show up earlier. This is a re-upload of the other file.

BR,
Benedikt
(0003621)
benedikt   
2021-06-30 11:37   
I've got two more questions, if you don't mind:

In the updated mime file, it says ">4 beshort x \b, version %d".
I suppose that beshort means 16 bit, big endian.
The file version though is just one 8-bit byte, where the high 4 bits are the major and the low 4 bits the minor version.

The other question is, if it is possible to add the mime-type, so that it works with `file --mime-type`?

Best regards and thanks,
Benedikt
(0003622)
christos   
2021-06-30 12:01   
Should be all better now.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
267 [file] General major always 2021-05-31 17:15 2021-06-30 10:25
Reporter: xexaxo Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: File detects CPIO files as "application/octet-stream"
Description: Using --mime with CPIO files made by bsdtar, seems to detected incorrectly.

In particular:
 file --mime foo.cpio
 foo.cpio: application/octet-stream; charset=binary

On the other hand, when omitting the --mime it is detected properly:
 file foo.cpio
 foo.cpio: ASCII cpio archive (SVR4 with no CRC)

The file foo.cpio was created using the following:
 echo test > test-file
 LANG=C bsdtar --uid 0 --gid 0 --null -cf - --format=newc test-file > foo.cpio
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003609)
xexaxo   
2021-05-31 17:19   
In case it matters, xdg-mime correctly detects the file:
 xdg-mime query filetype foo.cpio
 application/x-cpio
(0003610)
polluks   
2021-06-03 21:45   
--- archive.bak 2021-04-26 09:54:42 +0200
+++ archive 2021-06-03 23:43:29 +0200
@@ -169,6 +169,7 @@
 !:mime application/x-cpio # encoding: swapped
 0 string 070707 ASCII cpio archive (pre-SVR4 or odc)
 0 string 070701 ASCII cpio archive (SVR4 with no CRC)
+!:mime application/x-cpio
 0 string 070702 ASCII cpio archive (SVR4 with CRC)
 
 #
(0003611)
xexaxo   
2021-06-04 15:10   
Thanks polluks - your suggestion works like a charm.
Now if I can figure out how this fix can get merged into the official tree...
(0003618)
christos   
2021-06-30 10:25   
Fixed, thanks


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
269 [file] General crash always 2021-06-07 16:40 2021-06-30 10:12
Reporter: roneyth Platform:  
Assigned To: christos OS:  
Priority: high OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Undefined Behavior: applying zero offset to null pointer
Description: Enabling Undefined Behavior Sanitizer (UBSAN) check for pointer overflow(-fsanitize=pointer-overflow) causes the below error to be detected in file/src/apprentice.c.

/src/apprentice.c:567:43: runtime error: applying zero offset to null pointer
    #0 0x7f9c571ef541 in apprentice_unmap src/apprentice.c:567:43
    0000001 0x7f9c571ef34b in mlist_free_one src/apprentice.c:611:3
    0000002 0x7f9c571ed261 in mlist_free src/apprentice.c:625:3
    0000003 0x7f9c571ed147 in file_ms_free src/apprentice.c:504:3
    0000004 0x7f9c572172ae in magic_close src/magic.c:291:2
    0000005 0x2f16d5 in main tests/test.c
    0000006 0x7f9c56008674 in __libc_start_main libc-start.c
    0000007 0x24aeb8 in _start elfstart.S

The code where error observed
    
                CAST(char *, b) <= CAST(char *, p) + map->len)

Tags:
Steps To Reproduce: clang++ -fsanitize=pointer-overflow sourcefile
Additional Information: FWIW. we have thought of a fix as :
CAST(char *, b) <= (p ? CAST(char *, p) + map->len : CAST(char *, map->len)))
I wonder if there isn't a more elegant solution . Please do check the issue and make a fix ASAP.
Attached Files:
Notes
(0003616)
christos   
2021-06-30 10:12   
Fixed, thanks.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
270 [file] General trivial always 2021-06-08 03:41 2021-06-30 10:09
Reporter: ContronThePanda Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: When printing with %c from magic file, some characters are still escaped as \ooo in output even with -r option
Description: Basically exactly what the title says, I came across this while trying to see if I could use a custom magic file to print a terminal bell. However, even if I pass the -r flag, file still escapes the terminal bell as \007 as long as I print it using %s. It's worth noting that it actually prints the bell character when I use the %c pattern. This only applies to %s, for some reason.
Tags: magic
Steps To Reproduce: I've attached both the custom magic file and a file containing a bell character to test it on. You can reproduce this by running `file -rm bell-magic bell-file`.
Additional Information:
Attached Files: bell-magic (15 bytes) 2021-06-08 03:41
https://bugs.astron.com/file_download.php?file_id=227&type=bug
bell-file (2 bytes) 2021-06-08 03:41
https://bugs.astron.com/file_download.php?file_id=226&type=bug
Notes
(0003615)
christos   
2021-06-30 10:09   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
266 [file] General minor always 2021-05-20 22:01 2021-06-08 18:13
Reporter: j2j Platform:  
Assigned To: OS:  
Priority: normal OS Version:  
Status: new Product Version: 5.40  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: False hits by Magdir/pgp-binary-keys
Description: when i run file command version 5.40 on some files with -k option i
often get also misidentification messages starting with "OpenPGP". See
appended output OpenPGP-bad-k.txt.

When looking inside sources i see that such messages are triggered by
magic lines inside Magdir/pgp-binary-keys. These magic lines should
identify OpenPGP files.

The above mentioned examples are handled by starting lines like
 0 ubyte&0xFC =0x94 OpenPGP Secret Key
 >&-1 use primary_key_length_old

After inspecting just one byte print a message starting with "OpenPGP
Secret Key" and then do some additional check by calling sub routine
like primary_key_length_old. Obviously checking only 1 byte is not
sufficient. So non PGP examples with starting byte 95h like
mathemusic, PEDE and samples starting with 97h like Event.Tdf,
RIRE6.SPL, Rx.GS and Welcome.Snd are misidentified.

The consistence check is done later by sub routine
pgp_binary_key_pk_check which checks for valid versions range (2-7)
and valid time stamps (after 1990).

The correct way would be to check some possible PGP packets for valid
version and time stamp. If this succeeds then afterwards display some
message text.

Furthermore the samples starting with 97h are also described inside
Magdir/pgp in a more unreliable way by line starting with
 0 byte 0x97 PGP Secret Sub-key -

So when check and describing part is done by Magdir/pgp-binary-keys
then remove the lines from Magdir/pgp.

Furthermore with --extension option the 31 byte string pgp/gpg/pkr/asd
is shown. For public "foo" extension pkr is used whereas for secret
"foo" the extension "skr" is used. So skr file is missing in the
following magic line: !:ext pgp/gpg/pkr/asd And when doing effort in
inspecting PGP packet for "OpenPGP Public Key" and "OpenPGP Secret
Key" then it would be a good thing to display afterwards the right
file name extension ( pkr or skr).

Furthermore the extension asd is listed as a possibility. As far as i
know i no PGP or related file exist with that extension.

My misidentified examples are stored in appended archive
OpenPGP-bad.zip.
Tags: PGP
Steps To Reproduce:
Additional Information:
Attached Files: OpenPGP-bad.zip (66,795 bytes) 2021-05-20 22:01
https://bugs.astron.com/file_download.php?file_id=225&type=bug
OpenPGP-bad-k.txt (578 bytes) 2021-05-20 22:01
https://bugs.astron.com/file_download.php?file_id=224&type=bug
0002-For-binary-PGP-keys-only-use-the-pgp-and-gpg-extensi.patch (792 bytes) 2021-06-08 18:12
https://bugs.astron.com/file_download.php?file_id=229&type=bug
0001-Show-information-about-a-PGP-key-only-if-we-have-a-s.patch (2,356 bytes) 2021-06-08 18:12
https://bugs.astron.com/file_download.php?file_id=228&type=bug
Notes
(0003608)
neal   
2021-05-29 21:27   
(I wrote pgp-binary-keys.)

Thanks for the thorough report. I tested a lot of true positives (a large portion of the SKS dump), but it seems I failed to test enough false positives. The code checks a lot of bits, so it should be unambiguous. I suspect the problem is that I just emit "OpenPGP Secret Key" too early.

As for the file extensions, I'm only aware of .pgp and .gpg. The other variants existed in the old version of the code, so I kept them assuming that they used to be used.

I'll take a look in the next few days.
(0003612)
neal   
2021-06-08 18:12   
The issue identifies three problems:

  1. Descriptions in pgp-binary-key are printed too eagerly.

  2. Descriptions in pgp (PGP Secret Sub-key) are printed too eagerly.

  3. The extensions listed in pgp-binary-key are wrong.

I've fixed one as j2j suggested. Unfortunately, I can't figure out how to distinguish public and secret keys anymore, because the first byte of the file is not accessible from a function ("use").

The other patch fixes 3. I've changed it to only report pgp and gpg as valid extensions. I've never actually seen srk, prk or adf used in practice and I've been doing PGP stuff for nearly a decade.
(0003613)
neal   
2021-06-08 18:13   
(I'll take a look at pgp and prune the secret subkey detection and some other stuff that is not actually useful in practice.)


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
196 [file] General minor have not tried 2020-09-01 14:02 2021-05-29 19:09
Reporter: neal Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: Improve pgp binary key detection support
Description: The pgp binary key detection support is broken in numerous ways:

  - It only deals with old-style CTBs (new-style CTBs were introduced in RFC 2440 released in 1998)

  - It only deals with two byte length encoding. With small keys (thanks to ECC), 1 byte length encoding is typical for new keys.

  - It's not terribly robust.

  - It prints the MPI prefix, which is completely meaningless to all users (the fingerprint is more interesting, but it is the hash of the public key, which we can't compute using magic).

The new version checks more data. It checks that the first three packets are sane before emitting a mime type. And, it standardizes the terminology (PGP/GPG key public ring is... unusual).
Tags:
Steps To Reproduce: Do test, I downloaded the SKS dump and extracted the first 262753 public keys.

$ cd /tmp
$ GNUPGHOME=$(mktemp -d)
$ for i in `seq 0 100`; do wget https://pgp.key-server.io/dump/current/sks-dump-$(printf %04d $i).pgp; gpg --import sks-dump-$(printf %04d $i).pgp; done
$ mkdir /tmp/keys
$ gpg -k --with-colons | grep '^pub' | awk -F: '{ print $5 }' | while read k; do gpg --export $k > keys/$k.pgp; done

Then I did:

~/src/file$ make && find /tmp/keys/ | xargs src/file -m magic/Magdir/pgp-binary-keys 2>&1 > /tmp/normal-detection.txt; find /tmp/keys/ | xargs src/file --mime-type -m magic/Magdir/pgp-binary-keys 2>&1 > /tmp/mime-detection.txt

I did this using the new version and the version installed in Debian.

My patched version detects all certificates and correctly prints the mime type for all certificates. The original version misses: 12095 certificates.
Additional Information:
Attached Files: 0006-Improve-detection-of-OpenPGP-binary-keys.patch (49,321 bytes) 2020-09-01 14:02
https://bugs.astron.com/file_download.php?file_id=176&type=bug
0001-Improve-detection-of-OpenPGP-binary-keys.patch (63,243 bytes) 2020-10-14 12:10
https://bugs.astron.com/file_download.php?file_id=184&type=bug
Notes
(0003472)
neal   
2020-09-01 14:03   
Note: this requires the fixes that I submitted to work.
(0003479)
christos   
2020-09-05 17:27   
The pgp-binary-keys file is missing from the patch?
(0003494)
neal   
2020-10-14 12:10   
Sorry about the delay. I've added pgp-binary-keys file.
(0003495)
christos   
2020-10-14 21:11   
Thanks!
(0003607)
christos   
2021-05-29 19:09   
It appears that we mis-identify many files as pgp keys: see https://bugs.astron.com/view.php?id=266. Is there any way to make the detection stronger?


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
264 [file] General feature have not tried 2021-05-09 22:47 2021-05-10 01:11
Reporter: jbosboom Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version:  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Magic for Python pickle serialization format
Description: Pickle is a Python serialization format. Starting with version 2, pickles have a 2-byte protocol header, and starting with version 4, pickles have a frame opcode that provides a length hint (for that frame, not necessarily for the whole pickle). All versions end with a period. Version 0 is an ASCII text format and version 1 adds some binary opcodes, but as neither has a version header, they are not definable with magic. Pickles that have been modified by removing the header/framing or adding trailing garbage can still be deserialized, but are also not definable by magic.

0 string \x80\x02
>-1 byte 0x2e Python pickle data, protocol version 2
0 string \x80\x03
>-1 byte 0x2e Python pickle data, protocol version 3
0 string \x80\x04\x95
>-1 byte 0x2e Python pickle data, protocol version 4
0 string \x80\x05\x95
>-1 byte 0x2e Python pickle data, protocol version 5

Pickle is defined by the reference implementation; see https://docs.python.org/3/library/pickle.html#data-stream-format and the PEPs linked from that section. For testing, https://gist.github.com/jbosboom/1438dcbc304b7325802c36257f5dede9 is a Python script that creates pickles of each version containing the same data. `python -m pickletools <file>` can be used to disassemble a pickle.


According to `man magic`, negative offsets (as used in the magic definitions above) can only appear at the top level or as a continuation offset (with &), but this is evidently not a limitation, and in fact my initial attempt to write pickle magic given below does not seem to work:

-1 byte 0x2e
>0 string \x80\x02 Python pickle data, protocol version 2
>0 string \x80\x03 Python pickle data, protocol version 3
>0 string \x80\x04\x95 Python pickle data, protocol version 4
>0 string \x80\x05\x95 Python pickle data, protocol version 5

You may wish to bring the implementation and the documentation into alignment.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
There are no notes attached to this issue.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
15 [file] General feature N/A 2018-07-24 15:19 2021-05-10 01:09
Reporter: eschwartz Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Try to acquire the "magic" name on the Python Package Index
Description: Per https://www.python.org/dev/peps/pep-0541/ an abandoned project can be deleted by the PyPI maintainers to clear the way for reusing the name.

The current "magic" package on https://pypi.org/project/magic/ is unmaintained since initial upload in 2003, it cannot even be installed as there is no code uploaded to PyPI, the "Project links: Download" points at a website that no longer exists (according to web.archive.org it disappeared sometime between 20121203 and 20130103), and I cannot find the original uploader on the internet more recently than 2003.

I suspect it would not be difficult to convince the PyPI maintainers of the validity of a claim by the official project. :)
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0000016)
christos   
2018-07-25 06:55   
I'll send mail to jp-py@jsnp.net who owns the project first.
(0000017)
christos   
2018-07-25 06:59   
I've sent mail, waiting for a response.
(0000021)
christos   
2018-08-01 09:02   
Sent mail to infrastructure-stuff@python.org to ask what to do next.
(0000046)
christos   
2018-08-11 11:27   
No answer there, opened: https://github.com/pypa/pypi-legacy/issues/802
(0003421)
christos   
2020-06-01 19:49   
Re-filed under: https://github.com/pypa/pypi-support/issues/429
(0003584)
eschwartz   
2021-04-14 14:34   
The ticket has been processed by PyPI support and transferred over.

So this can be closed as implemented.
(0003606)
christos   
2021-05-10 01:09   
We have it now.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
263 [file] General minor always 2021-05-02 19:52 2021-05-09 22:39
Reporter: peoro Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Shebang patterns only match the beginning of the command, rather than the whole word
Description: Currently, magic patterns that detect files using a shebang only test the beginning of the command string.

A file starting with e.g. `#!/usr/bin/env basher` is detected as "Bourne-Again shell script", even though "basher" is not bash.
The same issue affects every shebang pattern I tried (sh, bash, node, python, perl etc).
Tags:
Steps To Reproduce: $ echo '#!/bin/shawarma' > script
$ file script
script: POSIX shell script, ASCII text executable
Additional Information:
Attached Files:
Notes
(0003605)
christos   
2021-05-09 22:39   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
262 [file] General minor sometimes 2021-04-27 15:52 2021-04-27 20:36
Reporter: polluks Platform: PowerBook5,8  
Assigned To: christos OS: MorphOS  
Priority: normal OS Version: 3.15  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: better sinclair
Description: --- sinclair.bak 2019-02-22 14:06:34 +0100
+++ sinclair 2021-04-11 02:00:00 +0200
@@ -31,6 +31,8 @@
 # Sinclair QL executables (was ThMO)
 4 belong 0x4AFB QDOS executable
 >9 pstring x '%s'
+6 beshort 0x4AFB QDOS executable
+>9 pstring x '%s'
 
 # Sinclair QL ROM (ThMO)
 0 belong =0x4AFB0001 QL plugin-ROM data,
Tags:
Steps To Reproduce:
Additional Information:
System Description
Attached Files: config (7,280 bytes) 2021-04-27 15:52
https://bugs.astron.com/file_download.php?file_id=223&type=bug
Notes
(0003604)
christos   
2021-04-27 20:36   
applied, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
261 [file] General minor always 2021-04-20 11:24 2021-04-27 19:39
Reporter: bitstreamout Platform: x86_64  
Assigned To: christos OS: openSUSE  
Priority: normal OS Version: Tumbleweed  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: New version breaks subversion tests
Description: Currently subversion build breaks at several points ... many of the breaks are caused by behaviour change of file in detecting ASCII text without newlines.
Tags:
Steps To Reproduce: With version 4.30

echo xx | file -
/dev/stdin: ASCII text, with no line terminators

with version 5.40

echo -n xx | file -
/dev/stdin: data
Additional Information:
Attached Files: fails.log (18,188 bytes) 2021-04-22 15:47
https://bugs.astron.com/file_download.php?file_id=221&type=bug
file-5.50-ascii.patch (699 bytes) 2021-04-23 07:36
https://bugs.astron.com/file_download.php?file_id=222&type=bug
Notes
(0003593)
bitstreamout   
2021-04-21 05:58   
Just to correct typo ... the 'With version 4.30' should be 'With version 5.39'

```
/suse/werner> file --version
file-5.39
magic file from /etc/magic:/usr/share/misc/magic
/suse/werner> echo -n xx | file -
/dev/stdin: ASCII text, with no line terminators
/suse/werner>
```
(0003594)
bitstreamout   
2021-04-21 08:38   
Could it be that the condition

if (u < 3)

within the LOOKS_ macro should be

if (u < 2)

at least for ASCII and latin1
(0003595)
bitstreamout   
2021-04-22 12:56   
Duplicate of https://bugs.astron.com/view.php?id=256
(0003596)
bitstreamout   
2021-04-22 15:25   
Hmmm ... still problems with files without last newline

```
abuild@noether:~/rpmbuild/BUILD/subversion-1.14.1> wc -c /dev/shm/svn-test-work/working_copies/merge_tests-2/A/B/F/foo
3 /dev/shm/svn-test-work/working_copies/merge_tests-2/A/B/F/foo
abuild@noether:~/rpmbuild/BUILD/subversion-1.14.1> file /dev/shm/svn-test-work/working_copies/merge_tests-2/A/B/F/foo
/dev/shm/svn-test-work/working_copies/merge_tests-2/A/B/F/foo: data
abuild@noether:~/rpmbuild/BUILD/subversion-1.14.1> cat /dev/shm/svn-test-work/working_copies/merge_tests-2/A/B/F/foo && echo
foo
```
(0003597)
bitstreamout   
2021-04-22 15:27   
Test with file 5.39
```
file /abuild/oscbuild/openSUSE_Tumbleweed/dev/shm/svn-test-work/working_copies/merge_tests-2/A/B/F/foo
/abuild/oscbuild/openSUSE_Tumbleweed/dev/shm/svn-test-work/working_copies/merge_tests-2/A/B/F/foo: ASCII text, with no line terminators
```
(0003598)
bitstreamout   
2021-04-22 15:47   
The fails.log of subversion with file 5.40
(0003599)
bitstreamout   
2021-04-23 06:58   
Something goes wrong even with those commits for bug PR/256
```
abuild@noether:~/rpmbuild/BUILD/file-5.40> echo -e "fo" | $PWD/src/.libs/file -m $PWD/magic/magic -
/dev/stdin: ASCII text
abuild@noether:~/rpmbuild/BUILD/file-5.40> echo -e "xx" | $PWD/src/.libs/file -m $PWD/magic/magic -
/dev/stdin: data
abuild@noether:~/rpmbuild/BUILD/file-5.40> echo -e "hi" | $PWD/src/.libs/file -m $PWD/magic/magic -
/dev/stdin: ASCII text
abuild@noether:~/rpmbuild/BUILD/file-5.40> echo -en "hi" | $PWD/src/.libs/file -m $PWD/magic/magic -
/dev/stdin: ASCII text, with no line terminators
abuild@noether:~/rpmbuild/BUILD/file-5.40> echo -en "foo" | $PWD/src/.libs/file -m $PWD/magic/magic -
/dev/stdin: data
abuild@noether:~/rpmbuild/BUILD/file-5.40> echo -e "foo" | $PWD/src/.libs/file -m $PWD/magic/magic -
/dev/stdin: ASCII text
abuild@noether:~/rpmbuild/BUILD/file-5.40> echo -e "xxx" | $PWD/src/.libs/file -m $PWD/magic/magic -
/dev/stdin: data
```
(0003600)
bitstreamout   
2021-04-23 07:36   
I suggest the attached patch to count every ASCII character even if it appears several times
(0003601)
bitstreamout   
2021-04-23 11:07   
I see that commit 3096f87f823e1e936139e48d6a3bae9a95557861 had introduced the `if (dist[i]) u++` which misdetect smaller ASCII files with and without newlines
(0003603)
christos   
2021-04-27 19:39   
The whole character count/distribution approach leads to more confusion as it tries to solve some corner cases. It is not worth using heuristics to resolve the corner cases. I've reverted the fix to PR/180 and that should bring back the original behavior.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
180 [file] General trivial always 2020-08-15 05:30 2021-04-27 19:35
Reporter: EuphCat Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: A file filled with 0xFF gets reported to be ISO-8859
Description: * I don't know how the parser works, or how file types are managed. I'm okay with NOTABUG

A file filled with 0xFF gets reported to be ISO-8859. I find this misleading.
Tags:
Steps To Reproduce: $ dd if=/dev/zero ibs=1k count=1 | tr "\000" "\377" > 0xFFfile.bin
1+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000256375 s, 4.0 MB/s
$ xxd ./0xFFfile.bin | head
00000000: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000010: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000020: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000030: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000040: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000050: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000060: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000070: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000080: ffff ffff ffff ffff ffff ffff ffff ffff ................
00000090: ffff ffff ffff ffff ffff ffff ffff ffff ................
$ file 0xFFfile.bin
0xFFfile.bin: ISO-8859 text, with very long lines, with no line terminators
Additional Information: Because of this issue, mkfs.ext4 reports false positive on file validity confirmation.
"/dev/sda2 contains `ISO-8859 text, with very long lines, with no line terminators`
Proceed anyway? (y,N)"
Attached Files:
Notes
(0003447)
christos   
2020-08-15 12:06   
Fixed by requiring at least 3 distinct character values.
(0003602)
christos   
2021-04-27 19:35   
Reverted the fix. Breaks other tests. A file can have the same character repeated many times and that should not change how file detects it. Heuristics just add confusion to the behavior.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
253 [file] General minor always 2021-03-31 11:46 2021-04-19 19:04
Reporter: rwmjones Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: file 5.40 can no longer print ext4 filesystem UUIDs correctly.
Description: file 5.40 can no longer print ext4 filesystem UUIDs correctly.

Reported in Fedora:
file-5.40-1.fc35.x86_64

Downstream bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1945122
Tags:
Steps To Reproduce: 1. Prepare a disk image in a file:

  $ rm /var/tmp/test.img
  $ truncate -s 1G /var/tmp/test.img
  $ mkfs.ext4 /var/tmp/test.img

2. Run 'file' against it to display the UUID.

With the previous version of file it would display the UUID correcctly, eg:

  $ file /var/tmp/test.img
  /var/tmp/test.img: Linux rev 1.0 ext4 filesystem data, UUID=b1bc22cc-7392-4780-8b50-77dac556236d (extents) (64bit) (large files) (huge files)

With the current version of file it displays it incorrectly, eg:

  $ file /var/tmp/test.img
  /var/tmp/test.img: Linux rev 1.0 ext4 filesystem data, UUID=b1bc22cc-7392-4780-ffff8b50-77dac556236d (extents) (64bit) (large files) (huge files)

Notice that some parts of the UUID are sign-extended.
Additional Information: I bisected this to the following upstream commit:

  0478d9251abafd0876cdb3121ef2c07af6c99513 is the first bad commit
  commit 0478d9251abafd0876cdb3121ef2c07af6c99513
  Author: Christos Zoulas <christos@zoulas.com>
  Date: Sat Aug 22 18:27:42 2020 +0000

    Treat printf numbers as signed.

   src/softmagic.c | 28 ++++++++++++++--------------
   1 file changed, 14 insertions(+), 14 deletions(-)
Attached Files:
Notes
(0003580)
rwmjones   
2021-03-31 11:57   
The magic entry (magic/Magdir/filesystems) specifies %08x and %04x for the fields of the UUID. Normally %x would be unsigned (eg. the printf(3) man page documents this). So I think this is a regression in "file" itself, not a bug in the magic data.
(0003592)
christos   
2021-04-19 19:04   
Thanks! The type and sign-ness of argument is determined by the magic type field and this has been fixed to be unsigned now.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
256 [file] General major always 2021-04-02 18:41 2021-04-19 18:38
Reporter: mutableVoid Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Wrong file type for file with one-bit char before new-line
Description:
When I execute `file` on a file that contains only a single (one-byte) character before the newline which terminates the file, the reported file type is binary instead of Unicode text (see section: Steps to reproduce).

This messes which programs like `more`, which therefore refuse to print the file's content, as the reported file type is binary.

I encountered this in the following scenario:

```bash
> printf 'h\n' > file2
> more file2
 
******** file2: Not a text file ********

> od -x file2
0000000 0a68
0000002
```

When I fill the file with a character that takes more than a single byte in Unicode, the problem does not occur:
```bash
printf 'ä\n' > new_file
file new_file
new_file: Unicode text, UTF-8 text
```
 
Tags:
Steps To Reproduce: ```bash
printf 'h\n' > new_file
file new_file
```
prints
`new_file: data` instead of the expected `new_file: Unicode text, UTF-8 text`
Additional Information:
Attached Files:
Notes
(0003582)
mutableVoid   
2021-04-02 19:03   
Might be related to the fix of bugs.astron.com/view.php?id=180 , printing the same character multiple times also reports `binary` instead of UTF-8 text:

printf 'aa\n' > new_file
file new_file
new_file: data
(0003590)
christos   
2021-04-19 18:38   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
257 [file] General major always 2021-04-03 17:05 2021-04-19 17:02
Reporter: cuihao Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: file -i doesn't recognize xz files
Description: file -i doesn't recognize xz files possibly because the stock magic file is broken.
Tags:
Steps To Reproduce: $ file -i xxx.xz

With file 5.39, I got:
xxx.xz: application/x-xz; charset=binary

With newest file 5.40, I got:
xxx.xz: application/octet-stream; charset=binary
Additional Information: I did git-bisect and found commit 3ebd747d (Add checksum for XZ) introduced the bug. I don't know the format so IDK how to fix it.

OS: Arch Linux, latest packages
Kernel: 5.11.11-zen1-1-zen
Attached Files:
Notes
(0003583)
sgallagh   
2021-04-12 14:02   
Additional downstream bug in Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=1947317
(0003589)
christos   
2021-04-19 17:02   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
259 [file] General minor always 2021-04-16 09:43 2021-04-19 16:47
Reporter: aleksandr.v.novichkov Platform:  
Assigned To: administrator OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.40  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.41  
    Target Version:  
Summary: Glueing mime types into one
Description: New version of file (5.40) glue mime types for mp3 files.
Tags:
Steps To Reproduce: 1. download attached file
2. run command:
```
file --mime-type attachments_audio.mp3
```
Additional Information: Actual type is:
```
audio/mpegapplication/octet-stream
```

Expected type is:
```
audio/mpeg
```
Attached Files: attachments_audio.mp3 (10,493 bytes) 2021-04-16 09:43
https://bugs.astron.com/file_download.php?file_id=220&type=bug
Notes
(0003586)
administrator   
2021-04-19 16:47   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
260 [file] General feature always 2021-04-16 10:04 2021-04-19 15:57
Reporter: aleksandr.v.novichkov Platform:  
Assigned To: administrator OS:  
Priority: high OS Version:  
Status: feedback Product Version: 5.40  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Create tests
Description: Our project uses a libmagic to content file defining.
We often find bugs that have already been fixed.
It would be a good idea to create tests.

We have tests are written in go language, which we can adapt to your project.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files:
Notes
(0003585)
administrator   
2021-04-19 15:57   
There is a tests subdirectory in the file distribution, and a test framework on https://github.com/file/file-tests. Can you convert your test to use either? I don't think that adding a 3rd framework is a good idea :-)


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
213 [file] General feature N/A 2020-11-30 14:52 2021-03-28 19:03
Reporter: Mikhail.Kovalev Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: reopened  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: Chiasmus encrypted files (.xia)
Description: Chiasmus (https://www.bsi.bund.de/EN/Topics/OtherTopics/Chiasmus/Chiasmus_node.html) is encryption software used by many public authorities in Germany. It would be great if file library could detect .xia files.
Tags:
Steps To Reproduce:
Additional Information: Detection should be easy: the first characters in the file are always "XIA". An example file is attached.
Attached Files: example.xia (338 bytes) 2020-11-30 16:50
https://bugs.astron.com/file_download.php?file_id=190&type=bug
Notes
(0003510)
christos   
2020-12-17 00:04   
Added, thanks!
(0003574)
Mikhail.Kovalev   
2021-03-25 11:43   
I am really sorry, but this turned out to be a duplicate. Detection of Chiasmus files was already implemented in https://github.com/file/file/blob/master/magic/Magdir/bsi
So the newly added file https://github.com/file/file/blob/master/magic/Magdir/crypto should be removed.

The problem is that it's only the "textual description" which gets set if the file type is detected. But the MIME-type remains "application/octet-stream". Which indeed should be the case according to https://datatypes.net/open-xia-files
However, in the past there were some attempts to introduce special MIME types for Chiasmus files, e.g. from https://bugs.freedesktop.org/show_bug.cgi?id=23255:
- application/x-chiasmus-key Chiasmus key
- application/x-chiasmus-encrypted Chiasmus encrypted data

Would it be possible to add such MIME types to the File library?
(0003579)
christos   
2021-03-28 19:03   
removed contents from crypto


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
251 [file] General trivial always 2021-03-25 22:47 2021-03-27 20:18
Reporter: vineetg76 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: elf files for Synopsys ARC are not identified correctly
Description: There are 3 variants of Synopsys ARC ISA: ARCompact, ARCv2 and ARCv3 and processors based on them
However current implementation of file is not identifying them ideally:

1. ARCompact based elf is incorrectly identified as legacy ARC Tangent A5 which don't exist.
2. ARCv2 is not even listed
3. ARCv3 is incorrectly identified as ARCv2.3

Tags:
Steps To Reproduce:
Additional Information: I'm attaching a patch which addresses the above.
Attached Files: 0001-Fix-names-for-Synopsys-ARC-cores.patch (1,858 bytes) 2021-03-25 22:47
https://bugs.astron.com/file_download.php?file_id=213&type=bug
Notes
(0003578)
christos   
2021-03-27 20:18   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
235 [file] General tweak always 2021-02-08 15:26 2021-03-14 17:13
Reporter: jschleus Platform:  
Assigned To: christos OS:  
Priority: low OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: 5.39 reports "Certificate" instead of "ASCII text" for the PHP project NEWS file
Description: file 5.39 (but not 5.38) reports incorrectly "Certificate" instead of "ASCII text" for the NEWS file of the PHP tarballs (e.g. viewable at https://raw.githubusercontent.com/php/php-src/master/NEWS).

Probably that is related to an addition in 5.39 to the file magic/Magdir/der (handling DER encoded files).

Unfortunately I'm not able to analyze the reason but just for curiosity I created some one-liner files starting with the first line of the mentioned NEWS file and got the following results
Tags:
Steps To Reproduce:
Additional Information: Unfortunately I'm not able to analyze the reason but just for curiosity I created some one-liner files starting with the first line of the mentioned NEWS file and got the following results:

"Certificate":
PHP NEWS
PHP xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx NEWS
PHP xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
pHP xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
pHp xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

"ASCII text":
PHA xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
php xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PHP xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PHP xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Unfortunately the text here seems not to use a monospaced font (e.g. the first two example lines have both 79 chars).
 
Attached Files: file-5.39_CERTIFICATE_bug.tests.txt (714 bytes) 2021-02-08 15:26
https://bugs.astron.com/file_download.php?file_id=202&type=bug
Notes
(0003560)
christos   
2021-02-24 22:52   
I can't reproduce this with the version from HEAD.
(0003562)
jschleus   
2021-02-25 10:17   
Hmm, I just compiled the GitHub R/O master version and could reproduce the described behavior.

Perhaps I have expressed myself somewhat imprecisely, for tests with the attached file you have to put only one of the given lines into a test file.

But file 5.38 outputs for the NEWS file of the originally mentioned URL "UTF-8 Unicode text", but for the NEWS file of the last PHP 8.0.2 release (https://raw.githubusercontent.com/php/php-src/PHP-8.0.2/NEWS) the mentioned "ASCII text". But the current file 5.39 and the HEAD version output both "Certificate".

By the way my tests are done under Linux openSUSE Leap 15.2.
(0003572)
christos   
2021-03-14 17:13   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
241 [file] General minor unable to reproduce 2021-03-02 05:03 2021-03-14 17:03
Reporter: thesamesam Platform: Linux  
Assigned To: christos OS: Gentoo GNU/Linux  
Priority: normal OS Version: amd64 (stable)  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: file is killed by seccomp filter when executing futex syscall
Description: A user reported to us in Gentoo that all file invocations, when built with seccomp, resulted in "bad system call" (i.e. killed by the seccomp filter).

They were able to produce strace output showing futex() was the problematic syscall, although it's not clear why futex() is being used here. I've attached their system information and strace output.

Currently, src/seccomp.c has:
>#ifdef XZLIBSUPPORT
> ALLOW_RULE(futex);
>#endif

So, clearly we expect futex in some cases. In this case, the user had not built file with LZMA support, but when they were asked to enable it, the issue disappeared (as expected).

Do you have any suggestions as to why this syscall was being used (and if it is problematic - seems not), or does the filter simply need updating to allow it unconditionally?
Tags:
Steps To Reproduce: Not been able to reproduce, but often these changes are quite sensitive to glibc version and other factors in the system environment.

The user's glibc version appeared to be the main version in deployment in Gentoo and has not been visible on any of my systems. This is the only report I've seen of this issue, although in the past we've occasionally seen this style of problem with other syscalls that was resolved by filter updates upstream in file.

I am happy to try pass on questions to the user downstream.
Additional Information: Downstream report in Gentoo: https://bugs.gentoo.org/771096
Attached Files: strace.txt (15,549 bytes) 2021-03-02 05:03
https://bugs.astron.com/file_download.php?file_id=206&type=bug
emerge_info.txt (7,876 bytes) 2021-03-02 05:03
https://bugs.astron.com/file_download.php?file_id=205&type=bug
Notes
(0003571)
christos   
2021-03-14 17:03   
I enabled futexes unconditionally. They are used for threaded programs and it is possible that newer glibc does some thread initialization unconditionally. It would be interesting to find the stack trace of the futex call though.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
242 [file] General minor have not tried 2021-03-02 15:31 2021-03-14 16:57
Reporter: catull Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: Follow-up to 000226, now the marker is 4 bytes long
Description: The Birtual Machine file marker was originally introduced as a 2-byte marker.

Now the implementor has adopted a larger marker.
The diff attached to this ticket accounts for the most recent change.
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: bm.diff (676 bytes) 2021-03-02 15:31
https://bugs.astron.com/file_download.php?file_id=207&type=bug
Notes
(0003570)
christos   
2021-03-14 16:57   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
243 [file] General minor have not tried 2021-03-02 15:45 2021-03-14 16:54
Reporter: catull Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: Add libmagic.pc to .gitignore
Description: When developing under git, the generated file above appears as "untracked file", see below.

With the patch applied, it will be duely ignored by git.
Tags:
Steps To Reproduce: git clone git@github.com:file/file.git
cd file
autoreconf -f -i
./configure
make
git status -b
Additional Information: The last command shows:

➜ file.git git:(master) git status -b
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
    libmagic.pc

nothing added to commit but untracked files present (use "git add" to track)
Attached Files: gitignore.diff (213 bytes) 2021-03-02 15:45
https://bugs.astron.com/file_download.php?file_id=208&type=bug
Notes
(0003569)
christos   
2021-03-14 16:54   
Fixed, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
244 [file] General feature always 2021-03-08 17:23 2021-03-14 16:52
Reporter: mainframed767 Platform: 64  
Assigned To: christos OS: Linux  
Priority: normal OS Version: Ubuntu 20.04  
Status: resolved Product Version: 5.38  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: Detect NETDATA (z/OS and CMS XMI) files
Description: NETDATA (https://en.wikipedia.org/wiki/NETDATA) is a simple file format used to move files between mainframes, oftentimes referred to as XMI or XMIT. More information and test files here: http://planetmvs.com/unxmit/

NETDATA files are stored in EBCDIC. The first two bytes are a size and a flag, which varies, followed by 'INMR01' in ebcdic followed by IBM text unit INMLRECL whic is always the same:

00000002 c9 d5 d4 d9 f0 f1 00 42 00 01 00 01 50 10 11 00

It would be great if detection for this format could be added. At the moment it is just reported as "data".

Attached is a sample: seq.xmi
Tags:
Steps To Reproduce: 1) Generate XMI file using TSO TRANSMIT: https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.2.0/com.ibm.zos.v2r2.ikjc500/transmi.htm
2) Transfer file to local machine
3) Run 'file' against the downloaded file
Additional Information:
Attached Files: seq.xmi (560 bytes) 2021-03-08 17:23
https://bugs.astron.com/file_download.php?file_id=209&type=bug
Notes
(0003568)
christos   
2021-03-14 16:52   
Added a simple detection for now. We can get more elaborate and extract the fields if needed.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
246 [file] General minor always 2021-03-13 11:56 2021-03-14 16:37
Reporter: Kid Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: video/mp4 identified as application/octet-stream
Description: The only thing possibly unusual I see is ftypiso4 instead of ftypisom. Thank you for maintaining this essential utility.

The OS is Arch Linux.
Tags: magic
Steps To Reproduce:
Additional Information:
Attached Files: 2KiB.mp4 (2,048 bytes) 2021-03-13 11:56
https://bugs.astron.com/file_download.php?file_id=211&type=bug
Notes
(0003567)
christos   
2021-03-14 16:37   
Fixed in HEAD, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
247 [file] General feature N/A 2021-03-14 10:18 2021-03-14 16:24
Reporter: akohlmey Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.38  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: Magic file patterns for the LAMMPS MD code
Description: The LAMMPS MD simulation code ( https://lammps.sandia.gov/ ) produces several binary and text mode files that can be easily recognized with the additional patterns in the file attached to this issue.
Tags: magic
Steps To Reproduce: Example output (tested on a Fedora Linux, and MacOS 11):

$ file *.*
dihedral-quadratic.restart: LAMMPS binary restart file (rev 2), Version 10 Mar 2021, Little Endian
mol-pair-wf_cut.restart: LAMMPS binary restart file (rev 2), Version 24 Dec 2020, Little Endian
atom.bin: LAMMPS atom style binary dump (rev 2), Little Endian, First time step: 445570
custom.bin: LAMMPS custom style binary dump (rev 2), Little Endian, First time step: 100
bn1.lammpstrj: LAMMPS text mode dump, First time step: 5000
data.fourmol: LAMMPS data file written by LAMMPS
pnc.data: LAMMPS data file written by msi2lmp
data.spce: LAMMPS data file written by TopoTools
B.data: LAMMPS data file written by OVITO
log.lammps: LAMMPS log file written by version 10 Feb 2021
Additional Information:
Attached Files: magic.lammps (2,329 bytes) 2021-03-14 10:18
https://bugs.astron.com/file_download.php?file_id=212&type=bug
Notes
(0003566)
christos   
2021-03-14 16:24   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
245 [file] General minor have not tried 2021-03-11 16:04 2021-03-14 16:22
Reporter: pamelawardtx2021 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version:  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: problems with javascript
Description: problems with javascript
Tags:
Steps To Reproduce:
Additional Information:
Attached Files: payforresearchpaperonline_a2.docx.pdf (31,104 bytes) 2021-03-11 16:04
https://bugs.astron.com/file_download.php?file_id=210&type=bug
Notes
(0003565)
christos   
2021-03-14 16:22   
spam


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
214 [tcsh] General minor always 2020-12-05 00:40 2021-02-27 01:02
Reporter: andrew@ugh.net.au Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: assigned Product Version: 6.22.03  
Product Build: Resolution: open  
Projection: none      
ETA: none Fixed in Version:  
    Target Version:  
Summary: Can't escape delimiters in :s modifier
Description: the man page, under "History substitution" for the s modifier says:

> Any character may be used as the delimiter in place of `/'; a `\' can be used to quote the delimiter inside l and r.

\ does not quote the delimiter currently. I didn't go back to see if it used to work.
Tags: patch
Steps To Reproduce: ```
>set a='a/b'
>echo $a
a/b
>echo $a:s/\//#/
a/b
```

The output should have been `a#b`
Additional Information: Patch attached
Attached Files: sh.dol.c.patch (2,458 bytes) 2020-12-05 00:40
https://bugs.astron.com/file_download.php?file_id=191&type=bug
sh.dol.c-2.patch (2,097 bytes) 2020-12-05 00:43
https://bugs.astron.com/file_download.php?file_id=192&type=bug
delim.diff (1,412 bytes) 2021-02-27 01:02
https://bugs.astron.com/file_download.php?file_id=204&type=bug
Notes
(0003499)
andrew@ugh.net.au   
2020-12-05 00:43   
This replaces the previous patch which somehow had some misplaced comments in it.
(0003563)
christos   
2021-02-26 14:33   
I am wondering if that ever worked and we broke it, or if it never worked and this patch is needed. I need to take a more careful look.
(0003564)
christos   
2021-02-27 01:02   
I think we need to parse both at the lexical level and at dollar evaluation like below.


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
237 [file] General feature always 2021-02-11 09:17 2021-02-24 23:56
Reporter: pxeger Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: Detect Ansible Vault files
Description: Ansible Vault (https://docs.ansible.com/ansible/latest/user_guide/vault.html) is a simple AES-based file encryption system for Ansible.

The exact file format is documented at https://docs.ansible.com/ansible/latest/user_guide/vault.html#format-of-files-encrypted-with-ansible-vault, but essentially the magic bytes are (hexdump):

00000000: 2441 4e53 4942 4c45 5f56 4155 4c54 3b $ANSIBLE_VAULT;

It would be great if detection for this format could be added. At the moment it is just reported as "ASCII text".

Attached is an example file which contains the content "this is an example file" with the password "123"
Tags: magic
Steps To Reproduce: 1. Create a file using `ansible-vault create myfile`
2. Enter some content in your editor and save the file
3. Use `file myfile`
Additional Information:
Attached Files: example_file (419 bytes) 2021-02-11 09:17
https://bugs.astron.com/file_download.php?file_id=203&type=bug
Notes
(0003561)
christos   
2021-02-24 23:56   
Added, thanks!


View Issue Details
ID: Category: Severity: Reproducibility: Date Submitted: Last Update:
238 [file] General text N/A 2021-02-15 18:16 2021-02-24 22:35
Reporter: lu3 Platform:  
Assigned To: christos OS:  
Priority: normal OS Version:  
Status: resolved Product Version: 5.39  
Product Build: Resolution: fixed  
Projection: none      
ETA: none Fixed in Version: 5.40  
    Target Version:  
Summary: MP4 Base Media v1 has the typo "IS0" (0 as the number zero) instead of "ISO" (O as the letter)
Description: --- /etc/share/misc/magic/animation.orig 2021-01-19 12:12:37.274757489 +0100
+++ /etc/share/misc/magic/animation 2021-02-15 19:05:48.250105805 +0100
@@ -112,7 +112,7 @@
 # ?/enc-isoff-generic
 >8 string iso2 \b, MP4 Base Media v2 [ISO 14496-12:2005]
 !:mime video/mp4
->8 string isom \b, MP4 Base Media v1 [IS0 14496-12:2003]
+>8 string isom \b, MP4 Base Media v1 [ISO 14496-12:2003]
 !:mime video/mp4
 >8 string/W jp2 \b, JPEG 2000
 !:mime image/jp2
Tags: magic
Steps To Reproduce: # file -b movie1.mp4
ISO Media, MP4 v2 [ISO 14496-14]
# file -b movie2.mp4
ISO Media, MP4 Base Media v1 [IS0 14496-12:2003]
Additional Information: For completeness: this is the standard installation on Gentoo Linux, version sys-apps/file-5.39-r3, found in /etc/share/misc/magic/animation.
Attached Files:
Notes
(0003559)