View Issue Details

IDProjectCategoryView StatusLast Update
0000068fileGeneralpublic2019-07-21 18:31
Reportervaloq Assigned To 
PrioritynormalSeverityfeatureReproducibilityhave not tried
Status newResolutionopen 
Summary0000068: Native decompression formats
DescriptionNote: This is a copy of the original issue (https://bugs.astron.com/view.php?id=3) without the spam.


Currently file uses external applications to decompress certain file formats before analysing them.
This prevents effective sandboxing via seccomp.

This bug is to collect information and keep track of the progress of implementing all compression formats using libraries

Currently we have the folloging native decompression functions:
uncompressgzipped
uncompresszlib


The following external decompression tools are currently used and need to be implemented using their respective libraries:

gzip
uncompress
bzip2
lzip
xz
lrzip
lz4
zstd


Additional information:

most of the necessary source code can be found here:
https://nxr.netbsd.org/search?q=&project=src&defs=&refs=&path=usr.bin%2Fgzip&hist=
(thanks christos)
TagsNo tags attached.

Activities

valoq

2019-02-24 16:27

reporter   ~0003223

@christos:
Do you want all compression to be handled within compress.c (using only this one file) or can we use additional files for each compression algorithm?

Also, it seems I cannot close my own issues, so if possible please close the old one: https://bugs.astron.com/view.php?id=3

christos

2019-02-24 18:09

manager   ~0003224

I think each compression scheme can come in its own file and this way we can probably use an adapter so we don't have to modify all sources.

valoq

2019-02-24 22:18

reporter   ~0003226

After taking a look at the bsd implementations, I think we can at least use the xz (unxz.c) decompression and the bzip (unbzip2.c) decompression code.
                                                                                                                                                                                                                
One issue remaining is that they use file descriptors for in and output and not strings.
I am not quite sure how we should adapt the code to be used by file.
Also I still don't completely understand some of the decompression handling of file.
Since I am not an expert on C I am rather reluctant to mess with stuff like that, especially since it involves a widely used program like file.

Nonetheless I am looking to push this to get a working implementation of seccomp into file (Debian buster just disabled seccomp as well :( )

valoq

2019-03-03 14:56

reporter   ~0003228

@christos:
Can you please provide the adapter to allow for easier adoption of third party compression code. That would be a great help.

christos

2019-07-21 18:31

manager   ~0003265

bzip, lzma, xz added by Christoph Biedl

Issue History

Date Modified Username Field Change
2019-02-24 16:23 valoq New Issue
2019-02-24 16:27 valoq Note Added: 0003223
2019-02-24 18:09 christos Note Added: 0003224
2019-02-24 22:18 valoq Note Added: 0003226
2019-03-03 14:56 valoq Note Added: 0003228
2019-07-21 18:31 christos Note Added: 0003265