View Issue Details

IDProjectCategoryView StatusLast Update
0000478tcshGeneralpublic2023-09-18 16:31
Reporteralper.akcan Assigned To 
PrioritynormalSeverityminorReproducibilityalways
Status newResolutionopen 
Product Version6.24.10 
Summary0000478: Detect Microsoft Office XML files generated with different file order in ZIP
DescriptionIf word, ppt, xl, visio folders are started at 6th place then mimetype is not recognized.
Steps To Reproduceunzip an xslx file, and zip with below command;

zip test.zip \
./[Content_Types].xml \
./docProps/app.xml \
./docProps/core.xml \
./docProps/custom.xml \
./_rels/.rels \
./xl/_rels \
./xl/_rels/workbook.xml.rels \
./xl/sharedStrings.xml \
./xl/styles.xml \
./xl/workbook.xml \
./xl/worksheets \
./xl/worksheets/sheet18.xml \
./xl/worksheets/sheet19.xml \
./xl/worksheets/sheet1.xml \
./xl/worksheets/sheet2.xml \
./xl/worksheets/sheet3.xml
Additional Informationpossible patch would be;

diff --git a/magic/Magdir/msooxml b/magic/Magdir/msooxml
index eed9a418..8b606964 100644
--- a/magic/Magdir/msooxml
+++ b/magic/Magdir/msooxml
@@ -53,6 +53,11 @@
 >>>>>>>&26 default x
 >>>>>>>>&26 search/6000 PK\003\004
 >>>>>>>>>&26 use msooxml
+# Some OOXML generators orders ZIP entry differently, so check the 6th file
+>>>>>>>>>&26 default x
+>>>>>>>>>>&26 search/6000 PK\003\004
+>>>>>>>>>>>&26 use msooxml
+>>>>>>>>>>>&26 default x Microsoft OOXML
 >>>>>>>>>&26 default x Microsoft OOXML
 >>>>>>>&26 default x Microsoft OOXML
 >>>>>&26 default x Microsoft OOXML
TagsNo tags attached.

Activities

gmile

2023-09-18 16:31

reporter   ~0003976

Running this appears to correctly guess the file format:

```
$ file /tmp/document-debugging.doc
/tmp/document-debugging.doc: Microsoft OOXML
$
```

Running this fails to correctly guess the file mime type:

```
$ file --mime --brief /tmp/document-debugging.docx
application/octet-stream; charset=binary
```

File contents:

```
$ unzip -l /tmp/document-debugging.docx
Archive: /tmp/document-debugging.docx
  Length Date Time Name
--------- ---------- ----- ----
     2107 09-18-2023 14:01 [Content_Types].xml
      592 09-18-2023 14:01 _rels/.rels
      193 09-18-2023 14:01 customXml/item1.xml
      451 09-18-2023 14:01 docProps/app.xml
      522 09-18-2023 14:01 docProps/core.xml
     1363 09-18-2023 14:01 word/_rels/document.xml.rels
    13168 09-18-2023 14:01 word/document.xml
     1631 09-18-2023 14:01 word/endnotes.xml
      828 09-18-2023 14:01 word/fontTable.xml
     1637 09-18-2023 14:01 word/footnotes.xml
     3447 09-18-2023 14:01 word/numbering.xml
     1794 09-18-2023 14:01 word/settings.xml
   341612 09-18-2023 14:01 word/styles.xml
     5307 09-18-2023 14:01 word/theme/theme1.xml
      183 09-18-2023 14:01 word/webSettings.xml
--------- -------
   374835 15 files
$
```
document-debugging.docx (19,161 bytes)

Issue History

Date Modified Username Field Change
2023-09-04 10:39 alper.akcan New Issue
2023-09-18 16:31 gmile Note Added: 0003976
2023-09-18 16:31 gmile File Added: document-debugging.docx