View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000418 | file | General | public | 2023-01-20 17:58 | 2023-01-24 20:30 |
| Reporter | joveler | Assigned To | christos | ||
| Priority | normal | Severity | tweak | Reproducibility | have not tried |
| Status | resolved | Resolution | fixed | ||
| Product Version | 5.44 | ||||
| Fixed in Version | 5.45 | ||||
| Summary | 0000418: Patch for HWP file format signature | ||||
| Description | This patch revises HancomOffice HWP (Hangul Word Processor) document file format signatures. HancomOffice HWP is a word processor (or semi-desktop publishing software) mainly used in the Republic of Korea. *Changes* 1. Add support for the HWPX format - Hancom is promoting that they are changing the most supported format to HWPX from HWP 5.0. - HWPX (OWPML) is based on OCF specification (PKZIP container), so the signature goes into magDir/archive. 2. Update filetype of HWP 3.0/5.0 format - HWP 3.0/5.0 filetype now starts with `Hancom HWP (Hangul Word Processor) file`. - Current HWP 3.0/5.0 format filetype contains `Hangul (Korean)`, but it is highly ambiguous. In this context, Hangul is a trademarked name of the word processor, not Korean characters. Also, the HWP formats do not have a distinction between Korean/Global HWP (program) releases. - I put the company name (Hancom) and program name (HWP), following the OOXML filetype convention (e.g. Microsoft Word 2007+). I also added the full name of the HWP program, 'Hangul Word Processor', to avoid ambiguity between the program name and extension. - HWP 3.0 format is a proprietary binary format, so it had been in magDir/wordprocessors. - HWP 5.0 format uses MS compound data format similar to MS Office 97 ~ 2003. The filetype string is hardcoded on src/readcdf.c, and also exists on magDir/ole2compounddocs. Both two files were patched. *Before Patch* ``` /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwp: Hangul (Korean) Word Processor File 5.x /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwpx: Zip data (MIME type "application/hwp+zip"?) /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP97.hwp: Hangul (Korean) Word Processor File 3.0 ``` *After Patch* ``` /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwp: Hancom HWP (Hangul Word Processor) file, version 5.0 /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwpx: Hancom HWP (Hangul Word Processor) file, HWPX /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP97.hwp: Hancom HWP (Hangul Word Processor) file, version 3.0 ``` | ||||
| Tags | hwp hwpx magic | ||||
|
|
file-5.44-hwp.diff (3,151 bytes)
diff --git a/file-5.44-org/magic/Magdir/archive b/file-5.44-mod/magic/Magdir/archive
index a706556..abc8740 100644
--- a/file-5.44-org/magic/Magdir/archive
+++ b/file-5.44-mod/magic/Magdir/archive
@@ -1669,6 +1669,16 @@
>>50 string epub+zip EPUB document
!:mime application/epub+zip
+# From: Hajin Jang <jb6804@naver.com>
+# hwpx (OWPML) document format follows OCF specification.
+# Hangul Word Processor 2010+ supports HWPX format.
+# URL: https://www.hancom.com/etc/hwpDownload.do
+# https://standard.go.kr/KSCI/standardIntro/getStandardSearchView.do?menuId=503&topMenuId=502&ksNo=KSX6101
+# https://e-ks.kr/streamdocs/view/sd;streamdocsId=72059197557727331
+>>50 string hwp+zip Hancom HWP (Hangul Word Processor) file, HWPX
+!:mime application/hwp+zip
+!:ext hwpx
+
# From: Joerg Jenderek
# URL: http://en.wikipedia.org/wiki/CorelDRAW
# NOTE: version; til 2 WL-based; from 3 til 13 by ./riff; from 14 zip based
diff --git a/file-5.44-org/magic/Magdir/ole2compounddocs b/file-5.44-mod/magic/Magdir/ole2compounddocs
index dc08e9c..9a89ebe 100644
--- a/file-5.44-org/magic/Magdir/ole2compounddocs
+++ b/file-5.44-mod/magic/Magdir/ole2compounddocs
@@ -262,9 +262,11 @@
!:ext tpl
#
# URL: https://en.wikipedia.org/wiki/Hangul_(word_processor)
+# https://www.hancom.com/etc/hwpDownload.do
# Note: "HWP Document File" signature found in FileHeader
+# Hangul Word Processor WORDIAN, 2002 and later is using HWP 5.0 format.
# Second directory entry name FileHeader hint for Thinkfree Office document
->>>>128 lestring16 FileHeader : Hangul (Korean) 5.0 Word Processor File
+>>>>128 lestring16 FileHeader : Hancom HWP (Hangul Word Processor) file, version 5.0
#!:mime application/haansofthwp
!:mime application/x-hwp
# https://example-files.online-convert.com/document/hwp/example.hwp
diff --git a/file-5.44-org/magic/Magdir/wordprocessors b/file-5.44-mod/magic/Magdir/wordprocessors
index be71676..034c034 100644
--- a/file-5.44-org/magic/Magdir/wordprocessors
+++ b/file-5.44-mod/magic/Magdir/wordprocessors
@@ -381,8 +381,11 @@
>10 byte !0 \b, v%d.
>11 byte x \b%d
-# Hangul (Korean) Word Processor File
-0 string HWP\ Document\ File Hangul (Korean) Word Processor File 3.0
+# Hancom HWP (Hangul Word Processor)
+# Hangul Word Processor 3.0 through 97 used HWP 3.0 format.
+# URL: https://www.hancom.com/etc/hwpDownload.do
+0 string HWP\ Document\ File Hancom HWP (Hangul Word Processor) file, version 3.0
+!:ext hwp
# CosmicBook, from Benoit Rouits
0 string CSBK Ted Neslson's CosmicBook hypertext file
diff --git a/file-5.44-org/src/readcdf.c b/file-5.44-mod/src/readcdf.c
index 1e2593a..5a730af 100644
--- a/file-5.44-org/src/readcdf.c
+++ b/file-5.44-mod/src/readcdf.c
@@ -613,7 +613,7 @@ file_trycdf(struct magic_set *ms, const struct buffer *b)
sizeof(HWP5_SIGNATURE) - 1) == 0) {
if (NOTMIME(ms)) {
if (file_printf(ms,
- "Hangul (Korean) Word Processor File 5.x") == -1)
+ "Hancom HWP (Hangul Word Processor) file, version 5.0") == -1)
return -1;
} else if (ms->flags & MAGIC_MIME_TYPE) {
if (file_printf(ms, "application/x-hwp") == -1)
|
|
|
Here are test sample files of the HWP format family. |
|
|
Committed, thanks |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2023-01-20 17:58 | joveler | New Issue | |
| 2023-01-20 17:58 | joveler | Tag Attached: hwp hwpx magic | |
| 2023-01-20 17:58 | joveler | File Added: file-5.44-hwp.diff | |
| 2023-01-24 15:59 | joveler | Note Added: 0003890 | |
| 2023-01-24 15:59 | joveler | File Added: HWP97.hwp | |
| 2023-01-24 15:59 | joveler | File Added: HWP2016.hwp | |
| 2023-01-24 15:59 | joveler | File Added: HWP2016.hwpx | |
| 2023-01-24 20:30 | christos | Assigned To | => christos |
| 2023-01-24 20:30 | christos | Status | new => assigned |
| 2023-01-24 20:30 | christos | Status | assigned => resolved |
| 2023-01-24 20:30 | christos | Resolution | open => fixed |
| 2023-01-24 20:30 | christos | Fixed in Version | => 5.45 |
| 2023-01-24 20:30 | christos | Note Added: 0003891 |