View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000418 | file | General | public | 2023-01-20 17:58 | 2023-01-24 20:30 |
Reporter | joveler | Assigned To | christos | ||
Priority | normal | Severity | tweak | Reproducibility | have not tried |
Status | resolved | Resolution | fixed | ||
Product Version | 5.44 | ||||
Fixed in Version | 5.45 | ||||
Summary | 0000418: Patch for HWP file format signature | ||||
Description | This patch revises HancomOffice HWP (Hangul Word Processor) document file format signatures. HancomOffice HWP is a word processor (or semi-desktop publishing software) mainly used in the Republic of Korea. *Changes* 1. Add support for the HWPX format - Hancom is promoting that they are changing the most supported format to HWPX from HWP 5.0. - HWPX (OWPML) is based on OCF specification (PKZIP container), so the signature goes into magDir/archive. 2. Update filetype of HWP 3.0/5.0 format - HWP 3.0/5.0 filetype now starts with `Hancom HWP (Hangul Word Processor) file`. - Current HWP 3.0/5.0 format filetype contains `Hangul (Korean)`, but it is highly ambiguous. In this context, Hangul is a trademarked name of the word processor, not Korean characters. Also, the HWP formats do not have a distinction between Korean/Global HWP (program) releases. - I put the company name (Hancom) and program name (HWP), following the OOXML filetype convention (e.g. Microsoft Word 2007+). I also added the full name of the HWP program, 'Hangul Word Processor', to avoid ambiguity between the program name and extension. - HWP 3.0 format is a proprietary binary format, so it had been in magDir/wordprocessors. - HWP 5.0 format uses MS compound data format similar to MS Office 97 ~ 2003. The filetype string is hardcoded on src/readcdf.c, and also exists on magDir/ole2compounddocs. Both two files were patched. *Before Patch* ``` /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwp: Hangul (Korean) Word Processor File 5.x /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwpx: Zip data (MIME type "application/hwp+zip"?) /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP97.hwp: Hangul (Korean) Word Processor File 3.0 ``` *After Patch* ``` /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwp: Hancom HWP (Hangul Word Processor) file, version 5.0 /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP2016.hwpx: Hancom HWP (Hangul Word Processor) file, HWPX /c/Joveler/Build/Joveler.FileMagician/Joveler.FileMagician.Tests/Samples/HWP97.hwp: Hancom HWP (Hangul Word Processor) file, version 3.0 ``` | ||||
Tags | hwp hwpx magic | ||||
|
file-5.44-hwp.diff (3,151 bytes)
diff --git a/file-5.44-org/magic/Magdir/archive b/file-5.44-mod/magic/Magdir/archive index a706556..abc8740 100644 --- a/file-5.44-org/magic/Magdir/archive +++ b/file-5.44-mod/magic/Magdir/archive @@ -1669,6 +1669,16 @@ >>50 string epub+zip EPUB document !:mime application/epub+zip +# From: Hajin Jang <jb6804@naver.com> +# hwpx (OWPML) document format follows OCF specification. +# Hangul Word Processor 2010+ supports HWPX format. +# URL: https://www.hancom.com/etc/hwpDownload.do +# https://standard.go.kr/KSCI/standardIntro/getStandardSearchView.do?menuId=503&topMenuId=502&ksNo=KSX6101 +# https://e-ks.kr/streamdocs/view/sd;streamdocsId=72059197557727331 +>>50 string hwp+zip Hancom HWP (Hangul Word Processor) file, HWPX +!:mime application/hwp+zip +!:ext hwpx + # From: Joerg Jenderek # URL: http://en.wikipedia.org/wiki/CorelDRAW # NOTE: version; til 2 WL-based; from 3 til 13 by ./riff; from 14 zip based diff --git a/file-5.44-org/magic/Magdir/ole2compounddocs b/file-5.44-mod/magic/Magdir/ole2compounddocs index dc08e9c..9a89ebe 100644 --- a/file-5.44-org/magic/Magdir/ole2compounddocs +++ b/file-5.44-mod/magic/Magdir/ole2compounddocs @@ -262,9 +262,11 @@ !:ext tpl # # URL: https://en.wikipedia.org/wiki/Hangul_(word_processor) +# https://www.hancom.com/etc/hwpDownload.do # Note: "HWP Document File" signature found in FileHeader +# Hangul Word Processor WORDIAN, 2002 and later is using HWP 5.0 format. # Second directory entry name FileHeader hint for Thinkfree Office document ->>>>128 lestring16 FileHeader : Hangul (Korean) 5.0 Word Processor File +>>>>128 lestring16 FileHeader : Hancom HWP (Hangul Word Processor) file, version 5.0 #!:mime application/haansofthwp !:mime application/x-hwp # https://example-files.online-convert.com/document/hwp/example.hwp diff --git a/file-5.44-org/magic/Magdir/wordprocessors b/file-5.44-mod/magic/Magdir/wordprocessors index be71676..034c034 100644 --- a/file-5.44-org/magic/Magdir/wordprocessors +++ b/file-5.44-mod/magic/Magdir/wordprocessors @@ -381,8 +381,11 @@ >10 byte !0 \b, v%d. >11 byte x \b%d -# Hangul (Korean) Word Processor File -0 string HWP\ Document\ File Hangul (Korean) Word Processor File 3.0 +# Hancom HWP (Hangul Word Processor) +# Hangul Word Processor 3.0 through 97 used HWP 3.0 format. +# URL: https://www.hancom.com/etc/hwpDownload.do +0 string HWP\ Document\ File Hancom HWP (Hangul Word Processor) file, version 3.0 +!:ext hwp # CosmicBook, from Benoit Rouits 0 string CSBK Ted Neslson's CosmicBook hypertext file diff --git a/file-5.44-org/src/readcdf.c b/file-5.44-mod/src/readcdf.c index 1e2593a..5a730af 100644 --- a/file-5.44-org/src/readcdf.c +++ b/file-5.44-mod/src/readcdf.c @@ -613,7 +613,7 @@ file_trycdf(struct magic_set *ms, const struct buffer *b) sizeof(HWP5_SIGNATURE) - 1) == 0) { if (NOTMIME(ms)) { if (file_printf(ms, - "Hangul (Korean) Word Processor File 5.x") == -1) + "Hancom HWP (Hangul Word Processor) file, version 5.0") == -1) return -1; } else if (ms->flags & MAGIC_MIME_TYPE) { if (file_printf(ms, "application/x-hwp") == -1) |
|
Here are test sample files of the HWP format family. |
|
Committed, thanks |
Date Modified | Username | Field | Change |
---|---|---|---|
2023-01-20 17:58 | joveler | New Issue | |
2023-01-20 17:58 | joveler | Tag Attached: hwp hwpx magic | |
2023-01-20 17:58 | joveler | File Added: file-5.44-hwp.diff | |
2023-01-24 15:59 | joveler | Note Added: 0003890 | |
2023-01-24 15:59 | joveler | File Added: HWP97.hwp | |
2023-01-24 15:59 | joveler | File Added: HWP2016.hwp | |
2023-01-24 15:59 | joveler | File Added: HWP2016.hwpx | |
2023-01-24 20:30 | christos | Assigned To | => christos |
2023-01-24 20:30 | christos | Status | new => assigned |
2023-01-24 20:30 | christos | Status | assigned => resolved |
2023-01-24 20:30 | christos | Resolution | open => fixed |
2023-01-24 20:30 | christos | Fixed in Version | => 5.45 |
2023-01-24 20:30 | christos | Note Added: 0003891 |