View Issue Details

IDProjectCategoryView StatusLast Update
0000049file[All Projects] Generalpublic2018-10-19 01:04
Reportergiosh94mhzAssigned Tochristos 
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Product Version5.34 
Target VersionFixed in VersionHEAD 
Summary0000049: Normal TSV identified as Algol 68
DescriptionRules for Algol68 are too broad, and gives some false results, especially with Tab separated text file.

Since Algol is pretty rare and TSV very common, this can be a boring issue. For now, I've attached a patch to use exact regex match and limited to only 1024 bytes.

A proper/better solution is to do multiple checks for a rare file type like this, but I'm not an Algol developer so I'm a bit clueless here. I think we should use a multiple match, which ensure that 2-3 algol instruction are in place (e.g PROC && MODE && REF, instead of PROC || MODE || REF).

Steps To ReproduceSee sample.xls (actually TSV) attached, and a first non optimal solution in file-algol.patch
TagsNo tags attached.
Attach Tags

Activities

giosh94mhz

2018-10-18 14:33

reporter  

sample.xls (62 bytes)
file-algol.patch (1,069 bytes)
commit ace4bbf03b69d843ef49b34daefdeb0d2580c150 (HEAD -> refs/heads/v5.34)
Author: Giorgio Premi <giosh94mhz@gmail.com>
Date:   Thu Oct 18 16:01:11 2018 +0200

    Stricted Algol checks
---
 magic/Magdir/algol68 | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/magic/Magdir/algol68 b/magic/Magdir/algol68
index 68583dfa..2479192d 100644
--- a/magic/Magdir/algol68
+++ b/magic/Magdir/algol68
@@ -5,13 +5,13 @@
 #
 0	search/8192	(input,			Algol 68 source text
 !:mime	text/x-Algol68
-0	regex		\^PROC			Algol 68 source text
+0	regex/1024	\^PROC			Algol 68 source text
 !:mime	text/x-Algol68
-0	regex           MODE[\t\ ]		Algol 68 source text
+0	regex/1024	\bMODE[\t\ ]		Algol 68 source text
 !:mime	text/x-Algol68
-0	regex          	REF[\t\ ]		Algol 68 source text
+0	regex/1024	\bREF[\t\ ]		Algol 68 source text
 !:mime	text/x-Algol68
-0	regex          	FLEX[\t\ ]\*\\[		Algol 68 source text
+0	regex/1024	\bFLEX[\t\ ]\*\\[	Algol 68 source text
 !:mime	text/x-Algol68
 #0	regex          	[\t\ ]OD		Algol 68 source text
 #!:mime	text/x-Algol68
file-algol.patch (1,069 bytes)

christos

2018-10-19 01:04

manager   ~0000098

applied, thanks!

Issue History

Date Modified Username Field Change
2018-10-18 14:33 giosh94mhz New Issue
2018-10-18 14:33 giosh94mhz File Added: file-algol.patch
2018-10-18 14:33 giosh94mhz File Added: sample.xls
2018-10-19 01:04 christos Assigned To => christos
2018-10-19 01:04 christos Status new => assigned
2018-10-19 01:04 christos Status assigned => resolved
2018-10-19 01:04 christos Resolution open => fixed
2018-10-19 01:04 christos Fixed in Version => HEAD
2018-10-19 01:04 christos Note Added: 0000098