View Issue Details

IDProjectCategoryView StatusLast Update
0000205fileGeneralpublic2021-02-06 21:45
Reporterutoddl Assigned Tochristos  
PrioritylowSeverityfeatureReproducibilityN/A
Status assignedResolutionopen 
Product Version5.38 
Summary0000205: option to indicate whether txt files have terminating line ending
DescriptionIt would be nice to have an option to indicate whether the last line of a text file, or really any file that's made of lines of text, has a terminator on it's last line.

This could be off by default since it would involve a seek to the end and thus could impact performance and backward compatibility in the output. Or not, if that's not a big deal.
TagsNo tags attached.

Activities

christos

2021-02-05 23:03

manager   ~0003545

That's not easy because we don't always read the whole file.

utoddl

2021-02-05 23:27

reporter   ~0003548

Exactly why I suggested it would have to be an off-by-default option. (Although now I'm having second thoughts...)

It would only apply to inputs which had already been determined to be some sort of "text" in a seekable file, and not from some exotic source like a pipe.

Only then would it be reasonable to seek to the end (certainly not reading all the data -- that could be huge) and check the last couple of bytes for some combination of CR and LF, at which point an appropriate string could be added to that file's output.

christos

2021-02-06 13:08

manager   ~0003549

All this is doable, but the cost-benefit (from both the code complexity and performance perspective) leans heavily on the cost and not the benefit..

utoddl

2021-02-06 21:45

reporter   ~0003550

I just crawled around through the code, and, yeah, I have to agree with you. :( It seems such a simple ask, but the existing code would have to be significantly reworked to wedge it in. I'm sorry to say we can let this one go until the next extreme makeover, because that's what it would take.

If you ever get the ./TODO list knocked out, particularly such that struct buffer becomes a thing that you can query, reload different parts of, etc., then implementing tests at the tail end of files would be much more straightforward. You'd still have to deal with piped data which may be very large or even unending. It would be neat if it could say for example how many pages are in a .pdf or how many YAML bodies are in a .yaml or whether a text file is terminated. It would also be neat if `file` could produce its output in a format more consumable by scripts, maybe json. But that a different RFQ altogether.

Thanks anway.

Issue History

Date Modified Username Field Change
2020-10-23 12:27 utoddl New Issue
2021-02-05 23:03 christos Assigned To => christos
2021-02-05 23:03 christos Status new => assigned
2021-02-05 23:03 christos Status assigned => feedback
2021-02-05 23:03 christos Note Added: 0003545
2021-02-05 23:27 utoddl Note Added: 0003548
2021-02-05 23:27 utoddl Status feedback => assigned
2021-02-06 13:08 christos Note Added: 0003549
2021-02-06 21:45 utoddl Note Added: 0003550