0000205: option to indicate whether txt files have terminating line ending - MantisBT

ID	Project	Category	View Status	Date Submitted	Last Update

0000205	file	General	public	2020-10-23 12:27	2021-02-06 21:45

Reporter	utoddl	Assigned To	christos
Priority	low	Severity	feature	Reproducibility	N/A
Status	assigned	Resolution	open
Product Version	5.38

Summary	0000205: option to indicate whether txt files have terminating line ending
Description	It would be nice to have an option to indicate whether the last line of a text file, or really any file that's made of lines of text, has a terminator on it's last line. This could be off by default since it would involve a seek to the end and thus could impact performance and backward compatibility in the output. Or not, if that's not a big deal.
Tags	No tags attached.

christos 2021-02-05 23:03 manager ~0003545	That's not easy because we don't always read the whole file.

utoddl 2021-02-05 23:27 reporter ~0003548	Exactly why I suggested it would have to be an off-by-default option. (Although now I'm having second thoughts...) It would only apply to inputs which had already been determined to be some sort of "text" in a seekable file, and not from some exotic source like a pipe. Only then would it be reasonable to seek to the end (certainly not reading all the data -- that could be huge) and check the last couple of bytes for some combination of CR and LF, at which point an appropriate string could be added to that file's output.

christos 2021-02-06 13:08 manager ~0003549	All this is doable, but the cost-benefit (from both the code complexity and performance perspective) leans heavily on the cost and not the benefit..

utoddl 2021-02-06 21:45 reporter ~0003550	I just crawled around through the code, and, yeah, I have to agree with you. :( It seems such a simple ask, but the existing code would have to be significantly reworked to wedge it in. I'm sorry to say we can let this one go until the next extreme makeover, because that's what it would take. If you ever get the ./TODO list knocked out, particularly such that struct buffer becomes a thing that you can query, reload different parts of, etc., then implementing tests at the tail end of files would be much more straightforward. You'd still have to deal with piped data which may be very large or even unending. It would be neat if it could say for example how many pages are in a .pdf or how many YAML bodies are in a .yaml or whether a text file is terminated. It would also be neat if `file` could produce its output in a format more consumable by scripts, maybe json. But that a different RFQ altogether. Thanks anway.

Date Modified	Username	Field	Change
2020-10-23 12:27	utoddl	New Issue
2021-02-05 23:03	christos	Assigned To	=> christos
2021-02-05 23:03	christos	Status	new => assigned
2021-02-05 23:03	christos	Status	assigned => feedback
2021-02-05 23:03	christos	Note Added: 0003545
2021-02-05 23:27	utoddl	Note Added: 0003548
2021-02-05 23:27	utoddl	Status	feedback => assigned
2021-02-06 13:08	christos	Note Added: 0003549
2021-02-06 21:45	utoddl	Note Added: 0003550