View Issue Details

IDProjectCategoryView StatusLast Update
0000109fileGeneralpublic2019-11-02 18:42
Reporterbcb Assigned Tochristos  
PrioritynormalSeverityminorReproducibilityalways
Status feedbackResolutionopen 
PlatformLinuxOSArch Linux 
Product Version5.37 
Summary0000109: reStructuredText with embedded code detected as Python script
DescriptionWe're documenting our Python project with reStructuredText files. These contain various snippets of Python code which serve as examples. I've attached a simple document showing this.

When run on this document, file picks up on the embedded code and detects it as a Python script:

$ file -v
file-5.37
magic file from /usr/share/file/misc/magic
$ file document.rst
document.rst: Python script, ASCII text executable
$ file -b --mime-type document.rst
text/x-python

The particular use-case where we ran into this was in a Git hook which runs a code formatting tool to reject any Python code which doesn't meet the project coding style. It was using file to only run the checker on Python files. For the time being we've changed this to a simple file extension test.

Note that it doesn't happen on every document, just those with enough magic matches to get the strength high enough. On documents which don't meet the threshold it reports "ASCII text text/plain". Ideally, it would report this in all cases.
TagsNo tags attached.

Activities

bcb

2019-09-27 12:10

reporter  

document.rst (566 bytes)   
Documentation
=============

This is some documentation for our project.

It is written in reStructuredText format. However, it contains embedded Python
code which serve as examples. For instance, the following snippet:

.. code:: python

    import numpy as np
    import matplotlib.pyplot as plt

    def test_function(A):
        f = 17.2
        t = np.linspace(0.5, 0.6, 200)
        return t, A * np.sin(2 * np.pi * f * t)

    time, signal = test_function(1.55)
    plt.plot(time, signal)

Ideally, this document would *not* be identified as a Python script.
document.rst (566 bytes)   

christos

2019-11-02 18:42

manager   ~0003328

Added some magic to recognize ReStructuredText but it is not easy as there is no pattern for it.

Issue History

Date Modified Username Field Change
2019-09-27 12:10 bcb New Issue
2019-09-27 12:10 bcb File Added: document.rst
2019-11-02 18:41 christos Assigned To => christos
2019-11-02 18:41 christos Status new => assigned
2019-11-02 18:42 christos Status assigned => feedback
2019-11-02 18:42 christos Note Added: 0003328