View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000264 | file | General | public | 2021-05-09 22:47 | 2021-05-10 01:11 |
Reporter | jbosboom | Assigned To | christos | ||
Priority | normal | Severity | feature | Reproducibility | have not tried |
Status | assigned | Resolution | open | ||
Summary | 0000264: Magic for Python pickle serialization format | ||||
Description | Pickle is a Python serialization format. Starting with version 2, pickles have a 2-byte protocol header, and starting with version 4, pickles have a frame opcode that provides a length hint (for that frame, not necessarily for the whole pickle). All versions end with a period. Version 0 is an ASCII text format and version 1 adds some binary opcodes, but as neither has a version header, they are not definable with magic. Pickles that have been modified by removing the header/framing or adding trailing garbage can still be deserialized, but are also not definable by magic. 0 string \x80\x02 >-1 byte 0x2e Python pickle data, protocol version 2 0 string \x80\x03 >-1 byte 0x2e Python pickle data, protocol version 3 0 string \x80\x04\x95 >-1 byte 0x2e Python pickle data, protocol version 4 0 string \x80\x05\x95 >-1 byte 0x2e Python pickle data, protocol version 5 Pickle is defined by the reference implementation; see https://docs.python.org/3/library/pickle.html#data-stream-format and the PEPs linked from that section. For testing, https://gist.github.com/jbosboom/1438dcbc304b7325802c36257f5dede9 is a Python script that creates pickles of each version containing the same data. `python -m pickletools <file>` can be used to disassemble a pickle. According to `man magic`, negative offsets (as used in the magic definitions above) can only appear at the top level or as a continuation offset (with &), but this is evidently not a limitation, and in fact my initial attempt to write pickle magic given below does not seem to work: -1 byte 0x2e >0 string \x80\x02 Python pickle data, protocol version 2 >0 string \x80\x03 Python pickle data, protocol version 3 >0 string \x80\x04\x95 Python pickle data, protocol version 4 >0 string \x80\x05\x95 Python pickle data, protocol version 5 You may wish to bring the implementation and the documentation into alignment. | ||||
Tags | No tags attached. | ||||