View Issue Details

IDProjectCategoryView StatusLast Update
0000264fileGeneralpublic2021-05-10 01:11
Reporterjbosboom Assigned Tochristos  
PrioritynormalSeverityfeatureReproducibilityhave not tried
Status assignedResolutionopen 
Summary0000264: Magic for Python pickle serialization format
DescriptionPickle is a Python serialization format. Starting with version 2, pickles have a 2-byte protocol header, and starting with version 4, pickles have a frame opcode that provides a length hint (for that frame, not necessarily for the whole pickle). All versions end with a period. Version 0 is an ASCII text format and version 1 adds some binary opcodes, but as neither has a version header, they are not definable with magic. Pickles that have been modified by removing the header/framing or adding trailing garbage can still be deserialized, but are also not definable by magic.

0 string \x80\x02
>-1 byte 0x2e Python pickle data, protocol version 2
0 string \x80\x03
>-1 byte 0x2e Python pickle data, protocol version 3
0 string \x80\x04\x95
>-1 byte 0x2e Python pickle data, protocol version 4
0 string \x80\x05\x95
>-1 byte 0x2e Python pickle data, protocol version 5

Pickle is defined by the reference implementation; see https://docs.python.org/3/library/pickle.html#data-stream-format and the PEPs linked from that section. For testing, https://gist.github.com/jbosboom/1438dcbc304b7325802c36257f5dede9 is a Python script that creates pickles of each version containing the same data. `python -m pickletools <file>` can be used to disassemble a pickle.


According to `man magic`, negative offsets (as used in the magic definitions above) can only appear at the top level or as a continuation offset (with &), but this is evidently not a limitation, and in fact my initial attempt to write pickle magic given below does not seem to work:

-1 byte 0x2e
>0 string \x80\x02 Python pickle data, protocol version 2
>0 string \x80\x03 Python pickle data, protocol version 3
>0 string \x80\x04\x95 Python pickle data, protocol version 4
>0 string \x80\x05\x95 Python pickle data, protocol version 5

You may wish to bring the implementation and the documentation into alignment.
TagsNo tags attached.

Activities

There are no notes attached to this issue.

Issue History

Date Modified Username Field Change
2021-05-09 22:47 jbosboom New Issue
2021-05-10 01:11 christos Assigned To => christos
2021-05-10 01:11 christos Status new => assigned