Feature or enhancement
Proposal:
The email.header.decode_header function does multiple calls to parts.pop(0) in a loop, where parts is a list. This is slow, because it has to shift the rest of the elements every time.
I have locally changed code where I stop doing pop(0), and instead operate on list index, which avoids the issue.
This is what pyperf says (the test string is copied from unit tests):
$ python -m pyperf timeit 'from email.header import decode_header;decode_header("=?ISO-8859-1?Q?Andr=E9?= Pirard <pirard@dom.ain>")' -o email_decode_header_baseline.json
# apply optimization
$ python -m pyperf timeit 'from email.header import decode_header;decode_header("=?ISO-8859-1?Q?Andr=E9?= Pirard <pirard@dom.ain>")' -o email_decode_header_optimized.json
$ python -m pyperf compare_to email_decode_header_baseline.json email_decode_header_optimized.json
Mean +- std dev: [email_decode_header_baseline] 6.21 us +- 0.05 us -> [email_decode_header_optimized] 6.11 us +- 0.06 us: 1.02x faster
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs
Feature or enhancement
Proposal:
The
email.header.decode_headerfunction does multiple calls toparts.pop(0)in a loop, wherepartsis a list. This is slow, because it has to shift the rest of the elements every time.I have locally changed code where I stop doing
pop(0), and instead operate on list index, which avoids the issue.This is what pyperf says (the test string is copied from unit tests):
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs