Feature or enhancement
Proposal:
Adding adaptivity to binarysort routine of list.sort.
See PR for specifics.
Initial Post (Outdated)
Note:
- This is POC that this has a observable impact.
- This is subject to further calibration and optimizations, but high level concept is dicusable.
- Will issue PR shortly.
Rationale
Galloping provides adaptivity when merging runs.
However, underlying binarysort always does O(nlogn).
Concept
The concept is simple:
binarysort optionally does adaptive routine.
- It has a mechanism to switch it off during the run and go to simple binarysort
- It returns 1 if it has completed full data with binary routine and 0 otherwise
timsort calls binarysort
- If it returns 1, then use it again next time
- If it returns 0, then use
binarysort without adaptivity next time
- increase number of simple
binarysort runs before next attempt of adaptive run with every 0 returned
High level results
A0 is current
A1 is with adaptivity
P means performance / runtime
C means comparison count
- Aggregate numbers for all datasets combined are in the title
- Plain integers and floats
- Integers and floats wrapped in
list, so comparisons are __lt__ calls, thus more expensive
So the benefit is higher when comparisons cost more.
But there is some benefit for optimized comparisons as well.
Has this already been discussed elsewhere?
I have already discussed this feature proposal on Discourse
Links to previous discussion of this feature:
https://discuss.python.org/t/sorting-adaptivity-improvement/103700
Linked PRs
Feature or enhancement
Proposal:
Adding adaptivity to
binarysortroutine oflist.sort.See PR for specifics.
Initial Post (Outdated)
Note:
Rationale
Galloping provides adaptivity when merging runs.
However, underlying
binarysortalways doesO(nlogn).Concept
The concept is simple:
binarysortoptionally does adaptive routine.timsortcallsbinarysortbinarysortwithout adaptivity next timebinarysortruns before next attempt of adaptive run with every 0 returnedHigh level results
A0is currentA1is with adaptivityPmeans performance / runtimeCmeans comparison countlist, so comparisons are__lt__calls, thus more expensiveSo the benefit is higher when comparisons cost more.
But there is some benefit for optimized comparisons as well.
Has this already been discussed elsewhere?
I have already discussed this feature proposal on Discourse
Links to previous discussion of this feature:
https://discuss.python.org/t/sorting-adaptivity-improvement/103700
Linked PRs
list.sortenhancement proposal: Adaptivity forbinarysort#138947