3

I would like to sort this list of strings by the first number preferably using using regular expressions in a single line but other suggestions are welcome. I am trying to get the quickest way of doing it. Here is the list;

[
  "10. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless I",
  "11. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless J",
  "12. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless K",
  "13. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless L",
  "14. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless M",
  "15. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless N",
  "16. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless O",
  "17. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless P",
  "18. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless Q",
  "19. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless R",
  "20. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless S",
  "21. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless z",
  "22. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless A",
  "5. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless D",
  "6. Command Mounting Refill Strips - Large Pack of 1 6 Strips E",
  "7. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless F",
  "8. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless G",
  "9. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless H"
]

This is the code that I have tried so far.

dirs = sorted(next(walk(self.rootDirectory))[1], key=lambda x: int(x[0]))

But this is returning;

[
  "10. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless I",
  "11. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless J",
  "12. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless K",
  "13. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless L",
  "14. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless M",
  "15. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless N",
  "16. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless O",
  "17. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless P",
  "18. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless Q",
  "19. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless R",
  "20. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless S",
  "21. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless z",
  "22. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless A",
  "5. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless D",
  "6. Command Mounting Refill Strips - Large Pack of 1 6 Strips E",
  "7. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless F",
  "8. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless G",
  "9. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless H"
]

Update

Can I also have an example of sorting it without having the . character. For example; "20 iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless S"

  • You may need a natural sort algorithm. https://stackoverflow.com/a/3033342/4531270 – pylang Oct 16 '17 at 14:35
  • Input list is sorted based on first digit, so is output. There seems to be no problem here. – Tomasz Plaskota Oct 16 '17 at 14:36
  • Seems like it _is_ sorted by the first digit. Did you actually mean that you'd expect the order to be `5, 6, 7, 8, 9, 10, 11, 12, ...`? Also the quickest way might not always be the _shortest_ way possible (ie, a one liner might be slower than a bigger piece of code). – Horia Coman Oct 16 '17 at 14:36
  • Yes, I have updated the question. The first number. I am thinking there is possibly a quick way to write it using the same amount of script that I have written but I don't mind having very long code. –  Oct 16 '17 at 14:38

4 Answers4

11

Convert to int after splitting on . to take the full numbers, not just the first digit:

lst = next(walk(self.rootDirectory))[1]
dirs = sorted(lst, key=lambda x: int(x.split('.')[0]))

To sort when the '.' are not certain to be in the strings:

dirs = sorted(lst, key=lambda x: float(x.split()[0]))

Works with or without the '.'.

Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
  • Thanks this worked. Just out of curiosity is there a way to sort it provided that I don't have the . character. I might need to sort strings without the . very soon. –  Oct 16 '17 at 14:44
  • You can split on the space and convert to float: `lambda x: float(x.split()[0])` Works with or without `.`. – Moses Koledoye Oct 16 '17 at 14:46
6

If you wish to sort the list by numeric strings in general, consider a natural sorting algorithm.

Code

import re


def natural_key(string_):
    return [int(s) if s.isdigit() else s for s in re.split(r'(\d+)', string_) if s]

The latter code is modified from this SO post and assumes each string is enumerated such that the numeric strings a converted to numbers which can be sorted.

iterable = [
  "10. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless I",
  "11. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless J",
  "12. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless K",
  "13. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless L",
  "14. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless M",
  "15. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless N",
  "16. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless O",
  "17. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless P",
  "18. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless Q",
  "19. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless R",
  "20. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless S",
  "21. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless z",
  "22. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless A",
  "5. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless D",
  "6. Command Mounting Refill Strips - Large Pack of 1 6 Strips E",
  "7. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless F",
  "8. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless G",
  "9. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless H"
]

sorted(iterable, key=natural_key)

Output

['5. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless D',
 '6. Command Mounting Refill Strips - Large Pack of 1 6 Strips E',
 '7. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless F',
 '8. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless G',
 '9. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless H',
 '10. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless I',
 ...]
pylang
  • 40,867
  • 14
  • 129
  • 121
2
l = <your-list>
import re
sorted(l, key=lambda x:int(re.match(r'(\d+)',x).groups()[0]))

Output:

['5. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless D',
 '6. Command Mounting Refill Strips - Large Pack of 1 6 Strips E',
 '7. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless F',
 '8. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless G',
 '9. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless H',
 '10. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless I',
 '11. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless J',
 '12. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless K',
 '13. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless L',
 '14. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless M',
 '15. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless N',
 '16. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless O',
 '17. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless P',
 '18. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless Q',
 '19. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless R',
 '20. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless S',
 '21. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless z',
 '22. iTOMA Radio Alarm Clock FM Digital Radio Clock Bedside Alarm Clock Wireless A']
Transhuman
  • 3,527
  • 1
  • 9
  • 15
0

you can sort by the first element of the list by this way.

iterable.sort(key=lambda x: x[0])
print iterable
Shaon shaonty
  • 1,367
  • 1
  • 11
  • 22