-2

Firstly I have this long string

s = '1MichaelAngelo'

How can I get the output as

new_s = '1 Michael Angelo' 

and as a list

new_list = [1,'Michael', 'Angelo']

Note: I have like a thousand I parsed from an html.

Secondly, I have this huge string (consists of names and numbers up to 1000]). E.g

1\nfirstName\nlastName\n.......999\nfirstName\nlastName

where \n denotes a newline.

How can I extract data from it to output something like:

[1, 'Michael', 'Emily], [2,'Mathew','Jessica'], [3, 'Jacob', 'Ashley '] 

and so on.

Ronan Boiteau
  • 9,608
  • 6
  • 34
  • 56
Olamide226
  • 428
  • 6
  • 12

1 Answers1

2

Two questions, two answers. Next time please ask one question at a time.

import re
s = '1MichaelAngelo'
[int(x) for x in re.findall(r'\d+',s)] + re.findall('[A-Z][^A-Z]*',s)
>>> [1, 'Michael', 'Angelo']

or, alternatively,

import re
s = '1MichaelAngelo'
[int(x) if re.match(r'\d+',x) else x for x in re.findall(r'\d+|[A-Z][^A-Z]*',s)]

where re.findall splits the longer string on the required boundaries;

and

import re
s = '1\nfirstName\nlastName\n999\nfirstName2\nlastName2'
[[int(x) if re.match(r'\d+',x) else x for x in s.split('\n')[i:i+3]] for i in range(0,len(s.split('\n')),3)]
>>> [[1, 'firstName', 'lastName'], [999, 'firstName2', 'lastName2']]

where the list comprehension first splits the entire string in threes (using the trick shown in https://stackoverflow.com/a/15890829/2564301), then scans the newly formed list for integers and convert only these.

Jongware
  • 22,200
  • 8
  • 54
  • 100