0

The design of this is not meeting expectations:

# Explanation:
# Read split of splits until index of indexes reached. Apply underscore to split token with no space if split followed by another index
# Therefore line output should be: '7 Waitohu Road _York_Bay Co Manager _York_Bay Asst Co Dir _Central_Lower_Hutt General Hand _Wainuiomata School Caretaker' 

# A list of suburb words and there index position in line
uniqueList = ['York', 3, 'Bay', 4, 'York', 7, 'Bay', 8, 'Central', 12, 'Lower', 13, 'Hutt', 14, 'Wainuiomata', 17]

# Using indexes = uniqueList[1::2] to reduce uniqueList down to just indexes
indexes = [3, 4, 7, 8, 12, 13, 14, 17]

# The line example
line = '7 Waitohu Road York Bay Co Manager York Bay Asst Co Dir Central Lower Hutt General Hand Wainuiomata School Caretaker'

# Split the line into tokens for counting indexes
splits = line.split(' ')

# Read index 
for i in range(len(indexes)):
    check = indexes[i]
    for j in range(len(splits)):
        if j == check and (i + 1 < len(indexes)):
            # Determine if next index incremental
            next = indexes[i + 1]
            if 1 == next - check:
                splits[j] = '_' + splits[j] + '_' + splits[j + 1]            
        else:
            if j == check:
                splits[j] = '_' + splits[j]

# Results here                
newLine = ' '.join(splits)
print(newLine)

Output:

7 Waitohu Road _York_Bay Bay Co Manager _York_Bay Bay Asst Co Dir _Central_Lower _Lower_Hutt Hutt General Hand _Wainuiomata School Caretaker

How to:

  • Not output/remove doubled up word Bay and Hutt
  • Deal with an additional underscored word to get _Central_Lower_Hutt
Dave
  • 687
  • 7
  • 15

1 Answers1

1

There are three cases:

  • A word in the list where the previous word was also in the list
  • A word in the list where the previous word was NOT in the list
  • A word not in the list

We just need to do the right thing for those three cases.

# A list of suburb words and there index position in line

indexes = [3, 4, 7, 8, 12, 13, 14, 17]

# The line example
line = '7 Waitohu Road York Bay Co Manager York Bay Asst Co Dir Central Lower Hutt General Hand Wainuiomata School Caretaker'

# Split the line into tokens for counting indexes
splits = line.split(' ')

# Read index 
outs = []
for i,word in enumerate(splits):
    if i in indexes:
        if i-1 not in indexes:
            outs.append(' ')
        outs.append('_')
    elif outs:
        outs.append(' ')
    outs.append(word)

# Results here                
newLine = ''.join(outs)
print(newLine)
Tim Roberts
  • 48,973
  • 4
  • 21
  • 30