0

I'm trying to create a random text generator in python. I'm using Markovify to produce the required text, a filter to not let it start generating text unless the first word is capitalized and, to prevent it from ending "mid sentence", want the program to search from the back of the output to the front and remove all text after the last (for instance) period. I want it to ignore all other instances of the selected delimiter(s). I have no idea how many instances of the delimiter will occur in the generated text, nor have anyway to know in advance.

While looking into this I found rsplit(), and tried using that, but ran into a problem. '''tweet = buff.rsplit('.')[-1] ''' The above is what I tried first, and I thought it was working until I noticed that all of the lines printed with that had only a single sentence in them. Never more than that. The problem seems to be that the text is being dumped into an array of strings, and the [-1] bit is calling just one entry from that array. '''tweet = buff.rsplit('.') - buff.rsplit('.')[-1] ''' Next I tried the above. The thinking, was that it would remove the last entry in the array, and then I could just print what remained. It... didn't go to plan. I get an "unsupported operand type" error, specifically tied to the attempt to subtract. Not sure what I'm missing at this point.

  • 2
    It would make it easier to answer this question if you showed an example of the text you are starting with and what you hope for as a result. – Mark Apr 16 '20 at 06:11
  • So part of the problem is I'm looking at the data type wrong. The output is being dumped into an array, so I need to process it like an array. This get's me to pop(). If i take the array and add .pop(-1), it rotates around and removes the last entry from the array. This get's me part of the way there, but having turned the periods into a delimeter, I now have to put them back in when printing. A while loop with an extra string variable should fix that, but I'm now stuck with replacing all delimiters with the same character, which isn't quite where I wanted this to go. – James Anthony Apr 16 '20 at 06:25
  • Not an actual example, but the text coming out is something like "All provided by those who fear the worst, the self-professed non-conformist, is unwittingly playing right into the crowd! The executioners are terrified and suddenly become revitalized. Theoretically, it would wipe out the names involved (despite some occultists who insist, 'You can't" The idea would be for the program to remove everything after the last period, so everything after the word 'Theoretically'. And to do that automatically for every blurb – James Anthony Apr 16 '20 at 06:30
  • You know what, I just need to go in manually and add a delimeter that doesn't occur natively in the text to the end of every sentence, then filter by that. It'll mean more work, but will also mean not having to figure out how to place the delimiters back in before printing the text (in case the "end of sentence" punctuation is a ! or ? that is). – James Anthony Apr 16 '20 at 06:33

1 Answers1

0

.rsplit has second optional argument - maxsplit i.e. maximum number of split to do. You could use it following way:

txt = 'some.text.with.dots'
all_but_last = txt.rsplit('.', 1)[0]
print(all_but_last)

Output:

some.text.with
Daweo
  • 31,313
  • 3
  • 12
  • 25