1

I have a few strings, like:

address1 = 'Красноярский край, г Красноярск, ул Академика Вавилова, 2Д, кв. 311'
address2 = 'Москва г, ул Ольховская, 45 стр. 1, квартира 3'
address3 = 'Красноярский край, г Красноярск, ул Академика Вавилова, 2Д, квартира 311'

So I need to cut that piece of string, which start from кв. I use regular expression and this my code:

import re

flat_template = r"кв(.*)$"

flat1 = re.search(flat_template, address1)
flat2 = re.search(flat_template, address2)
flat3 = re.search(flat_template, address3)

cut_flat1 = addresses[flat.start():flat.end()]
cut_flat2 = addresses[flat.start():flat.end()]
cut_flat3 = addresses[flat.start():flat.end()]

Output: 

cut_flat1 = 'кв. 311'
cut_flat2 = 'ква г, ул Ольховская, 45 стр. 1, квартира 3'
cut_flat3 = 'квартира 311'

But I want to get:

cut_flat1 = 'кв. 311'
cut_flat2 = 'квартира 3'
cut_flat3 = 'квартира 311'

All what I want to is search from end of string

2 Answers2

1

Regular expressions usually are "greedy": they try to match as many characters as possible. That is what you see in your results.

You can make them non-greedy instead:

flat_template = r"кв(.*?)$"

Note the use of .*? for the non-greedy variant of .*. This will match the minum number of characters possible.

To make sure "кв" matches the beginning of a word, use a word boundary: '\b':

flat_template = r"\bкв(.*?)$"
9769953
  • 10,344
  • 3
  • 26
  • 37
  • Thanks, I used your template for my code, but also added space before key word – AidarDzhumagulov Nov 19 '22 at 10:56
  • @AidarDzhumagulov You'd need a word boundary instead: "кв. 311", or "Москва г, ул Ольховская, 45 стр. 1,квартира 3" (due to a typo) would fail with a space, but a word boundary would solve that problem. – 9769953 Nov 19 '22 at 11:04
0

I solve especially my problem. I have added space before ' кв'. So my code

flat_template = r" кв(.*?)$"