Please help, regex blown my mind.
I am cleaning data in Pandas dataframe (python 3).
I tried so many combos of regex found on the web for digits but none work for my case. I can't seem to figure out how to write my own regex for pattern 2 digits space to space 2 digits (example 26 to 40).
My challenge is to extract from pandas column BLOOM (scraped data) number of petals. Frequently petals are specified as "dd to dd petals". I know that 2 digits in regex are \d\d
or \d{2}
but how do I incorporate split by "to"? It will also be good to have a condition that the pattern is followed by word "petals".
Surely I am not the first person that needs regex in python for pattern \d\d to \d\d.
Edit:
I realised that my question without a sample dataframe is a bit confusing. So here is a sample dataframe.
import pandas as pd
import re
# initialize list of lists
data = [['Evert van Dijk', 'Carmine-pink, salmon-pink streaks, stripes, flecks. Warm pink, clear carmine pink, rose pink shaded salmon. Mild fragrance. Large, very double, in small clusters, high-centered bloom form. Blooms in flushes throughout the season.'],
['Every Good Gift', 'Red. Flowers velvety red. Moderate fragrance. Average diameter 4". Medium-large, full (26-40 petals), borne mostly solitary bloom form. Blooms in flushes throughout the season.'],
['Evghenya', 'Orange-pink. 75 petals. Large, very double bloom form. Blooms in flushes throughout the season.'],
['Evita', 'White or white blend. None to mild fragrance. 35 petals. Large, full (26-40 petals), high-centered bloom form. Blooms in flushes throughout the season.'],
['Evrathin', 'Light pink. [Deep pink.] Outer petals white. Expand rarely. Mild fragrance. 35 to 40 petals. Average diameter 2.5". Medium, double (17-25 petals), full (26-40 petals), cluster-flowered, in small clusters bloom form. Prolific, once-blooming spring or summer. Glandular sepals, leafy sepals, long sepals buds.'],
['Evita 2', 'White, blush shading. Mild, wild rose fragrance. 20 to 25 petals. Average diameter 1.25". Small, very double, cluster-flowered bloom form. Blooms in flushes throughout the season.']]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['NAME', 'BLOOM'])
# print dataframe.
df