I'm trying to extract some prices from the ikea website but the price format is pretty messy (whitespace, carriage return, a comma in the middle of nowhere). This is what I extracted :
39,90 €
,
I used Scrapy to do this, so far no problem, except that I would like to get rid of all of what is not the price (and the euro symbol) !
I tried to use this regex (in python 2.7) :
re(\S[0-9]+([ ,]?[ ])([0-9]{2}?)u"\u20AC")
I'm new in programming and I learned what is a regular expression this afternoon, but I tried a massive number of possibilities without getting any better results than :
SyntaxError: unexpected character after line continuation character
If someone could take few minutes to look at what I did and tells me where I'm wrong, that would be great !
Cheers everyone