0

I'm new to python and trying to figure this out, so sorry if this has been asked. I couldn't find it and don't know what this may be called.

So the short of it. I want to take a link like:

http://www.somedomainhere.com/embed-somekeyhere-650x370.html

and turn it into this:

http://www.somedomainhere.com/somekeyhere

The long of it, I have been working on an addon for xbmc that goes to a website, grabs a url, goes to that url to find another url. Basically a url resolver.

So the program searches the site and comes up with somekeyhere-650x370.html. But that page is in java and is unusable to me. but when I go to com/somekeyhere that code is usable. So I need to grab the first url, change the url to the usable page and then scrape that page.

So far the code I have is

if 'somename' in name:
try:
  n=re.compile('<iframe title="somename" type="text/html" frameborder="0" scrolling="no" width=".+?" height=".+?" src="(.+?)">" frameborder="0"',re.DOTALL).findall(net().http_GET(url).content)[0]
CONVERT URL to .com/somekeyhere SO BELOW NA CAN READ IT.
  na = re.compile("'file=(.+?)&.+?'",re.DOTALL).findall(net().http_GET(na).content)[0]

Any suggestions on how I can accomplish converting the url?

twasbrillig
  • 17,084
  • 9
  • 43
  • 67

1 Answers1

-1

I really didn't get the long of your question. However, answering the short

Assumptions: somekey is a alphanumeric

a='http://www.domain.com/embed-somekey-650x370.html'
p=re.match(r'^http://www.domain.com/embed-(?P<key>[0-9A-Za-z]+)-650x370.html$',a)
somekey=p.group('key')
requiredString="http://www.domain.com/"+somekey #comment1

I have really provided a very specific answer here for just the domain name. You should modify the regex as required. I see your code in question uses regex and hence i assume you can frame a regex to match your requirement better.

EDIT 1 : also see urlparse from here https://docs.python.org/2/library/urlparse.html?highlight=urlparse#module-urlparse

It provides an easy way to get to parse your url

Also, in line with "#comment1" you can actually save the domain name to a variable and reuse it here

Vasif
  • 1,393
  • 10
  • 26
  • Thank you for your answer. It answered most of what I needed. Unfortunately even though its not throwing errors, its not working either. I would paste in my whole code but as that it has multiple links, this site won't let me post multiple links until I have more rep. So back to square one. – Paul Rocco Nov 18 '14 at 23:29
  • use something like pastebin to share code and i hope helps me understand your problem better. – Vasif Nov 19 '14 at 17:47