2

I am trying to modify the following regex to enforce that the domain is either youtube or youtu.be. This original regex is meant to provide in the 2nd group the id of the video for watching.
E.g. lVIGhYMwRgs

my current test list

http://www.youtube.com/watch?v=lVIGhYMwRgs&feature=feedrec_grec_index
http://www.youtube.com/v/lVIGhYMwRgs?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=lVIGhYMwRgs#t=0m10s
http://www.youtube.com/embed/lVIGhYMwRgs?rel=0
http://www.youtube.com/watch?v=lVIGhYMwRgs
http://youtu.be/lVIGhYMwRgs
http://www.example.com/media/embed/83295164

First Regex

(youtu.be\/|v\/|u\/\w\/|embed\/|watch\?v=|\&v=)([^#\&\?]*)

enter image description here

The problem is that example.com matches!

So I tried modifying the regex to the following to ensure either youtube or youtu.be are in the url:

((youtu.be\/)|(youtube.com\/))(v\/|u\/\w\/|embed\/|watch\?v=|\&v=)([^#\&\?]*)

enter image description here While this solves my example.com problem, it does not match the youtu.be url.

I have also tried this regex because I think my problem is that youtu.be only has a slash and then directly after, the id.

(youtube.com\/)(youtu.be|v\/|u\/\w\/|embed\/|watch\?v=|\&v=)([^#\&\?]*)

enter image description here

and then i tried this which works for youtu.be and not much else.

((youtube.com\/)|(v\/|u\/\w\/|embed\/|watch\?v=|\&v=)|(youtu.be\/))([^#\&\?]*)

enter image description here

How can i fix my modification?

Valamas
  • 24,169
  • 25
  • 107
  • 177

3 Answers3

1

Are the IDs always 11 digits? Some options below.

Fiddle

http://(www.)?youtu([.]be|be[.]com).*[/=]([A-Za-z0-9]{11})[?#&]*.*$

or

[=/]([A-Za-z0-9]{11})([?#&]|$)

Also found this, which might help JavaScript REGEX: How do I get the YouTube video id from a URL?

Community
  • 1
  • 1
gillyspy
  • 1,578
  • 8
  • 14
1

I've cracked it. Could you verify it once.

Note: The blank group i.e. ()() is due to easy processing of URLs, so that you need to only consider Group[6] which will give only lVIGhYMwRgs.

((you(tu.be\/()()(.*)|tube.com\/(v\/|u\/\w\/|embed\/|watch\?v=|\&v=)([^#\&\?]*))))
Santosh Panda
  • 7,235
  • 8
  • 43
  • 56
  • I recommend replacing the generic dot with something that will actually test for a dot character, or this will match `youtuBbeQcom` or `youtuBbe` which depending on the type of validation OP is looking for might cause problems. – Ro Yo Mi May 15 '13 at 04:08
  • @Denomales: It would not cause any problem, rather the dot character will match the video name after youtu.be/. It will never match youtuBbeQcom or youtuBbe. – Santosh Panda May 15 '13 at 04:14
  • this is awesome. Looking forward to dissecting how this works. I have changed my code to use the last group. All are group 8 unless it is the youtu.be address which is group 6. Works great. – Valamas May 15 '13 at 04:56
0

Description

Try this regex out, it looks for either youtube.com followed by a video code of some length upto the next paramater delimiter. Or it follow the youtu.be format and look for a / followed by some video code of variable length.

You will need some logic to parse the return strings, in this case group 1 and 2 will match for youtube.com and group 3 and 4 will match for youtu.be.

(?:(youtube[.]com).*?(?:[?&]v=[^&]*?|[/](?:v|embed)[/]([^&?]*?))(?=$|[?#&]))|(?:(youtu[.]be)[/](.*?)(?=[?&]|$))

enter image description here

Note how the www.example.com line is not matched

Ro Yo Mi
  • 14,790
  • 5
  • 35
  • 43