Php Curl parsing a m3u file

Question

Hope you guys can help me out. I have the following .m3u file

#EXTM3U
#EXTINF:-1 tvg-id="" tvg-name="A&E" tvg-logo="" group-title="ENTRETENIMIENTO",A&E
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
#EXTINF:-1 tvg-id="" tvg-name="ABC Puerto Rico" tvg-logo="" group-title="NACIONALES",ABC Puerto Rico
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
#EXTINF:-1 tvg-id="" tvg-name="Animal Planet" tvg-logo="" group-title="ENTRETENIMIENTO",Animal Planet
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/185.ts

As you can see, there is the main tag for the file #EXTM3U and down that start the video information tag (#EXTINF:-1 ...) and down that the video link entry (http:// .....)

Can you explicitly tell me how can i parse this whole file (it's a pretty large one) and save the fields in an array for example like this? videos[ ] and later i can acces to every video attributes lets say videos[0]['title'] for getting the title for the first video? and so on with the other attributes for example videos[42]['link'] and get the link to the video #42.

I am already using curl to get the file content into a variable like this

<?php
   $handler = curl_init("link to m3u file");  
   $response = curl_exec ($handler);  
   curl_close($handler); 
   echo $response;
?>

What i need now is to parse the Curl response and save all the videos information into an array, where i can acces to every attribute of every video.

I know i must use some regexp or something like that. i just dont understand how. can you please help me with some code? thank you so much.

What are those spaces, are they tabs, or a space? If they are tabs you could parse them with csv, like `fgetcsv ($handle, 0, "\t")` maybe using `fopen('php://temp')` for the steamwrapper. — ArtisticPhoenix, Jan 24 '17 at 05:51

ArtisticPhoenix · Accepted Answer · 2017-01-24T07:35:59.940

Behold the magik of Regx

$string = <<<CUT
#EXTM3U
#EXTINF:-1 tvg-id="" tvg-name="A&E" tvg-logo="" group-title="ENTRETENIMIENTO",A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
#EXTINF:-1 tvg-id="" tvg-name="ABC Puerto Rico" tvg-logo="" group-title="NACIONALES",ABC Puerto Rico
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
CUT;

preg_match_all('/(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^\r\n]+)|(?<url>http[^\s]+)/', $string, $match );

$count = count( $match[0] );

$result = [];
$index = -1;

for( $i =0; $i < $count; $i++ ){
    $item = $match[0][$i];

    if( !empty($match['tag'][$i])){
        //is a tag increment the result index
        ++$index;
    }elseif( !empty($match['prop_key'][$i])){
        //is a prop - split item
        $result[$index][$match['prop_key'][$i]] = $match['prop_val'][$i];
    }elseif( !empty($match['something'][$i])){
        //is a prop - split item
        $result[$index]['something'] = $item;
    }elseif( !empty($match['url'][$i])){
        $result[$index]['url'] = $item ;
    }
}

print_r( $result );

Returns

array (
  0 => 
  array (
    'tvg-name' => 'A&E',
    'group-title' => 'ENTRETENIMIENTO',
    'something' => ',A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
    'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
  ),
  1 => 
  array (
    'tvg-name' => 'ABC Puerto Rico',
    'group-title' => 'NACIONALES',
    'something' => ',ABC Puerto Rico',
    'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts',
  ),
)

Seriously though I have no clue what some of this is something for example. Anyway should get you started.

For the regx, it's actually pretty simple when it's broken down. The real trick is in using preg_match_all instead of preg_match.

Here is our regx

 /(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^\r\n]+)|(?<url>http[^\s]+)/

First we will break it down to more manageable bits. These are separated by the pipe | for or. Each one can be thought as a separate pattern, match this one or the next one. Now, the order can be important, because they will match left to right so if one matches on the left it stops. So you have to be careful no to have a regx that can match in two places ( if you don't want that ). However, it can be used to your advantage too, as I will show below. This is really what we are dealing with

 (?P<tag>#EXTINF:-1)

 (?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")

 (?<something>,[^\r\n]+)

 (?<url>http[^\s]+)

Four regular expressions. For all of these (?P<name>...) is a named capture group, it just makes it more readable, easier to find the bits. If you look at the conditions I use to find the matches, for example!empty($match['tag'][$i]), we can use the tag index/key because of a named capture group, otherwise it would be 1. With a number of regx all together, having 1 2 3 can get messy if you consider this is actually nested so it would be $match[1][$i] for tag etc. Anyway, once that is taken out we have

#EXTINF:-1 match this string literally
(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)") this is more complicated (?: .. ) is a non-capture group, this is so the key/value winds up with the same index in the match array but not captured togather, Broken down this is ([-a-z]+)=\"([^"]+)\" or match a word followed by = then " than anything but a " ending with ". Basically one side captures the key, the other the value excluding the double quotes
,[^\r\n]+ starts with a comma then anything but a line return
and last http[^\s] a url

Now remember what I said about order being important, this url http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts would match the last expression, except that it starts with ,A&Ehttp://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts` which matches the 3rd one so it never gets to number 4

Hope that helps, granted you'll have to have a basic grasp of Regx, this is not really the place for a full tutorial on that, and you can find better examples then I can provide in a few short minutes.

Just for the sake of completeness, here is part of what preg_match_all returns

(
    [0] => Array(
            [0] => #EXTINF:-1
            [1] => tvg-name="A&E"
            [2] => group-title="ENTRETENIMIENTO"
            [3] => ,A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
            [4] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
            [5] => #EXTINF:-1
            [6] => tvg-name="ABC Puerto Rico"
            [7] => group-title="NACIONALES"
            [8] => ,ABC Puerto Rico
            [9] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
        )
    [tag] => Array(
            [0] => #EXTINF:-1
            [1] => 
            [2] => 
            [3] => 
            [4] => 
            [5] => #EXTINF:-1
            [6] => 
            [7] => 
            [8] => 
            [9] => 
        )
    [1] => Array(
            [0] => #EXTINF:-1
            [1] => 
            [2] => 
            [3] => 
            [4] => 
            [5] => #EXTINF:-1
            [6] => 
            [7] => 
            [8] => 
            [9] => 
        )
    [prop_key] => Array(
            [0] => 
            [1] => tvg-name
            [2] => group-title
            [3] => 
            [4] => 
            [5] => 
            [6] => tvg-name
            [7] => group-title
            [8] => 
            [9] => 
        )
    [2] => Array( ... duplicate of prop_key .. ) 
   etc. 
)

The way to find the item in the above array is if you look at the for loop when it runs the first time index 0, the main part of the match $match[0][$i] contains all the matches, but the tag array only contains the items that match that regx, we can correlate them using the $i index.

    if( !empty($match['tag'][$i])){
        //is a tag increment the result index
        ++$index;
    }

If $match[tag][$i] is not empty. which if you look at $match[tag][0] when $i = 0 you will see that indeed it is not empty. On the second loop $match[tag][1] is empty but $match[prop_key][1] is not so we know that when $i = 1 item is a prop_key match. That's how that works.

-ps- if you can find a way to remove the duplicated numeric indexes, please share it with me ... lol ... these are the normal matches if I didn't use a named capture group, as I said it can get messy.

Sorry, i used your code and returned an empty array Array(). I just edited the question and fixed the code part of the m3u file content. can you give me a hand again? i need the tags tvg-name group-title and url. ommit the "something" tag. and if you could explain me a little about the regex I would appreciate it. thank you again — Alejandro Arenas, Jan 24 '17 at 06:48
what's the point of omitting it when you can just ignore it. Besides where does that data go. — ArtisticPhoenix, Jan 24 '17 at 06:53
Ok let's say i ignore it. But still the code returns me an empty array. i corrected the code on the question, can you please help me to get the values now? — Alejandro Arenas, Jan 24 '17 at 06:55
Your likely having an issue with the HEREDOC this bit `$string = << — ArtisticPhoenix, Jan 24 '17 at 07:17
Instead of `$string` you can just use the results of CURL `$response` — ArtisticPhoenix, Jan 24 '17 at 07:22
And if you really don't want that `something` just remove the regx part for it `|(?,[^\r\n]+)` including the `|` pipe. — ArtisticPhoenix, Jan 24 '17 at 07:23

score 0 · Answer 2 · answered Nov 26 '18 at 22:07

I did a simple working m3u8 parser in php. it's a remote m3u8 file parser to json but it easy to change the output https://github.com/onigetoc/m3u8-PHP-Parser

I may soon change it or add a CURL parser instead of file_get_contents().

m3u-parser.php?url=https://raw.githubusercontent.com/onigetoc/m3u8-PHP-Parser/master/ressources/demofile.m3u

score -1 · Answer 3 · answered Jan 24 '17 at 05:47

Once you get the CURL Response then read the file from Remote Location via CURL or fopen function.

For that you have read the files that are into directory from remote location and save all the files into Local server.

You can use the file function "Stat" for the getting all the information and keep into the $files

I have given the idea regarding how to collect all information and then you can create array.

Once the Array is created you can serialize the response for printing.

Php Curl parsing a m3u file

3 Answers3