0

My code is working but there is a small possibility to have duplicated $categoryurl as output, how can I keep the uniques only?

I have a folder called "xml" in the webroot, I use glob() to search the /xml/ directory for the xml files.

I use a loop to find all XML files and find all item nodes, the item nodes can be duplicated as some of the nodes are available in multiple xml files so I use $html = array_unique($html); to keep all 100% uniques and remove all duplicates from my array.

Some code:

<?php
// Removed the code above this line as it's not needed in this question 
// $URL_array is defined above, it's an array() filled with XML URL's
foreach($URL_array as $XML_url){
$xml = simplexml_load_file($XML_url);
if ($xml===null || !is_object($xml))
    die('Kon het XML bestand niet laden, Raporteer a.u.b. deze fout.');
if (!is_object($xml->item))
    die('Kon de items niet laden, rapporteer a.u.b. deze fout.');
$Number_Of_Nodes = $xml->item->count(); /** Count number of items **/
for($i = 0; $i < $Number_Of_Nodes; $i++){ /** Number of category here... **/
$categoryname = $xml->item[$i]->recepttitle;
$categoryurl = $xml->item[$i]->recepturl;
$receptintroduction = $xml->item[$i]->receptintroduction;
$receptimageurl = $xml->item[$i]->receptimageurl;
$receptcategoryurl = $xml->item[$i]->receptcategoryurl;
$receptcategory = $xml->item[$i]->receptcategory;
$html[] = '<div class="content_box">' . "\r\n" . '<div class="content_box_header">' . "\r\n\t" . ucfirst($categoryname) . ' &bull; <a href="'. $receptcategoryurl . '">'. $receptcategory . '</a>' . "\r\n" . '</div>' . "\r\n" . '<div class="story_box_text">' . "\r\n" . '<br />' . "\r\n" . '<p><a title="' . $categoryname . '" href="' . $categoryurl . '"><img src="' . $receptimageurl . '" alt="' . $categoryname . '" title="' . $categoryname . '" /></a><br />' . $receptintroduction . '<br /><span class="align-right"><a title="'. $categoryname . '" href="' . $categoryurl . '" class="purplesmallbutton">Lees verder</a></span><br /></p>' . "\r\n" . '</div></div>' . "\r\n" . '<div class="clear"></div>' . "\r\n";
}
}
if(empty($html)){
    echo '<p class="error">In verband met werkzaamheden geen inhoud beschikbaar</p>' . "\r\n";
    }else{
        $html = array_unique($html); /** Alle duplicaten verwijderen **/
        shuffle($html);

Now I have a large shuffled array full of strings

if $categoryurl is duplicated I'd like to keep the first key found and all uniques, How should I achive this?

Last part of my code:

        echo implode("\n", array_slice($html, 0, 6)); /** output shoud always be 6 array keys only without duplicated $categoryurl, if there are duplicates the duplicates should be removed before this action **/
    }
    ?>

Edit

Contents for XML file: 1.xml

   <?xml version="1.0" encoding="UTF-8"?>
    <channel xmlns="http://www.w3.org/2005/Atom">
      <id>...</id>
      <title><![CDATA[vegetarische hoofdgerechten RSS]]></title>
  <author>
    <name>Voeding vegetarische hoofdgerechten</name>
    <email>webmaster@example.com</email>
  </author>
  <updated></updated>
  <link rel="alternate" href="https://voeding.esthervrees.nl/vegetarische-hoofdgerechten" />
  <subtitle><![CDATA[Some subtitle here.]]></subtitle>
  <rights>Copyrights reserved. Feel free to use the embed function.</rights>

    <item>
    <recepttitle><![CDATA[Pittige rijst met bonen voor 6 tot 8 personen]]></recepttitle>
    <shortrecepttitle><![CDATA[Pittige rijst met bonen]]></shortrecepttitle>
    <receptintroduction>Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here </receptintroduction>
    <recepturl>https://google.com</recepturl>
    <receptimageurl>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptimageurl>
    <receptcategoryurl>https://www.yahoo.com</receptcategoryurl>
    <receptcategoryimage>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptcategoryimage>
    <receptcategory>Vegetarische hoofdgerechten</receptcategory>
    </item>

</channel>

Contents for XML file: 2.xml

<?xml version="1.0" encoding="UTF-8"?>
<channel xmlns="http://www.w3.org/2005/Atom">
  <id>...</id>
  <title><![CDATA[vegetarische hoofdgerechten RSS]]></title>
  <author>
    <name>Voeding vegetarische hoofdgerechten</name>
    <email>webmaster@example.com</email>
  </author>
  <updated></updated>
  <link rel="alternate" href="https://voeding.esthervrees.nl/vegetarische-hoofdgerechten" />
  <subtitle><![CDATA[Some subtitle here.]]></subtitle>
  <rights>Copyrights reserved. Feel free to use the embed function.</rights>

    <item>
    <recepttitle><![CDATA[Pittige rijst met bonen voor 6 tot 8 personen]]></recepttitle>
    <shortrecepttitle><![CDATA[Pittige rijst met bonen]]></shortrecepttitle>
    <receptintroduction>Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here </receptintroduction>
    <recepturl>https://google.com</recepturl>
    <receptimageurl>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptimageurl>
    <receptcategoryurl>https://www.yahoo.com</receptcategoryurl>
    <receptcategoryimage>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptcategoryimage>
    <receptcategory>Vegetarische hoofdgerechten</receptcategory>
    </item>
    <item>
    <recepttitle><![CDATA[Pittige rijst met bonen voor 6 tot 8 personen]]></recepttitle>
    <shortrecepttitle><![CDATA[Pittige rijst met bonen]]></shortrecepttitle>
    <receptintroduction>Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here </receptintroduction>
    <recepturl>https://yahoo.com</recepturl>
    <receptimageurl>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptimageurl>
    <receptcategoryurl>https://www.yahoo.com</receptcategoryurl>
    <receptcategoryimage>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptcategoryimage>
    <receptcategory>Vegetarische hoofdgerechten</receptcategory>
    </item>

</channel>

the item node in the XML file 1.xml and the first item node do have the same content for $categoryurl but the $categoryurl should not be available more then once in any of the items.

If the $categoryurl content is duplicated (available in any other $categoryurl inside one or more of the other item nodes) I would like to keep/add only one random item node and all unique items. $html[]with only 100% uniques (already done with) and a randomly selected key that has a duplicated $categoryurl as $categoryurl content should be an unique URL, if not unique, skip all of the duplicates and keep uniques only..

php example array:

    $URL_array = array($_SERVER['DOCUMENT_ROOT'] . '/xml/1.xml', $_SERVER['DOCUMENT_ROOT'] . '/xml/xml2.xml'); /** I added a lot more xml files to this array **/
jagb
  • 912
  • 11
  • 26
  • 1
    _"keep the first key found and all uniques"_, you want to get the first only ? and all other elsewhere? – Syscall Mar 02 '18 at 18:43
  • If there are any duplicates I would like to keep 1 of the duplicates (the first one where the url is duplicated) and all uniques – jagb Mar 02 '18 at 18:48
  • http://php.net/manual/en/function.array-unique.php – admcfajn Mar 02 '18 at 19:18
  • @IncredibleHat The answers below didn't do the trick, I believe your answer is looking very well but I wasn't able to get it to work. I receive php warnings such as `Illegal offset type in isset or empty` and `Illegal offset type` I'll try to get it to work by checking the other values if `$categoryurl` isset then if present add the string to the $html array as the ones without `$categoryurl` set are duplicates.. but it's still not working – jagb Mar 04 '18 at 22:11
  • @IncredibleHat Yes, I noticed the line with `trim()` but was not able to get it to work with it, if used the errors are gone but it's still showing objects with duplicated `$categoryurl`. If a duplicated `$categoryurl` is found, only 1 xml item that is using the `$categoryurl` can be available in the output randomly, if there multiple item nodes that have equal `$categoryurl` only 1 out of the duplicates and all uniques should apear in the `$htm[]` array I updated my question with XML and php code examples that will allow you to test and see the output, How can I use your sample code? – jagb Mar 21 '18 at 03:36
  • You changed the fundamental basis of the question. What we had worked for what you originally described. But now tossing in all that extra behavior with randomizing, and whatnots. Sorry. Nope. – IncredibleHat Mar 21 '18 at 14:05
  • @IncredibleHat No, I didn't change the fundamental basis of the question, I wrote in my original question: "Now I have a large shuffled array full of strings **if $categoryurl is duplicated I'd like to keep the first key found and all uniques, How should I achive this?**" as ` in each XML file within the `$URL_array` array has no duplicated `$categoryurl` the first key found inside the `$URL_array` array is random, so there is nothing to randomize. I only added more information to the question and added XML code examples to the question, it's still the same question. Best Regards – jagb Mar 21 '18 at 19:04

1 Answers1

0

Instead of:

$html[] = '<div class="content_box">...';

try:

if (array_key_exists($receptcategoryurl, $html)) {
    // this key already exists, we proceed to
    // the next item in the loop
    continue;
}
$html[$receptcategoryurl] = '<div class="content_box">...';

This way you make sure that only the first occurrence is allocated and skip the next ones.

  • I updated my question, it's about skipping duplicated`$categoryurl`, please read the updated question, I added some lines and code.. – jagb Mar 21 '18 at 03:50