I applied for a job recently and got sent a hackerrank exam with a couple of questions.One of them was a huffman decoding algorithm. There is a similar problem available here which explains the formatting alot better then I can.
The actual task was to take two arguments and return the decoded string.
The first argument is the codes, which is a string array like:
[
"a 00",
"b 101",
"c 0111",
"[newline] 1001"
]
Which is like: single character, two tabs, huffman code.
The newline was specified as being in this format due to the way that hacker rank is set up.
The second argument is a string to decode using the codes. For example:
101000111 = bac
This is my solution:
function decode($codes, $encoded) {
$returnString = '';
$codeArray = array();
foreach($codes as $code) {
sscanf($code, "%s\t\t%s", $letter, $code);
if ($letter == "[newline]")
$letter = "\n";
$codeArray[$code] = $letter;
}
print_r($codeArray);
$numbers = str_split($encoded);
$searchCode = '';
foreach ($numbers as $number) {
$searchCode .= $number;
if (isset($codeArray[$searchCode])) {
$returnString .= $codeArray[$searchCode];
$searchCode = '';
}
}
return $returnString;
}
It passed the two initial tests but there were another five hidden tests which it did not pass and gave no feedback on.
I realize that this solution would not pass if the character was a white space so I tried a less optimal solution that used substr to get the first character and regex matching to get the number but this still passed the first two and failed the hidden five. I tried function in the hacker rank platform with white-space as input and the sandboxed environment could not handle it anyway so I reverted to the above solution as it was more elegant.
I tried the code with special characters, characters from other languages, codes of various sizes and it always returned the desired solution.
I am just frustrated that I could not find the cases that caused this to fail as I found this to be an elegant solution. I would love some feedback both on why this could fail given that there is no white-space and also any feedback on performance increases.