0

I want to compare two files to check whether the second file is modified from the first file.

For this implementation I have planned to compare the md5_file() of the both files. But the problem is the original file is created by the Unix line coding and second file might be any type of line coding (Unix, Mac or Windows). So the file compare always fails. How to solve this issue?.

I have tried to remove the white spaces from the both files then proceeded the comparison. But this method also fails. Is there any other way to solve issue?

Im not supposed to copy or change the second file.

Fixed Myself as follows

$file1 = md5(preg_replace('/\s/', '', file_get_contents($file1)));
$file2 = md5(preg_replace('/\s/', '', file_get_contents($file2)));

if ($file1 == $file2)
    continue;
Santhanakumar
  • 382
  • 2
  • 15

2 Answers2

1

Simply replace all of the line endings in the second file with the unix style, but only do it to a temp file or such so you can preserve the original.

John V.
  • 4,652
  • 4
  • 26
  • 26
1

Depending on how big the files are, you could just read them into strings, taking the encoding into account, and then md5 those strings.

  $file1 = file_get_contents($file_url_1);
  $file2 = file_get_contents($file_url_2);

  $file1 = mb_convert_encoding($file1, "UTF-8", "whateverEncoding");
  $file2 = mb_convert_encoding($file2, "UTF-8", "whateverOtherEncoding");

  if (md5($file1) == md5($file2))

  ....
dognose
  • 20,360
  • 9
  • 61
  • 107
  • if you are reading entire file content, why not compare them, instead of calculating md5 and then comparing sums. And it would be much better to create md5 sum from byte values of file, instead of converting to string. – Dainius Jul 22 '13 at 08:27
  • @Dainius Well, i assume that the md5 String of the "current" version can be stored, so you just need to hash the new file, instead of both files all the time. But yes, ByteArray would make sence. – dognose Jul 22 '13 at 11:47