PHP search in textfile then echoing it

Question

I want to search for specific data in a text file which contains accentuated letters. I used this code:

<?php
    $file = 'textfile.txt';
    $searchfor = 'key';


    // get the file contents, assuming the file to be readable (and exist)
    $contents = file_get_contents($file);
    // escape special characters in the query
    $pattern = preg_quote($searchfor, '/');
    // finalise the regular expression, matching the whole line
    $pattern = "/^.*$pattern.*\$/m";
    // search, and store all matching occurences in $matches
    if(preg_match_all($pattern, $contents, $matches))
    {
       echo utf8_encode(implode("\n", $matches[0]));
    }
    else
    {
       echo utf8_encode("No matches found");
    }
?>

But it's case sensitive and doesn't work with accentuaded letters.

Can somebody help me please?

Thanks :)

use $pattern = "/^.*$pattern.*\$/mi" or pattern case insensitive — Syed mohamed aladeen, Sep 06 '16 at 13:43
@Barmar this question has not duplicate as you said. There are some aspect which we have to consider while taking data from file_get_content. So please remove **duplicate** from this. — Manish, Sep 06 '16 at 14:41
@Barmar I want to give answer for this question. But due to duplicate content i cannot post it there. — Manish, Sep 06 '16 at 14:42
@Manish I've reopened. Although I think the only significant part of the question is the stuff that's duplicated. — Barmar, Sep 06 '16 at 15:34

score 0 · Answer 1 · answered Sep 06 '16 at 13:44

0

Add a i with your current pattern.

$pattern = "/^.*$pattern.*\$/mi";

answered Sep 06 '16 at 13:44

Syed mohamed aladeen

6,507
4
32
59

Thank you, now it isn't case sensitive but it still doesnt work with accentuaded letters. – Hiroo17 Sep 06 '16 at 13:47

score 0 · Answer 2 · answered Sep 06 '16 at 17:42

0

You can use this to get all the strings that contains accentuated letters.

preg_match_all("/\s+(.*?[ÇÜ]+.*?)\s+/i", $str, $matches);

[ÇÜ] is the range of chars between Ç and Ü

for more details about that range check the ASCII table

answered Sep 06 '16 at 17:42

zakaria35

857
1
7
12

score 0 · Accepted Answer · answered Sep 06 '16 at 18:09

@Hiroo17 I explaining the method to do this.

Suppose you have textfile.txt file in which you have accentuated letters like below.

Éric Cantona kÉy.

Here is the below script to deal with accentuated letters.

$searchfor = 'key';
function file_get_contents_utf8($fn) {
    $content = file_get_contents($fn);
    return mb_convert_encoding($content, 'UTF-8',
    mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}

function normalizeChars($s) {
    $replace = array(
    'ъ'=>'-', 'Ь'=>'-', 'Ъ'=>'-', 'ь'=>'-',
    'Ă'=>'A', 'Ą'=>'A', 'À'=>'A', 'Ã'=>'A', 'Á'=>'A', 'Æ'=>'A', 'Â'=>'A', 'Å'=>'A', 'Ä'=>'Ae',
    'Þ'=>'B',
    'Ć'=>'C', 'ץ'=>'C', 'Ç'=>'C',
    'È'=>'E', 'Ę'=>'E', 'É'=>'E', 'Ë'=>'E', 'Ê'=>'E',
    'Ğ'=>'G',
    'İ'=>'I', 'Ï'=>'I', 'Î'=>'I', 'Í'=>'I', 'Ì'=>'I',
    'Ł'=>'L',
    'Ñ'=>'N', 'Ń'=>'N',
    'Ø'=>'O', 'Ó'=>'O', 'Ò'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'Oe',
    'Ş'=>'S', 'Ś'=>'S', 'Ș'=>'S', 'Š'=>'S',
    'Ț'=>'T',
    'Ù'=>'U', 'Û'=>'U', 'Ú'=>'U', 'Ü'=>'Ue',
    'Ý'=>'Y',
    'Ź'=>'Z', 'Ž'=>'Z', 'Ż'=>'Z',
    'â'=>'a', 'ǎ'=>'a', 'ą'=>'a', 'á'=>'a', 'ă'=>'a', 'ã'=>'a', 'Ǎ'=>'a', 'а'=>'a', 'А'=>'a', 'å'=>'a', 'à'=>'a', 'א'=>'a', 'Ǻ'=>'a', 'Ā'=>'a', 'ǻ'=>'a', 'ā'=>'a', 'ä'=>'ae', 'æ'=>'ae', 'Ǽ'=>'ae', 'ǽ'=>'ae',
    'б'=>'b', 'ב'=>'b', 'Б'=>'b', 'þ'=>'b',
    'ĉ'=>'c', 'Ĉ'=>'c', 'Ċ'=>'c', 'ć'=>'c', 'ç'=>'c', 'ц'=>'c', 'צ'=>'c', 'ċ'=>'c', 'Ц'=>'c', 'Č'=>'c', 'č'=>'c', 'Ч'=>'ch', 'ч'=>'ch',
    'ד'=>'d', 'ď'=>'d', 'Đ'=>'d', 'Ď'=>'d', 'đ'=>'d', 'д'=>'d', 'Д'=>'D', 'ð'=>'d',
    'є'=>'e', 'ע'=>'e', 'е'=>'e', 'Е'=>'e', 'Ə'=>'e', 'ę'=>'e', 'ĕ'=>'e', 'ē'=>'e', 'Ē'=>'e', 'Ė'=>'e', 'ė'=>'e', 'ě'=>'e', 'Ě'=>'e', 'Є'=>'e', 'Ĕ'=>'e', 'ê'=>'e', 'ə'=>'e', 'è'=>'e', 'ë'=>'e', 'é'=>'e',
    'ф'=>'f', 'ƒ'=>'f', 'Ф'=>'f',
    'ġ'=>'g', 'Ģ'=>'g', 'Ġ'=>'g', 'Ĝ'=>'g', 'Г'=>'g', 'г'=>'g', 'ĝ'=>'g', 'ğ'=>'g', 'ג'=>'g', 'Ґ'=>'g', 'ґ'=>'g', 'ģ'=>'g',
    'ח'=>'h', 'ħ'=>'h', 'Х'=>'h', 'Ħ'=>'h', 'Ĥ'=>'h', 'ĥ'=>'h', 'х'=>'h', 'ה'=>'h',
    'î'=>'i', 'ï'=>'i', 'í'=>'i', 'ì'=>'i', 'į'=>'i', 'ĭ'=>'i', 'ı'=>'i', 'Ĭ'=>'i', 'И'=>'i', 'ĩ'=>'i', 'ǐ'=>'i', 'Ĩ'=>'i', 'Ǐ'=>'i', 'и'=>'i', 'Į'=>'i', 'י'=>'i', 'Ї'=>'i', 'Ī'=>'i', 'І'=>'i', 'ї'=>'i', 'і'=>'i', 'ī'=>'i', 'ĳ'=>'ij', 'Ĳ'=>'ij',
    'й'=>'j', 'Й'=>'j', 'Ĵ'=>'j', 'ĵ'=>'j', 'я'=>'ja', 'Я'=>'ja', 'Э'=>'je', 'э'=>'je', 'ё'=>'jo', 'Ё'=>'jo', 'ю'=>'ju', 'Ю'=>'ju',
    'ĸ'=>'k', 'כ'=>'k', 'Ķ'=>'k', 'К'=>'k', 'к'=>'k', 'ķ'=>'k', 'ך'=>'k',
    'Ŀ'=>'l', 'ŀ'=>'l', 'Л'=>'l', 'ł'=>'l', 'ļ'=>'l', 'ĺ'=>'l', 'Ĺ'=>'l', 'Ļ'=>'l', 'л'=>'l', 'Ľ'=>'l', 'ľ'=>'l', 'ל'=>'l',
    'מ'=>'m', 'М'=>'m', 'ם'=>'m', 'м'=>'m',
    'ñ'=>'n', 'н'=>'n', 'Ņ'=>'n', 'ן'=>'n', 'ŋ'=>'n', 'נ'=>'n', 'Н'=>'n', 'ń'=>'n', 'Ŋ'=>'n', 'ņ'=>'n', 'ŉ'=>'n', 'Ň'=>'n', 'ň'=>'n',
    'о'=>'o', 'О'=>'o', 'ő'=>'o', 'õ'=>'o', 'ô'=>'o', 'Ő'=>'o', 'ŏ'=>'o', 'Ŏ'=>'o', 'Ō'=>'o', 'ō'=>'o', 'ø'=>'o', 'ǿ'=>'o', 'ǒ'=>'o', 'ò'=>'o', 'Ǿ'=>'o', 'Ǒ'=>'o', 'ơ'=>'o', 'ó'=>'o', 'Ơ'=>'o', 'œ'=>'oe', 'Œ'=>'oe', 'ö'=>'oe',
    'פ'=>'p', 'ף'=>'p', 'п'=>'p', 'П'=>'p',
    'ק'=>'q',
    'ŕ'=>'r', 'ř'=>'r', 'Ř'=>'r', 'ŗ'=>'r', 'Ŗ'=>'r', 'ר'=>'r', 'Ŕ'=>'r', 'Р'=>'r', 'р'=>'r',
    'ș'=>'s', 'с'=>'s', 'Ŝ'=>'s', 'š'=>'s', 'ś'=>'s', 'ס'=>'s', 'ş'=>'s', 'С'=>'s', 'ŝ'=>'s', 'Щ'=>'sch', 'щ'=>'sch', 'ш'=>'sh', 'Ш'=>'sh', 'ß'=>'ss',
    'т'=>'t', 'ט'=>'t', 'ŧ'=>'t', 'ת'=>'t', 'ť'=>'t', 'ţ'=>'t', 'Ţ'=>'t', 'Т'=>'t', 'ț'=>'t', 'Ŧ'=>'t', 'Ť'=>'t', '™'=>'tm',
    'ū'=>'u', 'у'=>'u', 'Ũ'=>'u', 'ũ'=>'u', 'Ư'=>'u', 'ư'=>'u', 'Ū'=>'u', 'Ǔ'=>'u', 'ų'=>'u', 'Ų'=>'u', 'ŭ'=>'u', 'Ŭ'=>'u', 'Ů'=>'u', 'ů'=>'u', 'ű'=>'u', 'Ű'=>'u', 'Ǖ'=>'u', 'ǔ'=>'u', 'Ǜ'=>'u', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'У'=>'u', 'ǚ'=>'u', 'ǜ'=>'u', 'Ǚ'=>'u', 'Ǘ'=>'u', 'ǖ'=>'u', 'ǘ'=>'u', 'ü'=>'ue',
    'в'=>'v', 'ו'=>'v', 'В'=>'v',
    'ש'=>'w', 'ŵ'=>'w', 'Ŵ'=>'w',
    'ы'=>'y', 'ŷ'=>'y', 'ý'=>'y', 'ÿ'=>'y', 'Ÿ'=>'y', 'Ŷ'=>'y',
    'Ы'=>'y', 'ž'=>'z', 'З'=>'z', 'з'=>'z', 'ź'=>'z', 'ז'=>'z', 'ż'=>'z', 'ſ'=>'z', 'Ж'=>'zh', 'ж'=>'zh'
    );
    return  strtr($s, $replace);
}

$contents = file_get_contents_utf8($file);

$contents = normalizeChars($contents);

// escape special characters in the query

$pattern = preg_quote($searchfor, '/');

// finalise the regular expression, matching the whole line

$pattern = "/^.*$pattern.*\$/mi";

    // search, and store all matching occurences in $matches
    if(preg_match_all($pattern, $contents, $matches))
    {
        echo utf8_encode(implode("\n", $matches[0]));
    }
    else
    {
        echo utf8_encode("No matches found");
    }

Now i am explaining why i use the above method. When you call file_get_content. it will destroy UTF8 encoding. For this either you can use the above mb_convert_encoding used in file_get_contents_utf8 function or straightly use utf8_encode like $contents = utf8_encode(file_get_contents_utf8($file));

And then i use normalizeChars function to deal with accentuated letters.In that i use strtr function whose main work is to Translate characters or replace substrings.

I hope this will resolve your issue.

And Thanx again @Barmar for reopening this question.I hope you will not disappoint with my answer.

@Hiroo17 Great. If It resolved, Please accept the answer as Accepted. — Manish, Sep 07 '16 at 06:34

PHP search in textfile then echoing it

3 Answers3