0

I have a file with a passage on it (Assignment2inputfile.txt). I can open that file just fine. I have another file (stopwords) that has a list of words that, if found in Assignment2inputfile, need to be replaced with the word "stop" (I put it in all caps in the code so I can see immediately when it works). I feel like I'm right on the edge of what I need, but the replacement is not happening. This is an exercise, so that's why my variables are named very generally or with what they are doing (chng -> change -> changing the original file ; $new -> the result of the changes)

$x = file_get_contents('Assignment2inputfile.txt');
$chng = str_replace("stopwords",'STOP', $x); 
$new = file_put_contents('Assignment2inputfile.txt', $chng);
echo $new; 
Kari
  • 115
  • 1
  • 1
  • 10
  • 1
    What is the structure of the stopwords file? Is it just a list of words, each on a new line, or CSV, or something else? It doesn't look like you're loading that file in your code. (Also there's a syntax error on line 2, but that's probably just a typo in your question.) – Don't Panic Oct 04 '18 at 21:36
  • A real example would be most useful. For example if you want to replace `red` but not `redemption` a better usage would be `preg_replace` and word boundaries. – user3783243 Oct 04 '18 at 21:38
  • The key point that's missing for you to complete the assignment is that you need to have an array where you have the string "stopwords" in the `str_replace` function. How you get that array depends on the structure of the stopwords file. – Don't Panic Oct 04 '18 at 21:39
  • When put up onto the server and accessed in browser, the stopwords file is a massive list of words, one per line, in alphabetical order. Yes, I just fixed the line 2 error, it was a typo. – Kari Oct 04 '18 at 21:39
  • Do you use `file` or `file_get_contents` on it? – user3783243 Oct 04 '18 at 21:40
  • How massive is it? If it's too big to hold in memory you may need to read it line by line and str_replace repeatedly rather than reading it into an array. – Don't Panic Oct 04 '18 at 21:41
  • If selected, copied, and pasted to Word: 927 words, one per line – Kari Oct 04 '18 at 21:43
  • @user3783243 What I use is what is there, I've been using file_get_contents. I'll probably need to repeat that function on stopwords since I haven't yet done that – Kari Oct 04 '18 at 21:45
  • Oh, 927 is not too bad. You can probably just read it in with [`file`](http://php.net/manual/en/function.file.php), as @user3783243 suggested. That will be better than `file_get_contents`, since you need an array anyway. Be sure to use the `FILE_IGNORE_NEW_LINES` flag. – Don't Panic Oct 04 '18 at 21:47
  • Different people have different Ideas of massive ... lol ... for me massive is over 10 million.... if they are one per line you can use [file](http://php.net/manual/en/function.file.php) I hardly ever use that function, but I remember it from the good old days... It creates an array from a file based on the line returns. But you'll want to trim the lines probably `$filearray = array_map('trim', $filearray);` as `Returns the file in an array. Each element of the array corresponds to a line in the file, with the newline still attached` – ArtisticPhoenix Oct 04 '18 at 21:50
  • Keep in mind https://en.wikipedia.org/wiki/Scunthorpe_problem – user3783243 Oct 04 '18 at 21:52
  • Sorry, I just started coding no more than 4 weeks ago, so I am still getting used to how much data is managed for the functions and exercises I do. – Kari Oct 04 '18 at 21:52
  • @user3783243 I'd never heard of the Scunthorpe Problem so thank you for that! Definitely helpful to know about – Kari Oct 04 '18 at 21:54

2 Answers2

0

str_replace can take an array of strings as its first parameter, and it will find and replace each of them in the target string. So here

$chng = str_replace("stopwords", 'STOP', $x);

"stopwords" needs to be an array $stopwords containing the list of words from that file.

Probably the easiest way to get that array is to use file, a function that reads a file into an array.

$stopwords = file('stopwords.txt', FILE_IGNORE_NEW_LINES);
$chng = str_replace($stopwords, 'STOP', $x);

FILE_IGNORE_NEW_LINES is needed because otherwise the strings in the array will include the newlines, and consequently probably won't match anything in your other file.


Sort of a sidenote, but file_put_contents doesn't return the new contents, it returns the number of bytes written to the file. So if you want to see the altered text, just echo $chng; instead of $new.

Don't Panic
  • 41,125
  • 10
  • 61
  • 80
-1

Here I will do you a solid (untested)

$x = file_get_contents('Assignment2inputfile.txt');

//if file returns false we cant use a boolean as an array, so this is more sensable
if(false === ($stopwords = file('stopwords.txt', FILE_SKIP_EMPTY_LINES))) throw new Exception('Could not load stop words from file');

$stopwords = array_map(function($item){
    return preg_quote(trim($item),'/');
}, $product);
$pattern = '/\b('.implode('|', $stopwords).')\b/';

$chng = preg_replace($pattern, 'STOP', $x); 
$new = file_put_contents('Assignment2inputfile.txt', $chng);

Basically after filtering the stopwords (array) you get a pattern like this

/\b(the|and|for)\b/

The pattern is basically

  • \b word boundary
  • ( ... | ... ) is OR

But you want to trim and preg quote them, which is what the array map does.

If you are just replacing using 'STOP' for all the words this should be fine.

http://php.net/manual/en/function.file.php

http://php.net/manual/en/function.preg-quote.php

Oh and 'stopwords.txt' should be the name of your stopwords file.

ArtisticPhoenix
  • 21,464
  • 2
  • 24
  • 38