-2

I have two files blacklist.txt and email.txt. The blacklist.txt contains some domain names. The email.txt also contains a few domain names. I have to compare both files and find the domain names of blacklist.txt into email.txt using the strstr() function. Following is the code I have written. The problem with this code is it returns me the output NULL instead of the matched text/domain name.

#include <stdio.h>
#include <string.h>

#define MAXCHAR 1000

int main() {
    FILE *fp, *fp1;
    char str[MAXCHAR];
    char str2[MAXCHAR];
    char *result;

    fp = fopen("blacklist.txt", "r");
    fp1 = fopen("email.txt", "r");
    if (fp == NULL || fp1 == NULL) { 
        printf("\n Cannot open one of the files %s %s \n", fp, fp1); 
        //return = 0; 
        if (fp != NULL) fclose(fp); 
        if (fp1 != NULL) fclose(fp1); 
    } 
    
    while (fgets(str, MAXCHAR, fp) != NULL || fgets(str2, MAXCHAR, fp1) != NULL)
    //printf("%s, %s", str,str2);
    fclose(fp);
    fclose(fp1);
    result = strstr(str, str2);
    printf("The substring starting from the given string: %s", result);
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
sanober
  • 21
  • 5
  • 3
    [Why is “Can someone help me?” not an actual question?](https://meta.stackoverflow.com/questions/284236/why-is-can-someone-help-me-not-an-actual-question). Please describe more specifically what prevents you from implementing the `chkSpam` function. And please fix up the code formatting, specifically the indentation, to make it readable. – kaylum Aug 10 '20 at 07:34
  • 1
    Please describe in which way the shown code fails to achieve your goal. – Yunnosch Aug 10 '20 at 07:41
  • 2
    `while (fgets(str, MAXCHAR, fp) != NULL || fgets(str2, MAXCHAR, fp1) != NULL) fclose(fp);` What's going on in this loop? You want to close the file each time you read one line from one file __or__ read one line from another? – KamilCuk Aug 10 '20 at 09:21
  • 2
    ^^^^^ You commented out the `printf`, but you shouldn't have commented out the `;`. – Bob__ Aug 10 '20 at 09:26

1 Answers1

0

Here are some remarks on your code:

  • printing the error message has undefined behavior because you pass the FILE* pointers instead of the file names.
  • your program has undefined behavior because the body of the while loop is missing.
  • Since one cannot assume that both files be sorted, each line from blacklist.txt should be tested against all lines in email.txt.
  • if we can assume that both lines end with a newline, a match with strstr() means the line from the second file s a suffix of the line from the first file. It is a domain match if the return value is the start of the buffer and it is a match for a subdomain if the previous character is a ..

Here is a modified version:

#include <stdio.h>
#include <string.h>

#define MAXCHAR 1000

int main() {
    FILE *fp, *fp2;
    char str[MAXCHAR];
    char str2[MAXCHAR];

    // open and check blacklist.txt
    fp = fopen("blacklist.txt", "r");
    if (fp == NULL) {
        printf("Cannot open %s\n", "blacklist.txt");
        return 1;
    }
    // open and check email.txt
    fp2 = fopen("email.txt", "r");
    if (fp2 == NULL) {
        printf("Cannot open %s\n", "email.txt");
        fclose(fp);
        return 1;
    }
    
    // for each line in blacklist.txt
    while (fgets(str, MAXCHAR, fp) != NULL) {
        // restart from the beginning of email.txt
        rewind(fp2);
        // for each line of email.txt
        while (fgets(str2, MAXCHAR, fp2) != NULL) {
            // check for a domain match
            char *p = strstr(str, str2);
            if (p != NULL && (p == str || p[-1] == '.')) {
                // compute the length of the domains (excluding the newline)
                int n = strcspn(str, "\n");
                int n2 = strcspn(str2, "\n");
                // output the message with the matching domains
                printf("domain match on %.*s for %.*s\n", n2, str2, n, str);
            }
        }
    }
    fclose(fp);
    fclose(fp2);
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • I run this code, and the output is "Cannot open email.txt". – sanober Aug 10 '20 at 15:47
  • @sanober: sorry, there was a missing line `if (fp2 == NULL) {` and also a typo on `while (fgets(str2, MAXCHAR, fp2) != NULL) {`. I also added some explanatory comments. – chqrlie Aug 11 '20 at 10:10
  • This only outputs the last matched domain. It does not give me the all matched domains. – sanober Aug 18 '20 at 18:08
  • @sanober: it actually outputs the first domain of `email.txt` that matches a domain in `blacklist.txt`. If you want all matches, remove the `break` statement. You might also want to transpose the loops and test each domain in `email.txt` against all domains in `blacklist.txt` in this order. – chqrlie Aug 18 '20 at 18:14