0

I am trying to find a string in the given file (actually the file is tar file(please pay attention here) and i opened the file in notepad++ and took randomly a string from that opened file) and i stored that full tar file in a buffer and now i want to find the position of the string i copied using strstr function in the stored buffer.

The code to do is this(which is absolutely correct)-

char *compare= "_.png"; //suppose this is the copied string
//which is to be find out in buffer using strstr
            char * StartPosition;
            StartPosition = strstr (buffer,compare);
            __int64 count=0; 
            MessageBox(m_hwndPreview,L"before the while loop",L"BTN WND6",MB_ICONINFORMATION);
            while (StartPosition!=NULL)
            {
                MessageBox(m_hwndPreview,L"hurr inside the while loop",L"BTN WND6",MB_ICONINFORMATION);
                MessageBoxA(m_hwndPreview,strerror(errno),"BTN WND4", MB_ICONINFORMATION);
                count=StartPosition-buffer+1;
                return 0;
            }

and suppose if i have content of tar file in notepad like as below from where i copied this string stored in compare-

3_VehicleWithKinematicsAndAerodynamics_.000.png  IHDR (here is some strange data which can't be copied and also there are lot of NULL but we have to find out the position of "_.png" so not so difficult in this case ).

The question is my code works fine until i store the data before the .png then i am able to find its position using strstr the problem is when i try to find out the string postion which is appearing after

`3_VehicleWithKinematicsAndAerodynamics_.000.png  IHDR ...suppose here we have strange data (which is data block if we see the tar parser)...after this we have another file  like..."3_VehicleWithKinematicsAndAerodynamics_.html"`

and if i want to find this "3_VehicleWithKinematicsAndAerodynamics_.html" using strstr then i am not able to find it due to strange data in between them.(because i think that those data are not recognized by the compiler and dut to that i am not able to access the file which is located after the strange data) to make more clear see the location of file in tar file is as follows-

3_VehicleWithKinematicsAndAerodynamics_.000.png  IHDR ............(its data blocl-which is strange contents if you open in tar file)....3_VehicleWithKinematicsAndAerodynamics_.000.html

i have to access the .html file using strstr . why it is not accessing it ?? any ideas ?? *

PLEASE GIVE THE ALTERNATIVES TO ACHIEVE IT..I am sure what i try won't work..

Sss
  • 1,519
  • 8
  • 37
  • 67

2 Answers2

2

A C style string is a number of characters terminated by a zero-character (NUL character - the value zero, not the character '0'). This means that strstr will stop as soon as it hits such a byte.

One solution that is quite plausible is to simply write a function that searches through binary data based on it's length, not on a "terminating character".

Something like this (this still assumes that the str is a C style string):

 char *find_str_in_data(const char *str, const char *data, int length)
 {
    int pos = 0;
    int slen = strlen(str);
    while(pos < length-slen)
    {
       int i = 0;
       while(i < slen && str[i] = data[pos+i])
       {
           i++;
       }
       if (i == slen)
          return data + pos;
   }
   return NULL;
}
Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • but i have to do in visual c++ and there not only NULL in between two files there are strange data like"™‹xvŠˆŒÁ­¡8Ÿ+„ç÷ˆ@wáæ/&O‘S„›¥h>T¤ÇßÌþí.·Cx9ˆð¾"2RDF Dè Qq‡hðý»Ã¨Ñ(ÎïC…¹\XX 7¥˜þªøý”0…BÄKƒ…ßcB!Ôbí ñå&¡½¾›Åk»0¡®£½Æ‹½©Ø3E…Ntâõ!¯­H-B£AìY%l-L;[w?!òt54lé$a*TB”dŠæ¾"P)ÚOZ£ˆß)\­ÄªCBSBÄlxнñB" are also present. How to access the second file which is located after this strange data – Sss Jul 26 '13 at 14:51
  • The above should work to find teh string if it's present, whatever the rest of the data. To find another string further on, just start after the last found point (and of course adjust the length accordingly). Although I don't see any `.png` in that string. – Mats Petersson Jul 26 '13 at 14:53
  • the tar file contains data like 3_VehicleWithKinematicsAndAerodynamics_.000.png (its at starting) ...In between strange data and NULLand then....3_VehicleWithKinematicsAndAerodynamics_.html(i have to load the data starting from here after knowing its location).so beofre i meant to say that strstr is able to find the position of string until the .png(just see the end of the starting file the file is having extension ".png") after this .png there is strange data so i am not able to access the contents of second file(which is actually a .html file and i have to store it in a buffer)undrstod me? – Sss Jul 26 '13 at 15:17
  • @ Mats What if this string to be searched matched with any other string of the same length??As your algo only deals with the length of the string to be searched..Any solution for this ??? Or i have done some mistake in undestanding ?? – Sss Jul 29 '13 at 06:26
  • Not sure I understand your questions? It does deal with the length of the string to search for as well - it just doesn't take it as an argument, because it can figure it out using strlen. Are you asking "how do I search for several different strings at the same time"? That's not at all so easy to do - just calling this type of function several times will be the best way to do that. But I'm not quite sure what you are asking. Since this question is already answered, I'd suggest that you write a new one (that way other people will look at it too). – Mats Petersson Jul 29 '13 at 10:05
0

If you really want to use strstr then you need to escape the string contained in buffer with '\0'. If you know the size of data that was put into the buffer (let's say, sizeOfData), than you could do something like this before you use strstr:

buffer[sizeOfData] = '\0';

Warning: if sizeOfData is equal to the size of buffer, then you will either need a greater buffer or overwrite the last character with '\0' (in the second case you should check the buffer tail manually, because the character you've overwritten could be one of the characters of sequence you are looking for).

podkova
  • 1,019
  • 7
  • 16