how to skip a file inside the tar file to get a particular file

Question

i am tring to get the contents of a html file which is present inside the tar file(i am using visual c++ to accomplish my task). my approach is to store the tar in a buffer using a stream and then store the contents of html in another buffer.Then using buffer going to the file name of each file present in tar file at location buffer[0-100](at this location we have the file name)and store the file name in "contents"(in my case) and search if it has the extension.html file ??

If it has .html in the name of the file then store its contents from the location buffer[PreviousFileSizes +512](by PreviousFileSizes i mean there were some files before this html file so we have to add their sizes in the buffer index to go to the correct location- i mean i am not assuming that the first file in tar file is html file-In my code i denote this PreviousFileSizes by "skip" - that means this much size to skip to go to our html file).

my code to achieve it is-

int skip=0;
            char contents [100];
            //char test[1000];
            do
            {

                    int SizeOfFile = CreateOctalToInteger(&buffer[skip+124],11);
                    size_t distance= ((SizeOfFile%512) ? SizeOfFile + 512 - (SizeOfFile%512) : SizeOfFile );
                    size_t skip= distance +512;
                    memcpy(contents,&buffer[skip],100);




            }
            while(strstr(contents,".html") != NULL);

am i going right ??Please correct me if there is anything wrong in my logic ??

Have you tries using `tar xf mytarfile myfile.html`? Wouldn't that be A LOT easier? — Mats Petersson, Jul 29 '13 at 10:14
Is that any command you are taling about..If yes i don't have to deal with the command prompt.Just c++ code for getting the contents of html file present in tar file.. if its any new and easy thing please explain me properly whats that "tar xf mytarfile myfile.html" ?? — Sss, Jul 29 '13 at 10:35
I mean the command prompt tool - you'd have been done last week if you'd taken that option. Just live with the fact that it's a command prompt tool. Or find a `tar` capable tools that runs in GUI, I'm sure there are such things. — Mats Petersson, Jul 29 '13 at 10:57
Sorry Mats ..I don't have knowledgr about how to do what your ar saying.Please give me some links to undersatnd well. But in the previous code i am getting the size of the next file(i mean i a just able to slip the first file not the second file .I think there is some problem in do while loopµ. Could yo uplease predict what's that problem is ? (I mean this skip is not repeating the loop until i don't find the .html file . It just executes once and thats it on debugging i can see the file contents of second file in tar file but coming out after that) — Sss, Jul 29 '13 at 11:21
No, I don't know any links - I'm sure they exist, but I'm sure you can use google just as well as I can. — Mats Petersson, Jul 29 '13 at 11:29

Ingo Leonhardt · Answer 1 · 2013-07-29T13:55:43.270

1

Doesn't look too bad except for the errors :-)

you set skip = ... instead of skip += .., so your position in buffer is only correct for the second file
You don't check the first file (because it's do { ... } while() and the first time you call strstr(), contents is already filled with buffer at some poition skip > 0).
You should also add a 'break' condition to stop looping when you find a 'file name' "".

EDIT and we should of course also check for the tar file size.

I would try it like that:

// I assume size_t bufsize to be the tar file size

size_t skip = 0;
while( bufsize > skip && strcmp( buffer+skip, "" ) != 0 && strstr( buffer+skip, ".html" ) != 0 ) {
     int SizeOfFile = CreateOctalToInteger(&buffer[skip+124],11);
     size_t distance= ((SizeOfFile%512) ? SizeOfFile + 512 - (SizeOfFile%512) : SizeOfFile );
     skip += distance +512;  
}

if( bufsize > skip && strstr( buffer+skip, ".html" ) == 0 ) {
    // hooray
    int SizeOfHTML = CreateOctalToInteger(&buffer[skip+124],11);
    char *htmlData = buffer+skip+512;

    // do stuff with htmlData
}

edited Jul 29 '13 at 13:55

answered Jul 29 '13 at 12:50

Ingo Leonhardt

9,435
2
24
33

there are some doubts . why you have used this condition (1.) strcmp( buffer+skip, "WHY YOU HAVE KEPT IT BLANK" ) != 0 (2.) By doing this buffer+skip. what do you mean to say ?? (3.) where you are asking me to break condition. Do i need to put break condition in the code written by you ?? – Sss Jul 29 '13 at 13:12
Comparing the filename to `""` is because of point 3 of my answer. That just *is* the additional break condition. `buffer+skip` is equivalent to `&buffer[skip]`. It's up to you what you like better, guess what I prefer :-). Anyway that's merely the same as you do in your original code -- comparing the filename as position `skip` -- just without copying a portion to `contents` what is simply unneccessary – Ingo Leonhardt Jul 29 '13 at 13:15
But in program i wouldn't know the fileName. because my program is for every tar file containing the.html file inside it. actually i need to store the .html contents and need to display it using buffer. So i wouldn't know what are the filename of the files inside tar file.here in this case i can see the file name manually but it's not possible to see for every tar file containing the html file. my program should work for every html file present in tar file. Have you understood what i mean to say ? – Sss Jul 29 '13 at 13:24
what filename is used in my code except `""` what you may find *behind the last file* ? Look [here again](http://stackoverflow.com/questions/17862383/how-to-know-the-files-inside-the-tar-parser). Please just try it out – Ingo Leonhardt Jul 29 '13 at 13:27

score 0 · Accepted Answer · answered Jul 29 '13 at 16:01

Finally i have made the solution for this question the code must be as follow-

char* StartPosition;
size_t skip= 0;
    char HtmlFileContents [200000];
    char contents [8000];
    do
    { 
            int SizeOfFile = CreateOctalToInteger(&buffer[skip+124],11);
            size_t distance= ((SizeOfFile%512) ? SizeOfFile + 512 - (SizeOfFile%512) : SizeOfFile );
            skip += distance + 512;
            memcpy(contents,&buffer[skip],100);
            if (StartPosition=strstr(contents,".html"))
            {
                MessageBox(m_hwndPreview,L"finally string is copied",L"BTN WND6",MB_ICONINFORMATION);
                int SizeOfFile = CreateOctalToInteger(&buffer[skip+124],11);
                memcpy(HtmlFileContents,&buffer[skip+512],SizeOfFile);
                break;
            }


    }
    while(strcmp(contents,".html") != NULL);

I guess its self explantory . and If not ?? Do not hesitate to ask me.

how to skip a file inside the tar file to get a particular file

2 Answers2

Linked