0

this question is somewhat related to another question I posted before (I'm posting this here as a new question, as I didnt wanted to interrupt the ongoing discussion in the other thread). I'm trying to implement an own read() implementation using (among others) the pread system call. Note that this is not intended to provide better performance whatsover, but just to check how pread() could be used to achieve the same as read().

I'm intercepting the read() execution, and forward it to my own read_handler(). There I will extract the parameters from their respective registers and execute my mapping.

static volatile int read_handler(void){

    register void *rdi asm ("rdi");
    register void *rsi asm ("rsi");
    register void *rdx asm ("rdx");

    int fd = rdi;
    char *buf = rsi;
    int count = rdx;
    printf("[OWN_READ] Got read(%d, %p, %d)\n", fd,buf, count);

    int current_offset = lseek(fd, 0, SEEK_CUR);

    int pread_return = pread(fd,buf,count,current_offset);

    //set fd offset, as pread will NOT change it automatically
    lseek(fd, current_offset+pread_return, SEEK_CUR);  

    return pread_return;
}

So, first I extract the parameter from the register, which works fine. Next i get the current file offset (as pread will not change the offset according to the man page). I call pread using the same parameters, in addition to the current offset defined by the file itself. Next, I update the file offset using lseek and return the number of bytes.

As stated in my previous question, a read() call within fseek will somehow break my read implementation. I had a function to get the current file size, which was as follows:

long get_file_size(const char *name)
{
    FILE *temp_file = fopen(name, "rb");
    if (temp_file == NULL)
    {
        return -1;
    }

    fseek(temp_file, 0L, SEEK_END);
    long sz =  ftell(temp_file);
    fclose(temp_file);
    return sz;
}

When I execute this function using the reference read() implementation, it returns the correct file size. My implementation on the other hand forces the get_file_size function to always return double the actually size.

My understanding of read() and pread() is that the main difference (regarding the functionality to read from a file) is that pread will not update the file offset, which I added in my implementation using lseek. Thus, (for now not including corner cased) my implementation should work just as the reference implementation.

Additonally (if it is helpfull), this "get_file_size" function works just fine:

unsigned long get_file_size()
{
    const char *text_file = "/tmp/syscalltest/tests/truncate_test.txt";
    int fd = open(text_file, O_RDONLY);
    unsigned int size = lseek(fd, 0 , SEEK_END);
    printf("[FILE SIZE] Current file size: %d\n",size );
    close(fd);
    return size;
}

My goal with this test is to check if pread() can produce the same output as a read() call. I tried to verify this by executing various test files, including on that will use the above get_file_size. In my previous question it was hinted that maybe my read() implementation has an error, which will force the get_file_size() to produce wrong results. I'm trying to understand if I have to check my read() implementation, or if the error is caused by the undefined behaviour of the use of fseek in the function. Thanks for any hints on which part might cause an error here.

MajorasKid
  • 733
  • 2
  • 5
  • 24
  • Why don't you use `stat()` to get the file size? – Shawn Nov 20 '19 at 12:55
  • Maybe related, the return types are not correct. Also see the [`pread (2)` man page](http://man7.org/linux/man-pages/man2/pread.2.html). – jww Nov 20 '19 at 13:09
  • Thanks for the comment. I have to adapt it so that I still return a size_t value. I also have another implementation for the get_size (which works fine) but still, this might be a valid case where my code should have the same output as the reference read implementation, which means I should check if the error is my code. But anyway thanks for the suggestion – MajorasKid Nov 20 '19 at 14:00
  • How do you intercept the 'read' ? – dash-o Nov 20 '19 at 17:51
  • Its a mechanism we developed for an internal project (thus, I cannot disclose it yet, sorry). But the arguments are extraced correctly, as i can just use printf to compare all their values with the parameters I use in my testfile – MajorasKid Nov 21 '19 at 07:52
  • I'm voting to close this question as off-topic because without an MCVE it's impossible to know what's wrong. – Employed Russian Nov 22 '19 at 03:15
  • Running the test under `strace` may explain what's going on. But if the interception mechanism itself uses `ptrace`, then debugging this would be hard. – Employed Russian Nov 22 '19 at 03:16

0 Answers0