0

I am writing a syscall wrapper for python(just as a fun project to get me used to the API),
and when I came upon implementing read(), I am perplexed as to how I can modify a python buffer that is sent to my function.

The function is a simple cpython wrapper around the read() syscall.
It takes an integer (the file descriptor), a buffer, and the maximum you want to read, then returns the amount read.

I have everything working except for the modification of the buffer:

py_obj py_read(py_obj self, py_obj args){
    char* buff;
    int fd;
    int len;

    if(!PyArg_ParseTuple(args, "isi", &fd, &buff, &len)){
        return NULL;
    }

    return Py_BuildValue("i", read(fd, buff, len));
}

After loading the module, then calling read:

>> from syscalls import read
>> STDIN = 1
>> s = ""
>> read(STDIN,s, 256)
napkin
7
>> s
""

Though this is what I expected (and is what should happen, since I did not have an actual reference to the argument), I would like to know how to get a reference to the parameter.

EDIT: After using @user2357112, it still does not modify the value

>>> b = memoryview(b"")
>>> from syscalls import *
>>> read(1, b, 10)
test
5
>>> b
<memory at 0x7fa060628408>
>>> b.tolist()
[]
>>>

EDIT 2: But it does work with bytearray, if I size it correctly Thank you @user2357112

Shipof123
  • 233
  • 2
  • 11
  • 2
    This question is way too broad. It isn't clear what you've tried, what your desired outcome is or what the problem is! Please consult https://stackoverflow.com/help/how-to-ask and https://stackoverflow.com/help/mcve. – Karl Dec 22 '18 at 07:52
  • Thank you @Karl for notifying me! I realize it way too unclear. Now knowing that I have updated the question and provided examples to clarify. – Shipof123 Dec 22 '18 at 18:32

1 Answers1

3

You have a reference to the argument. You may have just corrupted the argument object or the memory surrounding it, in fact. You don't have a reference to the caller's s variable, but variables and references don't work like that in Python anyway; references always refer to objects.

Python string objects aren't appropriate for use as mutable buffers. They're supposed to be immutable, after all. Also, they're Unicode, and read reads bytes. Instead, use an appropriately-sized bytearray and view its contents through a Py_buffer structure with the y* format code.

Also, since read returns ssize_t rather than int, you should use the n format code rather than i. n corresponds to Py_ssize_t, which is intended to match ssize_t when ssize_t exists.

PyObject *my_read(PyObject *self, PyObject *args){
    Py_buffer buff;
    int fd;
    int len;

    if(!PyArg_ParseTuple(args, "iy*i", &fd, &buff, &len)){
        return NULL;
    }

    ssize_t read_count = read(fd, buff.buf, len);
    PyBuffer_Release(&buff);

    return Py_BuildValue("n", read_count);
}
user2357112
  • 260,549
  • 28
  • 431
  • 505
  • 1
    @Mememyselfandaverycreepy: I could've sworn I had the `.buf`. Fixing... – user2357112 Dec 23 '18 at 04:58
  • Stil not modifying it (Using memoryviewer) – Shipof123 Dec 23 '18 at 05:34
  • 1
    @Mememyselfandaverycreepy: I don't know what you mean by "memoryviewer". Did you size the bytearray appropriately? – user2357112 Dec 23 '18 at 05:35
  • Is there a way to make it so they don't have to initialize the bytearray, for sizeing purposes? – Shipof123 Jan 25 '19 at 00:23
  • 1
    @Mememyselfandaverycreepy: Yes, but you should consider your design objectives first. If your goal is to have a very direct translation between the Python interface and the C `read` function, then you should require the user to provide an appropriately-sized buffer, just like what the C interface requires. If you want a convenient Python interface, then why have the user provide a buffer at all? You could just make your own bytearray - or you could just use the existing [`os.read`](https://docs.python.org/3/library/os.html#os.read), or you could just use a file object the usual way. – user2357112 Jan 25 '19 at 00:32
  • I see your point, is that the only way to allocate a buffer in Python? – Shipof123 Jan 25 '19 at 23:21
  • 1
    You could technically use other objects that expose the buffer interface, but a bytearray is probably the most appropriate. – user2357112 Jan 26 '19 at 23:30