1

I am passing in Strings from userspace to my BPF code and was wondering if it's possible to go beyond the size limit that is possible from my char struct array. Is it possible to put in my lines one by one to a Map and bypass the stack size limit? The way I am passing in my strings through Python is here:

import ctypes
from bcc import BPF


b = BPF(src_file="TP-4091lite.c")

lookupTable = b["lookupTable"]
#add hello.csv to the lookupTable array
f = open("hello.csv","r")
contents = f.readlines()
for i in range(0,len(contents)):
    string = contents[i].encode('utf-8')
    print(len(string))
    lookupTable[ctypes.c_int(i)] = ctypes.create_string_buffer(string, len(string))

f.close()
b.attach_kprobe(event=b.get_syscall_fnname("clone"), fn_name="TP4091")
b.trace_print()

Basically printing whatever is in my CSV line by line and putting it in my BPF Map. And my C code is here:

#include <uapi/linux/bpf.h>
#define ARRAYSIZE 64

struct data_t {
    char buf[ARRAYSIZE];
};

struct char_t {
    char c[ARRAYSIZE];
};

BPF_ARRAY(lookupTable, struct data_t, ARRAYSIZE);
BPF_ARRAY(characterTable, struct char_t, ARRAYSIZE);

//find substring in a string
static bool isSubstring(struct data_t stringVal)
{
    char substring[] = "New York";
    int M = sizeof(substring) - 1;
    int N = sizeof(stringVal.buf) - 1;
 
    /* A loop to slide pat[] one by one */
    for (int i = 0; i <= N - M; i++) {
        int j;
 
        /* For current index i, check for
 pattern match */
        for (j = 0; j < M; j++) {
            if (stringVal.buf[i + j] != substring[j]){
                break;
            }
        }
 
        if (j == M) {
            return true;
        }
    }
 
    return false;
}

int TP4091(void *ctx)
{
    for (int i = 0; i < ARRAYSIZE; i++) {
        char name[20];
        int k = i;
        struct data_t *line = lookupTable.lookup(&k);
        if (line) {
            // bpf_trace_printk("%s\n", key->buf);
            if (isSubstring(*line) == true) {
                bpf_trace_printk("%s\n", line->buf);
            }

        }
    }
    return 0;
}

There is still a size limit when putting in arbitrary values and I want to see how far I can push.

  • Not clear what you want to bypass exactly, do you mean on kernel side or user space side? From user space you can't go over the max limit for an entry, but you could maybe split your lines over several entries and implement some sort of list where your value is a string fragment + pointer to next fragment (not sure how easy to process this would end up, especially if you try to match random substrings). On kernel side I'm not sure what you can do to help with the stack size. – Qeole Aug 07 '21 at 18:30
  • Do you really need all your CSV lines in the map though? Wouldn't it make sense to parse it in user space and fill your map with structured data (a struct with one field for each column of your CSV line?), so you can process it more easily in eBPF for example? – Qeole Aug 07 '21 at 18:31
  • This is just a test program as of now but in the future, I would like to do some processing within BPF so I don't have to process some of the junk in userspace hence the question. The question actually came from this StackOverflow post's answer on how one can increase the size limit even beyond what was provided: https://stackoverflow.com/questions/68578415/is-there-a-string-size-limit-when-sending-strings-back-to-bpf-code-and-back-to-u/68639317?noredirect=1#comment121308817_68639317 – maxterthrowaway Aug 08 '21 at 20:49
  • Could you describe exactly the use case for processing strings with BPF in the kernel instead of processing in userspace? – pchaigno Aug 09 '21 at 12:28
  • This is mainly for my testing of CSDs. Writing BPF code allows me easy access to the Kernel and I would like to see how far I can push some of my code to run within the kernel. – maxterthrowaway Aug 09 '21 at 15:55

0 Answers0