How can I make more efficient functions: Cartesian product and Searchsorted for vector of strings tuples

Question

I developed two functions. The cartesian product takes sets and generates a vector of tuples with all possible combinations. Then I make a small sample of the vector. The second function, Searchsorted, brings the two tuples vectors as inputs to get the index of the larger vector associated with the sample tuples.

The problem is that my functions are sadly too slow (see below the benchmark)

I would appreciate your feedback very much. What can I do to improve the execution time? Please, bear in mind that this is my first Rust project; I am a Rust newbie :)

Details of the approach:

Ready to run -> code repository here: https://github.com/cdgaete/searchsorted

The idea is the following; we have a table in a long format (sample):

dim1	dim2	dim3	value
'a1'	'b2'	'c1'	10
'a2'	'b3'	'c2'	20

Every dimension has a known domain:

dim1 = {'a1', 'a2'}

dim2 = {'b1','b2','b3'}

dim3 = {'c1','c2'}

I want to create a table with all possible combinations (full) and then allocate the values of the sample table into the full one.

So, my approach is the following:

In the first step, I want to create an array of tuples (looks like a long format table) with all possible combinations (here, I use the cartesian product function). As the second step, I want to find the location of the sample array of tuples (table above) so that, later on, I can insert the value into the full table.

Step 1: cartesian product (full)

	dim1	dim2	dim3
1	'a1'	'b1'	'c1'
2	'a1'	'b1'	'c2'
3	'a1'	'b2'	'c1'
4	'a1'	'b2'	'c2'
5	'a1'	'b3'	'c1'
6	'a1'	'b3'	'c2'
7	'a2'	'b1'	'c1'
8	'a2'	'b1'	'c2'
9	'a2'	'b2'	'c1'
10	'a2'	'b2'	'c2'
11	'a2'	'b3'	'c1'
12	'a2'	'b3'	'c2'

Step2: searchsorted

dim1	dim2	dim3	value	index full table
'a1'	'b2'	'c1'	10	3
'a2'	'b3'	'c2'	20	12

Summary:

cartesian product inputs: set of dimensions
cartesian product returns: an array with tuples
searchsorted input: full table (array with tuples) and sample table (array with tuples)
searchsorted returns: an array with integers

Functions

Cartesian product:

type TS3 = (String,String,String);

pub fn cartesian_3d(l1: Vec<String>, l2: Vec<String>, l3: Vec<String>) -> Vec<TS3> {
    let mut collector = Vec::new();
    for tuple in iproduct!(l1,l2,l3) {
        collector.push(tuple);
    };
    collector
}

Searchsorted:

type TS3 = (String,String,String);

pub fn searchsorted_3d(dense_list: Vec<TS3>, index_list: Vec<TS3>) -> Vec<i64> {
    let mut htbl = HashMap::new();
    let mut i: i64 = 0;
    for key in dense_list.iter() {
        htbl.insert(key, i);
        i += 1i64
    };
    let mut location: Vec<i64> = Vec::new();
    for tuple in index_list.iter() {
        let value = htbl.get(tuple).unwrap();
        location.push(*value);
    };
    location
}

Benchmark

In examples folder eg2.py and eg2.rs contain the benchmark code:

Full vector: a million tuples of five-string each.
sample vector: 1000 tuples
each string in a tuple has two chars

Results:

Cartesian product:

Rust-python   eTime: 1342697 μs.
Pure Rust     eTime   246470 μs.
Pure Rust     eTime   140097 μs. cargo --release
Pure python   eTime:   84879 μs.

searchsorted:

Rust-python   eTime: 2599270 μs.
Pure Rust     eTime  2015062 μs.
Pure Rust     eTime   678256 μs.  cargo --release
Pure python   eTime:  103814 μs.

Code for pure Python:

Cartesian product: Itertools package

list(itertools.product(lst1,lst2,lst3,lst4,lst5))

searchsorted: dictionary and list comprehension

def pysearchsorted(full_list, sample_list):
    fullhashtable = {tupl: idx for idx, tupl in enumerate(full_list)}
    return [fullhashtable[tupl] for tupl in sample_list]

Thanks for your support!

If you have working code that you want to have peer reviewed for improvements, your question belongs on [codereview.se], which was created for exactly that purpose. Also, this is a question and answer site, not a multiple question site. You have two separate questions, one for each function, which means they belong in separate posts. This is covered in the [tour] and [help]. — Ken White, Dec 02 '22 at 01:37
You did run your code in release mode, right? Otherwise this is a duplicate of https://stackoverflow.com/q/25255736/5397009 — Jmb, Dec 02 '22 at 07:37
Thanks, @Jmb it improved quite significantly the performance. I edited the post, including the new time. — Carlos_G, Dec 02 '22 at 15:14
Note that on [codereview.se], you can ask for `comparative-review` using tags. I recommend you read [A guide to Code Review for Stack Overflow users](//codereview.meta.stackexchange.com/a/5778), as some things are done differently over there - e.g. question titles should simply say what the code *does*, as the question is always, "How can I improve this?". Be sure that the code works correctly; include your unit tests if possible. You'll likely get some suggestions on making it more efficient, easier to read, and better tested. — Toby Speight, Dec 02 '22 at 15:36

Nikolay Zakirov · Answer 1 · 2022-12-02T02:20:18.980

There are two questions here.

how to improve existing implementation so that it's faster than Python
how to achieve faster overall performance

A few ideas:

Regarding 1). Instead of let mut collector = Vec::new();. Calculate the capacity first - multiply the len of the three input lists. and then do Vec::with_capacity. You will avoid resizing then.

Same logic for let mut htbl = HashMap::new(); and for let mut location: Vec<i64> = Vec::new();

Regarding 2). I see a lot of redundancy in calculating full table first. And cartesian products grow really fast so memory will suffer too. Why don't you go directly to the end result. First collect domain lists list_1, list_2, into hashmaps to have the index of the value. Then for each row of the index list calculate the index of the column value from the hashmap (index_1, index_2, index_3, ...) and then calculate the final value as index_n + index_{n-1} * len_{n} + index_{n-2} * len_{n} * len_{n-1} + ..., where len_n - is the length of list_n. len_n, len_n*len_{n-1}, ... should be precalculated before the loop. I suppose overall it might be faster if the index_list that we calculating values for is significantly smaller than the full Cartesian product

Hi @Nikolay, Thank you very much for your support. I tested with_capacity, and time increased by ~15%. I agree that the approach could be different, and your second comments shed light on where to go. This will require a new code. I am also thinking of using integers instead of strings. string -> integer -> make several operations -> strings back again. I will post the new approach in Code Review; I hope you are also there. Thanks — Carlos_G, Dec 02 '22 at 15:24

How can I make more efficient functions: Cartesian product and Searchsorted for vector of strings tuples

Functions

Benchmark

Code for pure Python:

1 Answers1