I have a general n-dimensional array type I implemented with:
pub struct NdArray<T: Clone, const N: usize> {
pub shape: [usize; N],
pub data: Vec<T>,
}
impl<T: Clone, const N: usize> NdArray<T, N> {
// Creates a new NdArray from an array of
// values with a given shape
pub fn new(array: &[T], shape: [usize; N]) -> Self {
NdArray {
shape,
data: array.to_vec(),
}
}
// Creates a new NdArray with a `Vec` of
// values with a given shape
pub fn from(array: Vec<T>, shape: [usize; N]) -> Self {
NdArray { shape, data: array }
}
}
I am now implementing arithmetic operations for it (e.g. addition, subtraction, multiplication). The problem is that my implementation uses a lot of clones, such as this one for elementwise addition:
impl<T: Clone + Add<Output = T>, const N: usize> Add<&NdArray<T, N>> for &NdArray<T, N> {
type Output = NdArray<T, N>;
fn add(self, rhs: &NdArray<T, N>) -> Self::Output {
assert_eq!(self.shape, rhs.shape);
let sum_vec = self
.data
.iter()
.zip(&rhs.data)
.map(|(a, b)| a.clone() + b.clone())
.collect();
NdArray::from(sum_vec, self.shape)
}
}
And the same with several other operations:
pub fn max(&self) -> T
where
T: Ord,
{
self.data.iter().max().unwrap().clone()
}
pub fn min(&self) -> T
where
T: Ord,
{
self.data.iter().min().unwrap().clone()
}
pub fn sum(&self) -> T
where
T: Clone + Sum,
{
self.data.iter().cloned().sum()
}
pub fn product(&self) -> T
where
T: Clone + Product,
{
self.data.iter().cloned().product()
}
This makes it very slow when handling large arrays, which is problematic.
I've already tried to fix this by using a blas
library for doing certain array operations, but there isn't a BLAS function for many elementwise ops (e.g. addition) so I'm pretty lost as to what to do.
How can I minimize clone usage in my ndarray library?
And to answer the obvious: yes, I know ndarray
exists, this is an educational exercise.