Please fasten your seat belts, this is verbose.
I'm assuming you want to deserialize some JSON data
{"x": [[1, "a"], [2, "b"]]}
to some Rust struct
struct X {
x: Vec<Vec<Value>>, // Value is some enum containing string/int/float…
}
all while
- transposing the elements of the inner lists while inserting into the vectors
- checking that the inner vector elements conform to some type passed to deserialization
- not doing any transient allocations
At the start, you have to realize that you have three different types that you want to deserialize: X
, Vec<Vec<Value>>>
, and Vec<Value>
. (Value
itself you don't need, because what you actually want to deserialize are strings and ints and whatnot, not Value
itself.) So, you need three deserializers, and three visitors.
The innermost Deserialize
has a mutable reference to a Vec<Vec<Value>>
, and distributes the elements of a single [1, "a"]
, one to each Vec<Value>
.
struct ExtendVecs<'a>(&'a mut Vec<Vec<Value>>, &'a [DataTypes]);
impl<'de, 'a> DeserializeSeed<'de> for ExtendVecs<'a> {
type Value = ();
fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: Deserializer<'de>,
{
struct ExtendVecVisitor<'a>(&'a mut Vec<Vec<Value>>, &'a [DataTypes]);
impl<'de, 'a> Visitor<'de> for ExtendVecVisitor<'a> {
type Value = ();
fn visit_seq<A>(self, mut seq: A) -> Result<(), A::Error>
where
A: SeqAccess<'de>,
{
for (i, typ) in self.1.iter().enumerate() {
match typ {
// too_short checks for None and turns it into Err("expected more elements")
DataTypes::Stri => self.0[i].push(Value::Stri(too_short(self.1, seq.next_element::<String>())?)),
DataTypes::Numb => self.0[i].push(Value::Numb(too_short(self.1, seq.next_element::<f64>())?)),
}
}
// TODO: check all elements consumed
Ok(())
}
}
deserializer.deserialize_seq(ExtendVecVisitor(self.0, self.1))
}
}
The middle Deserialize
constructs the Vec<Vec<Value>>
, gives the innermost ExtendVecs
access to the Vec<Vec<Value>>
, and asks ExtendVecs
to have a look at each of the [[…], […]]
:
struct TransposeVecs<'a>(&'a [DataTypes]);
impl<'de, 'a> DeserializeSeed<'de> for TransposeVecs<'a> {
type Value = Vec<Vec<Value>>;
fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: Deserializer<'de>,
{
struct TransposeVecsVisitor<'a>(&'a [DataTypes]);
impl<'de, 'a> Visitor<'de> for TransposeVecsVisitor<'a> {
type Value = Vec<Vec<Value>>;
fn visit_seq<A>(self, mut seq: A) -> Result<Vec<Vec<Value>>, A::Error>
where
A: SeqAccess<'de>,
{
let mut vec = Vec::new();
vec.resize_with(self.0.len(), || vec![]);
while let Some(()) = seq.next_element_seed(ExtendVecs(&mut vec, self.0))? {}
Ok(vec)
}
}
Ok(deserializer.deserialize_seq(TransposeVecsVisitor(self.0))?)
}
}
Finally, the outermost Deserialize
is nothing special anymore, it just hands access to the type array down:
struct XD<'a>(&'a [DataTypes]);
impl<'de, 'a> DeserializeSeed<'de> for XD<'a> {
type Value = X;
fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: Deserializer<'de>,
{
struct XV<'a>(&'a [DataTypes]);
impl<'de, 'a> Visitor<'de> for XV<'a> {
type Value = X;
fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
where
A: serde::de::MapAccess<'de>,
{
let k = map.next_key::<String>()?;
// TODO: check k = "x"
Ok(X { x: map.next_value_seed(TransposeVecs(self.0))? })
}
}
Ok(deserializer.deserialize_struct("X", &["x"], XV(self.0))?)
}
}
Now, you can seed the outermost Deserialize
with your desired type list and use it to deserialize one X
, e.g.:
XD(&[DataTypes::Numb, DataTypes::Stri]).deserialize(
&mut serde_json::Deserializer::from_str(r#"{"x": [[1, "a"], [2, "b"]]}"#)
)
Playground with all the left-out error handling
Side node: If you can (i.e. if the format you're deserializing is self-describing like JSON) I'd recommend to do the type checking after deserialization. Why? Because doing it during means that all deserializers up to the top deserializer must be DeserializeSeed
, and you can't use #[derive(Deserialize)]
. If you do the type checking after, you can #[derive(Deserialize)]
and #[serde(deserialize_with = "TransposeVecs_deserialize_as_free_function")} x: Vec<Vec<Value>>
, and save half of the cruft in this post.