0

I have JSON content where, deeply nested, there is an array of numbers I want to extract. I'd like to not create the intermediate structs, so I tried the following:

... get f
let json = serde_json::from_reader::<_, serde_json::Value>(f)?;
let xs: Vec<(f64, f64)> = serde_json::from_value(json["subtree"][0])?;

This complains with

11 | serde_json::from_value(json["subtree"][0])?;
   |                        ^^^^^^^^^^^^^^^^^^^^^ move occurs because value has type `serde_json::value::Value`, which does not implement the `Copy` trait

If I clone, it works fine:

let xs: Vec<(f64, f64)> = serde_json::from_value(json["subtree"][0].clone())?;

But this seems unnecessary. I will not be using the rest of the structure. How do I achieve this without having to create the intermediate structs and without having to clone?

Listerone
  • 1,381
  • 1
  • 11
  • 25

3 Answers3

1

Oh, missed the totally obvious.

... get f
let mut json = serde_json::from_reader::<_, serde_json::Value>(f)?;
let xs: Vec<(f64, f64)> = serde_json::from_value(json["subtree"][0].take())?;
Listerone
  • 1,381
  • 1
  • 11
  • 25
0

I'd probably use Value::pointer. An example:

use serde_json::json;

fn main() {
    let value = json!({
        "deeply": {
            "nested": {
                "array": [0, 1, 2, 3, 4, 5]
            }
        }
    });

    let numbers: Vec<u64> = value
        .pointer("/deeply/nested/array")
        .unwrap()
        .as_array()
        .unwrap()
        .iter()
        .map(|x| x.as_u64().unwrap())
        .collect();

    println!("{:?}", numbers);
}

NOTE: This example contains excessive use of unwrap() calls which is dangerous and can lead to panics. It's there to make the whole example simpler.


Aren’t you still constructing multiple vectors in this case?

No. Let's expand this whole machinery.

use serde_json::{json, Value};
use std::iter::Map;
use std::slice::Iter;

fn main() {
    let value: Value = json!({
        "deeply": {
            "nested": {
                "array": [0, 1, 2, 3, 4, 5]
            }
        }
    });

    // Option<&Value> - Option & reference
    //
    //    pub enum Option<T> {
    //        None,
    //        Some(T),
    //    }
    //
    // T = &Value - reference
    let maybe_value_ref: Option<&Value> = value.pointer("/deeply/nested/array");

    // &Value - reference
    let value_ref: &Value = maybe_value_ref.unwrap();

    // Option<&Vec<Value>> - Option & reference
    //
    //    pub enum Option<T> {
    //        None,
    //        Some(T),
    //    }
    //
    // T = &Vec<Value> - reference to Vec
    let maybe_vec_ref: Option<&Vec<Value>> = value_ref.as_array();

    // &Vec<Value> - reference
    let vec_ref: &Vec<Value> = maybe_vec_ref.unwrap();

    // Iter<Value> allocation
    //
    //    pub struct Iter<'a, T: 'a> {
    //        ptr: *const T,
    //        end: *const T,
    //        _marker: marker::PhantomData<&'a T>,
    //    }
    //
    // .next() returns Option<&Value>
    let vec_ref_iter: Iter<Value> = vec_ref.iter();

    // Map<..., ...> allocation
    //
    //    pub struct Map<I, F> {
    //        iter: I,
    //        f: F,
    //    }
    //
    // .next() returns Option<u64>
    let vec_ref_iter_map: Map<Iter<Value>, fn(&Value) -> u64> =
        vec_ref_iter.map(|x: &Value| x.as_u64().unwrap());

    // Nothing happens till this point. I mean, only Iter, Map, ... structures
    // were allocated. But because they're lazy, we have to consume the last
    // Map (vec_ref_iter_map) to fire the whole machinery.
    //
    // What's going on (simplified)?
    //
    // * Vec implements FromIterator
    // * vec_ref_iter_map implements Iterator
    // * FromIterator consumes vec_ref_iter_map.next() till None (= end)
    // * vec_ref_iter_map.next() returns Option<u64>
    //   * it internally gets vec_ref_iter.next() value
    //   * if value is None then None is returned (= end)
    //   * if value is Some(x) then it applies .map() closure (x.as_u64().unwrap())
    //     and returns Some(closure result)
    //
    // The only allocated Vec here is the last one (numbers). No other Vectors
    // were allocated.
    let numbers: Vec<u64> = vec_ref_iter_map.collect();

    println!("{:?}", numbers);
}

Documentation:

zrzka
  • 20,249
  • 5
  • 47
  • 73
  • Aren’t you still constructing multiple vectors in this case? – Listerone Aug 22 '19 at 21:22
  • @Listerone see updated answer (end of it). Every step result stored to a temporary variable, annotated with type and some comments. Should help to understand what's going on. If not, feel free to ask. – zrzka Aug 23 '19 at 08:16
0

Since you mentioned in your question that you don't actually need the rest of the JSON data, maybe the Struson library and its seek_to method could be helpful for this. It allows positioning the JSON reader at the specified path, skipping all other values. This will likely be more efficient memory-wise than having to deserialize the complete JSON data as Value first before obtaining the relevant deeply nested portion.

You could then use Struson's JsonReader method begin_array to start the enclosing JSON array and then use within that begin_array, end_array and next_number to read the (f64, f64) values (assuming they look like [1.2, 3.4] in JSON).

// Assumes this is roughly the structure of your JSON data
let json = r#"
{
    "subtree": [
        [
            [1.2, 3.4],
            [-4.5, 6]
        ]
    ]
}     
"#;
// `std::io::Read` providing the JSON data; in this example the str bytes
let reader = json.as_bytes();

let mut xs: Vec<(f64, f64)> = Vec::new();

let mut json_reader = JsonStreamReader::new(reader);
json_reader.seek_to(&json_path!["subtree", 0])?;
json_reader.begin_array()?;

while json_reader.has_next()? {
    // Read the (f64, f64) values
    json_reader.begin_array()?;
    xs.push((
        json_reader.next_number()??,
        json_reader.next_number()??
    ));
    json_reader.end_array()?;
}

// Optionally consume the remainder of the JSON document
json_reader.skip_to_top_level()?;
json_reader.consume_trailing_whitespace()?;

println!("xs: {xs:?}");

Alternatively if you really wanted to use Serde-like functionality to deserialize the Vec<(f64, f64)>, you could use the serde feature instead:

let json = r#"
{
    "subtree": [
        [
            [1.2, 3.4],
            [-4.5, 6]
        ]
    ]
}     
"#;
// `std::io::Read` providing the JSON data; in this example the str bytes
let reader = json.as_bytes();

let mut json_reader = JsonStreamReader::new(reader);
json_reader.seek_to(&json_path!["subtree", 0])?;

// Uses Serde's Deserialize implementation
let xs: Vec<(f64, f64)> = json_reader.deserialize_next()?;

// Optionally consume the remainder of the JSON document
json_reader.skip_to_top_level()?;
json_reader.consume_trailing_whitespace()?;

println!("xs: {xs:?}");

Disclaimer: I am the author of Struson, and currently it is still experimental (but feedback is highly appreciated!).

Marcono1234
  • 5,856
  • 1
  • 25
  • 43