How do you return rows of JSON in a plv8 function in Postgres?

Question

I'm having trouble wrapping my head around this context. Here's my first take at this using some of the built in functions:

What I've tried so far

SELECT json_build_object('id', MD5(c.id), 'firstName', 
c.first_name, 'lastName', c.last_name, 'location', 
json_build_object('city', cl.city, 'state', cl.state))
FROM person p 
LEFT JOIN (SELECT id, city, state FROM 
person_location) pl ON pl.id = p.id
LIMIT 10

Works well enough for this small example, and also returns 10 rows. If I take off the limit, I'll get all of my rows. Perfect; however, this is meant to power a view that will be feeding data directly into Logstash, so there's some custom business logic that cleans up the fields and performs some other light tasks. To remedy this, it made sense to try plv8

plv8 method:

CREATE OR REPLACE FUNCTION generate_search_documents()   
RETURNS SETOF person_test AS
$$

const _ = require('lodash'),
    candidates = plv8.execute('select * FROM candidate LIMIT 10');

plv8.return_next(people);

$$
  LANGUAGE plv8;

And I'm already immediately lost. What I'm trying to do is select the tables that I need with the proper joins, clean up the data, and build up my JSON object and return one per row. In the first method, it's clear that in the query I'm working within the context of a single person. In the method below, it appears that I'm executing a query that's selecting all 80M people in our table. That's not easy.

Any idea how I can use this method to replicate what I'm attempting to do in the first example?

Anyway, if you want to do this in the plv8, you need to loop through rows returned as part of the rows set. — FuzzyChef, Jul 11 '17 at 23:28

klin · Accepted Answer · 2017-07-12T17:34:34.367

Example table:

create table person(id int primary key, first_name text, last_name text);
insert into person values
(1, 'John', 'Smith'),
(2, 'Phil', 'Jones');

select 
    json_build_object(
        'id', id, 
        'firstname', first_name, 
        'lastname', last_name)
from person;

                   json_build_object                    
--------------------------------------------------------
 {"id" : 1, "firstname" : "John", "lastname" : "Smith"}
 {"id" : 2, "firstname" : "Phil", "lastname" : "Jones"}
(2 rows)

How to modify the rows in a function and return consecutive rows as json?

You should declare setof jsonb (or setof json) as a return type of the function. Update values and use plv8.return_next() in a loop, example:

create or replace function person_as_jsonb()
returns setof jsonb language plv8 as $$
    var persons = plv8.execute('select * from person');
    var len = persons.length;   
    for (var i = 0; i < len; i++) {
        persons[i].first_name = persons[i].first_name + '?';
        persons[i].last_name = persons[i].last_name + '!';
        plv8.return_next(persons[i]);
    }
$$;

select * 
from person_as_jsonb();

                     person_as_json                      
---------------------------------------------------------
 {"id": 1, "last_name": "Smith!", "first_name": "John?"}
 {"id": 2, "last_name": "Jones!", "first_name": "Phil?"}
(2 rows)

The above function will work well for very limited dataset. Do not even try it for a whole table with a million rows or more.

However, you can write a function which modifies a single row. plv8 is smart enough to understand when you declare a custom type of an argument (the table name in this case) and use it as a json (jsonb) inside a function, example:

create or replace function modify_person(person person)
returns jsonb language plv8 as $$
    person.first_name = person.first_name+ '??';
    person.last_name = person.last_name+ '!!';
    return person;
$$;

Use this in a select query (Postgres itself makes a loop here, the function is called for every row separately):

select modify_person(person)
from person;

                       modify_person                       
-----------------------------------------------------------
 {"id": 1, "last_name": "Smith!!", "first_name": "John??"}
 {"id": 2, "last_name": "Jones!!", "first_name": "Phil??"}
(2 rows)

Note, that you can only change a return type to get the result as a typed record (in this case you should call the function in from clause):

create or replace function modify_person_row(person person)
returns person language plv8 as $$
    person.first_name = person.first_name+ '??';
    person.last_name = person.last_name+ '!!';
    return person;
$$;

select modified.*
from person,
lateral modify_person_row(person) modified

 id | first_name | last_name 
----+------------+-----------
  1 | John??     | Smith!!
  2 | Phil??     | Jones!!
(2 rows)

What are the implications of running this against a dataset in the 10's of millions? Surely, the only reason I'm noticing the loop is because it's explicitly written above? In both of our examples, I assume pg is performing a loop of it's own? — Sean Lindo, Jul 12 '17 at 15:57
Thank you so much for the clear examples and explanation. I've used this to create a view and it's completely changed the way we index documents and the speed at which we do so. — Sean Lindo, Jul 13 '17 at 15:42

How do you return rows of JSON in a plv8 function in Postgres?

What I've tried so far

plv8 method:

1 Answers1