I have PostgreSQL database dump by pg_dump version 9.5.2, which contains DDLs and also INSERT INTO
statements for each table in given database. Dump looks like this:
SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
CREATE TABLE unimportant_table (
id integer NOT NULL,
col1 character varying
);
CREATE TABLE important_table (
id integer NOT NULL,
col2 character varying NOT NULL,
unimportant_col character varying NOT NULL
);
INSERT INTO unimportant_table VALUES (123456, 'some data split into
- multiple
- lines
just for fun');
INSERT INTO important_table VALUES (987654321, 'some important data', 'another crap split into
- lines');
...
-- thousands of inserts into both tables
The dump file is really large and it is produced by another company, so I am not able to influence the export process. I need create 2 files from this dump:
- All DDL statements (all statements that doesn't start with
INSERT INTO
) - All
INSERT INTO important_table
statements (I want restore only some tables from dump)
If all statements would be on single line without new line character in the data, it will be very easy to create 2 SQL script by grep, for example:
grep -v '^INSERT INTO .*;$' my_dump.sql > ddl.sql
grep -o '^INSERT INTO important_table .*;$' my_dump.sql > important_table.sql
# Create empty structures
psql < ddl.sql
# Import only one table for now
psql < important_table.sql
Firstly I was thinking about using grep
but I did not find, how to process multiple lines at once, then I tried sed
but it is returning only single line inserts. I also used https://regex101.com/ to find out right regular expression but I don't know how to combine it with grep
or sed
:
^(?!(INSERT INTO)).*$ -- for ddl
^INSERT INTO important_table(\s|[[:alnum:]])*;$ -- for inserts
I found similar question pcregrep multiline SQL match but there is no answer. Also, I don't mind if the solution will work with grep
, sed
or whatever you suggest, but it should work on Ubuntu 18.04.4 TLS.