0

i have a very large database, with size upto around 2 gb. the database contains 10 tables and one of the tables has over 60 columns and 13902134 rows (vt_owner)

Insert into insurance.uninsured_vehicle 
           select a.regn_no,a.chasi_no,a.rto_cd,a.state_cd,a.regn_dt,a.vh_class 
from insurance.vt_owner a 
            where a.regn_no not in(select regn_no from insurance.irda_insurance)

executing this query takes a lot of time! currently using btree indexing. i wanted to optimize this query and wanted some suggestions on how to go about it. using postgres 9.2

i decided to vertically partition the table and then apply indexing on the various partitions .

vt_owner table definition

CREATE TABLE insurance.vt_owner
(
  regn_no character varying(10) NOT NULL,
  regn_dt timestamp without time zone,
  purchase_dt timestamp without time zone,
  owner_sr numeric(5,0),
  owner_name character varying(150),
  pan_no character varying(10),
  f_name character varying(150),
  c_add1 character varying(50),
  c_add2 character varying(50),
  c_city character varying(50),
  c_district character varying(30),
  c_pincode character varying(6),
  p_add1 character varying(50),
  p_add2 character varying(50),
  p_city character varying(50),
  p_district character varying(30),
  p_pincode character varying(6),
  owner_cd numeric(5,0),
  owner_cd_desc character varying(50),
  regn_type character varying(1),
  regn_type_desc character varying(50),
  vh_class numeric(5,0),
  vh_class_desc character varying(50),
  chasi_no character varying(30),
  eng_no character varying(30),
  maker numeric(5,0),
  maker_desc character varying(50),
  maker_model character varying(50),
  body_type character varying(3),
  body_type_desc character varying(50),
  no_cyl numeric(2,0),
  hp character varying(10),
  seat_cap numeric(3,0),
  stand_cap numeric(3,0),
  sleeper_cap numeric(2,0),
  unld_wt numeric(9,0),
  ld_wt numeric(9,0),
  fuel numeric(3,0),
  fuel_desc character varying(50),
  color character varying(50),
  manu_mon numeric(2,0),
  manu_yr numeric(4,0),
  fit_dt timestamp without time zone,
  norms numeric(2,0),
  norms_desc character varying(50),
  wheelbase character varying(10),
  cubic_cap character varying(10),
  floor_area numeric(7,3),
  ac_fiitted character(1),
  audio_fiitted character(1),
  video_fiitted character(1),
  vch_purchase_as character(1),
  vch_catg character(3),
  dealer_cd numeric(5,0),
  dealer_cd_desc character varying(50),
  sale_amt numeric(9,0),
  laser_code character varying(10),
  garage_add character varying(50),
  state_cd character varying(2) NOT NULL,
  rto_cd character varying(3) NOT NULL,
  CONSTRAINT vt_owner_pkey PRIMARY KEY (state_cd, rto_cd, regn_no)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE insurance.vt_owner
  OWNER TO postgres;

-- Index: insurance."index_VT_OWNER"

-- DROP INDEX insurance."index_VT_OWNER";

CREATE INDEX "index_VT_OWNER"
  ON insurance.vt_owner
  USING btree
  (regn_no COLLATE pg_catalog."default", chasi_no COLLATE pg_catalog."default", state_cd COLLATE pg_catalog."default", rto_cd COLLATE pg_catalog."default");

i also thought of creating trigrams. please i need some help.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786

1 Answers1

1

Is insurance.vt_owner.regn_no indexed?

Is there an index on insurance.irda_insurance.regn_no?

Have you run ANALYZE recently?

Are there any triggers on insurance.uninsured_vehicle?

Does this query perform any better?

INSERT INTO insurance.uninsured_vehicle 
SELECT a.regn_no,a.chasi_no,a.rto_cd,a.state_cd,a.regn_dt,a.vh_class 
FROM insurance.vt_owner a 
WHERE NOT EXISTS (SELECT 1 FROM insurance.irda_insurance WHERE regn_no = a.regn_no)
bma
  • 9,424
  • 2
  • 33
  • 22
  • insurance is the schema.. vt_owner, irda_insurance and uninsured_vehicle all the three are btree indexed. – Varun Gupta Jul 18 '13 at 06:55
  • taking ages to analyze.. "Insert on insurance.uninsured_vehicle (cost=2.95..855121.64 rows=6951225 width=42)" " -> Seq Scan on insurance.vt_owner a (cost=2.95..855121.64 rows=6951225 width=42)" " Output: a.regn_no, a.chasi_no, a.rto_cd, (a.state_cd)::character varying(3), a.regn_dt, a.vh_class" " Filter: (NOT (hashed SubPlan 1))" " SubPlan 1" " -> Seq Scan on insurance.irda_insurance (cost=0.00..2.76 rows=76 width=11)" " Output: irda_insurance.regn_no" – Varun Gupta Jul 18 '13 at 06:57
  • used EXPLAIN VERBOSE command with your query.. cost comes out to be higher than the existing one. havent used triggers on insurance.uninsured_vehicle. – Varun Gupta Jul 18 '13 at 07:01
  • Post the entire output of "EXPLAIN ANALYZE your_query_here". Also, is there an index or Primary/Unique key on regn_no? – bma Jul 18 '13 at 14:42