Using PostgreSQL 12, I am attempting to import data from a CSV with the following format:
country,state,county,lat,lng,type,measure,beds,population,year,source,source_url
US,AK,ketchikan gateway,63.588753,-154.493062,ICU,1000HAB,3.928701,13745,2018,arcgis,https://services1.arcgis.com/Hp6G80Pky0om7QvQ/arcgis/rest/services/Hospitals_1/FeatureServer/0
US,AK,kodiak island,63.588753,-154.493062,ACUTE,1000HAB,,n,2018,arcgis,https://services1.arcgis.com/Hp6G80Pky0om7QvQ/arcgis/rest/services/Hospitals_1/FeatureServer/0
Notice that the second row's "population" field has an n
instead of empty. My goal is to import the CSV so that the "population" column is BIGINT and "n" is replaced with NULL. My current solution is:
CREATE TABLE temp_table
(
country CHAR(2),
state CHAR(2),
county VARCHAR(255),
lat DOUBLE PRECISION,
lng DOUBLE PRECISION,
type VARCHAR(11),
measure VARCHAR(255),
beds DOUBLE PRECISION,
pop VARCHAR(255),
year SMALLINT,
source VARCHAR(255),
source_url VARCHAR(255)
);
COPY temp_table
FROM 'C:\\Users\\mconr\\Downloads\\global-hospital-beds-capacity-for-covid19\\hospital_beds_USA_v1.CSV'
WITH (DELIMITER ',', FORMAT CSV, HEADER TRUE);
SELECT country, state, county, lat, lng, type, measure, beds, CAST (NULLIF (pop, 'n') AS BIGINT) AS population, year, source, source_url
INTO USA
FROM temp_table;
DROP TABLE temp_table;
My current solution is to create a temporary table where "population" is VARCHAR(255), import the data, create a new table from a SELECT statement that replaces "n" with NULL and casts the column to BIGINT, then delete the temp table. However, this seems to be a little inefficient, since I am creating and deleting an intermediate table. Does anyone know a better way of doing this?