2

I have data stored in PostgreSQL with the data type text.

When I load this data into Stata it has type strL, even if every string in a column is only one charter long. This takes up too much memory. I would like to continue using the text type in PostgreSQL.

Is there a way to specify that text data from PostgreSQL is loaded into Stata with type str8?

I also want numeric data to be loaded as numeric values so allstring is not a good solution. I would also like to avoid specifying data type on a column by column basis.

The command I use to load data into Stata is this:

odbc load, exec("SELECT * FROM mytable")  <connect_options>

The file profile.do contains the following:

set odbcmgr unixodbc, permanently
set odbcdriver ansi, permanently

The file odbci.ini contains the following:

[database_name]
Debug = 0
CommLog = 0
ReadOnly = no
Driver = /usr/local/lib/psqlodbcw.so
Servername = <server>
FetchBufferSize = 99
Port = 5432
Database = postgres

In PosrgreSQL mytable looks like this:

postgres=# \d+ mytable
                      Table "public.mytable"
 Column | Type | Modifiers | Storage  | Stats target | Description
--------+------+-----------+----------+--------------+-------------
 c1     | text |           | extended |              |
 c2     | text |           | extended |              |


postgres=# select * from mytable;
 c1 |  c2
----+-------
 a  | one
 b  | two
 c  | three
(3 rows)

In Stata mutable looks like this:

. describe

Contains data
  obs:             3                          
 vars:             2                          
 size:           500                          
---------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
---------------------------------------------------------------------------
c1              strL    %9s                   
c2              strL    %9s                   
---------------------------------------------------------------------------
Sorted by: 
     Note: Dataset has changed since last saved.

I am using PostgreSQL v9.6.5 and Stata v14.2.

Molly
  • 13,240
  • 4
  • 44
  • 45

1 Answers1

1

You can do this in Stata by compress-ing your data after you load the variables:

clear

input strL string
"My name is Pearly Spencer"
"I am a contributor on Stack Overflow"
"This is an example variable"
end

describe 

Contains data
  obs:             3                          
 vars:             1                          
 size:           355                          
------------------------------------------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
------------------------------------------------------------------------------------------------------------------------
string          strL    %9s                   
------------------------------------------------------------------------------------------------------------------------
Sorted by: 
     Note: Dataset has changed since last saved.

compress, nocoalesce 

describe

Contains data
  obs:             3                          
 vars:             1                          
 size:           108                          
------------------------------------------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
------------------------------------------------------------------------------------------------------------------------
string          str36   %36s                  
------------------------------------------------------------------------------------------------------------------------
Sorted by: 
     Note: Dataset has changed since last saved.

The option nocoalesce forces Stata to choose the appropriate length for the loaded string variables.