If you have a table with a column whose type is SQL ARRAY, how do you find the base type of the array type, aka the type of the individual elements of the array type?
- How do you do this in vendor-agnostic pure JDBC?
- How do you do this without fetching and inspecting actual row data? Equivalently: what if the table is empty?
Similar questions were asked here:
However, I am asking for a vendor-agnostic way through the JDBC API itself. I'm asking: How is one supposed to solve this problem with vendor-agnostic pure JDBC? This use case seems like a core use case of JDBC, and I'm really surprised that I can't find a solution in JDBC.
I've spent hours reading and re-reading the JDBC API javadocs, and several more hours scouring the internet, and I'm greatly surprised that there doesn't seem be a proper way of doing this via the JDBC API. It should be right there via DatabaseMetaData or ResultSetMetaData, but it's apparently not.
Here are the insufficient workarounds and alternatives that I've found.
- Fetch some rows until you get a row with an actual value for that column, get the column value, cast to java.sql.Array, and call getBaseType.
- For postgres, assume that SQL ARRAY type names are encoded as ("_" + baseTypeName).
- For Oracle, use Oracle specific extensions that allow getting the answer.
- Some databases have a special "element_types" view which contains one row for each SQL ARRAY type that is used by current tables et al, and the row contains the base type and base type name.
My context is that I would like to use vendor-supplied JDBC connectors in spark in cloud in my company product, and metadata discovery becomes an important thing. I'm also investigating the feasibility of writing JDBC connectors myself for other data sources that don't have a JDBC driver nor spark connector yet. Metadata discovery is important so that one can define the Spark InternalRow and Spark-JDBC data getters correctly. Currently, Spark-JDBC has very limited support for SQL ARRAY and SQL STRUCT, but I managed to provide the missing bits with a day or two of coding, but during that process I hit this problem which is blocking me. If I have control over the JDBC Driver implementation, then I could use a kludge (i.e. encode the type information in the type name, and in the Spark JdbcDialect, take the type name and decode it to create the Catalyst type). However, I want to do it in the proper JDBC way, and I ideally I want to do it in a way that some other vendor-supplied JDBC drivers will support.
PS: It took me a surprising amount of time to locate DatabaseMetaData.getAttributes(). If I'm reading this right, this can give me the names and types of the fields/attributes of a SQL STRUCT. Again, I'm very surprised that I can get the names and types of the fields/attributes of a SQL STRUCT in vendor-agnostic pure JDBC but not get the base-type of a SQL ARRAY in vendor-agnostic pure JDBC.