0

I'm trying to use the CONCAT function to concatenate two columns in Databricks SQL. However, when I try to use it the CONCAT functions is prepending and appending .0 to the values in the second column. For example, my data looks like this:
ColA ColB
10001 25678
18921 25678
27331 89011

My query looks like this:

`SELECT cola, colb, CONCAT(cola, colb) AS colc
FROM <parquet table in question>`

And the output I'm expecting is
ColA ColB ColC
10001 25678 1000125678
18921 25678 1892125678
27331 89011 2733189011

But instead I get this:


ColA ColB ColC
10001 25678 10001.025678.0
18921 25678 18921.025678.0
27331 89011 27331.089011.0

Why is CONCAT adding these zeroes and decimal points? I had thought it may be related to the fact that I'm working with integers, but I tested the concat function of w3schools.com using their interactive command runner and it concatenated fine on there. What have I missed about the concat function?

Yoddlenod
  • 41
  • 8
  • A table is not a "database". The chances that w3schools.com interactive command runner uses Spark, is 0. You haven't supplied the parquet schema. – David דודו Markovitz Sep 01 '22 at 17:24
  • Sorry, I've corrected my terminology. I'm aware of that, but I thought since it is the same command the behaviour might be similar, as CONCAT seems to vendor agnostic. I didn't supply a schema because the table I give as a dummy example and doesn't actually exist. Would a schema help with answering the question? Sorry if these questions seem stupid, I'm still learning the ropes of SQL and Databricks SQL in particular. – Yoddlenod Sep 02 '22 at 08:13

1 Answers1

0

While I did not manage to solve the issue directly with SQL, I did solve it with SparkR with the help of another answer to a related question I posted (Is there a more memory efficient way to concatenate two columns in SparkR?) I think the issue was a type error. I think the two columns being concatenated were being converted to doubles, hence why they were getting decimal points added to them. The solution was to cast them as strings or ints to get rid of the decimals.

Yoddlenod
  • 41
  • 8