1

When trying to list down all tables names in a database having a specific name format, the following query works fine :

show tables like '*case*';

while the following does not

show tables like '%case%';

On the other hand, when comparing the actual data inside string columns its the vice-versa case

Working query :

select column from database.table where column like '%ABC%' limit 5;

Not working query :

select column from database.table where column like '*ABC*' limit 5;

What's the difference between the 2 operators * and % ?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Kshitij Kohli
  • 4,055
  • 4
  • 19
  • 27

2 Answers2

1

This is the difference between regular expressions and like patterns.

LIKE is built into the SQL language. It has two wildcards:

  • % represents any number of characters including zero.
  • _ represents exactly one character.

Regular expressions are much more flexible for matching almost any pattern in a string.

When SQL was invented, I don't think regular expressions were in common use in computer systems -- at the very least, the folks at IBM who worked on relational databases may not have been familiar with the folks at ATT who were inventing Unix.

Regular expressions are much more powerful than LIKE patterns, of course. And Hive supports them via the RLIKE operator (and some other functions).

The SHOW functionality is not standard SQL. So, the developers of Hive chose the more flexible method for pattern matching.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

HiveQL attempts to mimic the SQL, but it does not strictly follow its standards.

The usage of the wildcards are not pertinent to the LIKE clause, but to the statement itself. SHOW statements validate the wildcards based on the Java regular expression whereas when it comes to SELECT statements, Hive tries to stick with the SQL's wildcard validation.

franklinsijo
  • 17,784
  • 4
  • 45
  • 63