Questions tagged [non-equi-join]

Non-equi-join is a join using non-equality binary relational operator such as (<, ≤, ≠, >, ≥) . Also known as a Non-equi Theta-Join. Theta join combines tuples from different relations provided they satisfy the theta condition. General form of theta-join can use use all kinds of comparison operators (<, ≤, =, ≠, >, ≥) Both equi-join (join using = operator) and non-equi-join are subsets of general form on Theta-join.

More info about non-equijoin on w3resource.com

46 questions
1
vote
1 answer

Add column to data.table specifying if time is between time points in another table

I have a table (dt1) of fairly regular time points, like this: DATE TIME READING 2022-02-02 11:50:23 123.34 2022-02-02 11:50:59 125.66 2022-02-02 11:51:16 159.23 2022-02-02 11:52:34 234.22 etc... Note…
rw2
  • 1,549
  • 1
  • 11
  • 20
1
vote
1 answer

Conditional non-equi join in data.table

I have a dataset with some missing data (in reality <1%). We are going to use data from other sites to fill in the missing data. We have a data.table named dt1, which contains the data. We have another data.table named dt2 and this is an index of…
TBP
  • 697
  • 6
  • 16
1
vote
3 answers

How would one merge 2 data frames based on a category and numeric in df1 - matching the category and finding a value falling within the range of df2?

I want to merge df1 and df2, by pulling in only the numerics from df2 into df1? This would be a nested (Lookup(xlookup) in excel, but Im having difficulty working this in r? Any info would be much appreciated. df1 <- data.frame(Names =…
1
vote
1 answer

Can I create a non-equi join using python on strings?

I want to join two dataframes together to add labels to codes. The codes consist of a combination of a letter and a number. Is there a way to join the tables based on the letter and number in one step? It does not have to be native pandas. Note that…
Inkling
  • 469
  • 1
  • 4
  • 19
1
vote
1 answer

How to write a non-equi join in Hive

My tables include 3 date columns, dateA , dateB and dateC What I require: JOIN ON dateA between dateB and dateC The JOIN works quite well in Teradata, but I'm getting errors while running in Hive. select * from table A left join table B on A.col1 =…
1
vote
1 answer

Filtering by different ranges by group in data.table

I have some data in a table and I would like to do a non-equi join (I think is the right terminology) and filter it by different ranges for different groups. In the below example I would like to filter group "a" so that it only returns values…
TBP
  • 697
  • 6
  • 16
1
vote
2 answers

HIVE join taking too long, but is fast on Impala

i have a query like the below. This query runs in 15 seconds on Impala, but when I run the same on HIVE, it takes 10+ minutes. I have to join several other tables to this query (with similar joins as the below) and the total time it takes is more…
Sawan S
  • 87
  • 8
1
vote
2 answers

Hive Where clause with subquery - Only SubQuery expressions that are top level conjuncts are allowed

I need to get all the record from tableA greater than cdc_date which is stored in another tmp_table tmp_table has only one column cdc_date and only one record. tableA has more the 5 million records. My Hive query Select count(*) from tableA as a…
1
vote
1 answer

How to LEFT JOIN on ANY of the matching clauses in R?

could you please help me out with this: I have a dataframe (df1) that has index of all articles published in the website's CMS. There's a column for current URL and a column of original URLs in case they were changed after publication (column name…
dan
  • 33
  • 4
1
vote
3 answers

Column name labelling in data.table joins

I am trying to join data.table x to z using a non-equi join. Table x contains two columns X1 and X2 that are used as the range to use for joining with column Z1 in z. The current code successfully does the non-equi join however certain columns are…
JFG123
  • 577
  • 5
  • 13
1
vote
1 answer

Why Hive can not support non-equi join?

I found that the Hive does not support non-equi join.Is it just because it is difficult to convert non-equi join to Map reduce?
zun
  • 11
  • 1
1
vote
2 answers

Non-equi join of data table operation

I'd like to add columns to data table 1 that are operations on data table 2, joining by a variable and where dates from data table 2 are <= the dates from data table 1. I'm looking for a solution that isn't too computationally expensive (I have…
Slash
  • 501
  • 2
  • 9
1
vote
1 answer

R sum by group if date within date range

Suppose I have two dataframes. The first one includes "Date" at which a "Name" issues a "Rec" for an "ID" and the "Stop.Date" at which "Rec" becomes invalid. df (only a part) structure(list(Date = structure(c(13236, 13363, 14074, 13199, 14554),…
JB_G
  • 45
  • 7
1
vote
2 answers

Summing a dataframe based on another dataframe

I have daily data of rainfall from 10 locations across 10 years set.seed(123) df <- data.frame(loc.id = rep(1:10, each = 10*365),years = rep(rep(2001:2010,each = 365),times = 10), day = rep(rep(1:365,times = 10),times = 10), rain =…
89_Simple
  • 3,393
  • 3
  • 39
  • 94
1
vote
1 answer

Join big dataframe in r and filter in the same time

df1 = data.frame(id=1,start=as.Date("2012-07-05"),end=as.Date("2012-07-15")) df2 = data.frame(id=rep(1,1371),date = as.Date(as.Date("2012-05-06"):as.Date("2016-02-05"))) output = dplyr::inner_join(x=df1,y=df2,by="id") %>% filter(date>=start &…
Rona
  • 97
  • 8