For questions related to pattern matching, using character sequences or tree structures. In contrast to pattern recognition, the match described here usually has to be exact.
Questions tagged [matching]
2735 questions
0
votes
2 answers
Extract text using regular expression
I'm trying to extract a case id from the string, Could someone help me with this
https://looney-tunes/review/case/CAAAAAAAAR-hw7QEAAAAMf___-A?caseGroup=12&queueId=52
I want to extract the portion after case/ and before '?'in the link. They're 27…

Nagasai Tenekondala
- 48
- 1
- 5
0
votes
1 answer
R: Nearest neighbour matching with MatchIT
I would like to use nearest neighbour matching with MatchIt in R.
So far I have used the following code:
Matching<- matchit(Treatment ~ Size+ Age + Expenses, data=data, method = "nearest", distance="glm", replace=TRUE)
I have two…

remo
- 365
- 1
- 10
0
votes
1 answer
How to find the best matching dates on two dataframes in r?
I have two dataframes insitu and model:
dput(head(insitu,20))
structure(list(ID = c("AUR", "AUR", "AUR", "AUR", "AUR", "AUR",
"LAM", "LAM", "LAM", "LAM", "LAM", "LAM"), D_SOS = structure(c(16929,
17149, 17422, 17850, 18389, 18202, 17044, 16744,…

Cláudio Siva
- 502
- 2
- 10
0
votes
1 answer
match two DFs based on first, middle, last name & date of birth (account for data flaws)
I have a very simple problem: I want to check which persons in DF1 are contained in DF2. I want to do so based on their
first name,
middle name,
last name, and
date of birth.
I want to keep only those rows of DF1 and DF2 that are correct…

Philipp
- 3
- 2
0
votes
0 answers
Finding Matching ID's based on similar string match
I have a large pandas dataframe ( 10 million records) shown below (snapshot) :
CID Address
100 22 park street springvale nsw2655
101 U111 28 james road, Vic 2755
102 22 park st. springvale, nsw-2655
103 29 Bino Avenue , Mac - 3990
104 …

A DUBEY
- 806
- 6
- 20
0
votes
0 answers
Excel Index and match formula
This is a tricky one.. I have a list of raw data and am looking to break down this data in another tab by formulas. To keep it simple, I have raw data of all equity and list options with its ticker. I am trying to distinguish all equity tickers and…
0
votes
2 answers
Fuzzy matching gives unmatched data
Hi I want to merge df1 and df2 based on x.
df1<-tibble(x=c("TRP OVERSEAS STOCK |","PIMCO TOTAL RETURN FUND"),y=c(1,2))
df2<-tibble(x=c("AB Portfolios: AB Growth Fund; Class K Shares","PIMCO TOTAL RETURN FUND"),z=c(2020,2021))
However, when I…

Jane
- 91
- 4
0
votes
1 answer
Most Effective Way to get matched and unmatched objects in 2 ArrayLists
I have a task to read 2 files and match the contents of the files and provide a list of unmatched entries of both files. That means I have to present how many matched entries in the two files and how many unmatched entries in file 1 which is not in…

Ran_Macavity
- 154
- 2
- 21
0
votes
0 answers
Python script getting different subtotals on the same csv with the same code as a co-worker
Hi all I am working in python and am running a script on a csv to get totals and subtotals and for some reason I am getting a slightly different subtotal in one of the columns then my co-worker that is running the exact same python script on the…

Craig
- 1
- 1
0
votes
0 answers
Dummy code based on whether two other variable match each other
for my master thesis i am working with Rstudio and analyze M&A data. I want to create a dummy code (Firm relatedness yes/no) based on whether the acquiring & target firms SIC Code match.
I do know how to create dummy codes but not based on the…

Fero
- 1
- 1
0
votes
1 answer
BASH - find line (match string) begins with Text and Forward slash
I am trying to find (match string) line in file, that starts (begin) with TEST /
this works, TEST with whitespace:
if [[ "$LINE" == 'TEST '* ]]
then
echo $LINE
fi
texh with forward slash, doesnt work - how can I make this works?
if [[…

Scripter
- 23
- 4
0
votes
1 answer
python: fastest way cross mathcing findall occurances list of string list in list of sentences with million rows
I have list of sentences with million rows (N) and list of string list (M). I want to get matrix MxN that each element is how many occurances match list of string list in list of sentences with overlapped.
For example:
sentence_list = ['Homegrown…
0
votes
1 answer
MatchIt: Full Matching - Long Vector Error
I am running an analysis to assess the land conservation policy impact on land use change at parcel level. To address the non-random nature of conservation program enrollment, I am running a matching analysis between treated and non-treated parcel…

Pranab
- 3
- 1
0
votes
2 answers
0
votes
1 answer
Matching with multi-level multiple membership data
I am designing a within multi-level study in which my data has both nested and multiple-membership structure. The subjects are multimembers of V1 and also are nested in V2.The subjects are all from one year. When applying matching or weighting,…