Since the dataframe is not really used here I will simply work with the following two lists:
import string
import random
random.seed(18)
dataset1 = [''.join(random.choice(string.ascii_lowercase + ' ') for _ in range(random.randint(13, 20))) for s in range(1000)]
dataset2 = [''.join(random.choice(string.ascii_lowercase + ' ') for _ in range(random.randint(13, 20))) for s in range(1000)]
using these two lists with the code you provided using fuzzywuzzy. As a first change you could use RapidFuzz (I am the author) which is basically doing the same as FuzzyWuzzy, but is quite a bit faster. When using my tests lists this was about 7 times as fast as your code. Another issue is that when using fuzz.token_sort_ratio the strings are always lowercased and e.g. punctuation is removed. While this makes sence for the string matching, your doing it multiple times for each string in the list, which adds up when working with bigger lists. Using RapidFuzz and preprocessing only once is about 14 times as fast on these lists.
from rapidfuzz import fuzz, utils
dataset2_processed = [utils.default_process(x) for x in dataset2]
dataset1_processed = [utils.default_process(x) for x in dataset1]
matched_pair = []
for word1, word1_processed in zip(dataset1, dataset1_processed):
for word2, word2_processed in zip(dataset2, dataset2_processed):
if fuzz.token_sort_ratio(word1_processed, word2_processed, processor=None, score_cutoff=85):
matched_pair.append((word1, word2))