I have data in the following format:
+---------+---------+----------+-----------+-----------+-----------+
| id | title | author | keyword_1 | keyword_2 | keyword_3 |
+---------+---------+----------+-----------+-----------+-----------+
I am looking to store it in a database so I can search by title
, keyword_1
, keyword_2
, or keyword_3
.
An example would be
+---------+------------------+-----------+-------------+-------------+-----------+
| id | title | author | keyword_1 | keyword_2 | keyword_3 |
+---------+------------------+-----------+-------------+-------------+-----------+
| 123 | Learn Java 101 | John Doe | java | programming | software |
+---------+------------------+-----------+-------------+-------------+-----------+
On the front end, there is a form where the user inputs a title and/or keywords. The database needs to be queried for this information. But the user input will likely not be an exact match, so we need to do some kind of regex or fuzzy matching. The user payload may input something like:
{
title: "Learn Java",
author: "Jon Doee",
keyword1: "computers",
keyword2: "softwar",
keyword3: null,
}
I realize there are some built-in operations, for example, in Postgres we have LIKE
and Levenshtein()
. However, I'm not sure if this is the right approach. It seems like a very expensive operation to compare a keyword with all three columns.
Surely there must be a clean way to do this. I am posting here because I want to check whether this is or is not the path that I should go down.
From an architectural standpoint is this the correct way to store the data? I thought about using a document-based system and I'm not sure that that would be much better or worse.
I'm somewhat new to all this and would appreciate some guidance on what is recommended. Thanks!