My friends and I are doing an NLP project on song recommendation.
Context: We originally planned on giving the model a recommended song playlist that has the most similar lyrics based on the random input corpus(from the literature etc), however we didn't really have a concrete idea of its implementation.
Currently our task is to find similar lyrics to a random lyric fed as a string input. We are using sentence BERT model(sbert) and cosine similarity to find the similarity between the songs and it seems like the output numbers are meaningful enough to find the most similar song lyrics.
Is there any other way that we can improve this approach?
We'd like to use BERT model and are open to suggestions that can be used on top of BERT if possible, but if there is any other models that should be used instead of BERT, we'd be happy to learn. Thanks.