I have two data sets. The first data set has approx. 50.000 movie and song titles and the second one have 20.000 blacklist strings. I am looking for the best algorithm to detect movie/song title which contains blacklisted word(s).
Example: Dataset #1
The Lord Of The Rings
E.T.
Star Wars
...
(50k items)
Blacklist Data set
Lord
Home Alone
Matrix
ar
...
(20k items)
Items in these data sets may be a character or a few words. String search algorithms like Boyer-Moore is not helping me with this since I have more than 1 needle to search in the haystack. I (probably) need to find an algorithm to find all combinations efficiently and later make a string search (regex maybe?) for each combination.