Given a set of strings, find the longest common subtring. It is assumed that the input set of strings will never have more than one substring in common.
Input: file containing a vector of strings on each line.
Output: a string representing the common string. If the strings have no common substring, the output should be 0.
Example:
For:
orchestra
check
chelsea
The output is che (common sub-sequences are not considered valid outputs; that is, ch and he in the given example, are not considered valid outputs).
Solution
Given only two strings, a very fast dynamic programming approach is known, which solves the problem in , where is the length of the longest string. Generalizing the problem to a finite set of strings with cardinality is a little harder since remembering every valid path would require a tensor and filling it would lead to a complexity of . A faster and more intuitive way of solving this problem is using a suffix-tree as well as a bit vector.
When we insert every suffix of a given string into a tree, like "orchestra", "rchestra", "chestra", ... "a" and mark every node as touched by the kth word, using a bit vector, we can start at the root node and check for the longest path for which all bits are set. Creating the tree can be done in . We avoid the higher computational complexity by storing every combination in the tree and therefore going into space complexity. Since we expect we can use an integer as the bit vector and don't need to use an infinity bit set implementation like BitSet.js.
function insert(T, str, start, iter) {
for (var i = start; i < str.length; i++) {
var c = str[i];
if (T[c] === undefined) {
T[c] = {next: {}, mask: 1 << iter};
} else {
T[c].mask|= 1 << iter;
}
T = T[c].next;
}
}
var arr = ["orchestra", "check", "chelsea"];
var T = {};
for (var k = 0; k < arr.length; k++) {
var str = arr[k];
for (var i = 0; i < str.length; i++) {
insert(T, str, i, k);
}
}
Having the filled tree allows us to search for all valid substrings and extract the longest:
function findLCS(T, goal) {
var ret = [];
for (var k in T) {
if (T[k].mask == goal) {
ret.push(k + findLCS(T[k].next, goal));
}
}
return ret;
}
var ret = findLCS(T, Math.pow(2, arr.length) - 1);
if (ret.length === 0) {
console.log(0);
} else {
console.log(ret.reduce((a, b) => a.length > b.length ? a : b))
}