Fri Dec 24, 2010 4:26 pm by tartle 


Say you have one string of alphabetic characters, and say you have another, guaranteed smaller string of alphabetic characters. Algorithmically speaking, what's the fastest way to find out if all the characters in the smaller string are in the larger string?
For example, if the two strings were:
String 1: ABCDEFGHLMNOPQRS
String 2: DCGSRQPOM
You'd get true as every character in string2 is in string1. If the two strings were:
String 1: ABCDEFGHLMNOPQRS
String 2: DCGSRQPOZ
you'd get false as Z isn't in the first string.
The naive way to do this operation would be to iterate over the 2nd string once for each character in the 1st string. That'd be O(n*m) in algorithm parlance where n is the length of string1 and m is the length of string2. Given the strings in our above example, thats 16*8 = 128 operations in the worst case.
A slightly better way would be to sort each string and then do a stepwise iteration of both sorted strings simultaneously. Sorting both strings would be (in the general case) O(m log m) + O(n log n) and the linear scan after that is O(m+n). Again for our strings above, that would be 16*4 + 8*3 = 88 plus a linear scan of both strings at a cost of 16 + 8 = 24. Thats 88 + 24 = 112 total operations. Slightly better. (As the size of the strings grow, this method would start to look better and better)
Finally, the best method would simply be O(n+m). That is, iterate through the first string and put each character in a hashtable (cost of O(n) or 16). Then iterate the 2nd string and query the hashtable for each character you find. If its not found, you don't have a match. That would cost 8 operations  so both operations together is a total of 24 operations. Not bad and way better than the other solutions.
Is there another solution?
What if  given that we have a limited range of possible characters  I assigned each character of the alphabet to a prime number starting with 2 and going up from there. So A would be 2, and B would be 3, and C would be 5, etc. And then I went through the first string and 'multiplied' each character's prime number together. You'd end up with some big number right? And then  what if I iterated through the 2nd string and 'divided' by every character in there. If any division gave a remainder  you knew you didn't have a match. If there was no remainders through the whole process, you knew you had a subset. Would that work? 




Fri Dec 13, 2013 3:46 pm by simon155 


Can you define the boundaries of this problem? It all looks a bit vague. Ordinarily, posting a problem and a solution in one block, which I can only equate to providing someone with an already filled in crossword and asking them to go for it, wouldn't attract much interest.
You refer to a hashtable in your solution, which suggests a database problem? Yet there are no rules around the methods available defined.
On flicking to the end, I note however your preferred solution appears to revolve around assigning prime numbers to each character of the alphabet, yet with no indication of the overheads for doing so  deriving primes in itself unless you have them precalculated isn't usually what one would consider an "efficient" operation, especially when compared with your string operation, which is in theory far faster. Either way, working through your text to "assign primes" which I assume you're either working out on the fly, or looking up elsewhere, then performing a calculation at the end would be IMMENSELY inefficient and definitely NOT something I'd encourage. A simple stepthrough string operation would be more efficient than that.
If you're deadset on a mathematical operation and it's a programming exercise, not intended for efficiency, you could try assigning binary values to letters ie where A=1, B=2, C=4, D=8 etc. Then do a bitwise AND, and see if your small string MINUS the bitwise AND is zero. 




Tue Sep 16, 2014 12:43 pm by mahendrashravan 


As we know that the we have 26 alphabets take an 32bit integer.
whatever the alphabet encounters in string 1, set that particular alphabet index number in that integer. example A1, B2, C3..........Z26. whenever such alphabet encounters in String1 set that corresponding bit.
In String2 for each alphabet check whether that bit in integer value is set or not. If every alpahbet in String2 is set then it means String2 is taken from Sting1, else both strings are different.
With this method we can achieve time and space complexity. 




Fri Oct 10, 2014 12:44 am by Decius 


You can't do it in less than O(n+m). That's the time it takes to iterate over both strings. 




Thu May 07, 2015 4:48 am by zekta 


Agreed with Decius, you can't beat O(n+m) (Worst Case)
But you can more efficient on average case.
First thing first, as we know that the String 2 is always shorter, we should go for string 2 first. and map string 1 afterward
(so we can have an early exit if all the string 2 char have seen)
But you still need to go through the whole sting 1 as the worse case. 




Thu Jul 21, 2016 2:24 pm by bathfilms 


Tartle,
Simon155 has a point if the context requires a detailed algorithm for computer coding.
Nonetheless, I like the ingenuity of your mathematical algorithm concept.
In essence it is the same as:
1) Assign a different prime number to each character
2) Let x = the product of all the characters of the first string
3) Let y = the product of all the characters of the second string
4) If x/y yields a remainder, then FALSE, else TRUE
Step 3 (theoretically) avoids the iteration process previously mentioned.
Please note that Simon155 is correct about the complexity of implementing this. 






All times are GMT

Page 1 of 1 
