I've made a hashing algorithm from within mmf2. It's obviously a lot slower than the MD5 hashing of the quick hash object, but the point is to be able to create a unlock key generator for Klik games.
Anyways I need help with collision testing. Try out random strings and see if you get repeating patterns or matching hashes for 2 or more inputs. The point is to has as many possible solutions as the computer can compute without the hash being reversible.
The idea behind this is that if your hashing algorithm isn't widely used then it'll take longer to hack and be almost foolproof against brute-forcing because the attacker doesn't have the original algorithm. Also the programmer can invent his or her own key format (ex: xxx-xxx-xxx or xxxxxx-xx-xxxxxx)
The keys can be generated based on their username and doesn't require a database.
So to the point! If anyone can crack this or reverse engineer the algorithm without having the source I'll give you 600 DC points and the source code.
Well it's really only meant for small inputs, like username. It's meant to generate keys for games and whatnot.
I already found a bunch of collisions though, and after another revision I got it to hash up to 3 characters at a time until It hit another collision.
Something like
aba = !ag or something.
I guess offering 600 DC point to reverse engineer it is kinda silly though because if you are smart enough to do that then you wouldn't need the source.
Hint: I had to use the int64 object to do the math because the numbers wouldn't fit in a 32bit int.
OK, new version. This version has bruteforce collision testing.
I've already found a few collisions in a test group of 248,516 Hashes!
Out of that many hashes I've found 219 collisions. That's pretty good for a homegrown hashing algorithm.
I've only tested up to 3 character combination's.
The only 2 character collisions I found are as follows:
!7=d6
09=m8
Z$=m#
5&=i$
{(=G*
0)=U(
#,=o*
All of these generate the same hash for different inputs.
For the complete list download the file below, I've included them in the archive.
I also want to make it clear that I didn't use any code from any other algorithm, and the only extensions I used were the int64 object and the string parser(converting characters to Ascii)
I've also changed the algorithm quite abit from the original version, so it should be faster now. It also now only uses only capital letters and numbers in the output hash.
Here's the new version. I also included the log file from my tests.
Yes I've already noticed those patterns, but yet I'm not generating many collisions.
Those patterns are the result of multiplying and dividing prime numbers.
I chose prime numbers to make sure that the it never lands the same pattern in the same spot.
However my algorithm doesn't have a very good avalanch effect either which causes only small changes in the hash when small changes to the input are made. http://en.wikipedia.org/wiki/Avalanche_effect
But I guess perhaps my sudo-random number generator based on the seed that the original string provides is a bit shaky. However as soon as I fix it other problems arise.
Originally Posted by Adam Phant BTW, 5320 characters, uses all of those things.
I'd never be able to crack it.
Originally Posted by Adam Phant If it's supposed to be irreversible, then why use it? What is the end application for a Klik game?
It's not used "in game" it's used for making unlock-keys and whatnot.
If there is only 1 hash for 1 input then the unlock key will change for each user. Therefore if someone were to buy your game and then distribute the unlock key it wouldn't work for anyone else. Another example could be a time locked key. The hash changes based on the date, so the key would only work for a space of time before it no longer matches.
To make each hash/unlockkey computer specific you could do this:
Username + harddrive serial # -> Hashed key.
The other users would need the same username and harddrive for it to work. This is a bit extream however and isn't used very often.
It's kinda hard to explain, but here is the basic idea.
Data -> Hash
The Hash is much smaller than the Data, so it's easier to store.
If the Data is changed any then the Hash changes, so then you know the Data can't be the same.
Most websites use hashes to store passwords.
1.User enters password
2.Password is hashed and sent to the server
3.If the hash matches the one in the database then the user has entered the correct password
(they are all the same length the font just messes them up)
I discovered the reason the randomness was bad last time as well. It appears that the int64 object doesn't handle floats! As a work around I just multiplied it by 1000 did my "other math" and divided it by 1000!
UrbanMonkey, although it's nice that you've written your own hashing function, it's no use for anything other than academic purposes. Unless you can prove that the collision rate is equal or less than that of an MD5 hash I wouldn't use it. There are 2 schools of thought that I simply don't agree with
1) MD5 being freely available means it's easier to crack the hashes
Since it's a 1 way hashing function, knowing the algorithm means nothing, other than you can hash things for yourself. If someone wanted your hashing function they would just have to use a debugger and watch what happens in memory (if they can get past the MMF bloat). Ok, so there are MD5 hash databases online that people can use to find matching key values for a given hash, but if you modify your string in a specific way, or rehash your hash n times, then these databases are useless.
2) Using my own (or even an MD5) hashing algorithm makes the copy protection more secure
People don't attack the hashing algorithms, they attack the comparison method. Just modify the code that compares the 2 hashes and trick the code into thinking it's always right.
You are absolutely right. I was just having some fun. I'm not seriously thinking that my function is better than md5, nor do I want you to use it.
Just an experiment. It's also very slow due to the way mmf2 works, so it obviously wouldn't be suitable for anything that requires larger values.
But let me ask you a question. Do you really think that someone would go to all the trouble of hacking an mmf2 app. Especially in this community. Also you seem to know what your talking about, so could give me an example. Checking through memory to discover how something works is a lot easier said than done.
Also using rainbow tables you can double hash something or triple hash something while bruteforcing to bypass that protection. I don't think that invalidates md5 though simply because a specific md5 hash still cannot be targeted and created from viable data; that is data that's actually useful.