There is a challenge on Codewars that asks you to decipher SHA-1 hashes.
(I think) I need to require Crypto in order to do so. I found this example of how to create SHA-1 Hash from a String in JS using Crypto:
let x = "Geek"
function createSha1 (x) {
const crypto = require('crypto'),
let hashPwd = crypto.createHash('sha1').update(x).digest('hex');
return hashPwd;
}
So then basically I need to do the opposite, correct? The argument is given to the function as a SHA-1 hash, so we need to take that and turn it into a string - How can you decode a SHA-1 hash? Because according to this article, you can't.
It was suggested by a friend that I would need to
generate the hash, and then start looping and generating it for “a”, “b”, “c”, …“aa”, “ab”, “ac”… etc, and stop when the hashes match and return it.
So then to generate 'normal' hash, I would do something like:
function stringToHash(string) {
let hash = 0;
if (string.length == 0) return hash;
for (let i = 0; i < string.length; i++) {
let char = string.charCodeAt(i);
hash = ((hash << 5) - hash) + char;
hash = hash & hash;
}
return hash;
}
But of course, I don't know what string I should use for the argument there, because Codewars is giving us SHA-1 hashes as arguments already: e6fb06210fafc02fd7479ddbed2d042cc3a5155e.
It's the nature of this site, but I hate to post an answer to this challenge right away. Please give hints / direction prior to a working solution first if possible.
Thanks!
Brute force or dictionary attack
In general, you can't reverse hashes. What you can do instead is to try various plausible candidates for the input data, hash them, and check if you got the correct answer. This is how most 'password hash cracking' works in practice, with various well established tools (hashcat, john the ripper, etc) and online services, however that's probably not relevant since you're expected to code the solution yourself.
The main thing that determines how fast you'll find the solution is your choice of candidate solutions. Looping through all the letters and trying out all possible combinations as your friend suggested - the brute force solution - is a possibility. It's worth noting that it will succeed only if you're looping through the correct 'alphabet' - if you're trying only the lowercase letters, then you will intentionally ignore options that include a number or an uppercase letter or some other symbol.
Another commonly used option is to assume that the password is weak and run a 'dictionary attack', trying out various options that are more likely to be chosen by humans than a random sequence of letters. It may be a literal dictionary, trying English words; it may be a list of known frequently used passwords (e.g. https://github.com/danielmiessler/SecLists has some lists like that), or it may be a dictionary combined with some 'mutation rules' - e.g. if the list includes 'password' a cracking tool might automatically generate and try also 'password2' and 'Password' and 'passw0rd' among other options.
In your particular case the password is weak enough so that both brute force and a dictionary attack will work. For more difficult hashes that won't be the case.
Ideal cryptographic hashes are designed specifically as one way functions, making it infeasible to simply apply an inverse transform to get the plain text back out. Something else you should be aware of is due to the compression nature of hash functions you will have issues with the pigeon hole principal, meaning that multiple things can hash to the the same value.
There are several ways of solving this problem though that don't require you to create an inverse.
Bruteforce
As your friend mentioned, the easiest thing to do would be just to try to brute force the hash. Assuming that they picked something that is say 6 characters long, it should take less than a second. If you know anything about the nature of the password then your work is easier. For example if you are told the password maybe between 4-6 chars long and it contains at least one number and uppercase you can create a character set to use in your brute force. This assumes that the password is a straight hash or with some known salt.
Rainbow Tables
This is somewhat related to brute forcing, but someone has spent the time and precomputed as many possible hashes as they can. You can find these tables online and simply have to look up your hash. Most common passwords, words, phrases, and alphanumeric combinations have been enumerated and can be looked up within several miliseconds. Again these assumes that the values are run through hash as is without salts ect.
Length Extensions
With sha1, you can simply forge something that hashes to the same exact value using something called length extension. This is probably outside of what you need, but is something worth a look if you are really interested in cryptoanalysis.
I don't use that site so not familiar with the format.
It wants you to brute force it, can get it to pass on Test but not on Attempt, it just always times out
Here is the code, knowing its only going to be either code or test we can reduce charset, but it still times out. maybe you know why.
const crypto = require('crypto');
const alpha = 'abcdefghijklmnopqrstuvwxyz';
function* permutator(length = 4, prev = "") {
if (length <= 0) {
yield prev;
return;
}
for (const char of [...alpha])
yield* permutator(length - 1, prev + char);
}
function passwordCracker(hash) {
const it = permutator();
while (true) {
const v = it.next().value
if (v && crypto.createHash('sha1').update(v).digest('hex') === hash)
return v
}
}
Edit, it gives hints that its expecting to see at least 5 chars, if you fake it, with something like:
function passwordCracker(hash) {
// Good Luck
let map = {
'e6fb06210fafc02fd7479ddbed2d042cc3a5155e': 'code',
'a94a8fe5ccb19ba61c4c0873d391e987982fbbd3': 'test',
}
return map[hash];
}
Errors with:
Test Results:
Fixed tests
testing...
expected undefined to equal 'try'
...
harder testing...
expected undefined to equal 'cgggq'
So that means you can't cheat with a reduced charset, it will need to be a-z, work backwards (because it will pass before 5) and be at least 5 long.
On my computer, the above code takes way longer than 12seconds which is the limit :(
Related
I want to hide my personal data (email and phone) from scrapers and bots. This data is in the href of anchor tags. Of course, actual users should still be able to have functional clickable links.
I was thinking to use a simple JavaScript function to encrypt and decrypt the data so that patterns matchers (*#*.* etc) who only get the HTML code won't find the actual email address.
So my encryption function is to convert the string to a list of character codes, incrementing all list elements by 1, and converting it back to a string. See below for the code.
My question: Is this an adequate way to hide data from scrapers? Or does every scraper render JS nowadays?
The code:
JSFiddle
function stringToCharCodes(string) {
// Returns a list of the character codes of a string
return [...string].map(c => c.charCodeAt(0))
}
function deobfuscate(obfString) {
// String to character codes
let obfCharCodes = stringToCharCodes(obfString);
// Deobfuscate function (-1)
let deobfCharCodes = obfCharCodes.map(e => e -= 1);
// Character codes back to string
// Use spread operator ...
return String.fromCharCode(...deobfCharCodes);
}
// Result of obfuscate("example#example.com")
let obfEmail = "fybnqmfAfybnqmf/dpn";
document.getElementById("email").href = "mailto:" + deobfuscate(obfEmail);
// Result of obfuscate("31612345678")
let obfPhone = "42723456789";
document.getElementById("whatsapp").href = "https://wa.me/" + deobfuscate(obfPhone);
function obfuscate(string) {
// Obfuscate - Use developer tools F12 to run this once and then use the obfuscated string in your website
// String to character codes
let charCodes = stringToCharCodes(string);
// Obfuscate function (+1)
let obfCharCodes = charCodes.map(e => e += 1);
// Character codes back to string
// Use spread operator ...
return String.fromCharCode(...obfCharCodes);
}
<h1>Obfuscate Email And Phone</h1>
<p>Scrapers without Javascript will not be able to harvest your personal data.</p>
<ul>
<li><a id="email">Mail</a></li>
<li><a id="whatsapp">WhatsApp</a></li>
</ul>
The question is hard to answer, because there is no absolute truth, but let me try it.
You'll never get your email hidden 100% securely. Anything that renders the email address in a way that the user can read it, can also be rendered by sophisticated email scraper programs.
Once we accept that, what remains is the challenge to find a reasonable balance between the effort to hide the email address and the damage caused by a scraped email address.
In my experience, obfuscating the email and the href=mailto tag using html character encoding for a few characters is extremely simple but still effective in most cases. In addition to that, it renders without Javascript.
Example:
peter.pan#neverland.org
may become something like
peter.pan#neverland.de
It's supposedly even enough to hide the mailto: and the #.
I would guess that, because there are so many email addresses that can be collected too easily, email scrapers don't tend to use a lot of highly sophisticated techniques for that purpose. It's just not necessary.
Remember, however good you try to hide your email address on a publicly accessible website, if it's in one of the many address leakages, you've lost anyway. I use custom email addresses for different services, and only for those services, and still I get spam sent to some of these addresses, so I'm sure they were leaked in some way.
With regard to your approach, I'd say yes, it's good enough.
I've been searching around for a simple-lightweight hashing algorithm for JavaScript. I did find this numerically-based answer on Stack Overflow here.
Unfortunately, I am unable to use this since it's numerically based and I'll need to use this hash as a unique index elsewhere in my code. Often this function returns negative numbers and that would be a big no-no (try 'hello world!'.hashCode() from the snippet linked above to see what I mean).
I've been tempted to use the md5 hashing libraries out there for JS but they're simply to bulky for my purpose and encryption libraries (such as this) are overkill.
It's worth noting that the information within this hash isn't sensitive in anyway and it wouldn't necessarily matter if it was decrypted. The purpose of this function would be to simply generate fixed-length output data that acts as a shortened reference to the original data that I would pass in.
Any help, tips and comments are much appreciated :)
The solution proposed by Kooilnc, to use the absolute value, should do the tric for you. However, if you want to use a hashing function to generate a reference, i assume that the reference you get should be unique as to match the exact element it was generated from. If this is the case, be aware of collisions. Hashing function can create hashes that are similar even though the original messages are different and we call this a collision. If i remember correctly, SHA-1 is also available for java script and is not all that bulk. Good luck
I am unable to use this since it's numerically based and I'll need to use this hash as a unique index elsewhere in my code.
Hash functions are normally numerically based and are rarely perfect (produce unique keys). I think you need something different:
function GuidGen()
{
this.items = {};
this.size = 0;
}
GuidGen.prototype.get = function(str)
{
if (!(str in this.items))
{
this.items[str] = this.size++;
}
return this.items[str];
}
// usage:
id = new GuidGen();
id.get("hello world"); // 0
id.get("spam"); // 1
id.get("eggs"); // 2
id.get("hello world"); // 0
One problem:
I want to process a string (str) so that any parenthesised digits (matched by rgx) are replaced by values taken from the appropriate place in an array (sub):
var rgx = /\((\d+)\)/,
str = "this (0) a (1) sentence",
sub = [
"is",
"test"
],
result;
The result, given the variables declared above, should be 'this is a test sentence'.
Two solutions:
This works:
var mch,
parsed = '',
remainder = str;
while (mch = rgx.exec(remainder)) { // Not JSLint approved.
parsed += remainder.substring(0, mch.index) + sub[mch[1]];
remainder = remainder.substring(mch.index + mch[0].length);
}
result = (parsed) ? parsed + remainder : str;
But I thought the following code would be faster. It has fewer variables, is much more concise, and uses an anonymous function expression (or lambda):
result = str.replace(rgx, function() {
return sub[arguments[1]];
});
This works too, but I was wrong about the speed; in Chrome it's surprisingly (~50%, last time I checked) slower!
...
Three questions:
Why does this process appear to be slower in Chrome and (for example) faster in Firefox?
Is there a chance that the replace() method will be faster compared to the while() loop given a bigger string or array? If not, what are its benefits outside Code Golf?
Is there a way optimise this process, making it both more efficient and as fuss-free as the functional second approach?
I'd welcome any insights into what's going on behind these processes.
...
[Fo(u)r the record: I'm happy to be called out on my uses of the words 'lambda' and/or 'functional'. I'm still learning about the concepts, so don't assume I know exactly what I'm talking about and feel free to correct me if I'm misapplying the terms here.]
Why does this process appear to be slower in Chrome and (for example) faster in Firefox?
Because it has to call a (non-native) function, which is costly. Firefox's engine might be able to optimize that away by recognizing and inlining the lookup.
Is there a chance that the replace() method will be faster compared to the while() loop given a bigger string or array?
Yes, it has to do less string concatenation and assignments, and - as you said - less variables to initialize. Yet you can only test it to prove my assumptions (and also have a look at http://jsperf.com/match-and-substitute/4 for other snippets - you for example can see Opera optimizing the lambda-replace2 which does not use arguments).
If not, what are its benefits outside Code Golf?
I don't think code golf is the right term. Software quality is about readabilty and comprehensibility, in whose terms the conciseness and elegance (which is subjective though) of the functional code are the reasons to use this approach (actually I've never seen a replace with exec, substring and re-concatenation).
Is there a way optimise this process, making it both more efficient and as fuss-free as the functional second approach?
You don't need that remainder variable. The rgx has a lastIndex property which will automatically advance the match through str.
Your while loop with exec() is slightly slower than it should be, since you are doing extra work (substring) as you use exec() on a non-global regex. If you need to loop through all matches, you should use a while loop on a global regex (g flag enabled); this way, you avoid doing extra work trimming the processed part of the string.
var rgR = /\((\d+)\)/g;
var mch,
result = '',
lastAppend = 0;
while ((mch = rgR.exec(str)) !== null) {
result += str.substring(lastAppend, mch.index) + sub[mch[1]];
lastAppend = rgR.lastIndex;
}
result += str.substring(lastAppend);
This factor doesn't disturb the performance disparity between different browser, though.
It seems the performance difference comes from the implementation of the browser. Due to the unfamiliarity with the implementation, I cannot answer where the difference comes from.
In terms of power, exec() and replace() have the same power. This includes the cases where you don't use the returned value from replace(). Example 1. Example 2.
replace() method is more readable (the intention is clearer) than a while loop with exec() if you are using the value returned by the function (i.e. you are doing real replacement in the anonymous function). You also don't have to reconstruct the replaced string yourself. This is where replace is preferred over exec(). (I hope this answers the second part of question 2).
I would imagine exec() to be used for the purposes other than replacement (except for very special cases such as this). Replacement, if possible, should be done with replace().
Optimization is only necessary, if performance degrades badly on actual input. I don't have any optimization to show, since the 2 only possible options are already analyzed, with contradicting performance between 2 different browser. This may change in the future, but for now, you can choose the one that has better worst-performance-across-browser to work with.
I'm trying to sort an array of objects based on the objects' name property. Some of the names start with 'Æ', and I'd like for them to be sorted as though they were 'Ae'. My current solution is the following:
myArray.sort(function(a, b) {
var aName = a.name.replace(/Æ/gi, 'Ae'),
bName = b.name.replace(/Æ/gi, 'Ae');
return aName.localeCompare(bName);
});
I feel like there should be a better way of handling this without having to manually replace each and every special character. Is this possible?
I'm doing this in Node.js if it makes any difference.
There is no simpler way. Unfortunately, even the way described in the question is too simple, at least if portability is of any concern.
The localeCompare method is by definition implementation-dependent, and it usually depends on the UI language of the underlying operating system, though it may also differ between browsers (or other JavaScript implementations) in the same computer. It can be hard to find any documentation on it, so even if you aim at writing non-portable code, you might need to do a lot of testing to see which collation order is applied. Cf. to Sorting strings is much harder than you thought!
So to have a controlled and portable comparison, you need to code it yourself, unless you are lucky enough to find someone else’s code that happens to suit your needs. On the positive side, the case conversion methods are one of the few parts of JavaScript that are localization-ready: they apply Unicode case mapping rules, so e.g. 'æ'.toUpperCase() yields Æ in any implementation.
In general, sorting strings requires a complicated function that applies specific sorting rules as defined for a language or by some other rules, such as the Pan-European sorting rules (intended for multilingual content). But if we can limit ourselves to sorting rules that deal with just a handful of letters in addition to Ascii, we can use code like the following simplified sorting for German (extract from by book Going Global with JavaScript and Globalize.js):
String.prototype.removeUmlauts = function () {
return this.replace(/Ä/g,'A').replace(/Ö/g,'O').replace(/Ü/g,'U');
};
function alphabetic(str1, str2) {
var a = str1.toUpperCase().removeUmlauts();
var b = str2.toUpperCase().removeUmlauts();
return a < b ? -1 : a > b ? 1 : 0;
}
You could adds other mappings, like replace(/Æ/gi, 'Ae'), to this, after analyzing the characters that may appear and deciding how to deal with them. Removing diacritic marks (e.g. mapping É to E) is simplistic but often good enough, and surely better than leaving it to implementations to decide whether É is somewhere after Z. And at least you would get consistent results across implementations, and you would see what things go wrong and need fixing, instead of waiting for other users complain that your code sorts all wrong (in their environment).
Basically, I have a user input field where a user can enter a number. They would like to also be able to enter equations in the input field as well.
Something like "874.45 * 0.825 + 4000" and have that converted to its real numeric value.
All the solutions I found point to the dreaded eval() method. Especially with this being a user entered field, I'm concerned about just running eval("874.45 * 0.825 + 4000") and hoping to get a number out the back end.
I suppose I could do a web service call back to the server (ASP.NET), but I'm afraid a slight delay will create some frustration from the user.
Does anyone know of either a good technique or existing libraries?
What you really need here is an "expression parser", because you're trying to allow users to express their values using a small domain-specific language.
The basic mechanics work like this:
Tokenize their expression into operators and operands.
Based on the order of operations (e.g, where multiplication is evaluated with higher priority than addition), push the operators and operands onto a stack.
Pop the operands from the stack and push intermediate results back onto the stack. Repeat until the stack is empty.
I've done this a gazillion times for different "little languages" throughout my projects. But never in javascript. So I can't directly recommend a suitable parsing library.
But a quick googling reveals "PEG.js". Check it out here:
http://pegjs.majda.cz/
All of the examples in their documentation are for exactly the kind of "expression parser" you're trying to build.
Simply multiply it by 1 and it will force javascript to treat it as an integer from then on.
Eg
int = '2343.34' * 1;
int = input * 1;
And what is so wrong about the eval in this case?
As for me it perfectly fits in your task. If you want to shield its execution context then you can define function like:
function calc(str) {
var window = null, self = null, document = null;
// other globals like: $ = null, jQuery = null, etc.
try { return eval(str); } catch(e) {...}
}
and use it where you need to interpret the string. Simple and effective.
I think eval can pose a lesser security risk if you parse the resulting string and validate its content to be only digits and operators and execute the evaluation by faking the outer scope variables like document etc as 'var document = null'.