Simple string-based one-way hashing algorithm for JavaScript

Simple string-based one-way hashing algorithm for JavaScript - javascript

I've been searching around for a simple-lightweight hashing algorithm for JavaScript. I did find this numerically-based answer on Stack Overflow here.
Unfortunately, I am unable to use this since it's numerically based and I'll need to use this hash as a unique index elsewhere in my code. Often this function returns negative numbers and that would be a big no-no (try 'hello world!'.hashCode() from the snippet linked above to see what I mean).
I've been tempted to use the md5 hashing libraries out there for JS but they're simply to bulky for my purpose and encryption libraries (such as this) are overkill.
It's worth noting that the information within this hash isn't sensitive in anyway and it wouldn't necessarily matter if it was decrypted. The purpose of this function would be to simply generate fixed-length output data that acts as a shortened reference to the original data that I would pass in.
Any help, tips and comments are much appreciated :)

The solution proposed by Kooilnc, to use the absolute value, should do the tric for you. However, if you want to use a hashing function to generate a reference, i assume that the reference you get should be unique as to match the exact element it was generated from. If this is the case, be aware of collisions. Hashing function can create hashes that are similar even though the original messages are different and we call this a collision. If i remember correctly, SHA-1 is also available for java script and is not all that bulk. Good luck

I am unable to use this since it's numerically based and I'll need to use this hash as a unique index elsewhere in my code.
Hash functions are normally numerically based and are rarely perfect (produce unique keys). I think you need something different:
function GuidGen()
{
this.items = {};
this.size = 0;
}
GuidGen.prototype.get = function(str)
{
if (!(str in this.items))
{
this.items[str] = this.size++;
}
return this.items[str];
}
// usage:
id = new GuidGen();
id.get("hello world"); // 0
id.get("spam"); // 1
id.get("eggs"); // 2
id.get("hello world"); // 0

Related

Whats up with eval()?

I've seen a lot about how the eval() function is evil; not a wise choice for HTML/JavaScript programming. I would like to use a function where I can pass in a string to have it read as a variable name, and eval() seems to do just that, but I don't want to use a destructive function.
From what I understand, the issue with eval() is that it can read third-party input as actual code, which opens a door for malicious activity. I have a map element that keeps track of location using strings for the location names. I also have large blocks of text assigned to variables so I can pull up a description of the current location easily. This seems like an acceptable time to use eval, as the strings that I would be passing in would be provided by other parts of the code. Is this a fair judgement, or is there some other function that I should be using?

(Moving my comment as an answer)
An easy way to get around that is to save whatever variable you're interested in accessing in a javascript Object (i.e. key-value pairs), and access them via indexing. This simple use case doesn't need eval.

From what I understand, the issue with eval() is that it can read third-party input as actual code, which opens a door for malicious activity.
This is not the only reason. One could argue that by today's standards the performance of JavaScript code is negligible.
However, one has to take into account that eval() actually invokes the JavaScript interpreter which is significantly slower than writing the code upfront. ¹
I would like to use a function where I can pass in a string to have it read as a variable name, and eval() seems to do just that, but I don't want to use a destructive function.
This does not warrant the use of eval(). As mentioned in the comments, you can achieve this with keeping track of variables in an object:
let vars = {}
vars["some_variable_name"] = "test"
const var_name = "some_variable_name"
console.log(vars[var_name]) // "test"
as the strings that I would be passing in would be provided by other parts of the code
Might be, but what if in the future some piece of that code actually does process some user input?
Not worth the performance penality and obvious security risk in my opinion.

For a simple way to use a variable name as a string is to use an Object (called a dictionary or map in some languages)
const stringToGrade = {
"freshman": 9,
"sophomore": 10,
"junior": 11,
"senior": 12,
};
document.querySelector("#btn").addEventListener("click", () => {
const asString = document.querySelector("#inp").value.toLowerCase();
const grade = stringToGrade[asString];
console.log(`Your grade number is ${grade}`);
});
<input id="inp" placeholder="Enter your grade level (freshman, sophomore, etc.)" />
<button id="btn">Submit</button>

What is the default “tag” function for template literals?

What is the name of the native function that handles template literals?
That is, I know that when you write tag`Foo ${'bar'}.`;, that’s just syntactic sugar for tag(['Foo ', '.'], 'bar');.¹
But what about just `Foo ${'bar'}.`;? I can’t just “call” (['Foo ', '.'], 'bar');. If I already have arguments in that form, what function should I pass them to?
I am only interested in the native function that implements the template literal functionality. I am quite capable of rolling my own, but the purpose of this question is to avoid that and do it “properly”—even if my implementation is a perfect match of current native functionality, the native functionality can change and I want my usage to still match. So answers to this question should take on one of the following forms:
The name of the native function to use, ideally with links to and/or quotes from documentation of it.
Links to and/or quotes from the spec that defines precisely what the implementation of this function is, so that if I roll my own at least I can be sure it’s up to the (current) specifications.
A backed-up statement that the native implementation is unavailable and unspecified. Ideally this is backed up by, again, links to and/or quotes from documentation, but if that’s unavailable, I’ll accept other sources or argumentation that backs this claim up.
Actually, the first argument needs a raw property, since it’s a TemplateStringsArray rather than a regular array, but I’m skipping that here for the sake of making the example more readable.
Motivation
I am trying to create a tag function (tag, say) that, internally, performs the default template literal concatenation on the input. That is, I am taking the TemplateStringsArray and the remaining arguments, and turning them into a single string that has already had its templating sorted out. (This is for passing the result into another tag function, otherTag perhaps, where I want the second function to treat everything as a single string literal rather than a broken up template.)
For example, tag`Something ${'cooked'}.`; would be equivalent to otherTag`Something cooked.`;.
My current approach
The definition of tag would look something like this:
function tag(textParts, ...expressions) {
const cooked = // an array with a single string value
const raw = // an array with a single string value
return otherTag({ ...cooked, raw });
}
Defining the value of raw is fairly straightforward: I know that String.raw is the tag function I need to call here, so const raw = [String.raw(textParts.raw, ...expressions)];.
But I cannot find anywhere on the internet what function I would call for the cooked part of it. What I want is, if I have tag`Something ${'cooked'}.`;, I want const cooked = `Something ${cooked}.`; in my function. But I can’t find the name of whatever function accomplishes that.
The closest I’ve found was a claim that it could be implemented as
const cooked = [expressions.map((exp, i) => textParts[i] + exp).join('')];
This is wrong—textParts may be longer than expressions, since tag`Something ${'cooked'}.`; gets ['Something ', '.'] and ['cooked'] as its arguments.
Improving this expression to handle that isn’t a problem:
const cooked = [
textParts
.map((text, i) => (i > 0 ? expressions[i-1] : '') + text)
.join(''),
];
But that’s not the point—I don’t want to roll my own here and risk it being inconsistent with the native implementation, particularly if that changes.

The name of the native function to use, ideally with links to and/or quotes from documentation of it.
There isn't one. It is syntax, not a function.
Links to and/or quotes from the spec that defines precisely what the implementation of this function is, so that if I roll my own at least I can be sure it’s up to the (current) specifications.
Section 13.2.8 Template Literals of the specification explains how to process the syntax.

o[str] vs (o => o.str)

I'm coding a function that takes an object and a projection to know on which field it has to work.
I'm wondering if I should use a string like this :
const o = {
a: 'Hello There'
};
function foo(o, str) {
const a = o[str];
/* ... */
}
foo(o, 'a');
Or with a function :
function bar(o, proj) {
const a = proj(o);
/* ... */
}
bar(o, o => o.a);
I think V8 is creating classes with my javascript objects. If I use a string to access dynamically a field, will it be still able to create a class with my object and not a hashtable or something else ?

V8 developer here. The answer to "which pattern should I use?" is probably "it depends". I can think of scenarios where the one or the other would (likely) be (a bit) faster, depending on your app's behavior. So I would suggest that you either try both (in real code, not a microbenchmark!) and measure yourself, or simply pick whichever you prefer and/or makes more sense in the larger context, and not worry about it until profiling shows that this is an actual bottleneck that's worth spending time on.
If the properties are indeed known at the call site, then the fastest option is probably to load the property before the call:
function baz(o, str, a) {
/* ... */
}
baz(o, "a", o.a);
I realize that if things actually were this simple, you probably wouldn't be asking this question; if that assumption is true then this is a great example for how simplifications in microbenchmarks can easily change what the right answer is.
The answer to the classes question is that this decision has no impact on how V8 represents your objects under the hood -- that mostly depends on how you modify your objects, not on how you read from them. Also, for the record:
every object has a "hidden class"; whether or not it uses hash table representation is orthogonal to that
whether hash table mode or shape-tracking mode is better for any given object is one of the things that depend on the use case, which is precisely why both modes exist. I wouldn't worry too much about it, unless you know (from profiling) that it happens to be a problem in your case (more often than not, V8's heuristics get it right; manual intervention is rarely necessary).

Sorting Special Characters in Javascript (Æ)

I'm trying to sort an array of objects based on the objects' name property. Some of the names start with 'Æ', and I'd like for them to be sorted as though they were 'Ae'. My current solution is the following:
myArray.sort(function(a, b) {
var aName = a.name.replace(/Æ/gi, 'Ae'),
bName = b.name.replace(/Æ/gi, 'Ae');
return aName.localeCompare(bName);
});
I feel like there should be a better way of handling this without having to manually replace each and every special character. Is this possible?
I'm doing this in Node.js if it makes any difference.

There is no simpler way. Unfortunately, even the way described in the question is too simple, at least if portability is of any concern.
The localeCompare method is by definition implementation-dependent, and it usually depends on the UI language of the underlying operating system, though it may also differ between browsers (or other JavaScript implementations) in the same computer. It can be hard to find any documentation on it, so even if you aim at writing non-portable code, you might need to do a lot of testing to see which collation order is applied. Cf. to Sorting strings is much harder than you thought!
So to have a controlled and portable comparison, you need to code it yourself, unless you are lucky enough to find someone else’s code that happens to suit your needs. On the positive side, the case conversion methods are one of the few parts of JavaScript that are localization-ready: they apply Unicode case mapping rules, so e.g. 'æ'.toUpperCase() yields Æ in any implementation.
In general, sorting strings requires a complicated function that applies specific sorting rules as defined for a language or by some other rules, such as the Pan-European sorting rules (intended for multilingual content). But if we can limit ourselves to sorting rules that deal with just a handful of letters in addition to Ascii, we can use code like the following simplified sorting for German (extract from by book Going Global with JavaScript and Globalize.js):
String.prototype.removeUmlauts = function () {
return this.replace(/Ä/g,'A').replace(/Ö/g,'O').replace(/Ü/g,'U');
};
function alphabetic(str1, str2) {
var a = str1.toUpperCase().removeUmlauts();
var b = str2.toUpperCase().removeUmlauts();
return a < b ? -1 : a > b ? 1 : 0;
}
You could adds other mappings, like replace(/Æ/gi, 'Ae'), to this, after analyzing the characters that may appear and deciding how to deal with them. Removing diacritic marks (e.g. mapping É to E) is simplistic but often good enough, and surely better than leaving it to implementations to decide whether É is somewhere after Z. And at least you would get consistent results across implementations, and you would see what things go wrong and need fixing, instead of waiting for other users complain that your code sorts all wrong (in their environment).

Implementing a complicated decision table in JavaScript

Here's an implementation details question for JavaScript gurus.
I have a UI with a number of fields in which the values of the fields depend in a complicated fashion on the values of seven bits of inputs. Exactly what should be displayed for any one of the possible 128 values that is changing regularly as users see more of the application?
Right now, I've for this being implemented as a decision tree through an if-then-else comb, but it's brittle under the requirements changes and sort of hard to get right.
One implementation approach I've thought about is to make an array of values from 0x0 to 0x7F and then store a closure at each location --
var tbl; // initialize it with the values
...
tbl[0x42] = function (){ doAThing(); doAnotherThing(); }
and then invoke them with
tbl[bitsIn]();
This, at least makes the decision logic into a bunch of assignments.
Question: is there a better way?
(Update: holy crap, how'd that line about 'ajax iphone tags' get in there? No wonder it was a little puzzling.)
Update
So what happened? Basically I took a fourth option, although similar to the one I've checked. The logic was sufficiently complex that I finally built a Python program to generate a truth table in the server (generating Groovy code, in fact, the host is a Grails application) and move the decision logic into the server completely. Now the JavaScript side simply interprets a JSON object that contains the values for the various fields.
Eventually, this will probably go through one more iteration, and become data in a database table, indexed by the vector of bits.
The table driven part certainly came out to be the way to go; there have already been a half dozen new changes in the specific requirements for display.

I see two options...
Common to both solutions are the following named functions:
function aThing() {}
function anotherThing() {}
function aThirdThing() {}
The switch way
function exec(bits) {
switch(bits) {
case 0x00: aThing(); anotherThing(); break;
case 0x01: aThing(); anotherThing(); aThirdThing(); break;
case 0x02: aThing(); aThirdThing(); break;
case 0x03: anotherThing(); aThirdThing(); break;
...
case 0x42: aThirdThing(); break;
...
case 0x7f: ... break;
default: throw 'There is only 128 options :P';
}
}
The map way
function exec(bits) {
var actions = map[bits];
for(var i=0, action; action=actions[i]; i++)
action();
}
var map = {
0x00: [aThing, anotherThing],
0x01: [aThing, anotherThing, aThirdThing],
0x02: [aThing, aThirdThing],
0x03: [anotherThing, aThirdThing],
...
0x42: [aThirdThing],
...
};
in both cases you'd call
exec(0x42);

Since the situation (as you have described) is so irregular, there doesn't seem to be a better way. Although, I can suggest an improvement to your jump table. You mentioned that you have errors and duplicates. So instead of explicitly assigning them to a closure, you can assign them to named functions so that you don't have to duplicate the explicit closure.
var doAThingAndAnother = function (){ doAThing(); doAnotherThing(); }
var tbl; // initialize it with the values
...
tbl[0x42] = doAThingAndAnother;
tbl[0x43] = doAThingAndAnother;
Not that much of an improvement, but it's the only thing I could think of! You seem to have covered most of the other issues. Since it looks like the requirements change so much, I think you might have to forgo elegance and have a design that is not as elegant, but is still easy to change.

Have you considered generating your decision tree on the server rather than writing it by hand? Use whatever representation is clean, easy to work with, and modify and then compile that to ugly yet efficient javascript for the client side.
A decision tree is fairly easy to represent as data and it is easy to understand and work with as a traditional tree data structure. You can store said tree in whatever form makes sense for you. Validating and modifying it as data should also be straight forward.
Then, when you need to use the decision tree, just compile/serialize it to JavaScript as a big if-the-else, switch, or hash mess. This should also be fairly straight forward and probably a lot easier than trying to maintain a switch with a couple hundred elements.

I've got a rough example of a JavaScript decision tree tool if you want to take a look:
http://jsfiddle.net/danw/h8CFe/

Develop Reference

JavaScript is the programming language of the Web.

Simple string-based one-way hashing algorithm for JavaScript - javascript

Related

Whats up with eval()?

What is the default “tag” function for template literals?

o[str] vs (o => o.str)

Sorting Special Characters in Javascript (Æ)

Implementing a complicated decision table in JavaScript

Categories

Resources