I found this: click and thought what are the reasons behind this coding style ?
Defining identifiers like that _0x3384x4, kind of unreadable for a human ...?!
or writing object properties like:
{
"\x63\x68\x61\x72\x73": ' \uD83D\uDE23 ',
"\x63\x6C\x61\x73\x73": '_1az _1a- _2gc',
"\x6E\x61\x6D\x65": 'Bi\u1EC3u t\u01B0\u1EE3ng vui 18'
}
this could be written like that, couldn't it ?
{ chars=" 😣 ", class="_1az _1a- _2gc", name="Biểu tượng vui 18"}
Is it because of some old computers that can not display these characters? Is it kind of uglifying, protecting javascript code?
What kind of format is it (0x7892x8) kind of hex, what does it represent ? (eval("0x7892") evaluates 30866, but 0x7892x8 means 8th version of 30866 ... doesn't make sense for me ?!
It is no coding style. It is obfuscation.
From Wikipedia:
In software development, obfuscation is the deliberate act of creating
obfuscated code, i.e. source or machine code that is difficult for
humans to understand. Like obfuscation in natural language, it may use
needlessly roundabout expressions to compose statements.
Programmers may deliberately obfuscate code to conceal its purpose
(security through obscurity) or its logic, in order to prevent
tampering, deter reverse engineering, or as a puzzle or recreational
challenge for someone reading the source code.
Programs known as obfuscators transform readable code into obfuscated
code using various techniques.
There are many Tools out there, called Obfuscator, which obfusecate Code. Here is a Javascript Obfuscator for example:
http://www.jsobfuscate.com/
As you already right guesed it is hexadecimal. So for example x63 means 99 decimal.
Now we take a look into the Code Table:
http://www.codetable.net/
And we see, 99 decimal represents for example the c char. So \x63 basicly is c.
Related
I'm just wondering is there a difference in performance using removing spaces before and after equal signs. Like this two code snippets.
first
int i = 0;
second
int i=0;
I'm using the first one, but my friend who is learning html/javascript told me that my coding is inefficient. Is it true in html/javascript? And is it a huge bump in the performance? Will it also be same in c++/c# and other programming languages? And about the indent, he said 3 spaces is better that tab. But I already used to code like this. So I just want to know if he is correct.
Your friend is a bit misguided.
The extra spaces in the code will make a small difference in the size of the JS file which could make a small difference in the download speed, though I'd be surprised if it was noticeable or meaningful.
The extra spaces are unlikely to make a meaningful difference in the time to parse the file.
Once the file is parsed, the extra spaces will not make any difference in execution speed since they are not part of the parsed code.
If you really want to optimize download or parse speed, the way to do that is to write your code in the most readable fashion possible for best maintainability and then use a minimizer for the deployed code and this is a standard practice by many web sites. This will give you the best of both worlds - maintainable, readable code and minimum deployed size.
A minimizer will remove all unnecessary spacing, shorten the names of variables, remove comments, collapse lines, etc... all designed to make the deployed code as small as possible without changing the run-time meaning of the code at all.
C++ is a compiled language. As such, only the compiler that the developer uses sees any extra spaces (same with comments). Those spaces are gone once the code has been compiled into native code which is what the end-user gets and runs. So, issues about spaces between elements in a line are simply not applicable at all for C++.
Javascript is an interpreted language. That means the source code is downloaded to the browser and the browser then parses the code at runtime into some opcode form that the interpreter can run. The spaces in Javascript will be part of the downloaded code (if you don't use a minimizer to remove them), but once the code is parsed, those extra spaces are not part of the run-time performance of the code. Thus, the spaces could have a small influence on the download time and perhaps an even smaller influence on the parse time (though I'm guessing unlikely to be measurable or meaningful). As I said above, the way to optimize this for Javascript is to use spaces to enhance readability in the source code and then run a minimizer over the code to generate a deployed version of the code to minimize the deployed size of the file. This preserves maximum readability and minimizes download size.
There is little (javascript) to no (c#, c++, Java) difference in performance. In the compiled languages in particular, the source code compiles to the exact same machine code.
Using spaces instead of tabs can be a good idea, but not because of performance. Rather, if you aren't careful, use of tabs can result in "tab rot", where there are tabs in some places and spaces in others, and the indentation of the source code depends on your tab settings, making it hard to read.
I have read the question How to test and develop with asm.js?, and the accepted answer gives a link to http://kripken.github.com/mloc_emscripten_talk/#/.
The conclusion of that slide show is that "Statically-typed languages and especially C/C++ can be compiled effectively to JavaScript", so we can "expect the speed of compiled C/C++ to get to just 2X slower than native code, or better, later this year".
But what about non-statically-typed languages, such as regular JavaScript itself? Can it be compiled to asm.js?
Can JavaScript itself be compiled to asm.js?
Not really, because of its dynamic nature. It's the same problem as when trying to compile it to C or even to native code - you actually would need to ship a VM with it to take care of those non-static aspects. At least, such a VM is possible:
js.js is a JavaScript interpreter in JavaScript. Instead of trying to create an interpreter from scratch, SpiderMonkey is compiled into LLVM and then emscripten translates the output into JavaScript.
But if asmjs code runs faster than regular JS, then it makes sense to compile JS to asmjs, no?
No. asm.js is a quite restricted subset of JS that can be easily translated to bytecode. Yet you first would need to break down all the advanced features of JS to that subset for getting this advantage - a quite complicated task imo. But JavaScript engines are designed and optimized to translate all those advanced features directly into bytecode - so why bother about an intermediate step like asm.js? Js.js claims to be around 200 times slower than "native" JS.
And what about non-statically-typed languages in general?
The slideshow talks about that from …Just C/C++? onwards. Specifically:
Dynamic Languages
Entire C/C++ runtimes can be compiled and the original language
interpreted with proper semantics, but this is not lightweight
Source-to-source compilers from such languages to JavaScript ignore
semantic differences (for example, numeric types)
Actually, these languages depend on special VMs to be efficient
Source-to-source compilers for them lose out on the optimizations done in those VMs
In response to the general question "is it possible?" then the answer is that sure, both JavaScript and the asm.js subset are Turing complete so a translation exists.
Whether one should do this and expect a performance benefit is a different question. The short answer is "no, you shouldn't." I liken this to trying to compress a compressed file; yes, it is possible to run the compression algorithm, but in general you should not expect the resulting file to be smaller.
The short answer: The performance cost of dynamically-typed languages comes from the meaning of the code; a statically-typed program with an equivalent meaning would carry the same costs.
To understand this, it is important to understand why asm.js offers a performance benefit at all; or, more generally, why statically-typed languages perform better than dynamically-typed ones. The short answer is "run-time type checking takes time," and a longer answer would include the improved feasibility of optimizing statically-typed code. For example:
function a(x) { return x + 1; }
function b(x) { return x - 1; }
function c(x, y) { return a(x) + b(y); }
If x and y are both known to be integers, I can optimize function c to a couple of machine code instructions. If they could be integers or strings, the optimization problem becomes much harder; I have to treat these as string appends in some cases, and addition in other cases. In particular, there are four possible interpretations of the addition operation that occurs in c; it could be addition, or string append, or two different variants of coerce-to-string-and-append. As you add more possible types, the number of possible permutations grows; in the worst case for a dynamically-typed language, you have k^n possible interpretations of an expression involving n terms which could each have any number of k types. In a statically typed language, k=1, so there is always 1 interpretation of any given expression. Because of this, optimizers are fundamentally more efficient at optimizing statically-typed code than dynamically-typed code: There are fewer permutations to consider when searching for opportunities to optimize.
The point here is that when converting from dynamically-typed code to statically-typed code (as you'd be doing when going from JavaScript to asm.js), you have to account for the semantics of the original code. Meaning the type-checking still occurs (it's just now been spelled out statically-typed code) and all those permutations are still present to stifle the compiler.
A few facts about asm.js, which hopefully make the concept clear:
Yes you can write the asm.js dialect by hand.
If you did look at the examples for asm.js, they are very far from being user friendly. Obviously Javascript is not the front end language for creating this code.
Translating vanilla Javascript to asm.js dialect is not possible.
Think about it - if you already could translate standard Javascript in a fully statically manner, why would there be a need for asm.js? The sole existance of asm.js means that the Javascript JIT people at some people gave up on their promise that Javascript will get faster without any effort from the developer.
There are several reasons for this, but let's just say it would be really hard for the JIT to understand a dynamic language as good as a static compiler. And then probably for the developers to fully understand the JIT.
In the end it boils down to using the right tool for the task. If you want static, very performant code, use C / C++ ( / Java ) - if you want a dynamic language, use Javascript, Python, ...
asm.js has been created by the need of have a small subset of javascript which can be easily optimized. If you can have a way to convert javascript to javascript/asm.js, asm.js is not needed anymore - that method can be inserted in js interpreters directly.
In theory, it is possible to convert / compile / transpile any JavaScript operation to asm.js if it can be expressed with the limited subset of the language present in asm.js. In practice, however, there is no tool capable of converting ordinary JavaScript to asm.js at the moment (June, 2017).
Either way, it would make more sense to convert a language with static typing to asm.js, because static typing is a requirement of asm.js and the lack thereof one of the features of ordinary JavaScript that makes it exceptionally hard to compile to asm.js.
Back in 2013, when asm.js was hot, there has been an attempt to compile a statically typed language similar to JavaScript, but both the language and the compiler seem to have been abandoned.
Today, in 2017, JavaScipt subsets TypeScript and Flow would be suitable candidates for conversion to asm.js, but the core dev teams of neither language is interested in such conversion. LLJS had a fork that could compile to asm.js, but that project is pretty much dead. ThinScript is a much more recent attempt and is based on TypeScript, but it doesn't appear to be active either.
So, the best and easiest way to produce asm.js code is still to write your code in C/C++ and convert / compile / transpile it. However, it remains to be seen whether we'll even want to do this in the forseeable future. Web Assembly may soon replace asm.js altogether and there's already popping up TypeScript-like languages like TurboScript and AssemblyScript that convert to Web Assembly. In fact, TurboScript was originally based on ThinScript and used to compile to asm.js, but they appear to have abandoned this feature.
It may be possible to convert regular JavaScript to asm.js by first compiling it to C or C++, and then compiling the generated code to asm.js using Emscripten. I'm not sure if this would be practical, but it's an interesting concept nonetheless.
There is also a compiler called NectarJS that compiles JavaScript to WebAssembly and ASM.js.
check this http://badassjs.com/post/43420901994/asm-js-a-low-level-highly-optimizable-subset-of
basically you need check that your code would be asm.js compatible (no coercion or type casting, you need to manage the memory, etc). The idea behind this is write your code in javascript, detect the bottle neck and do the changes in your code for use asm.js and aot compilation instead jit and dynamic compilation...is a bit PITA but you can still use javascript or other languages like c++ or better..in a near future, lljs.....
Lua is small and can be easily embedded. The current JavaScript VMs are quite big and hard to integrate into existing applications.
So wouldn't it be possible to compile JavaScript to Lua or Lua bytecode?
Especially for the constraints in mobile applications this seems like a good fit. Being able to easily integrate one of the most popular scripting languages into any iPhone or Android app would be great.
I'm not very familiar with Lua so I don't know if this is technically feasible.
With Luvit there is an active project trying to port the Node.js architecture to Lua. So the evented JavaScript world can't be too far away from whats possible in Lua.
The wins of compiling Javascript to Lua are not as great as you might first imagine. Javascript's semantics are very different to Lua's (the LuaJIT author cites Lua's design as one of the main reasons LuaJIT can compete so favourably with Javascript JIT compilers).
Take this code:
if("1" == 1)
{
print("Yes");
}
This prints "Yes" in Javascript. The equivalent code in Lua does not, as strings are never equal to numbers in Lua. This may seem like a small point, but it has a fundamental consequence: we can no longer use Lua's built-in equality testing.
There are two solutions we could take. We could rewrite 1 == "1" to javascript_equals(1, "1"). Or we could wrap every Javascript value in Lua, and use Lua's metatables to override the == operator behaviour.
So we already lost a some efficiency from Lua by mapping Javascript to it. This is a simple example, but it continues like this all the way down. For example all the operator rules are different between Javascript and Lua.
We would even have to wrap Javascript objects, because they aren't the same as Lua tables. For example Javascript objects only support string keys, and coerce any index to a string:
> a = {}
{}
> a[1] = "Hello"
'Hello'
> a["1"]
'Hello'
You also have to watch out for Javascript's scoping rules, vararg functions, and so on.
Now, all of these things are surmountable, if someone were to put the effort into a full compiler. However any efficiency gains would soon be drowned out. You would essentially end up building a Javascript interpreter in Lua. Most Javascript interpreters are written in C and already optimised for Javascript's semantics.
So, doing it for efficiency is a lost cause. There may be other reasons - such as supporting Javascript in a Lua-only environment, though even then if possible just writing Lua bindings to an existing Javascript interpreter would probably be less work.
If you want to have a play with a Javascript->Lua source-to-source translator, take a look at js2lua, which is a toy project I created some time back. It's not anywhere complete, but playing with it would certainly give some food for thought. It already includes a Javascript lexer, so that hard work is done already.
Is there any C interpreter written in javascript or java ?
I don't need a full interpreter but I need to be able to do a step by step execution of the program and being able to see the values of variables, the stack...all that in a web interface.
The idea is to help C beginners by showing them the step by step execution of the program.
We are using GWT to build the interface so if something exists in Java we should be able to use it.
I can modify it to suit my needs but if I can avoid to write the parser / abstract-syntax tree walker / stack manipulation... that would be great.
Edit :
To be clear I don't want to simulate the complete C because some programs can be really tricky.
By step I mean a basic operation such as : expression evaluation, affectation, function call.
The C I want to simulate will contains : variables, for, while, functions, arrays, pointers, maths functions.
No goto, string functions, ctypes.h, setjmp.h... (at least for now).
Here is a prototype : http://www.di.ens.fr/~fevrier/war/simu.html
In this example we have manually converted the C code to a javascript representation but it's limited (expressions such as a == 2 || a = 1 are not handled) and is limited to programs manually converted.
We have a our disposal a C compiler on a remote server so we can check if the code is correct (and doesn't have any undefined behavior). The parsing / AST construction can also be done remotely (so any language) but the AST walking needs to be in javascript in order to run on the client side.
There's a C grammar available for antlr that you can use to generate a C parser in Java, and possibly JavaScript too.
There is em-scripten which converts LLVM languages to JS a little hacking on it and you may be able to produce a C interperter.
felixh's JSCPP project provides a C++ interpreter in Javascript, though with some limitations.
https://github.com/felixhao28/JSCPP
So an example program might look like this:
var JSCPP = require('JSCPP');
var launcher = JSCPP.launcher;
var code = 'int main(){int a;cin>>a;cout<<a;return 0;}';
var input = '4321';
var exitcode = launcher.run(code, input);
console.info('program exited with code ' + exitcode);
As of March 2015 this is under active development, so while it's usable there are still areas where it may continue to expand. Check the documentation for limitations. It looks like you can use it as a straight C interpreter with limited library support for now with no further issues.
I don't know of any C interpreters written in JavaScript, but here is a discussion of available C interpreters:
Is there an interpreter for C?
You might do better to look for any sort of virtual machine that runs on top of JavaScript, and then see if you can find a C compiler that emits the proper machine code for the VM. A likely one would seem to be LLVM; if you can find a JavaScript VM that can run LLVM, then you will be in great shape.
I did a few Google searches and found Emscripten, which translates C code into JavaScript directly! Perhaps you can do something with this:
https://github.com/kripken/emscripten/wiki
Perhaps you can modify Emscripten to emit a "sequence point" after each compiled line of C, and then you can make your simulated environment single-step from sequence point to sequence point.
I believe Emscripten is implementing LLVM, so it may actually have virtual registers; if so it might be ideal for your purposes.
I know you specified C code, but you might want to consider a JavaScript emulation of a simpler language. In particular, please consider FORTH.
FORTH runs on an extremely simple virtual machine. In FORTH there are two stacks, a data stack and a flow-of-control stack (called the "return" stack); plus some global memory. Originally FORTH was a 16-bit language but there are plenty of 32-bit FORTH implementations out there now.
Because FORTH code is sort of "close to the machine" it is easy to understand how it all works when you see it working. I learned FORTH before I learned C, and I found it to be a valuable learning experience.
There are several FORTH interpreters available in JavaScript already. The FORTH virtual machine is so simple, it doesn't take very long to implement it!
You could even then get a C-to-FORTH translator and let the students watch the FORTH virtual machine interpret compiled C code.
I consider this answer a long shot for you, so I'll stop writing here. If you are in fact interested in the idea, comment below and ask for more details and I will be happy to share them. It's been a long time since I wrote any FORTH code but I still remember it fondly, and I'd be happy to talk more about FORTH.
EDIT: Despite this answer being downvoted to a negative score, I am going to leave it up here. A simulation for educational purposes is IMHO more valuable if the simulation is simple and easy to understand. The simple stack-based virtual machine for FORTH is very simple, yet you could compile C code to run on it. (In the 80's there was even a CPU chip made that had FORTH instructions as its native machine code.) And, as I said, I personally studied FORTH when I was a complete beginner and it helped me to understand assembly language and C.
The question has no accepted answer, now over two years after it was asked. It could be that Loïc Février didn't find any suitable JavaScript interpreter. As I said, there already exist several JavaScript interpreters for the FORTH virtual machine. Therefore, this answer is a practical one.
C is a compiled language, not an interpreted language, and has features like pointers which JS completely doesn't support, so interpreting C in Javascript doesn't really make sense
I'm building a solution for a client which allows them to create very basic code,
now i've done some basic syntax validation but I'm stuck at variable verification.
I know JSLint does this using Javascript and i was wondering if anyone knew of a good way to do this.
So for example say the user wrote the code
moose = "barry"
base = 0
if(moose == "barry"){base += 100}
Then i'm trying to find a way to clarify that the "if" expression is in the correct syntax, if the variable moose has been initialized etc etc
but I want to do this without scanning character by character,
the code is a mini language built just for this application so is very very basic and doesn't need to manage memory or anything like that.
I had thought about splitting first by Carriage Return and then by Space but there is nothing to say the user won't write something like moose="barry" or if(moose=="barry")
and there is nothing to say the user won't keep the result of a condition inline.
Obviously compilers and interpreters do this on a much more extensive scale but i'm not sure if they do do it character by character and if they do how have they optimized?
(Other option is I could send it back to PHP to process which would then releave the browser of responsibility)
Any suggestions?
Thanks
The use case is limited, the syntax will never be extended in this case, the language is a simple scripted language to enable the client to create a unique cost based on their users input the end result will be processed by PHP regardless to ensure the calculation can't be adjusted by the end user and to ensure there is some consistency.
So for example, say there is a base cost of £1.00
and there is a field on the form called "Additional Cost", the language will allow them manipulate the base cost relative to the "additional cost" field.
So
base = 1;
if(additional > 100 && additional < 150){base += 50}
elseif(additional == 150){base *= 150}
else{base += additional;}
This is a basic example of how the language would be used.
Thank you for all your answers,
I've investigated a parser and creating one would be far more complex than is required
having run several tests with 1000's of lines of code and found that character by character it only takes a few seconds to process even on a single core P4 with 512mb of memory (which is far less than the customer uses)
I've decided to build a PHP based syntax checker which will check the information and convert the variables etc into valid PHP code whilst it's checking it (so that it's ready to be called later without recompilation) using this instead of javascript this seems more appropriate and will allow for more complex code to arise without hindering the validation process
It's only taken an hour and I have code which is able to check the validity of an if statement and isn't confused by nested if's, spaces or odd expressions, there is very little left to be checked whereas a parser and full blown scripting language would have taken a lot longer
You've all given me a lot to think about and i've rated relevant answers thank you
If you really want to do this — and by that I mean if you really want your software to work properly and predictably, without a bunch of weird "don't do this" special cases — you're going to have to write a real parser for your language. Once you have that, you can transform any program in your language into a data structure. With that data structure you'll be able to conduct all sorts of analyses of the code, including procedures that at least used to be called use-definition and definition-use chain analysis.
If you concoct a "programming language" that enables some scripting in an application, then no matter how trivial you think it is, somebody will eventually write a shockingly large program with it.
I don't know of any readily-available parser generators that generate JavaScript parsers. Recursive descent parsers are not too hard to write, but they can get ugly to maintain and they make it a little difficult to extend the syntax (esp. if you're not very experienced crafting the original version).
You might want to look at JS/CC which is a parser generator that generates a parser for a grammer, in Javascript. You will need to figure out how to describe your language using a BNF and EBNF. Also, JS/CC has its own syntax (which is somewhat close to actual BNF/EBNF) for specifying the grammar. Given the grammer, JS/CC will generate a parser for that grammar.
Your other option, as Pointy said, is to write your own lexer and recursive-descent parser from scratch. Once you have a BNF/EBNF, it's not that hard. I recently wrote a parser from an EBNF in Javascript (the grammar was pretty simple so it wasn't that hard to write one YMMV).
To address your comments about it being "client specific". I will also add my own experience here. If you're providing a scripting language and a scripting environment, there is no better route than an actual parser.
Handling special cases through a bunch of if-elses is going to be horribly painful and a maintenance nightmare. When I was a freshman in college, I tried to write my own language. This was before I knew anything about recursive-descent parsers, or just parsers in general. I figured out by myself that code can be broken down into tokens. From there, I wrote an extremely unwieldy parser using a bunch of if-elses, and also splitting the tokens by spaces and other characters (exactly what you described). The end result was terrible.
Once I read about recursive-descent parsers, I wrote a grammar for my language and easily created a parser in a 10th of the time it took me to write my original parser. Seriously, if you want to save yourself a lot of pain, write an actual parser. If you go down your current route, you're going to be fixing issues forever. You're going to have to handle cases where people put the space in the wrong place, or perhaps they have one too many (or one too little) spaces. The only other alternative is to provide an extremely rigid structure (i.e, you must have exactly x number of spaces following this statement) which is liable to make your scripting environment extremely unattractive. An actual parser will automatically fix all these problems.
Javascript has a function 'eval'.
var code = 'alert(1);';
eval(code);
It will show alert. You can use 'eval' to execute basic code.