Extract function names in python code using javascript regex - javascript

I am using prism.js to highlight python code but for some reasons it is not highlighting function names which are called for example deep_flatten() or flatten([1,2,3]) or flatten(list(map(lambda x: x*2,[1,2,3]))).
So made the following code to overcome this problem
[...document.getElementsByClassName('code')].forEach(x => {
x.innerText = x.innerText.replace(/(\w+)\([^\(\)]*\)/gi,match=>{
if(match.match(/^\d/)){
return match
}
else {
return `<span class="called-function">${match}</span>`
}
}
})
It works fine for the first two ones but fails for the other two ones.
On doing google search I found that this is called something recursive and can be done only with parsers. On searching for python parsers in javscript I found a lot of them but they are very big and for parsing whole code.
How can I make a parser/regex which extracts all function names and encloses them within span tags.
I don't want the specific code just some psuedo-code or algorithm as to how to proceed.

The default re package in std libs can't handle recursive regexes, however seems the regex package can
/(\w+)\([^\(\)]*\)/gi
can be changed to
/(\w+)(\((?:[^\(\)]|(?2))*\))/gi

Related

Is there an alternative to .join() that works in Google App Script?

I am using Google Apps Script to use input from a Google Form (shown on the response spreadsheet) on a template Google Doc to generate a report.
I am using the code below to loop through all the answers and summarize certain answers in a list. (For example, if the user answered Daily, the text of that question will show in the list.)
However, by default, the results generated are all in one line separated by a comma. I'd like to use the .join() function to change the delimiter from a comma to a line break.
Google keeps returning this error:
TypeError: Cannot find function join in object Wipe down kitchen counters.
at onFormSubmit(Code:41)
(Object "Wipe down kitchen counters" refers to one of the question's text, shown in the heading of the spreadsheet.)
I've read that join is a core Javascript function, not Google Apps Script. So, is there something wrong with my code? (I'm a beginner). OR does this function not work in Apps Script? and if so is there another function that would work?
Huge thanks for any help!!!
I've tried using the join function with other easy symbols, (% or *) instead of the line break in case that was the issue - but it returns the same error.
for (var key in e.namedValues) {
if (e.namedValues[key][0] === 'Daily')
dailyItems.push(key.replace('[','').replace(']','').trim().join("\n"));
}
}
To answer your title question: Array.prototype.join() works fine in Apps Script, and you could convince yourself by looking at the editor Logs after running
function verifyJoin() {
Logger.log(["a","b","c"].join("\n"))
}
Try modifying your code like this:
for (var key in e.namedValues) {
if (e.namedValues[key][0] === 'Daily') {
dailyItems.push(e.namedValues[key].join("\n"));
}
}

Using emscripten to compile c++ code to javascript for making the sum of two numbers.Practice

I want to make a short example to understand how emscripten works.
I want to make a html where I could add two numbers in two different text box. I also added a button and a third text box where the result should be printed after I introduce two numbers above and I pressed the button.
I see a few issues with your project. First, I think you should tag the C++ functions with EMSCRIPTEN_KEEPALIVE like this:
int EMSCRIPTEN_KEEPALIVE int_sum_of_two_numbers(int number1, int number2)
{
int sum;
sum = number1 + number2;
return sum;
}
From the Emscripten documentation:
If your function is used in other functions, LLVM may inline it and it will not appear as a unique function in the JavaScript. Prevent inlining by defining the function with EMSCRIPTEN_KEEPALIVE
That allows Javascript code to find your C++ functions.
Besides that, the command you use to compile your project seems to have an extra underscore _ character when you mention the exported functions _int_sum_of_two_numbers => int_sum_of_two_numbers. So you should have used:
EXPORTED_FUNCTIONS='["int_sum_of_two_numbers"]'
As a final note, you could leave the main() function empty. The code inside that function is irrelevant to your web application.
I wrote an article a while ago about integrating WebAssembly with Angular and it is very similar to what you are trying to achieve. I think this would be worth reading.

Decoding/reading json part of complex text file

I am starting to develop a desktop application using Electron. This app will parse some files and datas will be shown from these files. These files are containing complex data.
Now, I am trying to get json data from a complex text file. This text file contains some string and json objects. Sample file looks like that:
...strings that I'm not interested in...
{
"partOneA":0,
"partOneB":7,
....
}
...randomly strings may stand between json sections...
{
"partTwoA":7,
"partTwoB":4,
"partTwoC":4,
...
}
{
"differentPartA":3,
"differentPartB":5,
"differentPartC":6,
...
}
...somemoretext....
The problem is that, how can I get the json parts from this complex file using javascript? Performance of the solution should be considered.
Additionaly, Consider that json structure is nested like that:
{
"partOneA":0,
"partOneB" :{
"partOneBnode1":0,
"partOneBnode2":7,
}
}
Resolving with regular expressions is not applicable for this issue.
Now, I am trying to find a javascript based solution.
As long as you can rely on { and } as starting and closing tags you could use a regex like:
var jsonRegex = new RegExp(/({(?:(.|\n)*?(?:[^\\])){0,1}?})/g);
var result = jsonRegex.exec(text);
var firstMatch= result[1];
As a result you should get the first piece with the subsequent matches at the subsequent indexes. You can read the docs here on mdn.
You can play around with regex on sites like http://regexr.com/
Note
This approach does not work with nested JSON because you would require to match the same amount of opening and closing brackets (see this answer).

Intellij Javascript multiline structural search and replace

In our project a lot of angular unit tests contain following syntax:
inject(['dependency1', 'dependency2', function(_dependency1_, _dependency2_) {
dependency1 = _dependency1_;
dependency2 = _dependency2_;
}]);
In tests the array which lists the dependencies with string values is obsolete, since this is only useful when using minification. So we issued a coding convention to change this syntax to:
inject(function(_dependency1_, _dependency2_) {
dependency1 = _dependency1_;
dependency2 = _dependency2_;
});
Now I've been replacing a couple of these in existing code when I came across them, but I've gotten really tired of doing this manually. So I'm trying to solve this in IntelliJ by using structural search and replace. This is my search template so far:
inject([$injection$, function($argument$) {
$statement$;
}]);
with occurrences:
$injection$: 1 to infinite
$argument$: 1 to infinite
$statement$: 1 to infinite
The replace template is defined as follows:
inject(function($argument$) {
$statement$;
});
This does not work for the example I defined in the beginning however, it only matches and replaces correctly for a single line statement in the function body, so following example is replaced correctly:
inject(['dependency1', 'dependency2', function(_dependency1_, _dependency2_) {
dependency1 = _dependency1_;
}]);
Am I missing something? When I check out the simple if-else example on the Jetbrains website I get the feeling that this should work.
I have tried removing the semicolon behind the $statement$ variable, this didn't match multiple lines and resulted in the semicolons being removed after replacement. I've also tried applying a regex expressions to the $statement$ variable, but these didn't help either.
((.*)=(.*);\n)+
didn't match, probably because the semicolon is filtered out by the IntelliJ structural search before the actual regex matching is performed.
(.*)=(.*)
matched, but it replaced with the same behaviour as without the regex.
Matching multiple statements with a variable in JavaScript is currently broken because of a bug.

I need a Javascript literal syntax converter/deobfuscation tools

I have searched Google for a converter but I did not find anything. Is there any tools available or I must make one to decode my obfuscated JavaScript code ?
I presume there is such a tool but I'm not searching Google with the right keywords.
The code is 3 pages long, this is why I need a tools.
Here is an exemple of the code :
<script>([][(![]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[!+[]+!+[]]]()[(!![]+[])[!+[]+!+[]+!+[]]+(+(+[])+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+!+[]+[+[]]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]])(([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+
Thank you
This code is fascinating because it seems to use only nine characters ("[]()!+,;" and empty space U+0020) yet has some sophisticated functionality. It appears to use JavaScript's implicit type conversion to coerce arrays into various primitive types and their string representations and then use the characters from those strings to compose other strings which type out the names of functions which are then called.
Consider the following snippet which evaluates to the array filter function:
([][
(![]+[])[+[]] // => "f"
+ ([![]]+[][[]])[+!+[]+[+[]]] // => "i"
+ (![]+[])[!+[]+!+[]] // => "l"
+ (!![]+[])[+[]] // => "t"
+ (!![]+[])[!+[]+!+[]+!+[]] // => "e"
+ (!![]+[])[+!+[]] // => "r"
]) // => function filter() { /* native code */ }
Reconstructing the code as such is time consuming and error prone, so an automated solution is obviously desirable. However, the behavior of this code is so tightly bound to the JavaScript runtime that de-obsfucating it seems to require a JS interpreter to evaluate the code.
I haven't been able to find any tools that will work generally with this sort of encoding. It seems as though you'll have to study the code further and determine any patterns of usage (e.g. reliance on array methods) and figure out how to capture their usage (e.g. by wrapping high-level functions [such as Function.prototype.call]) to trace the code execution for you.
This question has already an accepted answer, but I will still post to clear some things up.
When this idea come up, some guy made a generator to encode JavaScript in this way. It is based on doing []["sort"]["call"]()["eval"](/* big blob of code here */). Therefore, you can decode the results of this encoder easily by removing the sort-call-eval part (i.e. the first 1628 bytes). In this case it produces:
if (document.cookie=="6ffe613e2919f074e477a0a80f95d6a1"){ alert("bravo"); }
else{ document.location="http://www.youtube.com/watch?v=oHg5SJYRHA0"; }
(Funny enough the creator of this code was not even able to compress it properly and save a kilobyte)
There is also an explanation of why this code doesn't work in newer browser anymore: They changed Array.prototype.sort so it does not return a reference to window. As far as I remember, this was the only way to get a reference to window, so this code is kind of broken now.

Categories

Resources