Converting exponential LaTeX syntax to PHP's pow() using JavaScript - javascript

I would like to use JS to convert a nested exponential LaTeX expression such as
2^{3^{4^5}}
to PHP's pow() syntax
pow(2,pow(3,pow(4,5)))
I know JS doesn't support recursive RegExp. The expression is part of an equation, so I expect the solution to work with something like
\frac{3}{9}+\frac{2^{\sqrt{4^2}}}{6}
which should output
\frac{3}{9}+\frac{pow(2,\sqrt{pow(4,2)})}{6}
like in #AbcAeffchen's solution.
I don't need conversion for the non-exponential parts.
Notice: The solution must not require resorting to PHP 5.6, which introduced the ** operator

An ugly hack,
> foo
'2,(3^(4^5))'
> var foo = "2^(3^(4^5))".replace(/\^/g, ",");
undefined
> foo
'2,(3,(4,5))'
> var bar = foo.replace(/(\d,)(?=[^\d])/g, "$1pow");
undefined
> var foobar = bar.replace(/^(.*)$/, "pow($1)")
undefined
> foobar
'pow(2,pow(3,pow(4,5)))'

Try this function I wrote. It just uses ordinary string functions, so it is a little bit longer.
Basically it works as the following
find the first occurrence of ^
find the part before ^ that belongs to the basis
find the part after ^ that belongs to the exponent
call the function recursive on the exponent and on the part after the exponent
put all parts together.
and so it looks in JavaScript:
function convertLatexPow(str)
{
// contains no pow
var posOfPow = str.indexOf('^');
if(posOfPow == -1)
return str;
var head = str.substr(0,posOfPow);
var tail = str.substr(posOfPow+1);
// find the beginning of pow
var headLen = posOfPow;
var beginning = 0;
var counter;
if(head[headLen-1] == '}') // find the opening brace
{
counter = 1;
for(i = headLen-2; i >= 0; i--)
{
if(head[i] == '}')
counter++;
else if(head[i] == '{')
counter--;
if(counter == 0)
{
beginning = i;
break;
}
}
}
else if(head[headLen-1].match('[0-9]{1}')) // find the beginning of the number
{
for(i = headLen-2; i >= 0; i--)
{
if(!head[i].match('[0-9]{1}'))
{
beginning = i+1;
break;
}
}
}
else // the string looks like ...abc^{..}.. so the basis is only one character ('c' in this case)
beginning = headLen-1;
var untouchedHead = head.substr(0,beginning);
var firstArg = head.substr(beginning);
// find the end of pow
var secondArg, untouchedTail;
if(tail[0] != '{')
{
secondArg = tail[0];
untouchedTail = tail.substr(1);
}
else
{
counter = 1;
var len = tail.length;
var end = len+1;
for(i = 1; i < len; i++)
{
if(tail[i] == '{')
counter++;
else if(tail[i] == '}')
counter--;
if(counter == 0)
{
end = i;
break;
}
}
secondArg = tail.substr(1,end-1);
if(end < len-1)
untouchedTail = tail.substr(end+1);
else
untouchedTail = '';
}
return untouchedHead
+ 'pow(' + firstArg + ',' + convertLatexPow(secondArg) + ')'
+ convertLatexPow(untouchedTail);
}
alert(convertLatexPow('2^{3^{4^5}}'));
alert(convertLatexPow('\\frac{3}{9}+\\frac{2^{\\sqrt{4^2}}}{6}'));
alert(convertLatexPow('{a + 2 \\cdot (b + c)}^2'));
Input: '2^{3^{4^5}}'
Output: pow(2,pow(3,pow(4,5)))
Input: '\\frac{3}{9}+\\frac{2^{\\sqrt{4^2}}}{6}'
Output: \frac{3}{9}+\frac{pow(2,\sqrt{pow(4,2)})}{6}
Input: '{a + 2 \\cdot (b + c)}^2'
Output: pow({a + 2 \cdot (b + c)},2)
Notice: It do not parse the \sqrt. you have to do this extra.
Feel free to improve it :)
Notice: ^ in LaTeX does not mean power. It just means superscript. So 2^3 becomes 23 (and looks like "2 to the power of 3"), but \sum_{i=1}^n just becomes better formatted. But you can extend the function above to ignore ^ directly after }.
Notice: As Lucas Trzesniewski mentioned in the comment, 2^3^4 is not converted "correct", but it is also not a valid LaTeX expression.
Edit: Improved the function to convert '{a + 2 \\cdot (b + c)}^2' right.
Notice: In LaTeX exists many ways to write a brace (e.g. (, \left(, [, \lbrace,..).
To be sure this function works fine with all this braces you should convert all that braces to { and } first. Or to normal braces (, but then the function has to be edited to look for ( instead of {.
Notice: The complexity of this function is O(nβ‹…k), where n is the length of the input and k is the number of ^ in the input. An worst case input would be the first test case 2^{3^{4^{...}}}. But in most cases the function will be much faster. Something about O(n).

You can do it iteratively:
var foo = "2^(3^(4^5))";
while (/\([^^]+\^[^^]+\)/.test(foo)) {
foo = foo.replace(/\(([^^]+)\^([^^]+)\)/, "pow($1,$2)");
}
if (/(.+)\^(.+)/.test(foo)) {
foo = "pow(" + RegExp.$1 + "," + RegExp.$2 + ")";
}
# foo == "pow(2,pow(3,pow(4,5)))"

Related

Undefined in Split String

i have a function to split string into 2 part, front and back. Then reverse it to back and front. Here is my code
function reverseString(string) {
let splitString = ""
let firstString = ""
for(i = 0; i <= string.length/2 - 1; i++) {
firstString += string[i]
}
for(i = string.length/2; i <= string.length; i++) {
splitString += string[i]
}
return splitString + firstString
}
Sorry for bad explanation, this is test case and expected result (first one is expected result, the second one is my result)
console.log(reverseString("aaabccc")); // "cccbaaa" "undefinedundefinedundefinedundefinedaaa"
console.log(reverseString("aab")); // "baa" "undefinedundefineda"
console.log(reverseString("aaaacccc")); // "ccccaaaa" "ccccundefinedaaa"
console.log(reverseString("abcdefghabcdef")); // "habcdefabcdefg" "habcdefundefinedabcdefg"
could you help me, whats wrong with it. Thank you
You could try another approach and use the slice function
function reverseString(string)
{
if (string.length < 2) { return string; }
let stringHalfLength = string.length / 2;
let isLengthOdd = stringHalfLength % 1 !== 0;
if (isLengthOdd) {
return string.slice(Math.ceil(stringHalfLength), string.length + 1) + string[Math.floor(stringHalfLength)] + string.slice(0, Math.floor(stringHalfLength));
}
return string.slice(stringHalfLength, string.length + 1) + string.slice(0, stringHalfLength);
}
console.log(reverseString("aaabccc") === "cccbaaa");
console.log(reverseString("aab") === "baa");
console.log(reverseString("aaaacccc") === "ccccaaaa");
console.log(reverseString("abcdefghabcdef") === "habcdefabcdefg");
A more efficient way to reverse the string would be to split the string, then use the built-in reverse javascript function (which reverses the elements of the split string), and then re-join the elements using the join function.. No need to re-invent the wheel?
You can concatenate the functions in shorthand (.split.reverse.join etc...) so your function would look something like this:
function reverseString(string) {
return string.split("").reverse().join("");
}
Try it out!
function reverseString(string) {
return string.split("").reverse().join("");
}
console.log(reverseString("hello"));
console.log(reverseString("aaabbbccc"));
If there's a particular reason you're opting not to use the in-built functions (i.e. if I've missed something?) , feel free to comment.
The short version of what you need:
function reverseString(string) {
const splitPosition = Math.ceil(string.length / 2);
return string.substring(splitPosition) + string.substring(0, splitPosition);
}
The key to your question is the middle element. To accomplish that, you probably want to use Math.floor that round under.
console.log(reverseString("aaabccc")); // "cccbaaa"
console.log(reverseString("abcdefghabcdef")); // "habcdefabcdefg"
function reverseString (str) {
if (str.length<2) {
return str
}
var half = Math.floor(str.length / 2);
return (str.slice(-half) + (str.length%2?str[half]:"") + str.slice(0,half));
}
reverseString('')
> ""
reverseString('1')
> "1"
reverseString('12')
> "21"
reverseString('123')
> "321"
reverseString('1234')
> "3412"
reverseString('12345')
> "45312"
reverseString("aaabccc")
> "cccbaaa"
reverseString("abcdefghabcdef")
> "habcdefabcdefg"
So basically your problem is not to grab 2 parts of the string and rearrange, it is to grab 3 parts.
1 part: str.slice(0,half)
2 part: str.length%2 ? str[half] : ""
3 part: str.slice(-half)
The second part is empty if the string length is even and the middle character if is odd.
So the code version in long self explanatory code:
function reverseString (str) {
if (str.length<2) {
return str
}
var half = Math.floor(str.length / 2);
var firstPart = str.slice(0,half);
var midlePart = str.length % 2 ? str[half] : ""; // we could expand also this
var endPart = str.slice(-half);
return endPart + midlePart + firstPart;
}
And also, notice the precondition, so I don't have to deal with the easy cases.
Also, in your code, you got undefined because you access in the last loop to:
string[string.length] you need to change <= by <

Regex split on comma don't split on comma between double quotes [duplicate]

I'm looking for [a, b, c, "d, e, f", g, h]to turn into an array of 6 elements: a, b, c, "d,e,f", g, h. I'm trying to do this through Javascript. This is what I have so far:
str = str.split(/,+|"[^"]+"/g);
But right now it's splitting out everything that's in the double-quotes, which is incorrect.
Edit: Okay sorry I worded this question really poorly. I'm being given a string not an array.
var str = 'a, b, c, "d, e, f", g, h';
And I want to turn that into an array using something like the "split" function.
Here's what I would do.
var str = 'a, b, c, "d, e, f", g, h';
var arr = str.match(/(".*?"|[^",\s]+)(?=\s*,|\s*$)/g);
/* will match:
(
".*?" double quotes + anything but double quotes + double quotes
| OR
[^",\s]+ 1 or more characters excl. double quotes, comma or spaces of any kind
)
(?= FOLLOWED BY
\s*, 0 or more empty spaces and a comma
| OR
\s*$ 0 or more empty spaces and nothing else (end of string)
)
*/
arr = arr || [];
// this will prevent JS from throwing an error in
// the below loop when there are no matches
for (var i = 0; i < arr.length; i++) console.log('arr['+i+'] =',arr[i]);
regex: /,(?=(?:(?:[^"]*"){2})*[^"]*$)/
const input_line = '"2C95699FFC68","201 S BOULEVARDRICHMOND, VA 23220","8299600062754882","2018-09-23"'
let my_split = input_line.split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/)[4]
Output:
my_split[0]: "2C95699FFC68",
my_split[1]: "201 S BOULEVARDRICHMOND, VA 23220",
my_split[2]: "8299600062754882",
my_split[3]: "2018-09-23"
Reference following link for an explanation: regexr.com/44u6o
Here is a JavaScript function to do it:
function splitCSVButIgnoreCommasInDoublequotes(str) {
//split the str first
//then merge the elments between two double quotes
var delimiter = ',';
var quotes = '"';
var elements = str.split(delimiter);
var newElements = [];
for (var i = 0; i < elements.length; ++i) {
if (elements[i].indexOf(quotes) >= 0) {//the left double quotes is found
var indexOfRightQuotes = -1;
var tmp = elements[i];
//find the right double quotes
for (var j = i + 1; j < elements.length; ++j) {
if (elements[j].indexOf(quotes) >= 0) {
indexOfRightQuotes = j;
break;
}
}
//found the right double quotes
//merge all the elements between double quotes
if (-1 != indexOfRightQuotes) {
for (var j = i + 1; j <= indexOfRightQuotes; ++j) {
tmp = tmp + delimiter + elements[j];
}
newElements.push(tmp);
i = indexOfRightQuotes;
}
else { //right double quotes is not found
newElements.push(elements[i]);
}
}
else {//no left double quotes is found
newElements.push(elements[i]);
}
}
return newElements;
}
Here's a non-regex one that assumes doublequotes will come in pairs:
function splitCsv(str) {
return str.split(',').reduce((accum,curr)=>{
if(accum.isConcatting) {
accum.soFar[accum.soFar.length-1] += ','+curr
} else {
accum.soFar.push(curr)
}
if(curr.split('"').length % 2 == 0) {
accum.isConcatting= !accum.isConcatting
}
return accum;
},{soFar:[],isConcatting:false}).soFar
}
console.log(splitCsv('asdf,"a,d",fdsa'),' should be ',['asdf','"a,d"','fdsa'])
console.log(splitCsv(',asdf,,fds,'),' should be ',['','asdf','','fds',''])
console.log(splitCsv('asdf,"a,,,d",fdsa'),' should be ',['asdf','"a,,,d"','fdsa'])
This works well for me. (I used semicolons so the alert message would show the difference between commas added when turning the array into a string and the actual captured values.)
REGEX
/("[^"]*")|[^;]+/
var str = 'a; b; c; "d; e; f"; g; h; "i"';
var array = str.match(/("[^"]*")|[^;]+/g);
alert(array);
Here's the regex we're using to extract valid arguments from a comma-separated argument list, supporting double-quoted arguments. It works for the outlined edge cases. E.g.
doesn't include quotes in the matches
works with white spaces in matches
works with empty fields
(?<=")[^"]+?(?="(?:\s*?,|\s*?$))|(?<=(?:^|,)\s*?)(?:[^,"\s][^,"]*[^,"\s])|(?:[^,"\s])(?![^"]*?"(?:\s*?,|\s*?$))(?=\s*?(?:,|$))
Proof: https://regex101.com/r/UL8kyy/3/tests (Note: currently only works in Chrome because the regex uses lookbehinds which are only supported in ECMA2018)
According to our guidelines it avoids non-capturing groups and greedy matching.
I'm sure it can be simplified, I'm open to suggestions / additional test cases.
For anyone interested, the first part matches double-quoted, comma-delimited arguments:
(?<=")[^"]+?(?="(?:\s*?,|\s*?$))
And the second part matches comma-delimited arguments by themselves:
(?<=(?:^|,)\s*?)(?:[^,"\s][^,"]*[^,"\s])|(?:[^,"\s])(?![^"]*?"(?:\s*?,|\s*?$))(?=\s*?(?:,|$))
I almost liked the accepted answer, but it didn't parse the space correctly, and/or it left the double quotes untrimmed, so here is my function:
/**
* Splits the given string into components, and returns the components array.
* Each component must be separated by a comma.
* If the component contains one or more comma(s), it must be wrapped with double quotes.
* The double quote must not be used inside components (replace it with a special string like __double__quotes__ for instance, then transform it again into double quotes later...).
*
* https://stackoverflow.com/questions/11456850/split-a-string-by-commas-but-ignore-commas-within-double-quotes-using-javascript
*/
function splitComponentsByComma(str){
var ret = [];
var arr = str.match(/(".*?"|[^",]+)(?=\s*,|\s*$)/g);
for (let i in arr) {
let element = arr[i];
if ('"' === element[0]) {
element = element.substr(1, element.length - 2);
} else {
element = arr[i].trim();
}
ret.push(element);
}
return ret;
}
console.log(splitComponentsByComma('Hello World, b, c, "d, e, f", c')); // [ 'Hello World', 'b', 'c', 'd, e, f', 'c' ]
Parse any CSV or CSV-String code based on TYPESCRIPT
public parseCSV(content:string):any[string]{
return content.split("\n").map(ar=>ar.split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/).map(refi=>refi.replace(/[\x00-\x08\x0E-\x1F\x7F-\uFFFF]/g, "").trim()));
}
var str='"abc",jkl,1000,qwerty6000';
parseCSV(str);
output :
[
"abc","jkl","1000","qwerty6000"
]
I know it's a bit long, but here's my take:
var sample="[a, b, c, \"d, e, f\", g, h]";
var inQuotes = false, items = [], currentItem = '';
for(var i = 0; i < sample.length; i++) {
if (sample[i] == '"') {
inQuotes = !inQuotes;
if (!inQuotes) {
if (currentItem.length) items.push(currentItem);
currentItem = '';
}
continue;
}
if ((/^[\"\[\]\,\s]$/gi).test(sample[i]) && !inQuotes) {
if (currentItem.length) items.push(currentItem);
currentItem = '';
continue;
}
currentItem += sample[i];
}
if (currentItem.length) items.push(currentItem);
console.log(items);
As a side note, it will work both with, and without the braces in the start and end.
This takes a csv file one line at a time and spits back an array with commas inside speech marks intact. if there are no speech marks detected it just .split(",")s as normal... could probs replace that second loop with something but it does the job as is
function parseCSVLine(str){
if(str.indexOf("\"")>-1){
var aInputSplit = str.split(",");
var aOutput = [];
var iMatch = 0;
//var adding = 0;
for(var i=0;i<aInputSplit.length;i++){
if(aInputSplit[i].indexOf("\"")>-1){
var sWithCommas = aInputSplit[i];
for(var z=i;z<aInputSplit.length;z++){
if(z !== i && aInputSplit[z].indexOf("\"") === -1){
sWithCommas+= ","+aInputSplit[z];
}else if(z !== i && aInputSplit[z].indexOf("\"") > -1){
sWithCommas+= ","+aInputSplit[z];
sWithCommas.replace(new RegExp("\"", 'g'), "");
aOutput.push(sWithCommas);
i=z;
z=aInputSplit.length+1;
iMatch++;
}
if(z === aInputSplit.length-1){
if(iMatch === 0){
aOutput.push(aInputSplit[z]);
}
iMatch = 0;
}
}
}else{
aOutput.push(aInputSplit[i]);
}
}
return aOutput
}else{
return str.split(",")
}
}
Use the npm library csv-string to parse the strings instead of split: https://www.npmjs.com/package/csv-string
This will handle the empty entries
Something like a stack should do the trick. Here I vaguely use marker boolean as stack (just getting my purpose served with it).
var str = "a,b,c,blah\"d,=,f\"blah,\"g,h,";
var getAttributes = function(str){
var result = [];
var strBuf = '';
var start = 0 ;
var marker = false;
for (var i = 0; i< str.length; i++){
if (str[i] === '"'){
marker = !marker;
}
if (str[i] === ',' && !marker){
result.push(str.substr(start, i - start));
start = i+1;
}
}
if (start <= str.length){
result.push(str.substr(start, i - start));
}
return result;
};
console.log(getAttributes(str));
jsfiddle setting image code output image
The code works if your input string in the format of stringTocompare.
Run the code on https://jsfiddle.net/ to see output for fiddlejs setting.
Please refer to the screenshot.
You can either use split function for the same for the code below it and tweak the code according to you need.
Remove the bold or word with in ** from the code if you dont want to have comma after split attach=attach**+","**+actualString[t+1].
var stringTocompare='"Manufacturer","12345","6001","00",,"Calfe,eto,lin","Calfe,edin","4","20","10","07/01/2018","01/01/2006",,,,,,,,"03/31/2004"';
console.log(stringTocompare);
var actualString=stringTocompare.split(',');
console.log("Before");
for(var i=0;i<actualString.length;i++){
console.log(actualString[i]);
}
//var actualString=stringTocompare.split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/);
for(var i=0;i<actualString.length;i++){
var flag=0;
var x=actualString[i];
if(x!==null)
{
if(x[0]=='"' && x[x.length-1]!=='"'){
var p=0;
var t=i;
var b=i;
for(var k=i;k<actualString.length;k++){
var y=actualString[k];
if(y[y.length-1]!=='"'){
p++;
}
if(y[y.length-1]=='"'){
flag=1;
}
if(flag==1)
break;
}
var attach=actualString[t];
for(var s=p;s>0;s--){
attach=attach+","+actualString[t+1];
t++;
}
actualString[i]=attach;
actualString.splice(b+1,p);
}
}
}
console.log("After");
for(var i=0;i<actualString.length;i++){
console.log(actualString[i]);
}
[1]: https://i.stack.imgur.com/3FcxM.png
I solved this with a simple parser.
It simply goes through the string char by char, splitting off a segment when it finds the split_char (e.g. comma), but also has an on/off flag which is switched by finding the encapsulator_char (e.g. quote). It doesn't require the encapsulator to be at the start of the field/segment (a,b","c,d would produce 3 segments, with 'b","c' as the second), but it should work for a well formed CSV with escaped encapsulator chars.
function split_except_within(text, split_char, encapsulator_char, escape_char) {
var start = 0
var encapsulated = false
var fields = []
for (var c = 0; c < text.length; c++) {
var char = text[c]
if (char === split_char && ! encapsulated) {
fields.push(text.substring(start, c))
start = c+1
}
if (char === encapsulator_char && (c === 0 || text[c-1] !== escape_char) )
encapsulated = ! encapsulated
}
fields.push(text.substring(start))
return fields
}
https://jsfiddle.net/7hty8Lvr/1/
const csvSplit = (line) => {
let splitLine = [];
var quotesplit = line.split('"');
var lastindex = quotesplit.length - 1;
// split evens removing outside quotes, push odds
quotesplit.forEach((val, index) => {
if (index % 2 === 0) {
var firstchar = (index == 0) ? 0 : 1;
var trimmed = (index == lastindex)
? val.substring(firstchar)
: val.slice(firstchar, -1);
trimmed.split(",").forEach(v => splitLine.push(v));
} else {
splitLine.push(val);
}
});
return splitLine;
}
this works as long as quotes always come on the outside of values that contain the commas that need to be excluded (i.e. a csv file).
if you have stuff like '1,2,4"2,6",8'
it will not work.
Assuming your string really looks like '[a, b, c, "d, e, f", g, h]', I believe this would be 'an acceptable use case for eval():
myString = 'var myArr ' + myString;
eval(myString);
console.log(myArr); // will now be an array of elements: a, b, c, "d, e, f", g, h
Edit: As Rocket pointed out, strict mode removes eval's ability to inject variables into the local scope, meaning you'd want to do this:
var myArr = eval(myString);
I've had similar issues with this, and I've found no good .net solution so went DIY. NOTE: This was also used to reply to
Splitting comma separated string, ignore commas in quotes, but allow strings with one double quotation
but seems more applicable here (but useful over there)
In my application I'm parsing a csv so my split credential is ",". this method I suppose only works for where you have a single char split argument.
So, I've written a function that ignores commas within double quotes. it does it by converting the input string into a character array and parsing char by char
public static string[] Splitter_IgnoreQuotes(string stringToSplit)
{
char[] CharsOfData = stringToSplit.ToCharArray();
//enter your expected array size here or alloc.
string[] dataArray = new string[37];
int arrayIndex = 0;
bool DoubleQuotesJustSeen = false;
foreach (char theChar in CharsOfData)
{
//did we just see double quotes, and no command? dont split then. you could make ',' a variable for your split parameters I'm working with a csv.
if ((theChar != ',' || DoubleQuotesJustSeen) && theChar != '"')
{
dataArray[arrayIndex] = dataArray[arrayIndex] + theChar;
}
else if (theChar == '"')
{
if (DoubleQuotesJustSeen)
{
DoubleQuotesJustSeen = false;
}
else
{
DoubleQuotesJustSeen = true;
}
}
else if (theChar == ',' && !DoubleQuotesJustSeen)
{
arrayIndex++;
}
}
return dataArray;
}
This function, to my application taste also ignores ("") in any input as these are unneeded and present in my input.

How do get input 2^3 to Math.pow(2, 3)?

I have this simple calculator script, but it doesn't allow power ^.
function getValues() {
var input = document.getElementById('value').value;
document.getElementById('result').innerHTML = eval(input);
}
<label for="value">Enter: </label><input id="value">
<div id="result">Results</div>
<button onclick="getValues()">Get Results</button>
I tried using input = input.replace( '^', 'Math.pow(,)');
But I do not know how to get the values before '^' and after into the brackets.
Example: (1+2)^3^3 should give 7,625,597,484,987
Use a regular expression with capture groups:
input = '3 + 2 ^3';
input = input.replace(/(\d+)\s*\^\s*(\d+)/g, 'Math.pow($1, $2)');
console.log(input);
This will only work when the arguments are just numbers. It won't work with sub-expressions or when you repeat it, like
(1+2)^3^3
This will require writing a recursive-descent parser, and that's far more work than I'm willing to put into an answer here. Get a textbook on compiler design to learn how to do this.
I don't think you'll be able to do this with simple replace.
If you want to parse infix operators, you build two stacks, one for symbols, other for numbers. Then sequentially walk the formula ignoring everything else than symbols, numbers and closing parenthesis. Put symbols and numbers into their stacks, but when you encounter closing paren, take last symbol and apply it to two last numbers. (was invented by Dijkstra, I think)
const formula = '(1+2)^3^3'
const symbols = []
const numbers = []
function apply(n2, n1, s) {
if (s === '^') {
return Math.pow(parseInt(n1, 10), parseInt(n2, 10))
}
return eval(`${n1} ${s} ${n2}`)
}
const applyLast = () => apply(numbers.pop(), numbers.pop(), symbols.pop())
const tokenize = formula => formula.split(/(\d+)|([\^\/\)\(+\-\*])/).filter(t => t !== undefined && t !== '')
const solver = (formula) => {
const tf = tokenize(formula)
for (let l of formula) {
const parsedL = parseInt(l, 10)
if (isNaN(parsedL)) {
if (l === ')') {
numbers.push(applyLast())
continue
} else {
if (~['+', '-', '*', '/', '^'].indexOf(l))
symbols.push(l)
continue
}
}
numbers.push(l)
}
while (symbols.length > 0)
numbers.push(applyLast())
return numbers.pop()
}
console.log(solver(formula))
Get your input into a string and do...
var input = document.getElementById('value').value;
var values = input.split('^'); //will save an array with [value1, value 2]
var result = Math.pow(values[0], values[1]);
console.log(result);
This only if your only operation is a '^'
EDIT: Saw example after edit, this no longer works.
function getValues() {
var input = document.getElementById('value').value;
// code to make ^ work like Math.pow
input = input.replace( '^', '**');
document.getElementById('result').innerHTML = eval(input);
}
The ** operator can replace the Math.pow function in most modern browsers. The next version of Safari (v10.1) coming out any day supports it.
As said in other answers here, you need a real parser to solve this correctly. A regex will solve simple cases, but for nested statements you need a recursive parser. For Javascript one library that offers this is peg.js.
In your case, the example given in the online version can be quickly extended to handle powers:
Expression
= head:Term tail:(_ ("+" / "-") _ Term)* {
var result = head, i;
for (i = 0; i < tail.length; i++) {
if (tail[i][1] === "+") { result += tail[i][3]; }
if (tail[i][1] === "-") { result -= tail[i][3]; }
}
return result;
}
Term
= head:Pow tail:(_ ("*" / "/") _ Pow)* { // Here I replaced Factor with Pow
var result = head, i;
for (i = 0; i < tail.length; i++) {
if (tail[i][1] === "*") { result *= tail[i][3]; }
if (tail[i][1] === "/") { result /= tail[i][3]; }
}
return result;
}
// This is the new part I added
Pow
= head:Factor tail:(_ "^" _ Factor)* {
var result = 1;
for (var i = tail.length - 1; 0 <= i; i--) {
result = Math.pow(tail[i][3], result);
}
return Math.pow(head, result);
}
Factor
= "(" _ expr:Expression _ ")" { return expr; }
/ Integer
Integer "integer"
= [0-9]+ { return parseInt(text(), 10); }
_ "whitespace"
= [ \t\n\r]*
It returns the expected output 7625597484987 for the input string (1+2)^3^3.
Here is a Python-based version of this question, with solution using pyparsing: changing ** operator to power function using parsing?

Using two for loops to compare two strings

I am working through exercises on exercism.io and the third one asks us to compare two DNA strings and return the difference (hamming distance) between them.
So for example:
GAGCCTACTAACGGGAT
CATCGTAATGACGGCCT
^ ^ ^ ^ ^ ^^
There are 7 different characters lined up in that comparison. My question is whether I'm taking the right approach to solve this. I created two empty arrays, created a function that loops through both strings and pushes the different letters when they meet.
I tried running it through a console and I always get an unexpected input error.
var diff = [];
var same = [];
function ham(dna1, dna2) {
for (var i = 0; i < dna1.length; i++)
for (var j = 0; j < dna2.length; i++){
if (dna1[i] !== dna2[j]) {
console.log(dna1[i]);
diff.push(dna1[i]);
}
else {
console.log(dna1[i]);
same.push(dna1[i]);
}
return diff.length;
}
ham("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT");
console.log("The Hamming distance between both DNA types is " + diff.length + ".");
Do not use globals.
Do not use nested loops if you don't have to.
Do not store useless things in arrays.
function ham(dna1, dna2) {
if (dna1.length !== dna2.length) throw new Error("Strings have different length.");
var diff = 0;
for (var i = 0; i < dna1.length; ++i) {
if (dna1[i] !== dna2[i]) {
++diff;
}
}
return diff;
}
var diff = ham("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT");
console.log("The Hamming distance between both DNA types is " + diff + ".");
The first problem is that you're missing a closing }. I think you want it right before the return statement.
secondly, there's a problem with your algorithm. You compare every item in dna1 (i) with every item in dna2 instead of coparing the item in the same position.
To use a shorter example so we can step through it, consider comparing 'CAT' and 'CBT'. you want to compare the characters in the same position in each string. So you don't actually want 2 for loops, you only want 1. You'd compare C to C ([0]), A to B ([1]), and T to T ( [2] ) to find the 1 difference at [1]. Now step through that with your 2 for loops in your head, and you'll see that you'll get many more differences than exist.
Once you use the same offset for the characters in each string to compare, you have to stat worrying that one might be shorter than the other. You'll get an error if you try to use an offset at the end of the string. So we have to take that into account too, and assumedly count the difference between string length as differences. But perhaps this is out of scope for you, and the the strings will always be the same.
You only need to have one single loop like below:
var diff = [];
var same = [];
function ham(dna1, dna2) {
for (var i = 0; i < dna1.length; i++) {
if (dna1[i] !== dna2[i]) {
console.log("not same");
diff.push(dna1[i]);
} else {
console.log("same");
same.push(dna1[i]);
}
}
return diff.length;
}
ham("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT");
console.log("The Hamming distance between both DNA types is " + diff.length + ".");
The edit distance is not really hard to calculate. More code is needed to cover the edge cases in parameter values.
function hamming(str1, str2) {
var i, len, distance = 0;
// argument validity check
if (typeof str1 === "undefined" || typeof str2 === "undefined") return;
if (str1 === null || str2 === null) return;
// all other argument types are assumed to be meant as strings
str1 = str1.toString();
str2 = str2.toString();
// the longer string governs the maximum edit distance
len = str1.length > str2.length ? str1.length : str2.length;
// now we can compare
for (i = 0; i < len; i++) {
if ( !(str1[i] === str2[i]) ) distance++;
}
return distance;
}
Execution of function:
ham( "GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT" );
of the following function definition:
function ham(A,B){
var D = [], i = 0;
i = A.length > B.length ? A : B;
for( var x in i)
A[x] == B[x] ? D.push(" ") : D.push("^");
console.log( A + "\n" + B +"\n" + D.join("") );
}
will output the log of:
GAGCCTACTAACGGGAT
CATCGTAATGACGGCCT
^ ^ ^ ^ ^ ^^
Is capable of receiving different length strings, which depending on the requirement and data representation comparison can be modified to fill the blank with adequate standard symbols etc.
Demo:
ham("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT");
function ham(A, B) {
var D = [],
i = 0;
i = A.length > B.length ? A : B;
for (var x in i)
A[x] == B[x] ? D.push(" ") : D.push("^");
console.log(A + "\n" + B + "\n" + D.join(""));
};
I think that you would want to do something like this:
var dna1 = "GAGCCTACTAACGGGAT";
var dna2 = "CATCGTAATGACGGCCT";
function ham(string1, string2) {
var counter = 0;
for (i = 0;i < string1.length;i++) {
if (string1.slice(i, i + 1) != string2.slice(i, i + 1)) {
counter++
};
};
return(counter);
};
console.log("insert text here " + ham(dna1, dna2));
It checks each character of the string against the corresponding character of the other string, and adds 1 to the counter whenever the 2 characters are not equal.
You can use Array#reduce to iterate the 1st string, by using Function#call, and compare each letter to the letter of the corresponding index in the 2nd string.
function ham(dna1, dna2) {
return [].reduce.call(dna1, function(count, l, i) {
return l !== dna2[i] ? count + 1 : count;
}, 0);
}
var diff =ham("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT");
console.log("The Hamming distance between both DNA types is " + diff + ".");

JavaScript strings outside of the BMP

BMP being Basic Multilingual Plane
According to JavaScript: the Good Parts:
JavaScript was built at a time when Unicode was a 16-bit character set, so all characters in JavaScript are 16 bits wide.
This leads me to believe that JavaScript uses UCS-2 (not UTF-16!) and can only handle characters up to U+FFFF.
Further investigation confirms this:
> String.fromCharCode(0x20001);
The fromCharCode method seems to only use the lowest 16 bits when returning the Unicode character. Trying to get U+20001 (CJK unified ideograph 20001) instead returns U+0001.
Question: is it at all possible to handle post-BMP characters in JavaScript?
2011-07-31: slide twelve from Unicode Support Shootout: The Good, The Bad, & the (mostly) Ugly covers issues related to this quite well:
Depends what you mean by β€˜support’. You can certainly put non-UCS-2 characters in a JS string using surrogates, and browsers will display them if they can.
But, each item in a JS string is a separate UTF-16 code unit. There is no language-level support for handling full characters: all the standard String members (length, split, slice etc) all deal with code units not characters, so will quite happily split surrogate pairs or hold invalid surrogate sequences.
If you want surrogate-aware methods, I'm afraid you're going to have to start writing them yourself! For example:
String.prototype.getCodePointLength= function() {
return this.length-this.split(/[\uD800-\uDBFF][\uDC00-\uDFFF]/g).length+1;
};
String.fromCodePoint= function() {
var chars= Array.prototype.slice.call(arguments);
for (var i= chars.length; i-->0;) {
var n = chars[i]-0x10000;
if (n>=0)
chars.splice(i, 1, 0xD800+(n>>10), 0xDC00+(n&0x3FF));
}
return String.fromCharCode.apply(null, chars);
};
I came to the same conclusion as bobince. If you want to work with strings containing unicode characters outside of the BMP, you have to reimplement javascript's String methods. This is because javascript counts characters as each 16-bit code value. Symbols outside of the BMP need two code values to be represented. You therefore run into a case where some symbols count as two characters and some count only as one.
I've reimplemented the following methods to treat each unicode code point as a single character: .length, .charCodeAt, .fromCharCode, .charAt, .indexOf, .lastIndexOf, .splice, and .split.
You can check it out on jsfiddle: http://jsfiddle.net/Y89Du/
Here's the code without comments. I tested it, but it may still have errors. Comments are welcome.
if (!String.prototype.ucLength) {
String.prototype.ucLength = function() {
// this solution was taken from
// http://stackoverflow.com/questions/3744721/javascript-strings-outside-of-the-bmp
return this.length - this.split(/[\uD800-\uDBFF][\uDC00-\uDFFF]/g).length + 1;
};
}
if (!String.prototype.codePointAt) {
String.prototype.codePointAt = function (ucPos) {
if (isNaN(ucPos)){
ucPos = 0;
}
var str = String(this);
var codePoint = null;
var pairFound = false;
var ucIndex = -1;
var i = 0;
while (i < str.length){
ucIndex += 1;
var code = str.charCodeAt(i);
var next = str.charCodeAt(i + 1);
pairFound = (0xD800 <= code && code <= 0xDBFF && 0xDC00 <= next && next <= 0xDFFF);
if (ucIndex == ucPos){
codePoint = pairFound ? ((code - 0xD800) * 0x400) + (next - 0xDC00) + 0x10000 : code;
break;
} else{
i += pairFound ? 2 : 1;
}
}
return codePoint;
};
}
if (!String.fromCodePoint) {
String.fromCodePoint = function () {
var strChars = [], codePoint, offset, codeValues, i;
for (i = 0; i < arguments.length; ++i) {
codePoint = arguments[i];
offset = codePoint - 0x10000;
if (codePoint > 0xFFFF){
codeValues = [0xD800 + (offset >> 10), 0xDC00 + (offset & 0x3FF)];
} else{
codeValues = [codePoint];
}
strChars.push(String.fromCharCode.apply(null, codeValues));
}
return strChars.join("");
};
}
if (!String.prototype.ucCharAt) {
String.prototype.ucCharAt = function (ucIndex) {
var str = String(this);
var codePoint = str.codePointAt(ucIndex);
var ucChar = String.fromCodePoint(codePoint);
return ucChar;
};
}
if (!String.prototype.ucIndexOf) {
String.prototype.ucIndexOf = function (searchStr, ucStart) {
if (isNaN(ucStart)){
ucStart = 0;
}
if (ucStart < 0){
ucStart = 0;
}
var str = String(this);
var strUCLength = str.ucLength();
searchStr = String(searchStr);
var ucSearchLength = searchStr.ucLength();
var i = ucStart;
while (i < strUCLength){
var ucSlice = str.ucSlice(i,i+ucSearchLength);
if (ucSlice == searchStr){
return i;
}
i++;
}
return -1;
};
}
if (!String.prototype.ucLastIndexOf) {
String.prototype.ucLastIndexOf = function (searchStr, ucStart) {
var str = String(this);
var strUCLength = str.ucLength();
if (isNaN(ucStart)){
ucStart = strUCLength - 1;
}
if (ucStart >= strUCLength){
ucStart = strUCLength - 1;
}
searchStr = String(searchStr);
var ucSearchLength = searchStr.ucLength();
var i = ucStart;
while (i >= 0){
var ucSlice = str.ucSlice(i,i+ucSearchLength);
if (ucSlice == searchStr){
return i;
}
i--;
}
return -1;
};
}
if (!String.prototype.ucSlice) {
String.prototype.ucSlice = function (ucStart, ucStop) {
var str = String(this);
var strUCLength = str.ucLength();
if (isNaN(ucStart)){
ucStart = 0;
}
if (ucStart < 0){
ucStart = strUCLength + ucStart;
if (ucStart < 0){ ucStart = 0;}
}
if (typeof(ucStop) == 'undefined'){
ucStop = strUCLength - 1;
}
if (ucStop < 0){
ucStop = strUCLength + ucStop;
if (ucStop < 0){ ucStop = 0;}
}
var ucChars = [];
var i = ucStart;
while (i < ucStop){
ucChars.push(str.ucCharAt(i));
i++;
}
return ucChars.join("");
};
}
if (!String.prototype.ucSplit) {
String.prototype.ucSplit = function (delimeter, limit) {
var str = String(this);
var strUCLength = str.ucLength();
var ucChars = [];
if (delimeter == ''){
for (var i = 0; i < strUCLength; i++){
ucChars.push(str.ucCharAt(i));
}
ucChars = ucChars.slice(0, 0 + limit);
} else{
ucChars = str.split(delimeter, limit);
}
return ucChars;
};
}
More recent JavaScript engines have String.fromCodePoint.
const ideograph = String.fromCodePoint( 0x20001 ); // outside the BMP
Also a code-point iterator, which gets you the code-point length.
function countCodePoints( str )
{
const i = str[Symbol.iterator]();
let count = 0;
while( !i.next().done ) ++count;
return count;
}
console.log( ideograph.length ); // gives '2'
console.log( countCodePoints(ideograph) ); // '1'
Yes, you can. Although support to non-BMP characters directly in source documents is optional according to the ECMAScript standard, modern browsers let you use them. Naturally, the document encoding must be properly declared, and for most practical purposes you would need to use the UTF-8 encoding. Moreover, you need an editor that can handle UTF-8, and you need some input method(s); see e.g. my Full Unicode Input utility.
Using suitable tools and settings, you can write var foo = '𠀁'.
The non-BMP characters will be internally represented as surrogate pairs, so each non-BMP character counts as 2 in the string length.
Using for (c of this) instruction, one can make various computations on a string that contains non-BMP characters. For instance, to compute the string length, and to get the nth character of the string:
String.prototype.magicLength = function()
{
var c, k;
k = 0;
for (c of this) // iterate each char of this
{
k++;
}
return k;
}
String.prototype.magicCharAt = function(n)
{
var c, k;
k = 0;
for (c of this) // iterate each char of this
{
if (k == n) return c + "";
k++;
}
return "";
}
This old topic has now a simple solution in ES6:
Split characters into an array
simple version
[..."πŸ˜΄πŸ˜„πŸ˜ƒβ›”πŸŽ πŸš“πŸš‡"] // ["😴", "πŸ˜„", "πŸ˜ƒ", "β›”", "🎠", "πŸš“", "πŸš‡"]
Then having each one separated you can handle them easily for most common cases.
Credit: DownGoat
Full solution
To overcome special emojis as the one in the comment, one can search for the connection charecter (char code 8205 in UTF-16) and make some modifications. Here is how:
let myStr = "πŸ‘©β€πŸ‘©β€πŸ‘§β€πŸ‘§πŸ˜ƒπŒ†"
let arr = [...myStr]
for (i = arr.length-1; i--; i>= 0) {
if (arr[i].charCodeAt(0) == 8205) { // special combination character
arr[i-1] += arr[i] + arr[i+1]; // combine them back to a single emoji
arr.splice(i, 2)
}
}
console.log(arr.length) //3
Haven't found a case where this doesn't work. Comment if you do.
To conclude
it seems that JS uses the 8205 char code to represent UCS-2 characters as a UTF-16 combinations.

Categories

Resources