How to catch empty string when using split length - javascript

Using the JS split function on a empty string will return 1, that makes sense of course.
I had a situation in which I needed to count the number of ID's inside a comma-separated string. When just using string.split(',').length on an empty string it will return 1, which won't correspond with the actual number of ID's inside the string (but is just the default behavior of split, since a single - empty - element is returned).
To catch this, I wrote the code below. But something tells me that this isn't the most excellent solution. I would like to improve the code below and therefore get a better understanding of best practice.
Hopefully someone could help out here and provide some feedback on my issue:
What's the best way to count the number of ID's inside a comma-separated string, with respect to empty strings?
var str1 = '12,16,91,89,43';
var str2 = '';
if(!str1)
countRight = 0;
else
countRight = str1.split(',').length;
if(!str2)
countWrong = 0;
else
countWrong = str2.split(',').length;

Your code is a little verbose. How about
count = str ? str.split (',').length : 0;
or shorter but a little more more obscure :
count = +(str && str.split(',').length);
or even
count = str.length && str.split (',').length;
which can be shortened to
count = (str && str.split (',')).length;
Off topic : If your strings are coming from user input I would recommend allowing spaces by splitting using the regular expression /\s*,\s*/

Related

Performance issue using regex to replace/clear substring

I have a string containing things like this:
<a#{style}#{class}#{data} id="#{attr:id}">#{child:content} #{child:whatever}</a>
Everything to do here is just clear #{xxx}, except sub-strings starting with #{child: .
I used str.match() to get all sub-strings "#{*}" in an array to search and keep all #{child: substrings:
var matches = str.match(new RegExp("#\{(.*?)\}",'g'));
if (matches && matches.length){
for(var i=0; i<matches.length; i++){
if (matches[i].search("#{child:") == -1) str = str.replace(matches[i],'');
}
}
I got it running ok, but it's too slow when string becomes bigger (~2 seconds / +1000 nodes like this one on top)
Is there some alternative to do it, maybe using a rule (if exists) to escape #{child: direct in regex and improve performance?
If I understand your question correctly you don't want to remove the #{child:...} sub-strings but everything else of the format #{...} should go. In which case can you could change the regular expression to check that child: is not matched when you perform the replace:
var str = '<a#{style}#{class}#{data} id="#{attr:id}">#{child:content} #{child:whatever}</a>';
str = str.replace(/#\{((?!child:)[\s\S])+?\}/g, '');
This seems pretty fast.

How to compare two Strings and get Different part

now I have two strings,
var str1 = "A10B1C101D11";
var str2 = "A1B22C101D110E1";
What I intend to do is to tell the difference between them, the result will look like
A10B1C101D11
A10 B22 C101 D110E1
It follows the same pattern, one character and a number. And if the character doesn't exist or the number is different between them, I will say they are different, and highlight the different part. Can regular expression do it or any other good solution? thanks in advance!
Let me start by stating that regexp might not be the best tool for this. As the strings have a simple format that you are aware of it will be faster and safer to parse the strings into tokens and then compare the tokens.
However you can do this with Regexp, although in javascript you are hampered by the lack of lookbehind.
The way to do this is to use negative lookahead to prevent matches that are included in the other string. However since javascript does not support lookbehind you might need to go search from both directions.
We do this by concatenating the strings, with a delimiter that we can test for.
If using '|' as a delimiter the regexp becomes;
/(\D\d*)(?=(?:\||\D.*\|))(?!.*\|(.*\d)?\1(\D|$))/g
To find the tokens in the second string that are not present in the first you do;
var bothstring=str2.concat("|",str1);
var re=/(\D\d*)(?=(?:\||\D.*\|))(?!.*\|(.*\d)?\1(\D|$))/g;
var match=re.exec(bothstring);
Subsequent calls to re.exec will return later matches. So you can iterate over them as in the following example;
while (match!=null){
alert("\""+match+"\" At position "+match.index);
match=re.exec(t);
}
As stated this gives tokens in str2 that are different in str1. To get the tokens in str1 that are different use the same code but change the order of str1 and str2 when you concatenate the strings.
The above code might not be safe if dealing with potentially dirty input. In particular it might misbehave if feed a string like "A100|A100", the first A100 will not be considered as having a missing object because the regexp is not aware that the source is supposed to be two different strings. If this is a potential issue then search for occurences of the delimiting character.
You call break the string into an array
var aStr1 = str1.split('');
var aStr2 = str2.split('');
Then check which one has more characters, and save the smaller number
var totalCharacters;
if(aStr1.length > aStr2.length) {
totalCharacters = aStr2.length
} else {
totalCharacters = aStr1.length
}
And loop comparing both
var diff = [];
for(var i = 0; i<totalCharacters; i++) {
if(aStr1[i] != aStr2[i]) {
diff.push(aStr1[i]); // or something else
}
}
At the very end you can concat those last characters from the bigger String (since they obviously are different from the other one).
Does it helps you?

Parsing with or without regular expressions? Which one is faster?

Say I have an array of strings of the following format:
"array[5] = 10"
What would be the best solution to parse it in JavaScript?
Ashamedly not being familiar with regular expressions, I can come up only with something like this:
for (i in lines){
var index = lines[i].indexOf("array[");
if (index >= 0) {
var pair = str.substring(index + 6).trim().split('=');
var index = pair[0].trim().substring(0, pair[0].trim().length - 1);
var value = pair[1].trim();
}
}
Is there a more elegant way to parse something like this? If the answer is using regex, would it make the code slower?
Don't ask which approach is faster; measure it!
This is a regular expression that should match what you've implemented in your code:
/array\[(\d+)]\s*=\s*(.+)/
To help you learn regular expression, you can use a tool like Regexper to visualize the code. Here's a visualization of the above expression:
Note how for the index I assumed it should be an integer, but for the value any characters are accepted. Your code doesn't specify that either the index or value should be numbers, but I made some assumptions to that effect. I leave it as an exercise to the reader to tweak the expression to something more fitting if need be.
If you want a regular expression approach, then, something like so will do the trick: ^".*?\[(\d+)\]\s*=\s*(\d+)"$. This will match and extract the number you have in your square brackets (\[(\d+)\]) and also any numbers you will have at the end just before the " sign.
Once matched, it will put them into a group which you can then eventually access. Please check this previous SO post to see how you can access said groups.
I can't comment on speed, but usually regular expressions make string processing code more compact, the drawback of which is that the code is usually more difficult to read (depending on the complexity of the expression).
Regex is slower than working by finding the index of a given char, regardless of the language.
In your case, don't use split but only substring at given index.
Moreover, some hints to improve perf : pair[0].trim() is called twice and first trim is useless because you already call pair[1].trim().
It's all about algorithms…
Here is a faster implementation :
for (var i = 0; i < lines.length; i++) {
var i1 = lines[i].indexOf("[");
var i2 = lines[i].indexOf("]");
var i3 = lines[i].indexOf("=");
if (i1 >= 0) {
var index = lines[i].substring(i1, i2);
var value = lines[i].substring(i3, lines[i].length-1).trim();
}
}
If all you want to do is extract the index and value, you don't need to parse the string (which infers tokenising and processing). Just find the bits you want and extract them.
If your strings are always like "array[5] = 10" and the values are always integers, then:
var nums = s.match(/\d+/);
var index = nums[0];
var value = nums[1];
should do the trick. If there is a chance that there will be no matches, then you might want:
var index = nums && nums[0];
var value = nums && nums[1];
and deal with cases where index or value are null to avoid errors.
If you genuinely want to parse the string, there's a bit more work to do.

Can someone explain what this regex do?

It's part of code where javascript should watch for some price and match if it's lover than required, but i don't understand regex quite well and it's obvious that the error is in there.
So on a website i have price like
<div class="item_price_now"> $ 1,34 </div>
And on javascript part code looks like this
var maxprice = '0.98';
var itemprice = document.getElementByClassName('item_price_now');
var i = 0;
var currentprice = itemprice[i].innerHTML.replace(/\s+/g, ' ');
currentprice = currentprice.substring(2);
if (currentprice > maxprice)
{ do some code }
else
{ do some other code }
But this doesn't work, i assume that part of error is in regex, with this i don't get any values, i tried to change it to something like this
(\S+\w)
And it's outputing something (actually i get output of 1,34 ) but still can't match it with maxprice variable.
Can someone explain me what regex above means or at least point me in some direction. Thanks.
/\s+/g means "match any space/tab character that is repeated one of more times over the entire string".
Hence it's replacing any multiple whitespaces/tabs with a single whitespace.
It seems that your problem is that you use locale strings to describe your value, as you're comparing the string 0.98 (which is casted by JS) with 1,34 (which cannot be casted by JS, as , would be a thousand seperator)

regex - get numbers after certain character string

I have a text string that can be any number of characters that I would like to attach an order number to the end. Then I can pluck off the order number when I need to use it again. Since there's a possibility that the number is variable length, I would like to do a regular expression that catch's everything after the = sign in the string ?order_num=
So the whole string would be
"aijfoi aodsifj adofija afdoiajd?order_num=3216545"
I've tried to use the online regular expression generator but with no luck. Can someone please help me with extracting the number on the end and putting them into a variable and something to put what comes before the ?order_num=203823 into its own variable.
I'll post some attempts of my own, but I foresee failure and confusion.
var s = "aijfoi aodsifj adofija afdoiajd?order_num=3216545";
var m = s.match(/([^\?]*)\?order_num=(\d*)/);
var num = m[2], rest = m[1];
But remember that regular expressions are slow. Use indexOf and substring/slice when you can. For example:
var p = s.indexOf("?");
var num = s.substring(p + "?order_num=".length), rest = s.substring(0, p);
I see no need for regex for this:
var str="aijfoi aodsifj adofija afdoiajd?order_num=3216545";
var n=str.split("?");
n will then be an array, where index 0 is before the ? and index 1 is after.
Another example:
var str="aijfoi aodsifj adofija afdoiajd?order_num=3216545";
var n=str.split("?order_num=");
Will give you the result:
n[0] = aijfoi aodsifj adofija afdoiajd and
n[1] = 3216545
You can substring from the first instance of ? onward, and then regex to get rid of most of the complexities in the expression, and improve performance (which is probably negligible anyway and not something to worry about unless you are doing this over thousands of iterations). in addition, this will match order_num= at any point within the querystring, not necessarily just at the very end of the querystring.
var match = s.substr(s.indexOf('?')).match(/order_num=(\d+)/);
if (match) {
alert(match[1]);
}

Categories

Resources