Javascript regexp replace, multiline - javascript

I have some text content (read in from the HTML using jQuery) that looks like either of these examples:
<span>39.98</span><br />USD
or across multiple lines with an additional price, like:
<del>47.14</del>
<span>39.98</span><br />USD
The numbers could be formatted like
1,234.99
1239,99
1 239,99
etc (i.e. not just a normal decimal number). What I want to do is get just whatever value is inside the <span></span>.
This is what I've come up with so far, but I'm having problems with the multiline approach, and also the fact that there's potentially two numbers and I want to ignore the first one. I've tried variations of using ^ and $, and the "m" multiline modifier, but no luck.
var strRegex = new RegExp(".*<span>(.*?)</span>.*", "g");
var strPrice = strContent.replace(strRegex, '$1');
I could use jQuery here if there's a way to target the span tag inside a string (i.e. it's not the DOM we're dealing with at this point).

You could remove all line breaks from the string first and then run your regex:
strContent = strContent.replace(/(\r\n|\n|\r)/gm,"");
var strRegex = new RegExp(".*<span>(.*?)</span>.*", "g");
var strPrice = strContent.replace(strRegex, '$1');

This is pretty easy with jQuery. Simply wrap your HTML string inside a div and use jQuery as usual:
var myHTML = "<span>Span 1 HTML</span><span>Span 2 HTML</span><br />USD";
var $myHTML = $("<div>" + myHTML + "</div>");
$myHTML.find("span").each(function() {
alert($(this).html());
});
Here's a working fiddle.

try using
"[\s\S]*<span>(.*?)</span>[\s\S]*"
instead of
".*<span>(.*?)</span>.*"
EDIT: since you're using a string to define your regex don't forget to esacpe your backslashes, so
[\s\S]
would be
[\\s\\S]

You want this?
var str = "<span>39.98</span><br />USD\n<del>47.14</del>\n\n<span>40.00</span><br />USD";
var regex = /<span>([^<]*?)<\/span>/g;
var matches = str.match(regex);
for (var i = 0; i < matches.length; i++)
{
document.write(matches[i]);
document.write("<br>");
}
Test here: http://jsfiddle.net/9LQGK/
The matches array will contain the matches. But it isn't really clear what you want. What does there's potentially two numbers and I want to ignore the first one means?

Related

How do I pass a variable into regex with Node js?

So basically, I have a regular expression which is
var regex1 = /10661\" class=\"fauxBlockLink-linkRow u-concealed\">([\s\S]*?)<\/a>/;
var result=text.match(regex1);
user_activity = result[1].replace(/\s/g, "")
console.log(user_activity);
What I'm trying to do is this
var number = 1234;
var regex1 = /${number}\" class=\"fauxBlockLink-linkRow u-concealed\">([\s\S]*?)<\/a>/;
but it is not working, and when I tried with RegExp, I kept getting errors.
You can use RegExp to create regexp from a string and use variables in that string.
var number = 1234;
var regex1 = new RegExp(`${number}aa`);
console.log("1234aa".match(regex1));
You can build the regex string with templates and/or string addition and then pass it to the RegExp constructor. One key in doing that is to get the escaping correct as you need an extra level of escaping for backslashes because the interpretation of the string takes one level of backslash, but you need one to survive as it gets to the RegExp contructor. Here's a working example:
function match(number, str) {
let r = new RegExp(`${number}" class="fauxBlockLink-linkRow u-concealed">([\\s\\S]*?)<\\/a>`);
return str.match(r);
}
const exampleHTML = 'Some link text';
console.log(match(1234, exampleHTML));
Note, using regex to match HTML like this becomes very order-sensitive (whereas the HTML itself isn't order-sensitive). And, your regex requires exactly one space between classes which HTML doesn't. If the class names were in a slightly different order or spacing different in the <a> tag, then it would not match. Depending upon what you're really trying to do, there may be better ways to parse and use the HTML that isn't order-sensitive.
I solved it with the method of Adem,
function escapeRegExp(string) {
return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string
}
var number = 1234;
var firstPart = `<a href="/forum/search/member?user_id=${number}" class="fauxBlockLink-linkRow u-concealed">`
var regexpString = escapeRegExp(firstPart) + '([\\s\\S]*?)' + escapeRegExp('</a>');
console.log(regexpString)
var sample = ` `
var regex1 = new RegExp(regexpString);
console.log(sample.match(regex1));
in the first place the issue was actually the way I was reading the file, the data I was applying the match on, was undefined.

Javascript regex For Removing height/width from style

Using HTMLFilter addrules in CKEDITOR, I'm trying to remove the height/width from the STYLE of plain text.
They don't return the actual object just plain text style so I really can't use jQuery or other DOM manipulation tools.
I have the below regex code that successfully removes HEIGHT and WIDTH but still leaves the actual dimensions.
I'm new to regular expressions so I'm sure it's something rather simple. Just not sure what.
Thank you.
var str = "width:100px;height:200px;float:left;";
var regex = /(height|width):(?=(.*?);)/gi;
console.log(str.replace(regex,""));
You used a lookahead, and it is a non-consuming pattern, i.e. the text it matches does not become part of the whole match value. Thus, it does not get removed
Use a pattern like
/(?:height|width):[^;]*;/gi
See the regex demo.
Details
(?:height|width) - a non-capturing group matching either height or width
: - a colon
[^;]* - a negated character class matching 0+ chars other than ;
; - a semi-colon.
See JS demo:
var str = "width:100px;height:200px;float:left;";
var regex = /(?:height|width):[^;]*;/gi;
console.log(str.replace(regex,""));
A non-regex solution with javascript built-ins methods to remove the height/width from the STYLE of plain text.
function isNotWidthHeight(style) {
return style.toLowerCase().indexOf("width") === -1 && style.toLowerCase().indexOf("height") === -1 && style;
}
var str = "margin:0 auto;width:100px;height:200px;float:left;";
var array = str.split(';').filter(isNotWidthHeight);
console.log(array.join(';'));
You need to capture the values too.
.*? instead of (?=(.*?);) will be enough.
var str = "width:100px;height:200px;float:left;";
var regex = /(height|width):.*?;/gi;
console.log(str.replace(regex,""));
Pretty close, you just need an extra group and something to wait until either ; or word boundary, \b. This will grab any setting including calc or whatever settings can follow until the ; or end of inline style.
var str = "width:100px;height:200px;float:left;";
var str2 = "width:calc(100vh - 20px);height:100%;float:left;";
var regex = /((width|height):[\s\S]+?;|\b)/gi;
console.log(str.replace(regex,""));
console.log(str2.replace(regex,""));

How to remove strings before nth character in a text?

I have a dynamically generated text like this
xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0
How can I remove everything before Map ...? I know there is a hard coded way to do this by using substring() but as I said these strings are dynamic and before Map .. can change so I need to do this dynamically by removing everything before 4th index of - character.
You could remove all four minuses and the characters between from start of the string.
var string = 'xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0',
stripped = string.replace(/^([^-]*-){4}/, '');
console.log(stripped);
I would just find the index of Map and use it to slice the string:
let str = "xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0"
let ind = str.indexOf("Map")
console.log(str.slice(ind))
If you prefer a regex (or you may have occurrences of Map in the prefix) you man match exactly what you want with:
let str = "xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0"
let arr = str.match(/^(?:.+?-){4}(.*)/)
console.log(arr[1])
I would just split on the word Map and take the first index
var splitUp = 'xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0'.split('Map')
var firstPart = splitUp[0]
Uses String.replace with regex expression should be the popular solution.
Based on the OP states: so I need to do this dynamically by removing everything before 4th index of - character.,
I think another solution is split('-') first, then join the strings after 4th -.
let test = 'xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0'
console.log(test.split('-').slice(4).join('-'))

Trimming whitespace without affecting strings

So, I recently found this example on trimming whitespace, but I've found that it also affects strings in code. For instance, say I'm doing a lesson on string comparison, and to demonstrate that "Hello World!" and "Hello World!" are different, I need the code compression to not have any effect on those two strings.
I'm using the whitespace compression so that people with different formatting styles won't be punished for using something that I don't use. For instance, I like to format my functions like this:
function foo(){
return 0;
};
While others may format it like this:
function foo()
{
return 0;
};
So I use whitespace compression around punctuation to make sure it always comes out the same, but I don't want it to affect anything within a string. Is there a way to add exceptions in JavaScript's replace() function?
UPDATE:
check this jsfiddle
var str='dfgdfg fdgfd fd gfd g print("Hello World!"); sadfds dsfgsgdf'
var regex=/(?:(".*"))|(\s+)/g;
var newStr=str.replace(regex, '$1 ');
console.log(newStr);
console.log(str);
In this code it will process everything except the quoted strings
to play with the code more comfortably you can see how the regex is working :
https://regex101.com/r/tG5qH2/1
I made a jsfiddle here: https://jsfiddle.net/cuywha8t/2/
var stringSplitRegExp = /(".+?"|'.+?')/g;
var whitespaceRegExp = /\s+\{/g;
var whitespaceReplacement = "{"
var exampleCode = `var str = "test test test" + 'asdasd "sd"';\n`+
`var test2 = function()\n{\nconsole.log("This is a string with 'single quotes'")\n}\n`+
`console.log('this is a string with "double quotes"')`;
console.log(exampleCode)
var separatedStrings =(exampleCode.split(stringSplitRegExp))
for(var i = 0; i < separatedStrings.length; i++){
if (i%2 === 1){
continue;
}
var oldString = separatedStrings[i];
separatedStrings[i] = oldString.replace(whitespaceRegExp, whitespaceReplacement)
}
console.log(separatedStrings.join(""))
I believe this is what you are looking for. it handles cases where a string contains the double quotes, etc. without modifying. This example just does the formatting of the curly-braces as you mentioned in your post.
Basically, the behavior of split allows the inclusion of the splitter in the array. And since you know the split is always between two non-string elements you can leverage this by looping over and modifying only every even-indexed array element.
If you want to do general whitespace replacement you can of course modify the regexp or do multiple passes, etc.

javascript dynamic css styling

I'm trying to get all characters after "."(dot) and set some styling to them with JavaScript.
Example: $10.12 . I want to set some styling to numbers "12".
I have this number dynamically created in phtml file inside span.
I tried something like this, but without success:
var aa = document.getElementById('testdiv').innerHTML; // gets my span
var bb = aa.toString().split(".")[1]; // gets all numbers after "."
bb.setAttribute("style","width: 500px;");
Thanks to everyone! You really helped me. I would vote for every answer, but unfortunately I can't vote yet.
Your mistake begins here:
var aa = document.getElementById('testdiv').innerHTML; // gets my span
That's not your span, but its HTML contents. To take care of setting the width, you need something like this instead:
var aa = document.getElementById('testdiv'); // gets my span
aa.style.width = "500px";
You can only apply styling to HTML elements, not text nodes.
Try this instead:
var elem = document.getElementById('testdiv');
var parts = elem.innerHTML.toString().split(".");
parts[1] = "<div style=\"width: 500px\">" + parts[1] + "</div>";
elem.innerHTML = parts.join(".");
I've used because it's immediately apparent that a style has been applied, but if you want the number to appear consistent, as in "$10.12" without the "12" on a new line, you will probably need to apply additional styles or rethink how you're outputting the HTML.
You cannot set style to the textNode, the work around is to create an element to wrap the character after "." by using span. The idea is simple. First split it by "." and check if it has "." inside, if yes, create an element to wrap it and set style. Finally, join it back by "."
var inner = document.getElementById('testdiv').innerHTML;
var arr = inner.toString().split(".");
if(arr.length > 1)
{
arr[1] = '<span style="display: inline-block; width: 500px;">' + arr[1] + '</span>';
}
newContent = arr.join(".");
document.getElementById('testdiv').innerHTML = newContent;
You could do something like this:
document.getElementById('testdiv').innerHTML = document.getElementById('testdiv').innerHTML.replace( /(\d+).(\d+)/, '$1.<span id="end">$2</span>' );
document.getElementById('end').style.fontWeight = 'bold';
jsFiddle example
Your example fails at bb.setAttribute since you're trying to set an attribute on a string instead of a node. What you need to do is essentially rebuild the 10.12 with <span> elements surrounding the text you want to alter, and then you can use other JavaScript methods to modify the styling. The method you were using was almost correct, except the last part won't work because the split() method returns a string, not a node.
You can do this with regexp:
onlyDigitsText = text.replace(/\.([0-9]*)/g, ".<span class='highlighted'>$1</span>");
JsFiddle example
Try
var elem = document.getElementById('testdiv');
elem.innerHTML = elem.innerHTML.replace( /(\d+)\.(\d+)/g, '$1.<span class="decimalPart">$2</span>' );
// (\d+) one or more digits
// \. a dot character. Must be escaped otherwise it means ANY character
// (\d+) one or more digits
// g regex flag to replace all instances, not just one.
Then in your css add styling for the decimalPart class
.decimalPart {
width: 500px;
}
This has the added advantage of separating your styles from your html.
UPDATE
Following your comment to get the character just before the number use
elem.innerHTML.replace( /(\s)([^\s\d]*?)(\d+)\.(\d+)/g, '$1<span class="currencySymbol">$2</span>$3.<span class="decimalPart">$4</span>' );
// (\s) space, tab, carriage return, new line, vertical tab, form feed
// ([\s\d]*?) any set of characters that are not a digit or the above zero or more times
// (\d+) one or more digits
// \. a dot character. Must be escaped otherwise it means ANY character
// (\d+) one or more digits
// g regex flag to replace all instances, not just one.
Please note I have made an allowance for currency symbols that take up more than a single character.

Categories

Resources