Lets say I have this in my html:
<div id="myDiv" class="a b-32"/>
I'm trying to get the index of 'b' class (in my example '32')
The only way I know how to do it is by:
var index;
var myDiv = $("#myDiv").attr("class");
var classes = myDiv.split(' ');
for(var i=0; i<classes.size(); i++){
if(classes[i].matches((b-)+[\d]*) ){
index = classes[i].replace("b-","");
}
}
alert(index);
Is there any solution that doesn't imply iterating all the classes manually? Because my solution seems dull. Surely there must be a better way, I just can't find it.
Y'know, for all that people claim jQuery makes it so you have to write less code, Vanilla JS is surprisingly good at one-liners :p
alert((document.getElementById('myDiv').className
.match(/(?:^| )b-(\d+)/) || [0,0])[1]);
(Whitespace added for readability)
Returns 0 in the event where myDiv doesn't have a b-number class.
EDIT: As #A.Wolff pointed out in a comment on your question, you may wish to consider this:
<div id="myDiv" class="a" data-b="32"></div>
Then you can get:
alert(document.getElementById('myDiv').getAttribute("data-b"));
A regular expression can help:
var index;
var myDivClasses = $("#myDiv").attr("class");
var cls = myDivClasses.match(/(?:^| )b-(\d+)(?:$| )/);
if (cls) {
index = cls[1];
}
(Use parseInt if you want it as a number.)
That looks for b-### where ### is one or more digits, with either whitespace or a boundary (start of string, end of string) on either side, and extracts the digits.
Related
I've a problem. I'm currently looking for a way to remove any HTML elements from a string. But there are two conditions:
The content of the elements should be kept
Special elements with a defined class should not be removed
I've already tried lots of things and looked at plenty of questions/answers on SO, but unfortunately I can't really figure out any of the answers. Unfortunately, this exceeds my abilities by far. But I would like to know how something like this works.
Question/Answers I've tried:
How to strip HTML tags from string in JavaScript?,
Strip HTML from Text JavaScript
So when I have for example a string like this:
You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span>
It should looks like this after stripping:
You have to pay <div class="keep-this">$200</div> per month for your car
I've actually tried following things:
jQuery(document).ready(function ($) {
let string = 'You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span>';
console.log(string);
function removeHTMLfromString(string) {
let tmp = document.createElement("DIV");
tmp.innerHTML = string;
return tmp.textContent || tmp.innerText || "";
}
console.log(removeHTMLfromString(string));
console.log(string.replace(/<[^>]*>?/gm, ''));
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
And I've also tried out a regex tool to see what get's removed, but unfortunately, I'm not making much progress here either:
https://www.regexr.com/50qar
I would love if someone can help me with this task. Thanks a lot!
Update
Maybe there is a way doing it with just a regex? If yes, how can I exclude my elements with a special class when using this regex: /<\/?[^>]+(>|$)/g?
It may be a little big code. But I think it may help you.
let str = 'You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span> <div class="keep-this">$500</div> also';
const el = document.createElement("div");
el.innerHTML = str;
// Get all the elements to keep
const keep = el.querySelectorAll(".keep-this");
// Replace the keeping element from the original string
// With special pattern and index so that we can replace
// the pattern with original keeping element
keep.forEach((v, i) => {
const keepStr = v.outerHTML;
str = str.replace(keepStr, `_k${i}_`);
});
// Replace created element's innerHTML by patternised string.
el.innerHTML = str;
// Get the text only
let stringify = el.innerText;
// Replace patterns from the text string by keeping element
keep.forEach((v,i) => {
const keepStr = v.outerHTML;
stringify = stringify.replace(`_k${i}_`, keepStr);
});
console.log(stringify);
Leave me comment if anything misleading.
Update: Regular Expression approach
The same task can be done by using a regular expression. The approach is-
Find all the keepable elements by regex and store them.
Replace all the keepable elements from the input string by an identical pattern
Remove all the HTML tags from the sting.
Replace the identical patterns by keepable elements.
let htmlString = 'You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span> Another <div class="keep-this">$400</div> here';
// RegExp for keep elements
const keepRegex = /<([a-z1-6]+)\s+(class=[\'\"](keep-this\s*.*?)[\'\"])[^>]*>.*?<\/\1>/ig;
// RegExp for opening tag
const openRegex = /<([a-z1-6]+)\b[^>]*>/ig;
// RegExp for closing tag
const closeRegex = /<\/[a-z1-6]+>/ig;
// Find all the matches for the keeping elements
const matches = [...htmlString.matchAll(keepRegex)];
// Replace the input string with any pattern so that it could be replaced later
matches.forEach((match, i) => {
htmlString = htmlString.replace(match[0], `_k${i}_`);
});
// Remove opening tags from the input string
htmlString = htmlString.replace(openRegex, '');
// Remove closing tags from the input string
htmlString = htmlString.replace(closeRegex, '');
// Replace the previously created pattern by keeping element
matches.forEach((match, index) => {
htmlString = htmlString.replace(`_k${index}_`, match[0]);
})
console.log(htmlString);
If date and vehicles div and class are coming from another function, you should just get rid of it from there.
How do I replace all instances of digits within a string pattern with that digit plus an offset.
Say I want to replace all HTML tags with that number plus an offset
strRegEx = /<ol start="(\d+)">/gi;
strContent = strContent.replace(strRegEx, function() {
/* return $1 + numOffset; */
});
#Tomalak is right, you shouldn't really use regex's with HTML, you should use the broswer's own HTML DOM or an XML parser.
For example, if that tag also had another attribute assigned to it, such as a class, the regex will not match it.
<ol start="#" > does not equal <ol class="foo" start="#">.
There is no way to use regexes for this, you should just go through the DOM to find the element you are looking for, grab its attributes, check to see if they match, and then go from there.
function replaceWithOffset(var offset) {
var elements = document.getElementsByTagName("ol");
for(var i = 0; i < elements.length; i++) {
if(elements[i].hasAttribute("start")) {
elements[i].setAttribute("start", parseInt(elements[i].getAttribute("start")) + offset);
}
}
}
the replace function obviously doesn't allow that, so doing what you need required a bit more effort
executing (with .exec()) a global regex multiple time will return subsequent results until no more matches are available and null is returned. You can use that in a while loop and then use the returned match to substring the original input and perform your modifications manually
var strContent = "<ol start=\"1\"><ol start=\"2\"><ol start=\"3\"><ol start=\"4\">"
var strRegEx = /<ol start="(\d+)">/g;
var match = null
while (match = strRegEx.exec(strContent)) {
var tag = match[0]
var value = match[1]
var rightMark = match.index + tag.length - 2
var leftMark = rightMark - value.length
strContent = strContent.substr(0, leftMark) + (1 + +value) + strContent.substr(rightMark)
}
console.log(strContent)
note: as #tomalak said, parsing HTML with regexes is generally a bad idea. But if you're parsing just a piece of content of which you know the precise structure beforehand, I don't see any particular issue ...
Regular expressions are most powerful. However, the result they return is sometimes useless:
For example:
I want to manage a CSV string using semicolons.
I define a string like:
var data = "John;Paul;Pete;Stuart;George";
If I use the instruction:
var tab = data.match(/;/g)
after what, "tab" contains an array of 4 ";" :
tab[0]=";", tab[1]=";", tab[2]=";", tab[3]=";"
This array is not useful in the present case, because I knew it even before using the regular expression.
Indeed, what I want to do is 2 things:
1stly: Suppress the 4th element (not "Stuart" as "Stuart", but "Stuart" as 4th element)
2ndly: Replace the 3rd element by "Ringo" so as to get back (to where you once belonged!) the following result:
data == "John;Paul;Ringo;George";
In this case, I would greatly prefer to obtain an array giving the positions of semicolons:
tab[0]=4, tab[1]=9, tab[2]=14 tab[3]=21
instead of the useless (in this specific case)
tab[0]=";", tab[1]=";", tab[2]=";", tab[3]=";"
So, here's my question: Is there a way to obtain this numeric array using regular expressions?
To get tab[0]=4, tab[1]=9, tab[2]=14 tab[3]=21, you can do
var tab = [];
var startPos = 0;
var data = "John;Paul;Pete;Stuart;George";
while (true) {
var currentIndex = data.indexOf(";", startPos);
if (currentIndex == -1) {
break;
}
tab.push(currentIndex);
startPos = currentIndex;
}
But if the result wanted is "John;Paul;Ringo;George", you can do
var tab = data.split(';'); // Split the string into an array of strings
tab.splice(3, 1); // Suppress the 4th element
tab[2] = "Ringo"; // Replace the 3rd element by "Ringo"
var str = tab.join(';'); // Join the elements of the array into a string
The second approach is maybe better in your case.
String.split
Array.splice
Array.join
You should try a different approach, using split.
tab = data.split(';') will return an array of the form
tab[0]="John", tab[1]="Paul", tab[2]="Pete", tab[3]="Stuart", tab[4]="George"
You should be able to achieve your goal with this array.
Why use a regex to perform this operation? You have a built-in function split, which can split your string based on the delimiter you pass.
var data = "John;Paul;Pete;Stuart;George";
var temp=data.split(';');
temp[0],temp[1]...
I'm trying to cut out some text from a scraped site and not sure what functions or library's I can use to make this easier:
example of code I run from PhantomJS:
var latest_release = page.evaluate(function () {
// everything inside this function is executed inside our
// headless browser, not PhantomJS.
var links = $('[class="interesting"]');
var releases = {};
for (var i=0; i<links.length; i++) {
releases[links[i].innerHTML] = links[i].getAttribute("href");
}
// its important to take note that page.evaluate needs
// to return simple object, meaning DOM elements won't work.
return JSON.stringify(releases);
});
Class interesting has what I need, surrounded by new lines and tabs and whatnot.
here it is:
{"\n\t\t\t\n\t\t\t\tI_Am_Interesting\n\t\t\t\n\t\t":null,"\n\t\t\t\n\t\t\t\tI_Am_Interesting\n\t\t\t\n\t\t":null,"\n\t\t\t\n\t\t\t\tI_Am_Interesting\n\t\t\t\n\t\t":null}
I tried string.slice("\n"); and nothing happened, I really want a effective way to be able to cut out strings like this, based on its relationship to those \n''s and \t's
By the way this was my split code:
var x = latest_release.split('\n');
Cheers.
Its a simple case of stripping out all whitespace. A job that regexes do beautifully.
var s = " \n\t\t\t\n\t\t\t\tI Am Interesting\n\t\t \t \n\t\t";
s = s.replace(/[\r\t\n]+/g, ''); // remove all non space whitespace
s = s.replace(/^\s+/, ''); // remove all space from the front
s = s.replace(/\s+$/, ''); // remove all space at the end :)
console.log(s);
Further reading: https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/RegExp
var interesting = {
"\n\t\t\t\n\t\t\t\tI_Am_Interesting1\n\t\t\t\n\t\t":null,
"\n\t\t\t\n\t\t\t\tI_Am_Interesting2\n\t\t\t\n\t\t":null,
"\n\t\t\t\n\t\t\t\tI_Am_Interesting3\n\t\t\t\n\t\t":null
}
found = new Array();
for(x in interesting) {
found[found.length] = x.match(/\w+/g);
}
alert(found);
Could you try with "\\n" as pattern? your \n may be understood as plain string rather than special character
new_string = string.replace("\n", "").replace("\t", "");
I cant seem to be able to build a good regex expression (in javascript) that extracts each attribute from an xml node. For example,
<Node attribute="one" attribute2="two" n="nth"></node>
I need an express to give me an array of
['attribute="one"', 'attribute2="two"' ,'n="nth"']
...
Any help would be appreciated. Thank you
In case you missed Kerrek's comment:
you can't parse XML with a regular expression.
And the link: RegEx match open tags except XHTML self-contained tags
You can get the attributes of a node by iterating over its attributes property:
function getAttributes(el) {
var r = [];
var a, atts = el.attributes;
for (var i=0, iLen=atts.length; i<iLen; i++) {
a = atts[i];
r.push(a.name + ': ' + a.value);
}
alert(r.join('\n'));
}
Of course you probably want to do somethig other than just put them in an alert.
Here is an article on MDN that includes links to relevant standards:
https://developer.mozilla.org/En/DOM/Node.attributes
try this~
<script type="text/javascript">
var myregexp = /<node((\s+\w+=\"[^\"]+\")+)><\/node>/im;
var match = myregexp.exec("<Node attribute=\"one\" attribute2=\"two\" n=\"nth\"></node>");
if (match != null) {
result = match[1].trim();
var arrayAttrs = result.split(/\s+/);
alert(arrayAttrs);}
</script>
I think you could get it using the following. You would want the second and third matching group.
<[\w\d\-_]+\s+(([\w\d\-_]+)="(.*?)")*>
The regex is /\w+=".+"/g (note the g of global).
You might try it right now on your firebug / chrome console by doing:
var matches = '<Node attribute="one" attribute2="two" n="nth"></node>'.match(/\w+="\w+"/g)