Matching a JS string with regex - javascript

I have a long xml raw message that is being stored in a string format. A sample is as below.
<tag1>val</tag><tag2>val</tag2><tagSomeNameXYZ/>
I'm looking to search this string and find out if it contains an empty html tag such as <tagSomeNameXYZ/>. This thing is, the value of SomeName can change depending on context. I've tried using Str.match(/tagSomeNameXYZ/g) and Str.match(/<tag.*.XYZ\/>/g) to find out if it contains exactly that string, but am able to get it return anything. I'm having trouble in writing a reg ex that matches something like <tag*XYZ/>, where * is going to be SomeName (which I'm not interested in)
Tl;dr : How do I filter out <tagSomeNameXYZ/> from the string. Format being : <constant variableName constant/>
Example patterns that it should match:
<tagGetIndexXYZ/>
<tagGetAllIndexXYZ/>
<tagGetFooterXYZ/>

The issue you have with Str.match(/<tag.*.XYZ\/>/g) is the .* takes everything it sees and does not stop at the XYZ as you wish. So you need to find a way to stop (e.g. the [^/]* means keep taking until you find a /) and then work back from there (the slice).
Does this help
testString = "<tagGetIndexXYZ/>"
res = testString.match(/<tag([^/]*)\/\>/)[1].slice(0,-3)
console.log(res)

Related

How do I replace string within quotes in javascript?

I have this in a javascript/jQuery string (This string is grabbed from an html ($('#shortcode')) elements value which could be changed if user clicks some buttons)
[csvtohtml_create include_rows="1-10"
debug_mode="no" source_type="visualizer_plugin" path="map"
source_files="bundeslander_staple.csv" include cols="1,2,4" exclude cols="3"]
In a textbox (named incl_sc) I have the value:
include cols="2,4"
I want to replace include_cols="1,2,4" from the above string with the value from the textbox.
so basically:
How do I replace include_cols values here? (include_cols="2,4" instead of include_cols="1,2,4") I'm great at many things but regex is not one of them. I guess regex is the thing to use here?
I'm trying this:
var s = $('#shortcode').html();
//I want to replace include cols="1,2,4" exclude cols="3"
//with include_cols="1,2" exclude_cols="3" for example
s.replace('/([include="])[^]*?\1/g', incl_sc.val() );
but I don't get any replacement at all (the string s is same string as $("#shortcode").html(). Obviously I'm doing something really dumb. Please help :-)
In short what you will need is
s.replace(/include cols="[^"]+"/g, incl_sc.val());
There were a couple problems with your code,
To use a regex with String.prototype.replace, you must pass a regex as the first argument, but you were actually passing a string.
This is a regex literal /regex/ while this isn't '/actually a string/'
In the text you supplied in your question include_cols is written as include cols (with a space)
And your regex was formed wrong. I recomend testing them in this website, where you can also learn more if you want.
The code above will replace the part include cols="1,2,3" by whatever is in the textarea, regardless of whats between the quotes (as long it doesn't contain another quote).
First of all I think you need to remove the quotes and fix a little bit the regex.
const r = /(include_cols=\")(.*)(\")/g;
s.replace(r, `$1${incl_sc.val()}$3`)
Basically, I group the first and last part in order to include them at the end of the replacement. You can also avoid create the first and last group and put it literally in the last argument of the replace function, like this:
const r = /include_cols=\"(.*)\"/g;
s.replace(r, `include_cols="${incl_sc.val()}"`)

Match a words, which contains specific symbols

It seems that i'm stuck with something simple, but I was unable to find quite similar question on stack.
Using JavaScript/jQuery/regexp I want to match a words that contains specific symbols in string .
I.e in given string 'check out mydomain/folder/#something' if i run this kind of search with symbols 'folder/#' it must return whole mydomain/folder/#something.
In fact I want to use this to replace whole link in a string with some kind of widget button, but as those links are pretty specific (i.e i know that they will contain folder/#) using some kind of library for this task would be overkill.
Here is the regexp you are looking for: /[\w\/]*\/folder\/#[\w\/]*/
var str = 'check out mydomain/folder/#something';
// returns ["mydomain/folder/#something"]
str.match(/[\w\/]*\/folder\/#[\w\/]*/)
Or for a more robust version or it: /[\w\/\.]*\/folder\/#(?:[\w\/]|\.\w+)*/
That last one will accept dots in the file names but ignore the last one.
For instance 'check out my.domain/foo/folder/#some.thing.'will return ["my.domain/foo/folder/#some.thing"]

Bookmarklet - Verify URL format and extract substring

I'm trying to build a bookmarklet that preforms a service client side, but I'm not really fluent in Javascript. In my code below I want to take the current page url and first verify that it's a url following a specific format after the domain, which is...
/photos/[any alphanumeric string]/[any numeric string]
after that 3rd "/" should always be the numeric string that I need to extract into a var. Also, I can't just start from the end and work backwards because there will be times that there is another "/" after the numeric string followed by other things I don't need.
Is indexOf() the right function to verify if the url is the specific format and how would I write that expression? I've tried several things related to indexOf() and Regex(), but had no success. I seem to always end up with an unexpected character or it just doesn't work.
And of course the second part of my question is once I know the url is the right format, how do I extract the numeric string into a variable?
Thank you for any help!
javascript:(function(){
// Retrieve the url of the current page
var photoUrl = window.location.pathname;
if(photoUrl.indexOf(/photos/[any alphanumeric string]/[any numeric string]) == true) {
// Extract the numeric substring into a var and do something with it
} else {
// Do something else
}
})();
var id = window.location.pathname.match(/\/photos\/(\w+)\/(\d+)/i);
if (id) alert(id[1]); // use 1 or 2 depending on what you want
else alert('url did not fit expected format');
(EDIT: changed first \d* to \w+ and second \d* to \d+ and dig to id.)
To test strings for patterns and get their parts, you can use regular expressions. Exression for your criteria would be like this:
/^\/photos\/\w+\/(\d+)\/?$/
It will match any string starting with /photos/, followed by any alphanumeric character (and underscore), followed by any number and optional / at the end of string, wrapped in a capture group.
So, if we do this:
"/photos/abc123/123".match(/^\/photos\/\w+\/(\d+)\/?$/)
the result will be ["/photos/abc123/123", "123"]. As you might have noticed, capture group is the second array element.
Ready to use function:
var extractNumeric = function (string) {
var exp = /^\/photos\/\w+\/(\d+)\/?$/,
out = string.match(exp);
return out ? out[1] : false;
};
You can find more detailed example here.
So, the answers:
Is indexOf() the right function to verify if the url is the specific
format and how would I write that expression? I've tried several
things related to indexOf() and Regex(), but had no success. I seem to
always end up with an unexpected character or it just doesn't work.
indexOf isn't the best choice for the job, you were right about using regular expression, but lacked experience to do so.
And of course the second part of my question is once I know the url is
the right format, how do I extract the numeric string into a variable?
Regular expression together with match function will allow to test string for desired format and get it's portions at the same time.

How can I split text on commas not within double quotes, while keeping the quotes?

So I'm trying to split a string in javacript, something that looks like this:
"foo","super,foo"
Now, if I use .split(",") it will turn the string into an array containing [0]"foo" [1]"super [2]foo"
however, I only want to split a comma that is between quotes, and if I use .split('","'), it will turn into [0]"foo [1]super,foo"
Is there a way I can split an element expressing delimiters, but have it keep certain delimiters WITHOUT having to write code to concatenate a value back onto the string?
EDIT:
I'm looking to get [0]"foo",[1]"super,foo" as my result. Essentially, the way I need to edit certain data, I need what is in [0] to never get changed, but the contents of [1] will get changed depending on what it contains. It will get concatenated back to look like "foo", "I WAS CHANGED" or it will indeed stay the same if the contents of [1] where not something that required a change
Try this:
'"foo","super,foo"'.replace('","', '"",""').split('","');
For the sake of discussion and benefit of everyone finding this page is search of help, here is a more flexible solution:
var text = '"foo","super,foo"';
console.log(text.match(/"[^"]+"/g));
outputs
['"foo"', '"super,foo"']
This works by passing the 'g' (global) flag to the RegExp instance causing match to return an array of all occurrences of /"[^"]"/ which catches "foo,bar", "foo", and even "['foo', 'bar']" ("["foo", "bar"]" returns [""["", "", "", ""]""].)

How do I extract the title value from a string using Javascript regexp?

I have a string variable which I would like to extract the title value in id="resultcount" element. The output should be 2.
var str = '<table cellpadding=0 cellspacing=0 width="99%" id="addrResults"><tr></tr></table><span id="resultcount" title="2" style="display:none;">2</span><span style="font-size: 10pt">2 matching results. Please select your address to proceed, or refine your search.</span>';
I tried the following regex but it is not working:
/id=\"resultcount\" title=['\"][^'\"](+['\"][^>]*)>/
Since var str = ... is Javascript syntax, I assume you need a Javascript solution. As Peter Corlett said, you can't parse HTML using regular expressions, but if you are using jQuery you can use it to take advantage of browser own parser without effort using this:
$('#resultcount', '<div>'+str+'</div>').attr('title')
It will return undefined if resultcount is not found or it has not a title attribute.
To make sure it doesn't matter which attribute (id or title) comes first in a string, take entire html element with required id:
var tag = str.replace(/^.*(<[^<]+?id=\"resultcount\".+?\/.+?>).*$/, "$1")
Then find title from previous string:
var res = tag.replace(/^.*title=\"(\d+)\".*$/, "$1");
// res is 2
But, as people have previously mentioned it is unreliable to use RegEx for parsing html, something as trivial as different quote (single instead of double quote) or space in "wrong" place will brake it.
Please see this earlier response, entitled "You can't parse [X]HTML with regex":
RegEx match open tags except XHTML self-contained tags
Well, since no one else is jumping in on this and I'm assuming you're just looking for a value and not trying to create a parser, I'll give you what works for me with PCRE. I'm not sure how to put it into the java format for you but I think you'll be able to do that.
span id="resultcount" title="(\d+)"
The part you're looking to get is the non-passive group $1 which is the '\d+' part. It will get one or more digits between the quote marks.

Categories

Resources