Substring specific text using javascript - javascript

<li class="item">hello?james</a></li>
<li class="item">goodbye?michael</a></li>
I want to extract the text that is after the sign "?" -> james, michael
I tried using a substring method but it only works if I specify the starting and the ending like substr(5,10) or substr(5) etc.
I'm using this when I extract from another file in a foreach php method, so I need everything that is after "?".
Is there any method in which I can substring starting with a character (eg. "?") or a specific string ?
Many thanks!

Use a regex with capture groups
Regular expressions will make it easy to match your HTML and using capture groups (parenthesis) you can extract the names from the string:
var myString = YOUR_HTML_HERE;
var myRegexp = /<li .*\?([a-zA-Z0-1]*)</g;
var match = myRegexp.exec(myString);
alert(match[1]); // james.
Execute the regex multiple times to get all matches
You can run the regex again in the original string to get the next match. To get all of them, do this:
while (match != null) {
match = myRegexp.exec(myString);
alert(match[1]); // michael
}

Related

How to capture group in Javascript Regex?

I am trying to capture customer.name from hello #customer.name from the end of the text string.
However, I can't seem to get the # character out of it. It gives #customer.name.
Right now my regex expression is:
#([0-9a-zA-Z.]+)$
Use the .match method of the string. The result will be null if there was no match with the given regex, otherwise it will be an array where:
The first element is the entire matched string, #customer.name in your case
The remaining elements are each capture group. You have one capture group so that would end up in the 1st index. In your case this will be customer.name, the string you want.
See this snippet. Your regex already looks correct, you just need to pull only the capture group from it instead of the entire matched string.
const str = "hello #customer.name"
const reg = /#([0-9a-zA-Z.]+)$/;
const match = str.match(reg)
console.log(match)
console.log("Matched string:", match[0]);
console.log("First capture group:", match[1]);
Your regex works fine, here's some code to use it to access your capture group using the regex .exec function. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec
let testString ='hello #customer.name',
pattern = /#([0-9a-zA-Z.]+)$/,
match = pattern.exec(testString);
console.log(match[1]); // abc

javascript find 3rd occureance of a substring in a string

I am trying to extract the substring between 3rd occurance of '|' character and ';GTSet' string within string
For Example, if my string is "AP0|#c7477474-376c-abab-2990-918aac222213;L0|#0a4a23b12-125a-2ac2-3939-333aav111111|ABC xxx;pATeND|#222222ANCJ-VCVC-2262-737373-3838383";
I would like to extract "ABC xxx" from above string using javascript.
I have tried following options
var str = "AP0|#c7477474-376c-abab-2990-918aac222213;L0|#0a4a23b12-125a-2ac2-3939-333aav111111|ABC xxx;pATeND|#222222ANCJ-VCVC-2262-737373-3838383";
alert(str.match(/^|\;pATeND(.*)$/gm));
//var n = str.search(";pATeND");
//to get the 3rd occurance of | character
//var m = str.search("s/\(.\{-}\z|\)\{3}");
This lookahead regex should work:
/[^|;]+(?=;pATeND)/
RegEx Demo
Or if paTeND text is know known then grab the value after 3rd |:
^(?:[^|]*\|){3}([^|;]+)
and use captured group #1.
Demo 2

converting a string to stringified JSON

I'm getting a string which looks like following
"{option:{name:angshu,title:guha}}"
Now I have to make a valid JSON string from this. Is there any smart way to convert that. I tried with string handelling but that takes a lot of conditions still being case specific. Even i tried with eval() that doesn't work also.
This regex will do the trick for the provided example string:
/:([^{},]+)/g
Regex101 analysis of it:
: matches the character : literally
1st Capturing group ([^{},]+)
[^{,}]+ match a single character not present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
{}, a single character in the list {}, literally
g modifier: global. All matches (don't return on first match)
Basically, it looks for all characters following a : that aren't {},. those "words" are saved in the 1st capturing group, which allows .replace to re-use them with $1.
You can use the regex like this:
var raw = "{option:{name:angshu,title:guha}}",
regex = /:([^{,}]+)/g,
replacement = ':"$1"';
var jsonString = raw.replace(regex, replacement);
alert(jsonString);
try this if you are looking for JSON object
string inputString = '{option:{name:angshu,title:guha}}';
var obj = JSON.parse(inputString);

match a string not after another string

This
var re = /[^<a]b/;
var str = "<a>b";
console.log(str.match(re)[0]);
matches >b.
However, I don't understand why this pattern /[^<a>]b/ doesn't match anything. I want to capture only the "b".
The reason why /[^<a>]b/ doesn't do anything is that you are ignoring <, a, and > as individual characters, so rewriting it as /[^><a]b/ would do the same thing. I doubt this is what you want, though. Try the following:
var re = /<a>(b)/;
var str = "<a>b";
console.log(str.match(re)[1]);
This regex looks for a string that looks like <a>b first, but it captures the b with the parentheses. To access the b, simply use [1] when you call .match instead of [0], which would return the entire string (<a>b).
What you're using here is a match for a b preceded by any character that is not listed in the group. The syntax [^a-z+-] where the a-z+- is a range of characters (in this case, the range of the lowercase Latin letters, a plus sign and a minus sign). So, what your regex pattern matches is any b preceded by a character that is NOT < or a. Since > doesn't fall in that range, it matches it.
The range selector basically works the same as a list of characters that are seperated by OR pipes: [abcd] matches the same as (a|b|c|d). Range selectors just have an extra functionality of also matching that same string via [a-d], using a dash in between character ranges. Putting a ^ at the start of a range automatically turns this positive range selector into a negative one, so it will match anything BUT the characters in that range.
What you are looking for is a negative lookahead. Those can exclude something from matching longer strings. Those work in this format: (?!do not match) where do not match uses the normal regex syntax. In this case, you want to test if the preceding string does not match <a>, so just use:
(?!<a>)(.{3}|^.{0,2})b
That will match the b when it is either preceded by three characters that are not <a>, or by fewer characters that are at the start of the line.
PS: what you are probably looking for is the "negative lookbehind", which sadly isn't available in JavaScript regular expressions. The way that would work is (?<!<a>)b in other languages. Because JavaScript doesn't have negative lookbehinds, you'll have to use this alternative regex.
you could write a pattern to match anchor tag and then replace it with empty string
var str = "<a>b</a>";
str = str.replace(/((<a[\w\s=\[\]\'\"\-]*>)|</a>)/gi,'')
this will replace the following strings with 'b'
<a>b</a>
<a class='link-l3'>b</a>
to better get familiar with regEx patterns you may find this website very useful regExPal
Your code :
var re = /[^<a>]b/;
var str = "<a>b";
console.log(str.match(re));
Why [^<a>]b is not matching with anything ?
The meaning of [^<a>]b is any character except < or a or > then b .
Hear b is followed by > , so it will not match .
If you want to match b , then you need to give like this :
var re = /(?:[\<a\>])(b)/;
var str = "<a>b";
console.log(str.match(re)[1]);
DEMO And EXPLANATION

Javascript regular expression matching prior and trailing characters

I have this string in a object:
<FLD>dsfgsdfgdsfg;NEW-7db5-32a8-c907-82cd82206788</FLD><FLD>dsfgsdfgsd;NEW-480e-e87c-75dc-d70cd731c664</FLD><FLD>dfsgsdfgdfsgfd;NEW-0aad-440a-629c-3e8f7eda4632</FLD>
this.model.get('value_long').match(/[<FLD>\w+;](NEW[-|\d|\w]+)[</FLD>]/g)
Returns:
[";NEW-7db5-32a8-c907-82cd82206788<", ";NEW-480e-e87c-75dc-d70cd731c664<", ";NEW-0aad-440a-629c-3e8f7eda4632<"]
What is wrong with my regular expression that it is picking up the preceding ; and trailing <
here is a link to the regex
http://regexr.com?30k3m
Updated:
this is what I would like returned:
["NEW-7db5-32a8-c907-82cd82206788", "NEW-480e-e87c-75dc-d70cd731c664", "NEW-0aad-440a-629c-3e8f7eda4632"]
here is a JSfiddle for it
http://jsfiddle.net/mwagner72/HHMLK/
Square brackets create a character class, which you do not want here, try changing your regex to the following:
<FLD>\w+;(NEW[-\d\w]+)</FLD>
Since it looks like you want to grab the capture group from each match, you can use the following code to construct an array with the capture group in it:
var regex = /<FLD>\w+;(NEW[\-\d\w]+)<\/FLD>/g;
var match = regex.exec(string);
var matches = [];
while (match !== null) {
matches.push(match[1]);
match = regex.exec(string);
}
[<FLD>\w+;] would match one of the characters inside of the square brackets, when I think what you actually want to do is match all of those. Also for the other character class, [-|\d|\w], you can remove the | because it is already implied in a character class, | should only be used for alternation inside of a group.
Here is an updated link with the new regex: http://jsfiddle.net/RTkzx/1

Categories

Resources