Extract specific chars from a string using a regex - javascript

I need to split an email address and take out the first character and the first character after the '#'
I can do this as follows:
'bar#foo'.split('#').map(function(a){ return a.charAt(0); }).join('')
--> bf
Now I was wondering if it can be done using a regex match, something like this
'bar#foo'.match(/^(\w).*?#(\w)/).join('')
--> bar#fbf
Not really what I want, but I'm sure I miss something here! Any suggestions ?

Why use a regex for this? just use indexOf to get the char at any given position:
var addr = 'foo#bar';
console.log(addr[0], addr[addr.indexOf('#')+1])
To ensure your code works on all browsers, you might want to use charAt instead of []:
console.log(addr.charAt(0), addr.charAt(addr.indexOf('#')+1));
Either way, It'll work just fine, and This is undeniably the fastest approach
If you are going to persist, and choose a regex, then you should realize that the match method returns an array containing 3 strings, in your case:
/^(\w).*?#(\w)/
["the whole match",//start of string + first char + .*?# + first string after #
"groupw 1 \w",//first char
"group 2 \w"//first char after #
]
So addr.match(/^(\w).*?#(\w)/).slice(1).join('') is probably what you want.

If I understand correctly, you are quite close. Just don't join everything returned by match because the first element is the entire matched string.
'bar#foo'.match(/^(\w).*?#(\w)/).splice(1).join('')
--> bf

Using regex:
matched="",
'abc#xyz'.replace(/(?:^|#)(\w)/g, function($0, $1) { matched += $1; return $0; });
console.log(matched);
// ax

The regex match function returns an array of all matches, where the first one is the 'full text' of the match, followed by every sub-group. In your case, it returns this:
bar#f
b
f
To get rid of the first item (the full match), use slice:
'bar#foo'.match(/^(\w).*?#(\w)/).slice(1).join('\r')

Use String.prototype.replace with regular expression:
'bar#foo'.replace(/^(\w).*#(\w).*$/, '$1$2'); // "bf"

Or using RegEx
^([a-zA-Z0-9])[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+#([a-zA-Z0-9-])[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$
Fiddle

Related

Javascript Regex match everything after last occurrence of string

I am trying to match everything after (but not including!) the last occurrence of a string in JavaScript.
The search, for example, is:
[quote="user1"]this is the first quote[/quote]\n[quote="user2"]this is the 2nd quote and some url https://www.google.com/[/quote]\nThis is all the text I\'m wirting about myself.\n\nLook at me ma. Javascript.
Edit: I'm looking to match everything after the last quote block. So I was trying to match everything after the last occurrence of "quote]" ? Idk if this is the best solution but its what i've been trying.
I'll be honest, i suck at this Regex stuff.. here is what i've been trying with the results..
regex = /(quote\].+)(.*)/ig; // Returns null
regex = /.+((quote\]).+)$/ig // Returns null
regex = /( .* (quote\]) .*)$/ig // Returns null
I have made a JSfiddle for anyone to have a play with here:
https://jsfiddle.net/au4bpk0e/
One option would be to match everything up until the last [/quote], and then get anything following it. (example)
/.*\[\/quote\](.*)$/i
This works since .* is inherently greedy, and it will match every up until the last \[\/quote\].
Based on the string you provided, this would be the first capturing group match:
\nThis is all the text I\'m wirting about myself.\n\nLook at me ma. Javascript.
But since your string contains new lines, and . doesn't match newlines, you could use [\s\S] in place of . in order to match anything.
Updated Example
/[\s\S]*\[\/quote\]([\s\S]*)$/i
You could also avoid regex and use the .lastIndexOf() method along with .slice():
Updated Example
var match = '[\/quote]';
var textAfterLastQuote = str.slice(str.lastIndexOf(match) + match.length);
document.getElementById('res').innerHTML = "Results: " + textAfterLastQuote;
Alternatively, you could also use .split() and then get the last value in the array:
Updated Example
var textAfterLastQuote = str.split('[\/quote]').pop();
document.getElementById('res').innerHTML = "Results: " + textAfterLastQuote;

Regex thinks I'm nesting, but I'm not

I wrote this regexp to capture the strings below.
\!\[(.*?)?\]
All the strings below should match and return an optional string that's inside the first set of square brackets.
![]
![caption]
![]()
![caption]()
![caption][]
The problem is that this string also matches and returns ][ because the regex thinks it's between the first [ and last ].
![][] // Should not match, but does and returns "]["
How do I fix this?
Just remove the ? outside (.*?), that is redundant.
var myArray = ["![abc]","![caption]", "![def]()", "![caption]()","![caption][]"];
myArray.forEach(function(current) {
console.log(/!\[(.*?)\]/.exec(current)[1]);
});
Output
abc
caption
def
caption
caption
Check how the RegEx works here
Use this regex:
\!\[([^\]]*)\]
It means that it expects a "last" ] but makes internal ones invalid.
This should solve your issue.
My preference is this if you want to ignore catching the things like this ![[]]
\!\[([^\[\]]*)\]

Remove everything after the first instance of one of several characters

Say I have a string like:
var str = "Good morningX Would you care for some tea?"
Where the X could be one of several characters, like a ., ?, or !.
How can I remove everything after that character?
If it could only be one type of character, I would use indexOf and substr, but it looks like I need a different method to find the position in this case. Perhaps a regular expression?
Clarification: I do not know what character X is. I'd like to cut the string off at the first occurrence of any one of the specified characters.
Ok, further clarification:
What I'm actually doing is scrubbing posts from a website. I'm taking the first bit from each post and stitching them together. By 'bit', I mean characters before the first piece of punctuation. I need to cut everything off after that punctuation. Does that make sense?
Just replace everything within the [ and ] with your delimiters. Escape if necessary.
var str = "Good morning! Would you care for some tea?";
var beginning = str.split(/[.?!]/)[0];
// "Good morning"
Try this, If the X have this ',' character , then try below
var s = 'Good morning, would you care for some tea?';
s = s.substring(0, s.indexOf(','));
document.write(s);
Demo : http://jsfiddle.net/L4hna/490/
and if the X have '!' , then try below
var s = 'Good morning! would you care for some tea?';
s = s.substring(0, s.indexOf('!'));
document.write(s);
Demo : http://jsfiddle.net/L4hna/491/
Try this way for your requirement string.
Both are will return Good Morning
The below code will do as you expect:
var s = "Good morningX Would you care for some tea?";
s = s.substring(X, n != -1 ? n : s.length);
document.write(s);
http://jsfiddle.net/JEFnY/
The regex would be
str.replace(/(.*?)([\.\?\!])(.*)/i, '$1$2');
The first capturing group is a lazy expression to match everything before the next capturing group.
The second capturing group only looks for the characters that you specify - which in this case are .!?, all escaped.
The last capturing group is discarded. Hence the substitution string is $1$2, or the first two capturing groups together.

Is there a way to do a substring in Javascript but use string characters as the parameters for what you want to select?

So a substring can take two parameters, the index to start at and the index to stop at like so
var str="Hello beautiful world!";
document.write(str.substring(3,7));
but is there a way to designate the start and stopping points as a set of characters to grab, so instead of the starting point being 3 I would want it to be "lo" and instead of the end point being 7 I would want it to be "wo" so I would be grabbing "lo beautiful wo". Is there a Javascript function that serves that purpose already?
Sounds like you want to use regular expressions and string.match() instead:
var str="Hello beautiful world!";
document.write(str.match(/lo.*wo/)[0]); // document.write("lo beautiful wo");
Note, match() returns an array of matches, which might be null if there is no match. So you should include a null check.
If you're not familiar with regexes, this is a pretty good source:
http://www.w3schools.com/jsref/jsref_obj_regexp.asp
use the method indexOf: document.write(str.substring(3,str.indexOf('wo')+2));
Yup, you can do this easily with regular expressions:
var substr = /lo.+wo/.exec( 'Hello beautiful world!' )[0];
console.log( substr ); //=> 'lo beautiful wo'
Use a regex brother:
if (/(lo.+wo)/.test("Hello beautiful world!")) {
document.write(RegExp.$1);
}
You need a backup plan in case the string does not match. Hence the use of test.
Regular expression may be able to achieve this to some extent, but there are many details that you must be aware of.
For example, if you want to find all the substrings that starts with "lo", and ends with the nearest "wo" after "lo". (If there are more than 1 match, the subsequent matches will pick up the first "lo" after the "wo" of last match).
"Hello beautiful world!".match(/lo.*?wo/g);
Using the RegExp constructor, you can make it more flexible (you can substitute "lo" and "wo" with the actual string you want to find):
"Hello beautiful world!".match(new RegExp("lo" + ".*?" + "wo", "g"));
Important: The downside of the RegExp approach above is that, you need to know what characters are special to escape them - otherwise, they will not match the actual substring you want to find.
It can also be achieve with indexOf, albeit a little bit dirty. For the first substring:
var startIndex = str.indexOf(startString);
var endIndex = str.indexOf(endString, startIndex);
if (startIndex >= 0 && endIndex >= 0)
str.substring(startIndex, endIndex + endString.length)
If you want to find the substring that starts with the first "lo" and ends with the last "wo" in the string, you can use indexOf and lastIndexOf to find it (with a small modification to the code above). RegExp can also do it, by changing .*? to .* in the two example above (there will be at most 1 match, so the "g" flag at the end is redundant).

java script Regular Expressions patterns problem

My problem start with like-
var str='0|31|2|03|.....|4|2007'
str=str.replace(/[^|]\d*[^|]/,'5');
so the output becomes like:"0|5|2|03|....|4|2007" so it replaces 31->5
But this doesn't work for replacing other segments when i change code like this:
str=str.replace(/[^|]{2}\d*[^|]/,'6');
doesn't change 2->6.
What actually i am missing here.Any help?
I think a regular expression is a bad solution for that problem. I'd rather do something like this:
var str = '0|31|2|03|4|2007';
var segments = str.split("|");
segments[1] = "35";
segments[2] = "123";
Can't think of a good way to solve this with a regexp.
Here is a specific regex solution which replaces the number following the first | pipe symbol with the number 5:
var re = /^((?:\d+\|){1})\d+/;
return text.replace(re, '$15');
If you want to replace the digits following the third |, simply change the {1} portion of the regex to {3}
Here is a generalized function that will replace any given number slot (zero-based index), with a specified new number:
function replaceNthNumber(text, n, newnum) {
var re = new RegExp("^((?:\\d+\\|){"+ n +'})\\d+');
return text.replace(re, '$1'+ newnum);
}
Firstly, you don't have to escape | in the character set, because it doesn't have any special meaning in character sets.
Secondly, you don't put quantifiers in character sets.
And finally, to create a global matching expression, you have to use the g flag.
[^\|] means anything but a '|', so in your case it only matches a digit. So it will only match anything with 2 or more digits.
Second you should put the {2} outside of the []-brackets
I'm not sure what you want to achieve here.

Categories

Resources