Extract Twitter handlers from string using regex in JavaScript - javascript

I Would like to extract the Twitter handler names from a text string, using a regex. I believe I am almost there, except for the ">" that I am including in my output. How can I change my regex to be better, and drop the ">" from my output?
Here is an example of a text string value:
"PlaymakersZA, Absa, DiepslootMTB"
The desired output would be an array consisting of the following:
PlaymakersZA, Absa, DiepslootMTB
Here is an example of my regex:
var array = str.match(/>[a-z-_]+/ig)
Thank you!

You can use match groups in your regex to indicate the part you wish to extract.
I set up this JSFiddle to demonstrate.
Basically, you surround the part of the regex that you want to extract in parenthesis: />([a-z-_]+)/ig, save it as an object, and execute .exec() as long as there are still values. Using index 1 from the resulting array, you can find the first match group's result. Index 0 is the whole regex, and next indices would be subsequent match groups, if available.
var str = "PlaymakersZA, Absa, DiepslootMTB";
var regex = />([a-z-_]+)/ig
var array = regex.exec(str);
while (array != null) {
alert(array[1]);
array = regex.exec(str);
}

You could just strip all the HTML
var str = "PlaymakersZA, Absa, DiepslootMTB";
$handlers = str.replace(/<[^>]*>|\s/g,'').split(",");

Related

How to remove strings before nth character in a text?

I have a dynamically generated text like this
xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0
How can I remove everything before Map ...? I know there is a hard coded way to do this by using substring() but as I said these strings are dynamic and before Map .. can change so I need to do this dynamically by removing everything before 4th index of - character.
You could remove all four minuses and the characters between from start of the string.
var string = 'xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0',
stripped = string.replace(/^([^-]*-){4}/, '');
console.log(stripped);
I would just find the index of Map and use it to slice the string:
let str = "xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0"
let ind = str.indexOf("Map")
console.log(str.slice(ind))
If you prefer a regex (or you may have occurrences of Map in the prefix) you man match exactly what you want with:
let str = "xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0"
let arr = str.match(/^(?:.+?-){4}(.*)/)
console.log(arr[1])
I would just split on the word Map and take the first index
var splitUp = 'xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0'.split('Map')
var firstPart = splitUp[0]
Uses String.replace with regex expression should be the popular solution.
Based on the OP states: so I need to do this dynamically by removing everything before 4th index of - character.,
I think another solution is split('-') first, then join the strings after 4th -.
let test = 'xxxxxx-xxxx-xxxxx-xxxxx-Map-B-844-0'
console.log(test.split('-').slice(4).join('-'))

Javascript RegEx contains

I'm using Javascript RegEx to compare if a string matches a standart format.
I have this variable called inputName, which has the following format (sample):
input[name='data[product][tool_team]']
And what I want to achieve with Javascript's regex is to determine if the string has the following but contains _team in between those brackets.
I tried the following:
var inputName = "input[name='data[product][tool_team]']";
var teamPattern = /\input[name='data[product][[_team]]']/g;
var matches = inputName.match(teamPattern);
console.log(matches);
I just get null with the result I gave as an example.
To be honest, RegEx isn't really my area, so I suppose it's wrong.
A couple of things:
You need to escape [ and ] as they have special meaning in regex
You need .* (or perhaps [^[]*) in front of _team if you want to allow anything there ([^[]* means "anything but a [ repeated zero or more times)
Example if you just want to know if it matches:
var string = "input[name='data[product][tool_team]']";
var teamPattern = /input\[name='data\[product\]\[[^[]*_team\]'\]/;
console.log(teamPattern.test(string));
Example if you need to capture the xyz_team bit:
var string = "input[name='data[product][tool_team]']";
var teamPattern = /input\[name='data\[product\]\[([^[]*_team)\]'\]/;
var match = string.match(teamPattern);
console.log(match ? match[1] : "no match");
If you are trying to check for DOM elements you can use attribute contains or attribute equals selector
document.querySelectorAll("input[name*='[_team]']")

Match a string between two other strings with regex in javascript

How can I use regex in javascript to match the phone number and only the phone number in the sample string below? The way I have it written below matches "PHONE=9878906756", I need it to only match "9878906756". I think this should be relatively simple, but I've tried putting negating like characters around "PHONE=" with no luck. I can get the phone number in its own group, but that doesn't help when assigning to the javascript var, which only cares what matches.
REGEX:
/PHONE=([^,]*)/g
DATA:
3={STATE=, SSN=, STREET2=, STREET1=, PHONE=9878906756,
MIDDLENAME=, FIRSTNAME=Dexter, POSTALCODE=, DATEOFBIRTH=19650802,
GENDER=0, CITY=, LASTNAME=Morgan
The way you're doing it is right, you just have to get the value of the capture group rather than the value of the whole match:
var result = str.match(/PHONE=([^,]*)/); // Or result = /PHONE=([^,]*)/.exec(str);
if (result) {
console.log(result[1]); // "9878906756"
}
In the array you get back from match, the first entry is the whole match, and then there are additional entries for each capture group.
You also don't need the g flag.
Just use dataAfterRegex.substring(6) to take out the first 6 characters (i.e.: the PHONE= part).
Try
var str = "3={STATE=, SSN=, STREET2=, STREET1=, PHONE=9878906756, MIDDLENAME=, FIRSTNAME=Dexter, POSTALCODE=, DATEOFBIRTH=19650802, GENDER=0, CITY=, LASTNAME=Morgan";
var ph = str.match(/PHONE\=\d+/)[0].slice(-10);
console.log(ph);

Regular Expression to get the last word from TitleCase, camelCase

I'm trying to split a TitleCase (or camelCase) string into precisely two parts using javascript. I know I can split it into multiple parts by using the lookahead:
"StringToSplit".split(/(?=[A-Z])/);
And it will make an array ['String', 'To', 'Split']
But what I need is to break it into precisely TWO parts, to produce an array like this:
['StringTo', 'Split']
Where the second element is always the last word in the TitleCase, and the first element is everything else that precedes it.
Is this what you are looking for ?
"StringToSplit".split(/(?=[A-Z][a-z]+$)/); // ["StringTo", "Split"]
Improved based on lolol answer :
"StringToSplit".split(/(?=[A-Z][^A-Z]+$)/); // ["StringTo", "Split"]
Use it like this:
s = "StringToSplit";
last = s.replace(/^.*?([A-Z][a-z]+)(?=$)/, '$1'); // Split
first = s.replace(last, ''); // StringTo
tok = [first, last]; // ["StringTo", "Split"]
You could use
(function(){
return [this.slice(0,this.length-1).join(''), this[this.length-1]];
}).call("StringToSplit".split(/(?=[A-Z])/));
//=> ["StringTo", "Split"]
In [other] words:
create the Array using split from a String
join a slice of that Array without the last element of that
Array
add that and the last element to a final Array

Regex to extract substring, returning 2 results for some reason

I need to do a lot of regex things in javascript but am having some issues with the syntax and I can't seem to find a definitive resource on this.. for some reason when I do:
var tesst = "afskfsd33j"
var test = tesst.match(/a(.*)j/);
alert (test)
it shows
"afskfsd33j, fskfsd33"
I'm not sure why its giving this output of original and the matched string, I am wondering how I can get it to just give the match (essentially extracting the part I want from the original string)
Thanks for any advice
match returns an array.
The default string representation of an array in JavaScript is the elements of the array separated by commas. In this case the desired result is in the second element of the array:
var tesst = "afskfsd33j"
var test = tesst.match(/a(.*)j/);
alert (test[1]);
Each group defined by parenthesis () is captured during processing and each captured group content is pushed into result array in same order as groups within pattern starts. See more on http://www.regular-expressions.info/brackets.html and http://www.regular-expressions.info/refcapture.html (choose right language to see supported features)
var source = "afskfsd33j"
var result = source.match(/a(.*)j/);
result: ["afskfsd33j", "fskfsd33"]
The reason why you received this exact result is following:
First value in array is the first found string which confirms the entire pattern. So it should definitely start with "a" followed by any number of any characters and ends with first "j" char after starting "a".
Second value in array is captured group defined by parenthesis. In your case group contain entire pattern match without content defined outside parenthesis, so exactly "fskfsd33".
If you want to get rid of second value in array you may define pattern like this:
/a(?:.*)j/
where "?:" means that group of chars which match the content in parenthesis will not be part of resulting array.
Other options might be in this simple case to write pattern without any group because it is not necessary to use group at all:
/a.*j/
If you want to just check whether source text matches the pattern and does not care about which text it found than you may try:
var result = /a.*j/.test(source);
The result should return then only true|false values. For more info see http://www.javascriptkit.com/javatutors/re3.shtml
I think your problem is that the match method is returning an array. The 0th item in the array is the original string, the 1st thru nth items correspond to the 1st through nth matched parenthesised items. Your "alert()" call is showing the entire array.
Just get rid of the parenthesis and that will give you an array with one element and:
Change this line
var test = tesst.match(/a(.*)j/);
To this
var test = tesst.match(/a.*j/);
If you add parenthesis the match() function will find two match for you one for whole expression and one for the expression inside the parenthesis
Also according to developer.mozilla.org docs :
If you only want the first match found, you might want to use
RegExp.exec() instead.
You can use the below code:
RegExp(/a.*j/).exec("afskfsd33j")
I've just had the same problem.
You only get the text twice in your result if you include a match group (in brackets) and the 'g' (global) modifier.
The first item always is the first result, normally OK when using match(reg) on a short string, however when using a construct like:
while ((result = reg.exec(string)) !== null){
console.log(result);
}
the results are a little different.
Try the following code:
var regEx = new RegExp('([0-9]+ (cat|fish))','g'), sampleString="1 cat and 2 fish";
var result = sample_string.match(regEx);
console.log(JSON.stringify(result));
// ["1 cat","2 fish"]
var reg = new RegExp('[0-9]+ (cat|fish)','g'), sampleString="1 cat and 2 fish";
while ((result = reg.exec(sampleString)) !== null) {
console.dir(JSON.stringify(result))
};
// '["1 cat","cat"]'
// '["2 fish","fish"]'
var reg = new RegExp('([0-9]+ (cat|fish))','g'), sampleString="1 cat and 2 fish";
while ((result = reg.exec(sampleString)) !== null){
console.dir(JSON.stringify(result))
};
// '["1 cat","1 cat","cat"]'
// '["2 fish","2 fish","fish"]'
(tested on recent V8 - Chrome, Node.js)
The best answer is currently a comment which I can't upvote, so credit to #Mic.

Categories

Resources