Match words separated by punctuation characters using regex - javascript

The sample string:
this!is.an?example
I want to match: this is an example.
I tried this:
<script type="text/javascript">
var string="this!is.an?example";
var pattern=/^\W/g;
alert(string.match(pattern));
</script>

Try this:
var words = "this!is.an?example".split(/[!.?,;:'"-]/);
This will create an string array containing each word.
If you want to turn it into a single string with the words separated by spaces, you can call words.join(" ").
EDIT: You could also split on \W (str.split(/\W/)), but that may match more characters than you want.

I can't understand why you want to explicitly match, but if your goal is to strip all punctuation, this would work:
var words = "this!is.an?example".split(/\W/);
words = words.join(' ');
\W will match any character except letters, digits and underscore.
If you want to split also on underscores, use this:
var words = "this!is.an?example_with|underscore".split(/\W|_/);

If you just want to match:
(\w|\.|!|\?)+

If you want to replace all punctuation with a whitespace, you could do this:
var str = str.replaceAll([^A-Za-z0-9]," ");
This replaces all non letters, numerals with a space.

/^\W/g means match a string where the first character is not a letter or number
and the string "this!is.an?example" obviously does not begin with a non-letter or non-number.
Remember that ^ means the whole string start with, not what you want to match start with. And also remember that capital \W is everything that is not matched by small \w. With that reminder what you probably want is:
var string="this!is.an?example";
var pattern=/(\w+)/g; // parens for capturing
alert(string.match(pattern).join(' ')); // if you don't join,
// some browsers will simply
// print "[object Object]"
// or something like it

Related

regex custom lenght but no whitespace allowed [duplicate]

I have a username field in my form. I want to not allow spaces anywhere in the string. I have used this regex:
var regexp = /^\S/;
This works for me if there are spaces between the characters. That is if username is ABC DEF. It doesn't work if a space is in the beginning, e.g. <space><space>ABC. What should the regex be?
While you have specified the start anchor and the first letter, you have not done anything for the rest of the string. You seem to want repetition of that character class until the end of the string:
var regexp = /^\S*$/; // a string consisting only of non-whitespaces
Use + plus sign (Match one or more of the previous items),
var regexp = /^\S+$/
If you're using some plugin which takes string and use construct Regex to create Regex Object i:e new RegExp()
Than Below string will work
'^\\S*$'
It's same regex #Bergi mentioned just the string version for new RegExp constructor
This will help to find the spaces in the beginning, middle and ending:
var regexp = /\s/g
This one will only match the input field or string if there are no spaces. If there are any spaces, it will not match at all.
/^([A-z0-9!##$%^&*().,<>{}[\]<>?_=+\-|;:\'\"\/])*[^\s]\1*$/
Matches from the beginning of the line to the end. Accepts alphanumeric characters, numbers, and most special characters.
If you want just alphanumeric characters then change what is in the [] like so:
/^([A-z])*[^\s]\1*$/

JS\TS - unable to add a white space in a string based on multiple special characters

I'm looking for a smart way to add white space after special character in a long string.
let str = "this\is\an\example\for\a\long\string";
str = str.split("\\").join("\\ ");
This would produce:
"this\ is\ an\ example\ for\ a\ long\ string";
I am looking for something more generic to capture multiple special chars at once, something like this:
let str = "this.is.a\long-mixed.string\with\many.special/characters";
str = str.split(/[.\-_]/).join(/[. \- _ ]/); //note the white spaces after the dot, hyphen and slash. I need to cover as much special chars as possible.
EDIT
I need this to support multi languages. So basically English\Arabic\Hebrew words should not be whitespaced, But only insert a whitespace after a special char.
You can do it like this
So here with replace i am matching anything except alphabets and digits. and than simply adding a space to it.
let str = "this.is.a\long-mixed.string\with\many.special/characters";
str = str.replace(/([\W_])/g, "$1 ");
console.log(str);
([\W_]) - Matches anything except alphabets and digits.

JS & Regex: how to replace punctuation pattern properly?

Given an input text such where all spaces are replaced by n _ :
Hello_world_?. Hello_other_sentenc3___. World___________.
I want to keep the _ between words, but I want to stick each punctuation back to the last word of a sentence without any space between last word and punctuation. I want to use the the punctuation as pivot of my regex.
I wrote the following JS-Regex:
str = str.replace(/(_| )*([:punct:])*( |_)/g, "$2$3");
This fails, since it returns :
Hello_world_?. Hello_other_sentenc3_. World_._
Why it doesn't works ? How to delete all "_" between the last word and the punctuation ?
http://jsfiddle.net/9c4z5/
Try the following regex, which makes use of a positive lookahead:
str = str.replace(/_+(?=\.)/g, "");
It replaces all underscores which are immediately followed by a punctuation character with the empty string, thus removing them.
If you want to match other punctuation characters than just the period, replace the \. part with an appropriate character class.
JavaScript doesn't have :punct: in its regex implementation. I believe you'd have to list out the punctuation characters you care about, perhaps something like this:
str = str.replace(/(_| )+([.,?])/g, "$2");
That is, replace any group of _ or space that is immediately followed by punctation with just the punctuation.
Demo: http://jsfiddle.net/9c4z5/2/

Regex does not apply to whole string

the following regex
var str = "1234,john smith,jack jone";
var match = str.match(/([^,]*,[^,]*,[^ ]*)/g);
alert(match);
returns
1234,john smith,jack
But what I am trying to get is the whole string which is
1234,john smith,jack jones
Basically my script does the job only for the first whitespace between commas but I want to do it everytime there is a white space between commas.
Can anyone help me out pls.
Your pattern excludes spaces from the last section so as soon as it encounters a space in after the third comma, that's the end of the match. You might want to try this instead:
var match = str.match(/[^,]*,[^,]*,.*/g);
This will allow anything after the second comma, including spaces or more commas (since your original pattern allowed commas after the the second).
If you'd like to match pattern only on a single, use start / end anchors (^ / $) as well as the multiline flag (m), like us this:
var match = str.match(/^[^,]*,[^,]*,.*$/mg);
You can try it out with this simple demo.
Why you're not using split ?
"1234,john smith,jack jone".split(/,/)
or
"1234,john smith,jack jone".split(",")

Regex from character until end of string

Hey. First question here, probably extremely lame, but I totally suck in regular expressions :(
I want to extract the text from a series of strings that always have only alphabetic characters before and after a hyphen:
string = "some-text"
I need to generate separate strings that include the text before AND after the hyphen. So for the example above I would need string1 = "some" and string2 = "text"
I found this and it works for the text before the hyphen, now I only need the regex for the one after the hyphen.
Thanks.
You don't need regex for that, you can just split it instead.
var myString = "some-text";
var splitWords = myString.split("-");
splitWords[0] would then be "some", and splitWords[1] will be "text".
If you actually have to use regex for whatever reason though - the $ character marks the end of a string in regex, so -(.*)$ is a regex that will match everything after the first hyphen it finds till the end of the string. That could actually be simplified that to just -(.*) too, as the .* will match till the end of the string anyway.

Categories

Resources