removing space and retaining the new line? - javascript

i want to replace all the spaces from a string , but i need to keep the new line character as it ?
choiceText=choiceText.replace(/\s/g,'');
india
aus //both are in differnt line
is giving as indaus
i want newline should retain and remove the s

\s means any whitespace, including newlines and tabs. is a space. To remove just spaces:
choiceText=choiceText.replace(/ /g,''); // remove spaces
You could remove "any whitespace except newlines"; most regex flavours count \s as [ \t\r\n], so we just take out the \n and the \r and you get:
choiceText=choiceText.replace(/[ \t]/g,''); // remove spaces and tabs

You can't use \s (any whitespace) for this. Use a character set instead: [ \t\f\v]

The \s regex pattern matches all whitespace chars. According to MDN, \s is "equivalent to [ \f\n\r\t\v\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]".
The easiest way to subtract a line feed, a newline, char from \s is to use a reverse, \S, shorthand character class, put it into a negated character class, [^\S] and add \n into it, that is, [^\S\n].
See a JavaScript demo:
console.log(
"india\naus \f\r\t\v\u00a0\u1680\u2000\u200a\u2028\u2029\u202f\u205f\u3000\ufeff."
.replace(/[^\S\n]+/g, ''))

Related

How to extract the last word in a string with a JavaScript regex?

I need is the last match. In the case below the word test without the $ signs or any other special character:
Test String:
$this$ $is$ $a$ $test$
Regex:
\b(\w+)\b
The $ represents the end of the string, so...
\b(\w+)$
However, your test string seems to have dollar sign delimiters, so if those are always there, then you can use that instead of \b.
\$(\w+)\$$
var s = "$this$ $is$ $a$ $test$";
document.body.textContent = /\$(\w+)\$$/.exec(s)[1];
If there could be trailing spaces, then add \s* before the end.
\$(\w+)\$\s*$
And finally, if there could be other non-word stuff at the end, then use \W* instead.
\b(\w+)\W*$
In some cases a word may be proceeded by non-word characters, for example, take the following sentence:
Marvelous Marvin Hagler was a very talented boxer!
If we want to match the word boxer all previous answers will not suffice due the fact we have an exclamation mark character proceeding the word. In order for us to ensure a successful capture the following expression will suffice and in addition take into account extraneous whitespace, newlines and any non-word character.
[a-zA-Z]+?(?=\s*?[^\w]*?$)
https://regex101.com/r/D3bRHW/1
We are informing upon the following:
We are looking for letters only, either uppercase or lowercase.
We will expand only as necessary.
We leverage a positive lookahead.
We exclude any word boundary.
We expand that exclusion,
We assert end of line.
The benefit here are that we do not need to assert any flags or word boundaries, it will take into account non-word characters and we do not need to reach for negate.
var input = "$this$ $is$ $a$ $test$";
If you use var result = input.match("\b(\w+)\b") an array of all the matches will be returned next you can get it by using pop() on the result or by doing: result[result.length]
Your regex will find a word, and since regexes operate left to right it will find the first word.
A \w+ matches as many consecutive alphanumeric character as it can, but it must match at least 1.
A \b matches an alphanumeric character next to a non-alphanumeric character. In your case this matches the '$' characters.
What you need is to anchor your regex to the end of the input which is denoted in a regex by the $ character.
To support an input that may have more than just a '$' character at the end of the line, spaces or a period for instance, you can use \W+ which matches as many non-alphanumeric characters as it can:
\$(\w+)\W+$
Avoid regex - use .split and .pop the result. Use .replace to remove the special characters:
var match = str.split(' ').pop().replace(/[^\w\s]/gi, '');
DEMO

How to match a string contain specific text preceded by any combination of alphanumerics and other charcters in regex

I want to match strings which have a specific text in start but after that any combination of alphanumeric and character value and string ends with double quotes ". Here is the sample string
fixed_words_/abcd123/"
in this string, fixed_words_ will always be same and in the end will be " but in between there can be digits, alphabets, underscores and slashes.
I tried mystring.match(/fixed_words_\w*"/g) but its not working. I am sorry but I am new to regex so don't mind if its a stupid question.
Instead of \w, have a character class that can match either \w (alphanumerics/underscores) or a slash:
mystring.match(/fixed_words_[\/\w]*"/g)
The above assumes that your expression can appear anywhere (or multiple times!) in mystring. If you want mystring to contain only your expression, add a start-of-string anchor (^) at the beginning, an end-of-string anchor ($) at the end, and get rid of the g flag permitting multiple matches:
mystring.match(/^fixed_words_[\/\w]*"$/)
Use the following regex to match your string: ^\s*fixed_words_[^"]"\s*$
[^"]* will match all the characters until it finds a double quote (") character.

How to trim all non-alphanumeric characters from start and end of a string in Javascript?

I have some strings that I want to clean up by removing all non-alphanumeric characters from the beginning and end.
It should work on these strings:
)&*#^#*^#&^%$text-is.clean,--^2*%#**)(#&^ --->> text-is.clean,--^2
-+~!##$%,.-"^&example-text#is.clean,--^#*%#**)(#&^ --->> example-text#is.clean
I have this regex, which removes them from the whole string:
val.replace(/[^a-zA-Z0-9]/g,'')
How would I change it to only remove from the beginning and end of string?
Modify your current RegExp to specify the start or end of string with ^ or $ and make it greedy. You can then link the two together with an OR |.
val.replace(/^[^a-zA-Z0-9]*|[^a-zA-Z0-9]*$/g, '');
This can be simplified to a-z with i flag for all letters and \d for numbers
val.replace(/^[^a-z\d]*|[^a-z\d]*$/gi, '');
You need to use anchors - ^ and $. And also, you would need a quantifier - *:
val.replace(/^[^a-zA-Z0-9]*|[^a-zA-Z0-9]*$/g,'')
Use anchors to match the start and end of the string:
val.replace(/^[^A-Z0-9]+|[^A-Z0-9]+$/ig, '')
Use anchors ^ and $ to match positions before first character and after last character in the string.
val.replace(/(^[^A-Za-z0-9]*)|([^A-Za-z0-9]*$)/g, '');
You can also shorten your code using \W which means non-alphanumeric character, shortcut for [^a-zA-Z0-9_] in case you want to keep underscore as well.
val.replace(/(^\W*)|(\W*$)/g, '');

Regex to allow special characters

I need a regex that will allow alphabets, hyphen (-), quote ('), dot (.), comma(,) and space. this is what i have now
^[A-Za-z\s\-]$
Thanks
I removed \s from your regex since you said space, and not white space. Feel free to put it back by replacing the space at the end with \s Otherwise pretty simple:
^[A-Za-z\-'., ]+$
It matches start of the string. Any character in the set 1 or more times, and end of the string. You don't have to escape . in a set, in case you were wondering.
You probably tried new RegExp("^[A-Za-z\s\-\.\'\"\,]$"). Yet, you have a string literal there, and the backslashes just escape the following characters - necessary only for the delimiting quote (and for backslashes).
"^[A-Za-z\s\-\.\'\"\,]$" === "^[A-Za-zs-.'\",]$" === '^[A-Za-zs-.\'",]$'
Yet, the range s-. is invalid. So you would need to escape the backslash to pass a string with a backslash in the RegExp constructor:
new RegExp("^[A-Za-z\\s\\-\\.\\'\\\"\\,]$")
Instead, regex literals are easier to read and write as you do not need to string-escape regex escape characters. Also, they are parsed only once during script "compilation" - nothing needs to be executed each time you the line is evaluated. The RegExp constructor only needs to be used if you want to build regexes dynamically. So use
/^[A-Za-z\s\-\.\'\"\,]$/
and it will work. Also, you don't need to escape any of these chars in a character class - so it's just
/^[A-Za-z\s\-.'",]$/
You are pretty close, try the following:
^[A-Za-z\s\-'.,]+$
Note that I assumed that you want to match strings that contain one or more of any of these characters, so I added + after the character class which mean "repeat the previous element one or more times".
Note that this will currently also allow tabs and line breaks in addition to spaces because \s will match any whitespace character. If you only want to allow spaces, change it to ^[A-Za-z \-'.,]+$ (just replaced \s with a space).

How can my regex for finding multiple spaces not match newlines?

I run this regex on #q keyup event to avoid extra spaces in a string.
$('#q').val($('#q').val().replace(/\s+/g,' '));
The problem is that it is also deleting all new lines. How can I delete extra spaces but keep new lines intact?
The issue is that \s represents all whitespace including newlines. If you just want spaces, you can have a literal space:
$('#q').val($('#q').val().replace(/ +/g,' '));
If you want spaces and tabs, you could use a character class instead:
$('#q').val($('#q').val().replace(/[\t ]+/g,' '));
Looking for \x20+ does the trick:
$('#q').val($('#q').val().replace(/\x20+/g,' '));
20 is the Hex code for the space character. You were looking for all whitespace characters, including newlines.

Categories

Resources