I am trying to handle input groups similar to:
'...A.B.' and want to output '.....AB'.
Another example:
'.C..Z..B.' ==> '......CZB'
I have been working with the following:
'...A.B.'.replace(/(\.*)([A-Z]*)/g, "$1")
returns:
"....."
and
'...A.B.'.replace(/(\.*)([A-Z]*)/g, "$2")
returns:
"AB"
but
'...A.B.'.replace(/(\.*)([A-Z]*)/g, "$1$2")
returns
"...A.B."
Is there a way to return
"....AB"
with a single regexp?
I have only been able to accomplish this with:
'...A.B.'.replace(/(\.*)([A-Z]*)/g, "$1") + '...A.B.'.replace(/(\.*)([A-Z]*)/g, "$2")
==> ".....AB"
If the goal is to move all of the . to the beginning and all of the A-Z to the end, then I believe the answer to
with a single regexp?
is "no."
Separately, I don't think there's a simpler, more efficient way than two replace calls — but not the two you've shown. Instead:
var str = "...A..B...C.";
var result = str.replace(/[A-Z]/g, "") + str.replace(/\./g, "");
console.log(result);
(I don't know what you want to do with non-., non-A-Z characters, so I've ignored them.)
If you really want to do it with a single call to replace (e.g., a single pass through the string matters), you can, but I'm fairly sure you'd have to use the function callback and state variables:
var str = "...A..B...C.";
var dots = "";
var nondots = "";
var result = str.replace(/\.|[A-Z]|$/g, function(m) {
if (!m) {
// Matched the end of input; return the
// strings we've been building up
return dots + nondots;
}
// Matched a dot or letter, add to relevant
// string and return nothing
if (m === ".") {
dots += m;
} else {
nondots += m;
}
return "";
});
console.log(result);
That is, of course, incredibly ugly. :-)
This question already has answers here:
Strip HTML from Text JavaScript
(44 answers)
removing html tags from string
(3 answers)
Closed 7 years ago.
I need to get rid of any text inside < and >, including the two delimiters themselves.
So for example, from string
<brev-y>th</brev-y><sw-ex>a</sw-ex><sl>t</sl>
I would like to get this one
that
This is what i've tried so far:
var str = annotation.split(' ');
str.substring(str.lastIndexOf("<") + 1, str.lastIndexOf(">"))
But it doesn't work for every < and >.
I'd rather not use RegEx if possible, but I'm happy to hear if it's the only option.
You can simply use the replace method with /<[^>]*>/g.It matches < followed by [^>]* any amount of non> until > globally.
var str = '<brev-y>th</brev-y><sw-ex>a</sw-ex><sl>t</sl>';
str = str.replace(/<[^>]*>/g, "");
alert(str);
For string removal you can use RegExp, it is ok.
"<brev-y>th</brev-y><sw-ex>a</sw-ex><sl>t</sl>".replace(/<\/?[^>]+>/g, "")
Since the text you want is always after a > character, you could split it at that point, and then the first character in each String of the array would be the character you need. For example:
String[] strings = stringName.split("<");
String word = "";
for(int i = 0; i < strings.length; i++) {
word += strings[i].charAt(0);
}
This is probably glitchy right now, but I think this would work. You don't need to actually remove the text between the "<>"- just get the character right after a '>'
Using a regular expression is not the only option, but it's a pretty good option.
You can easily parse the string to remove the tags, for example by using a state machine where the < and > characters turns on and off a state of ignoring characters. There are other methods of course, some shorter, some more efficient, but they will all be a few lines of code, while a regular expression solution is just a single replace.
Example:
function removeHtml1(str) {
return str.replace(/<[^>]*>/g, '');
}
function removeHtml2(str) {
var result = '';
var ignore = false;
for (var i = 0; i < str.length; i++) {
var c = str.charAt(i);
switch (c) {
case '<': ignore = true; break;
case '>': ignore = false; break;
default: if (!ignore) result += c;
}
}
return result;
}
var s = "<brev-y>th</brev-y><sw-ex>a</sw-ex><sl>t</sl>";
console.log(removeHtml1(s));
console.log(removeHtml2(s));
There are several ways to do this. Some are better than others. I haven't done one lately for these two specific characters, so I took a minute and wrote some code that may work. I will describe how it works. Create a function with a loop that copies an incoming string, character by character, to an outgoing string. Make the function a string type so it will return your modified string. Create the loop to scan from incoming from string[0] and while less than string.length(). Within the loop, add an if statement. When the if statement sees a "<" character in the incoming string it stops copying, but continues to look at every character in the incoming string until it sees the ">" character. When the ">" is found, it starts copying again. It's that simple.
The following code may need some refinement, but it should get you started on the method described above. It's not the fastest and not the most elegant but the basic idea is there. This did compile, and it ran correctly, here, with no errors. In my test program it produced the correct output. However, you may need to test it further in the context of your program.
string filter_on_brackets(string str1)
{
string str2 = "";
int copy_flag = 1;
for (size_t i = 0 ; i < str1.length();i++)
{
if(str1[i] == '<')
{
copy_flag = 0;
}
if(str1[i] == '>')
{
copy_flag = 2;
}
if(copy_flag == 1)
{
str2 += str1[i];
}
if(copy_flag == 2)
{
copy_flag = 1;
}
}
return str2;
}
I have a string like the following:
"[a,b,c],[d,e,f],[g,h,i]"
I was wondering how can I separate the string by ],[ in JavaScript. .split("],[") will remove the brackets. I want to preserve them.
Expected output:
["[a,b,c]","[d,e,f]","[g,h,i]"]
Edit:
Here is a more complicated case that I highlighted in a comment on #Leo's answer (wherein a ],[-delimited string contains ],):
"[dfs[dfs],dfs],[dfs,df,sdfs]]"
Expected output:
["[dfs[dfs],dfs]","[dfs,df,sdfs]]"]
Try this:
"[a,b,c],[d,e,f],[g,h,i]".match(/(\[[^\]]+\])/g)
// ["[a,b,c]", "[d,e,f]", "[g,h,i]"]
EDIT For OP's new case, here's the trick:
"[dfs[dfs],dfs],[dfs,df,sdfs]]".match(/(?!,\[).+?\](?=,\[|$)/g)
// ["[dfs[dfs],dfs]", "[dfs,df,sdfs]]"]
It works for even more complicated cases:
"[dfs[aa,[a],dfs],[dfs[dfs],dfs],[dfs,df,sdfs]]".match(/(?!,\[).+?\](?=,\[|$)/g)
// ["[dfs[aa,[a],dfs]", "[dfs[dfs],dfs]", "[dfs,df,sdfs]]"]
"[dfs[aa,[a],dfs],[dfs[dfs],dfs],[dfs,df,sdfs]],[dfs,df,sdfs]]".match(/(?!,\[).+?\](?=,\[|$)/g)
// ["[dfs[aa,[a],dfs]", "[dfs[dfs],dfs]", "[dfs,df,sdfs]]", "[dfs,df,sdfs]]"]
Below is my personal opinion
However, JavaScript's RegExp doesn't support lookbehind (?<, which is super handy for such requirements), using RegExp may become a maintainability nightmare. In this situation, I'd suggest an approach like, maybe #alienchow's replacing delimiters - not so neat, but more maintainable.
Personally I'd do
"[dfs[dfs],dfs],[dfs,df,sdfs]]".split("],[");
then loop through it to:
Append the first string with a "]".
Prepend the last string with a "[".
Prepend a "[" and append a "]" to all strings in between.
However, if you know what kind of strings and characters you will be receiving and you reaaaaally want a one-liner approach, you could try the hack below.
Replace all instances of "],[" with "]unlikely_string_or_special_unicode[", then split by "unlikely_string_or_special_unicode" - for example:
"[dfs[dfs],dfs],[dfs,df,sdfs]]".replace(/\],\[/g,"]~I_have_a_dream~[").split("~I_have_a_dream~");
Warning: Not 100% full-proof. If your input string has the unlikely string you used as a delimiter, then it implodes and the universe comes to an end.
TMTOWDI
I prefer doing this with a regex as #Leo explained, but another way to do it in the spirit of TMTOWDI & completeness is with the map function following the split:
var test = "[a,b,c],[d,e,f],[g,h,i]";
var splitTest = test.split("],[").map(
function(str) {
if (str[0] !== '[') {
str = '[' + str;
}
if (str[str.length - 1] !== ']') {
str += ']';
}
return str;
});
// FORNOW: to see the results
for (var i = 0; i < splitTest.length; i++) {
alert(splitTest[i]);
}
Afterthought:
If you perchance have an empty pair of square brackets in your ],[-delimited string (i.e. "[a,b,c],[d,e,f],[],[g,h,i]" for example), this approach will preserve it too (as would changing #Leo's regex from /(\[[^\]]+\])/g to /(\[[^\]]*\])/g).
TMTOWDI Redeux
With the curveball that ] and [ may be within the ],[-delimited strings (per your comment on #Leo's answer), here is a rehash of my initial approach that is more robust:
var test = "[dfs[dfs],dfs],[dfs,df,sdfs]]";
var splitTest = test.split("],[").map(
function(str, arrIndex, arr) {
if (arrIndex !== 0) {
str = '[' + str;
}
if (arrIndex !== arr.length - 1) {
str += ']';
}
return str;
});
// FORNOW: to see the results
for (var i = 0; i < splitTest.length; i++) {
alert(splitTest[i]);
}
I'm trying to check if a string contains any of these words:
AB|AG|AS|Ltd|KB|University
My current code:
var acceptedwords = '/AB|AG|AS|Ltd|KB|University/g'
var str = 'Hello AB';
var matchAccepted = str.match(acceptedwords);
console.log(matchAccepted);
if (matchAccepted !== null) { // Contains the accepted word
console.log("Contains accepted word: " + str);
} else {
console.log("Does not contain accepted word: " + str);
}
But for some strange reason this does not match.
Any ideas what I'm doing wrong?
That's not the right way to define a literal regular expression in Javascript.
Change
var acceptedwords = '/AB|AG|AS|Ltd|KB|University/g'
to
var acceptedwords = /AB|AG|AS|Ltd|KB|University/;
You might notice I removed the g flag : it's useless as you only want to know if there's one match, you don't want to get them all. You don't even have to use match here, you could use test :
var str = 'Hello AB';
if (/AB|AG|AS|Ltd|KB|University/.test(str)) { // Contains the accepted word
console.log("Contains accepted word: " + str);
} else {
console.log("Does not contain accepted word: " + str);
}
If you want to build a regex with strings, assuming none of them contains any special character, you could do
var words = ['AB','AG', ...
var regex = new RegExp(words.join('|'));
If your names may contain special characters, you'll have to use a function to escape them.
If you want your words to not be parts of other words (meaning you don't want to match "ABC") then you should check for words boundaries :
regex = new RegExp(words.map(function(w){ return '\\b'+w+'\\b' }).join('|'),'g');
i have a problem i'm trying to solve, i have a javascript string (yes this is the string i have)
<div class="stories-title" onclick="fun(4,'this is test'); navigate(1)
What i want to achieve are the following points:
1) cut characters from start until the first ' character (cut the ' too)
2) cut characters from second ' character until the end of the string
3) put what's remaining in a variable
For example, the result of this example would be the string "this is test"
I would be very grateful if anyone have a solution.. Especially a simple one so i can understand it.
Thanks all in advance
You can use split() function:
var mystr = str.split("'")[1];
var newstr = str.replace(/[^']+'([^']+).*/,'$1');
No need to cut anything, you just want to match the string between the first ' and the second ' - see similar questions like Javascript RegExp to find all occurences of a a quoted word in an array
var string = "<div class=\"stories-title\" onclick=\"fun(4,'this is test'); navigate(1)";
var m = string.match(/'(.+?)'/);
if (m)
return m[1]; // the matching group
You can use regular expressions
/\'(.+)\'/
http://rubular.com/r/RcVmejJOmU
http://www.regular-expressions.info/javascript.html
If you want to do the work yourself:
var str = "<div class=\"stories-title\" onclick=\"fun(4,'this is test'); navigate(1)";
var newstr = "";
for (var i = 0; i < str.length; i++) {
if (str[i] == '\'') {
while (str[++i] != '\'') {
newstr += str[i];
}
break;
}
}