Javascript - Regular expressions Not consistent match - javascript

I am getting crazy with this problem, I try to extract "Latin" sentences from the following string (shorter than the original) :
My Name is Yoann
ホームインサイト最新のインサイト運用チームからの最新のマーケ
ットアップデート、運用アップデートマーケットアップデート市場環境情報に関す
るレポート投資アップデート投資環境情報レポート及び出版物運用戦略運用戦略
ファーストステート・スチュワートアジア・パシフィック、グローバル・エマージング・マーケット、ワールドワイド株式
Hello World
Here is the regular expression :
...
//text=My Name is Yoann...
pattern = new RegExp("([A-Za-z]+[\s]?[A-Za-z])+", "g");
results= text.match(pattern);
When I run that, I get :
//results[0]="My"
//results[1]="Name"
//results[2]="is"
//results[3]="Yoann"
//...
The words are splitted on "space". When I try on the http://regexlib.com (JS client engine) tester or http://regexpal.com, the results are :
//results[0]="My Name is Yoann"
//results[1]="Hello World"
So I don't understand what i do wrong in my code but I don't get the same results.
Thanks for your help.
Yoann

> "foo bar".match(new RegExp("[a-z]\s[a-z]"))
null
> "foo bar".match(new RegExp("[a-z]\\s[a-z]"))
["o b"]
When using the RegExp constructor, double your slashes. Better yet, use regex literals, e.g.
/[A-Z][A-Z\s]*[A-Z]|[A-Z]/gi
http://jsfiddle.net/4PdJh/1

Related

Or Condition in a regular expression

I have a string in which I need to get the value between either "[ValueToBeFetched]" or "{ValueToBeFetched}".
var test = "I am \"{now}\" doing \"[well]\"";
test.match(/"\[.*?]\"/g)
the above regex serves the purpose and gets the value between square brackets and I can use the same for curly brackets also.
test.match(/"\{.*?}\"/g)
Is there a way to keep only one regex and do this, something like an or {|[ operator in regex.
I tried some scenarios but they don't seem to work.
Thanks in advance.
You could try following regex:
(?:{|\[).*?(?:}|\])
Details:
(?:{|\[): Non-capturing group, gets character { or [
.*?: gets as few as possible
(?:}|\]): Non-capturing group, gets character } or ]
Demo
Code in JavaScript:
var test = "I am \"{now}\" doing \"[well]\"";
var result = test.match(/"(?:{|\[).*?(?:}|\])"/g);
console.log(result);
Result:
["{now}", "[well]"]
As you said, there is an or operator which is |:
[Edited as suggested] Let's catch all sentences that begins with an "a" or a "b" :
/^((a|b).*)/gm
In this example, if the line parsed begins with a or b, the entire sentence will be catched in the first result group.
You may test your regex with an online regex tester
For your special case, try something like that, and use the online regex tester i mentionned before to understand how it works:
((\[|\{)\w*(\]|\}))

How to convert string from PHP to javascript regular expression?

This is my string converted into javascript object.
{"text" : "Must consist of alphabetical characters and spaces only", regexp:"/^[a-z\\s]+$/i"}
I need regexp to use it for validation but it won’t work because of the double quotes and \s escape sequence.
To make it work the value of regexp must be {"text" : "Must consist of alphabetical characters and spaces only", regexp : /^[a-z\s]+$/i}.
I also used this new RegExp(object.regexp) and any other way I can possibly think but with no luck at all.
Any help is appreciated!
Try split-ing out the part that you want, before putting it into the new RegExp constructor:
var regexVariable = new RegExp(object.regexp.split("/")[1]);
That will trim off the string representation of the regex "boundaries", as well as the "i" flag, and leave you with just the "guts" of the regex.
Pushing the result of that to the console results in the following regex: /^[a-z\s]+$/
Edit:
Not sure if you want to "read" the case insensitivity from the value in the object or not, but, if you do, you can expand the use of the split a little more to get any flags included automatically:
var aRegexParts = object.regexp.split("/");
var regexVariable = new RegExp(aRegexParts[1], aRegexParts[2]);
Logging that in the console results in the first regex that I posted, but with the addition of the "i" flag: /^[a-z\s]+$/i
Borrowing the example #RoryMcCrossan made, you can use a regular expression to parse your regular expression.
var object = {
"text": "Must consist of alphabetical characters and spaces only",
"regexp": "/^[a-z\\s]+$/i"
}
// parse out the main regex and any additional flags.
var extracted_regex = object.regexp.match(/\/(.*?)\/([ig]+)?/);
var re = new RegExp(extracted_regex[1], extracted_regex[2]);
// don't use document.write in production! this is just so that it's
// easier to see the values in stackoverflow's editor.
document.write('<b>regular expression:</b> ' + re + '<br>');
document.write('<b>string:</b> ' + object.text + '<br>');
document.write('<b>evaluation:</b> ' + re.test(object.text));
not used regex in Java but the regular expression itself should look something like :
"^([aA-zZ] | \s)*$"
If Java uses regular expression as I am used to them [a-z] will only capture lowercase characters
Hope this helps even if it's just a little (would add this as a comment instead of answer but need 50 rep)

Splitting string by {0}, {1}...{n} and replace with empty string

I have the following String:
var text = "Hello world! My name is {0}. How {1} can you be?"
I wanna find each of the {n} and replace them with an empty string. I'm totally useless with regex and tried this:
text = text.split("/^\{\d+\}$/").join("");
I'm sure this is an easy answer and probably exist some answer already on SO but I'm not sure what to search for. Not even sure what the "{" are called in english.
Please (if possible) maintain the use of "split" and "join".
Thanks!
You could achieve this through string.replace function.
string.replace(/\{\d+\}/g, "")
Example:
> var text = "Hello world! My name is {0}. How {1} can you be?"
undefined
> text.replace(/\{\d+\}/g, "")
'Hello world! My name is . How can you be?'
Through string.split. You just need to remove the anchors. ^ asserts that we are at the start and $ asserts that we are at the end. Because there isn't only a string {num} exists in a single line, your regex fails. And also remove the quotes which are around the regex delimiter /
> text.split(/\{\d+\}/).join("");
'Hello world! My name is . How can you be?'
You were close. What you want is: text = text.split(/\{\d+\}/).join("");
The two things that you missed were:
^ = the start of the string and $ = the end of the string. Since there are other characters around the pattern that you are trying to match, you don't want those.
if you are using a regular expression in the split() method, you need to use a RegExp object, not a string. Removing the "'s and just having the expression start and end with / will define a RegExp object.
I would actually agree with the others that the replace() method would be a better way to do what you are trying to accomplish, but if you want to use split() and join(), as you've stated, this is the change that you need.
You can just use replace mehthod:
var repl = text.replace(/\{\d+\} */g, '');
//=> "Hello world! My name is . How can you be?"

Replace string using regular expression at specific position dynamically set

I want to use regular expression to replace a string from the matching pattern string.
Here is my string :
"this is just a simple text. this is just a simple text. this is just a simple text. this is just a simple text. How are you man today. I Have been working on this."
Now, I have a situation that, I want to replace "just a simple" with say "hello", but only in the third occurrence of a string. So can anyone guide me through this I will be very helpful. But the twist comes here. The above string is dynamic. The user can modify or change text.
So how can I check, if the user add "this is just a simple text" one or more times at the start or before the third occurrence of string which changes my string replacement position?
Sorry if I am unclear; But any guidance or help or any other methods will be helpful.
You can use this regex:
(?:.*?(just a simple)){3}
Working demo
You can use replace with a dynamically built regular expression and a callback in which you count the occurrences of the searched pattern :
var s = "this is just a simple text. this is just a simple text. this is just a simple text. this is just a simple text. How are you man today. I Have been working on this.",
pattern = "just a simple",
escapedPattern = pattern.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'),
i=0;
s = s.replace(new RegExp(escapedPattern,'g'), function(t){ return ++i===3 ? "hello" : t });
Note that I used this related QA to escape any "special" characters in the pattern.
Try
$(selector)
.data("r", ["simple text", 3, "hello"])
.text(function (i, o) {
var r = $(this).data("r");
return o.replace(new RegExp(r[0], "g"), function (m) {
++i;
return (i === r[1]) ? r[2] : m
})
}).data("r", []);
jsfiddle http://jsfiddle.net/guest271314/2fm91qox/
See
Find and replace nth occurrence of [bracketed] expression in string
Replacing the nth instance of a regex match in Javascript
JavaScript: how can I replace only Nth match in the string?

RegExp using regex + variables

I know there are many questions about variables in regex. Some of them for instance:
concat variable in regexp pattern
Variables in regexp
How to properly escape characters in regexp
Matching string using variable in regular expression with $ and ^
Unfortunately none of them explains in detail how to escape my RegExp.
Let's say I want to find all files that have this string before them:
file:///storage/sdcard0/
I tried this with regex:
(?:file:\/\/\/storage\/sdcard0\(.*))(?:\"|\')
which correctly got my image1.jpg and image2.jpg in certain json file. (tried with http://regex101.com/#javascript)
For the life of me I can't get this to work inside JS. I know you should use RegExp to solve this, but I'm having issues.
var findStr = "file:///storage/sdcard0/";
var regex = "(?:"+ findStr +"(.*))(?:\"|\')";
var re = new RegExp(regex,"g");
var result = <mySearchStringVariable>.match(re);
With this I get 1 result and it's wrong (bunch of text). I reckon I should escape this as said all over the web.. I tried to escape findStr with both functions below and the result was the same. So I thought OK I need to escape some chars inside regex also.
I tried to manually escape them and the result was no matches.
I tried to escape the whole regex variable before passing it to RegExp constructor and the result was the same: no matches.
function quote(regex) {
return regex.replace(/([()[{*+.$^\\|?])/g, '\\$1');
}
function escapeRegExp(str) {
return str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
}
What the hell am I doing wrong, please?
Is there any good documentation on how to write RegExp with variables in it?
All I needed to do was use LAZY instead of greedy with
var regex = "(?:"+ findStr +"(.*?))(?:\"|\')"; // added ? in (.*?)

Categories

Resources