Regular Express for Javascript - Contain a specific word in the beginning after get any character until a certain character comes - javascript

I need a certain type of regular expression where I need list of special type of strings from a string. Example input:
str = 'this is extra data which i do not need /type/123456/weqweqweqweqw/ these are more extra data which i dont need /'
Result needed:
/type/123456/weqweqweqweqw/
Here the /type/ string will be constant and the remaining will be dynamic i.e. 123456/weqweqweqweqw and the last string will be /.
I tried:
var myRe = /\/type\/(.*)\//g
But this matches everything from /type/ to the end of the string.

Instead of repeating ., which will match anything, repeat anything but a space via \S+, so that only the URL part of the string will be matched:
const str = 'this is extra data which i do not need /type/123456/weqweqweqweqw/ these are more extra data which i dont need /';
console.log(str.match(/\/type\S+/));

It's tagged Python, so here is a solution:
import re
re.search(r"/type/[^/]*/[^/]*/",str)
Out: <_sre.SRE_Match object; span=(39, 66), match='/type/123456/weqweqweqweqw/'>

Related

Extracting a complicated part of the string with plain Javascript

I have a following string:
Text
I want to extract from this string, with the use of JavaScript 'pl' or 'pl_company_com'
There are a few variables:
jan_kowalski is a name and surname it can change, and sometimes even have 3 elements
the country code (in this example 'pl') will change to other en / de / fr (this is that part of the string i want to get)
the rest of the string remains the same for every case (beginning + everything after starting with _company_com ...
Ps. I tried to do it with split, but my knowledge of JS is very basic and I cant get what i want, plase help
An alternative to Randy Casburn's solution using regex
let out = new URL('https://my.domain.com/personal/jan_kowalski_pl_company_com/Documents/Forms/All.aspx').href.match('.*_(.*_company_com)')[1];
console.log(out);
Or if you want to just get that string with those country codes you specified
let out = new URL('https://my.domain.com/personal/jan_kowalski_pl_company_com/Documents/Forms/All.aspx').href.match('.*_((en|de|fr|pl)_company_com)')[1];
console.log(out);
let out = new URL('https://my.domain.com/personal/jan_kowalski_pl_company_com/Documents/Forms/All.aspx').href.match('.*_((en|de|fr|pl)_company_com)')[1];
console.log(out);
A proof of concept that this solution also works for other combinations
let urls = [
new URL('https://my.domain.com/personal/jan_kowalski_pl_company_com/Documents/Forms/All.aspx'),
new URL('https://my.domain.com/personal/firstname_middlename_lastname_pl_company_com/Documents/Forms/All.aspx')
]
urls.forEach(url => console.log(url.href.match('.*_(en|de|fr|pl).*')[1]))
I have been very successful before with this kind of problems with regular expressions:
var string = 'Text';
var regExp = /([\w]{2})_company_com/;
find = string.match(regExp);
console.log(find); // array with found matches
console.log(find[1]); // first group of regexp = country code
First you got your given string. Second you have a regular expression, which is marked with two slashes at the beginning and at the end. A regular expression is mostly used for string searches (you can even replace complicated text in all major editors with it, which can be VERY useful).
In this case here it matches exactly two word characters [\w]{2} followed directly by _company_com (\w indicates a word character, the [] group all wanted character types, here only word characters, and the {}indicate the number of characters to be found). Now to find the wanted part string.match(regExp) has to be called to get all captured findings. It returns an array with the whole captured string followed by all capture groups within the regExp (which are denoted by ()). So in this case you get the country code with find[1], which is the first and only capture group of the regular expression.

javascript regex insert new element into expression

I am passing a URL to a block of code in which I need to insert a new element into the regex. Pretty sure the regex is valid and the code seems right but no matter what I can't seem to execute the match for regex!
//** Incoming url's
//** url e.g. api/223344
//** api/11aa/page/2017
//** Need to match to the following
//** dir/api/12ab/page/1999
//** Hence the need to add dir at the front
var url = req.url;
//** pass in: /^\/api\/([a-zA-Z0-9-_~ %]+)(?:\/page\/([a-zA-Z0-9-_~ %]+))?$/
var re = myregex.toString();
//** Insert dir into regex: /^dir\/api\/([a-zA-Z0-9-_~ %]+)(?:\/page\/([a-zA-Z0-9-_~ %]+))?$/
var regVar = re.substr(0, 2) + 'dir' + re.substr(2);
var matchedData = url.match(regVar);
matchedData === null ? console.log('NO') : console.log('Yay');
I hope I am just missing the obvious but can anyone see why I can't match and always returns NO?
Thanks
Let's break down your regex
^\/api\/ this matches the beginning of a string, and it looks to match exactly the string "/api"
([a-zA-Z0-9-_~ %]+) this is a capturing group: this one specifically will capture anything inside those brackets, with the + indicating to capture 1 or more, so for example, this section will match abAB25-_ %
(?:\/page\/([a-zA-Z0-9-_~ %]+)) this groups multiple tokens together as well, but does not create a capturing group like above (the ?: makes it non-captuing). You are first matching a string exactly like "/page/" followed by a group exactly like mentioned in the paragraph above (that matches a-z, A-Z, 0-9, etc.
?$ is at the end, and the ? means capture 0 or more of the precending group, and the $ matches the end of the string
This regex will match this string, for example: /api/abAB25-_ %/page/abAB25-_ %
You may be able to take advantage of capturing groups, however, and use something like this instead to get similar results: ^\/api\/([a-zA-Z0-9-_~ %]+)\/page\/\1?$. Here, we are using \1 to reference that first capturing group and match exactly the same tokens it is matching. EDIT: actually, this probably won't work, since the text after /api/ and the text after /page/ will most likely be different, carrying on...
Afterwards, you are are adding "dir" to the beginning of your search, so you can now match someting like this: dir/api/abAB25-_ %/page/abAB25-_ %
You have also now converted the regex to a string, so like Crayon Violent pointed out in their comment, this will break your expected funtionality. You can fix this by using .source on your regex: var matchedData = url.match(regVar.source); https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/source
Now you can properly match a string like this: dir/api/11aa/page/2017 see this example: https://repl.it/Mj8h
As mentioned by Crayon Violent in the comments, it seems you're passing a String rather than a regular expression in the .match() function. maybe try the following:
url.match(new RegExp(regVar, "i"));
to convert the string to a regular expression. The "i" is for ignore case; don't know that's what you want. Learn more here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp

Parse JSON but preserve \n in strings

I have this JSON string:
{\"text\":\"Line 1\\nLine 2\",\"color\":\"black\"}
I can parse it when I do this:
pg = JSON.parse(myJSONString.replace(/\\/g, ""));
But when I access pg.text the value is:
Line 1nLine 2.
But I want the value to be exactly:
Line 1\nLine 2
The JSON string is valid in terms of the target program which interprets it as part of a larger command. It's Minecraft actually. Minecraft will render this as you would expect with Line 1 and Line 2 on separate lines.
But I'm making a editor that needs to read the \n back in as is. Which will be displayed in an html input field.
Just as some context here is the full command which contains some JSON code.
/summon zombie ~ ~1 ~ {HandItems:[{id:"minecraft:written_book",Count:1b,tag:{title‌​:"",author:"",pages:‌​["{\"text\":\"Line 1\\nLine 2\",\"color\":\"black\"}"]}},{}]}
Try adding [1] at /\[1]/g but works for single slash only, but since the type of the quoted json i think is a string when you parse that it slash will automatically be removed so you don't even need to use replace. and \n will remain as.
var myString ='{\"text\":\"Line 1\\nLine 2\",\"color\":\"black\"}';
console.log(JSON.parse(myString.replace(/\\[1]/g, ""))); //adding [1] will remove single slash \\n -> \n
var myString =JSON.parse(myString.replace(/\\[1]/g, ""));
console.log(myString.text);
Your string is not valid JSON, and ideally you should fix the code that generates it, or contact the provider of it.
If the issue is that there is always one backslash too many, then you could do this:
// Need to escape the backslashes in this string literal to get the actual input:
var myJSONString = '{\\"text\\":\\"Line 1\\\\nLine 2\\",\\"color\\":\\"black\\"}';
console.log(myJSONString);
// Only replace backslashes that are not preceded by another:
var fixedJSON = myJSONString.replace(/([^\\])\\/g, "$1");
console.log(fixedJSON);
var pg = JSON.parse(fixedJSON);
console.log(pg);

string replace using a regex

I have a string after Json.stringify in javascript using node. I wanted to replace the text in the string which starts with 'ab' then followed by some numbers(atleast one digit), with 'ab^^^^^^' where the number of '^' s should be equal to the number of digits after ab. The text starting with ab can occur atleast once, In this example it occurs twice. I need help in regex and replacing the string
string - in this, text starting with ab occurs twice.
var str = JSON.stringify({"abc":{"idcardno":"ertyuiop","form":{"somestring":"This string:\n- can have multiple \nab12345ab5677\n","flag":"true","flag2":"false"},"anothertext":"samplestring","numbetstr":"7"}});
after the regex replace it should be like this
{"abc":{"idcardno":"ertyuiop","form":{"somestring":"This string:\n- can have multiple \na^^^^^ab^^^^\n","flag":"true","flag2":"false"},"anothertext":"samplestring","numbetstr":"7"}}
Edit
As per the post below the below will be the contents of obj.abc.form.string, coming in multiple lines. How do I do the regex(above mentioned) replace of this object?
This string:
- can have multiple
ab12345ab56778
Don't process stringifed JSON with regexp. Process the JavaScript object itself, then stringify. In your case, assuming obj is the input:
obj.abc.form.somestring = transform(obj.abc.form.somestring);
str = JSON.stringify(obj);
where transform is a regexp/replace making the transformation you want.
#torazaburo is right, it's a bad practice to manipulate JSON directly. Once you get ahold of the string in obj.abc.form.somestring, though, you can use replace, passing a function:
str.replace(/ab\d+/g, function(match) {return match.replace(/\d/g,'^')})

Javascript:Replace single characters after the string

I'm trying to do something which seems fairly basic, but can't seem to get it working.
I'm trying to strip the characters after the last instance of an underscore.
I have this long Query String:
json_data=demo_title=Demo+title&proc1_script=script.sh+parameters&proc1_chk_make=on&outputp2_value=&demo_input_description=hola+mundo&outputp4_visible=on&outputp4_info=&inputdata1_max_pixels=1024000&tag=&outputp1_id=nanana&proc1_src_compresion=zip&proc1_chk_cmake=off&outputp3_description=&outputp3_value=&inputdata1_description=input+data+description&inputp2_description=bien%3F&inputp3_description=funciona&proc1_cmake=-D+CMAKE_BUILD_TYPE%3Astring%3DRelease+&outputp2_visible=on&outputp3_visible=on&outputp1_type=header&inputp1_type=text&demo_params_description=va+bien&outputp1_description=&inputdata1_type=image2d&proc1_chk_script=off&demo_result_description=win%3F&outputp2_id=nanfdsvfa&inputp1_description=funciona&demo_wait_description=boh&outputp4_description=&inputp2_type=integer&inputp2_id=papapa&outputp1_value=&outputp3_id=nananartrtrt&inputp3_id=pepepe&outputp3_type=header&inputp3_visible=+off&outputp1_visible=on&inputdata1_id=id_lsd&outputp4_value=&inputp2_visible=on&proc1_source=lsd-1.5.zip&inputp3_value=si&proc1_make=-j4+-C+&images_config_file=cfgmydemo.cfg&outputp2_type=header&proc1_subdir=xxx-1.5&proc1_url=http%3A%2F%2Fwww.ipol.im%2Fpub%2Falgo%2F...&inputdata1_image_depth=1x8i&inputp1_id=popopo&inputp1_value=si&inputp2_value=no&demo_data_filename=data_saved.cfg&inputdata1_info=info_lsd&outputp3_info=&inputdata1_image_format=.pgm&outputp1_info=&inputdata1_compress=False&inputp1_visible=on&proc1_id=lsd&outputp4_id=nana&outputp2_description=&outputp4_type=header&outputp2_info=&inputp3_type=float&&tag&inputp4_iddcksmdclk&inputp4_typetext&inputp4_descriptionkldmsclk&inputp4_valueklcdmkl&inputp4_infoclkdmscdl
Now I replace the separator = in separator %24+ and & in +%23+ using fr=fr.replace(/\&/g,"+%23+");
Separator
javascript Mako
= %24+
& +%23+
But the result is:
json_data%24+demo_title%24+Demo+title+%23+proc1_script%24+script.sh+parameters+%23+proc1_chk_make%24+on+%23+outputp2_value%24++%23+demo_input_description%24+hola+mundo+%23+outputp4_visible%24+on+%23+outputp4_info%24++%23+inputdata1_max_pixels%24+1024000+%23+tag%24++%23+outputp1_id%24+nanana+%23+proc1_src_compresion%24+zip+%23+proc1_chk_cmake%24+off+%23+outputp3_description%24++%23+outputp3_value%24++%23+inputdata1_description%24+input+data+description+%23+inputp2_description%24+bien%3F+%23+inputp3_description%24+funciona+%23+proc1_cmake%24+-D+CMAKE_BUILD_TYPE%3Astring%3DRelease++%23+outputp2_visible%24+on+%23+outputp3_visible%24+on+%23+outputp1_type%24+header+%23+inputp1_type%24+text+%23+demo_params_description%24+va+bien+%23+outputp1_description%24++%23+inputdata1_type%24+image2d+%23+proc1_chk_script%24+off+%23+demo_result_description%24+win%3F+%23+outputp2_id%24+nanfdsvfa+%23+inputp1_description%24+funciona+%23+demo_wait_description%24+boh+%23+outputp4_description%24++%23+inputp2_type%24+integer+%23+inputp2_id%24+papapa+%23+outputp1_value%24++%23+outputp3_id%24+nananartrtrt+%23+inputp3_id%24+pepepe+%23+outputp3_type%24+header+%23+inputp3_visible%24++off+%23+outputp1_visible%24+on+%23+inputdata1_id%24+id_lsd+%23+outputp4_value%24++%23+inputp2_visible%24+on+%23+proc1_source%24+lsd-1.5.zip+%23+inputp3_value%24+si+%23+proc1_make%24+-j4+-C++%23+images_config_file%24+cfgmydemo.cfg+%23+outputp2_type%24+header+%23+proc1_subdir%24+xxx-1.5+%23+proc1_url%24+http%3A%2F%2Fwww.ipol.im%2Fpub%2Falgo%2F...+%23+inputdata1_image_depth%24+1x8i+%23+inputp1_id%24+popopo+%23+inputp1_value%24+si+%23+inputp2_value%24+no+%23+demo_data_filename%24+data_saved.cfg+%23+inputdata1_info%24+info_lsd+%23+outputp3_info%24++%23+inputdata1_image_format%24+.pgm+%23+outputp1_info%24++%23+inputdata1_compress%24+False+%23+inputp1_visible%24+on+%23+proc1_id%24+lsd+%23+outputp4_id%24+nana+%23+outputp2_description%24++%23+outputp4_type%24+header+%23+outputp2_info%24++%23+inputp3_type%24+float+%23++%23+tag+%23+inputp4_iddcksmdclk+%23+inputp4_typetext+%23+inputp4_descriptionkldmsclk+%23+inputp4_valueklcdmkl+%23+inputp4_infoclkdmscdl
Now I am interested how to replace this = after the value jsondata.
Explain:
In the Query string there is the string json_data+%23+ and this +%23+ I want replace to =
How?
Strip the characters after the last instance of an underscore:
json_data.substring(0, json_data.lastIndexOf("_"));
Replace +%23+ with =
json_data.replace("+%23+", "=");
However, if you're trying to turn all the %xx into what they're supposed to be, you should url decode the string instead.
Which would probably have to be something like:
decodeURIComponent((json_data).replace('+', '%20'));

Categories

Resources