Split string using regex javascript - javascript

I'm trying to split the following type of string using the String.prototype.split() method:
"#flat flat#flat #flat# flat#"
The condition for splitting is '#' positioned at the beginning of the word + words that follow it , until next such '#'. For example the string above should be splitted like that:
["#flat flat#flat","#flat# flat#"]
I tried a lot of different variants but none of them is correct.

Depends a Little on what you call a word (or better what Ends a word), but i take it as a word is separated by a space character.
In this case, this code gives you the desired Output: myString.split(/ #|^#|#$/).
It Outputs as a list that contains an empty string at the start and at the beginning which is may be a bit uncomfortable; to remove this you would extend the Piece of code to: myString.split(/ #|^#|#$/).filter(match=>match!==''). This removes This Returns a list with these empty strings removed.
Hope this helps!

Related

How do I replace string within quotes in javascript?

I have this in a javascript/jQuery string (This string is grabbed from an html ($('#shortcode')) elements value which could be changed if user clicks some buttons)
[csvtohtml_create include_rows="1-10"
debug_mode="no" source_type="visualizer_plugin" path="map"
source_files="bundeslander_staple.csv" include cols="1,2,4" exclude cols="3"]
In a textbox (named incl_sc) I have the value:
include cols="2,4"
I want to replace include_cols="1,2,4" from the above string with the value from the textbox.
so basically:
How do I replace include_cols values here? (include_cols="2,4" instead of include_cols="1,2,4") I'm great at many things but regex is not one of them. I guess regex is the thing to use here?
I'm trying this:
var s = $('#shortcode').html();
//I want to replace include cols="1,2,4" exclude cols="3"
//with include_cols="1,2" exclude_cols="3" for example
s.replace('/([include="])[^]*?\1/g', incl_sc.val() );
but I don't get any replacement at all (the string s is same string as $("#shortcode").html(). Obviously I'm doing something really dumb. Please help :-)
In short what you will need is
s.replace(/include cols="[^"]+"/g, incl_sc.val());
There were a couple problems with your code,
To use a regex with String.prototype.replace, you must pass a regex as the first argument, but you were actually passing a string.
This is a regex literal /regex/ while this isn't '/actually a string/'
In the text you supplied in your question include_cols is written as include cols (with a space)
And your regex was formed wrong. I recomend testing them in this website, where you can also learn more if you want.
The code above will replace the part include cols="1,2,3" by whatever is in the textarea, regardless of whats between the quotes (as long it doesn't contain another quote).
First of all I think you need to remove the quotes and fix a little bit the regex.
const r = /(include_cols=\")(.*)(\")/g;
s.replace(r, `$1${incl_sc.val()}$3`)
Basically, I group the first and last part in order to include them at the end of the replacement. You can also avoid create the first and last group and put it literally in the last argument of the replace function, like this:
const r = /include_cols=\"(.*)\"/g;
s.replace(r, `include_cols="${incl_sc.val()}"`)

JavaScript split string by specific character string

I have a text box with a bunch of comments, all separated by a specific character string as a means of splitting them to display each comment individually.
The string in question is | but I can change this to accommodate whatever will work. My only requirement is that it is not likely to be a string of characters someone will type in an everyday sentence.
I believe I need to use the split method and possibly some regex but all the other questions I've seen only seem to mention splitting by one character or a number of different characters, not a specific set of characters in a row.
Can anyone point me in the right direction?
.split() should work for that purpose:
var comments = "this is a comment|and here is another comment|and yet another one";
var parsedComments = comments.split('|');
This will give you all comments in an array which you can then loop over or do whatever you have to do.
Keep in mind you could also change | to something like <--NEWCOMMENT--> and it will still work fine inside the split('<--NEWCOMMENT-->') method.
Remember that split() removes the character it's splitting on, so your resulting array won't contain any instances of <--NEWCOMMENT-->

Why would the replace with regex not work even though the regex does?

There may be a very simple answer to this, probably because of my familiarity (or possibly lack thereof) of the replace method and how it works with regex.
Let's say I have the following string: abcdefHellowxyz
I just want to strip the first six characters and the last four, to return Hello, using regex... Yes, I know there may be other ways, but I'm trying to explore the boundaries of what these methods are capable of doing...
Anyway, I've tinkered on http://regex101.com and got the following Regex worked out:
/^(.{6}).+(.{4})$/
Which seems to pass the string well and shows that abcdef is captured as group 1, and wxyz captured as group 2. But when I try to run the following:
"abcdefHellowxyz".replace(/^(.{6}).+(.{4})$/,"")
to replace those captured groups with "" I receive an empty string as my final output... Am I doing something wrong with this syntax? And if so, how does one correct it, keeping my original stance on wanting to use Regex in this manner...
Thanks so much everyone in advance...
The code below works well as you wish
"abcdefHellowxyz".replace(/^.{6}(.+).{4}$/,"$1")
I think that only use ()to capture the text you want, and in the second parameter of replace(), you can use $1 $2 ... to represent the group1 group2.
Also you can pass a function to the second parameter of replace,and transform the captured text to whatever you want in this function.
For more detail, as #Akxe recommend , you can find document on https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace.
You are replacing any substring that matches /^(.{6}).+(.{4})$/, with this line of code:
"abcdefHellowxyz".replace(/^(.{6}).+(.{4})$/,"")
The regex matches the whole string "abcdefHellowxyz"; thus, the whole string is replaced. Instead, if you are strictly stripping by the lengths of the extraneous substrings, you could simply use substring or substr.
Edit
The answer you're probably looking for is capturing the middle token, instead of the outer ones:
var str = "abcdefHellowxyz";
var matches = str.match(/^.{6}(.+).{4}$/);
str = matches[1]; // index 0 is entire match
console.log(str);

regex to match all keywords in a string

Being noob in regex I require some support from community
Let say I have this string str
www.anysite.com hello demo try this link
anysite.com indeed demo link
http://www.anysite.com another one
www.anysite.com
http://anysite.com
Consider 1-5 as whole string str here
I want to convert all 'anysite.com' into clickable html links, for which I am using:
str = str.replace(/((http|https|ftp):\/\/[\w?=&.\/-;#~%-]+(?![\w\s?&.\/;#~%"=-]*>))/g, '$1');
This converts all space separated words starting with http/https/ftp into links as
url
So, line 3 and line 5 has been converted correctly. Now to convert all www.anysite.com into links I again used
str = str.replace(/(\b^(http|https|ftp)?(www\.)[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig, '$1');
Though it only converts www.anysite.com into link if it is found at very beginning of str. So it convert line number 1 but not line number 4.
Note that I have used ^(http|https|ftp)?(www.) to find all www not
starting with http/https/ftp, as for http they already have been
converted
Also the link on line number 2, where it is neither started with http nor www rather it ends with .com, how the regex would be for that.
For reference you can try posting this whole string to you facebook timeline, it converts all five line into links. Check snapshot
Thanks for help, the final RegEx that helped me is:
//remove all http:// and https://
str = str.replace(/(http|https):\/\//ig, "");
//replace all string ending with .com or .in only into link
str = str.replace( /((www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.(com|in))/ig, '$1');
I used .com and .in for my specific requirement, else the solution on this http://regexr.com/39i0i will work
Though sill there is issue like- it doesn't convert shortened url into
links perfectly. e.g http://s.ly/qhdfTyuiOP will give link till s.ly
Still any suggestions?
^(http|https|ftp)?(www\.) does not mean "all www not starting with http/https/ftp" but rather "a string that starts with an optional http/https/ftp followed by www..
Indeed, ^ in this context isn't a negation but rather an anchor representing the start of the string. I suppose you used it this way because of its meaning when used in a character class ([^...]) ; it is rather tricky since its meaning change depending on the context it is found in.
You could just remove it and you should be fine, as I see no point of making sure the string does not start with http/https/ftp (you transformed those occurrences just before, there should be none left).
Edit : I mentioned lookbehind but forgot it's not available in JS...
If you wanted to make some kind of negation, the easiest way would be to use a negative lookbehind :
(?<!http|https|ftp)www\.
This matches "www." only when it's not preceded by http, https nor ftp.

Breaking a String into Chunks based on Pattern

I have one string, that looks like this:
a[abcdefghi,2,3,jklmnopqr]
The beginning "a" is fixed and non-changing, however the content within the brackets is and can follow a pattern. It will always be an alphabetical string, possibly followed by numbers separate by commas or more strings and/or numbers.
I'd like to be able to break it into chunks of the string and any numbers that follow it until the "]" or another string is met.
Probably best explained through examples and expected ideal results:
a[abcdefghi] -> "abcdefghi"
a[abcdefghi,2] -> "abcdefghi,2"
a[abcdefghi,2,3,jklmnopqr] -> "abcdefghi,2,3" and "jklmnopqr"
a[abcdefghi,2,3,jklmnopqr,stuvwxyz] -> "abcdefghi,2,3" and "jklmnopqr" and "stuvwxyz"
a[abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz] -> "abcdefghi,2,3" and "jklmnopqr,1,9" and "stuvwxyz"
a[abcdefghi,1,jklmnopqr,2,stuvwxyz,3,4] -> "abcdefghi,1" and "jklmnopqr,2" and "stuvwxyz,3,4"
Ideally a malformed string would be partially caught (but this is a nice extra):
a[2,3,jklmnopqr,1,9,stuvwxyz] -> "jklmnopqr,1,9" and "stuvwxyz"
I'm using Javascript and I realize a regex won't bring me all the way to the solution I'd like but it could be a big help. The alternative is to do a lot of manually string parsing which I can do but doesn't seem like the best answer.
Advice, tips appreciated.
UPDATE: Yes I did mean alphametcial (A-Za-z) instead of alphanumeric. Edited to reflect that. Thanks for letting me know.
You'd probably want to do this in 2 steps. First, match against:
a\[([^[\]]*)\]
and extract group 1. That'll be the stuff in the square brackets.
Next, repeatedly match against:
[a-z]+(,[0-9]+)*
That'll match things like "abcdefghi,2,3". After the first match you'll need to see if the next character is a comma and if so skip over it. (BTW: if you really meant alphanumeric rather than alphabetic like your examples, use [a-z0-9]*[a-z][a-z0-9]* instead of [a-z]+.)
Alternatively, split the string on commas and reassemble into your word with number groups.
Why wouldn't a regex bring you all the way to a solution?
The following regex works against the given data, but it makes a few assumptions (at least two alphas followed by comma separated single digits).
([a-z]{2,}(?:,\\d)*)
Example:
re = new RegExp('[a-z]{2,}(?:,\\d)*', 'g')
matches = re.exec("a[abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz]")
Assuming you can easily break out the string between the brackets, something like this might be what you're after:
> re = new RegExp('[a-z]+(?:,\\d)*(?:,?)', 'gi')
> while (match = re.exec("abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz")) { print(match[0]) }
abcdefghi,2,3,
jklmnopqr,1,9,
stuvwxyz
This has the advantage of working partially in your malformed case:
> while (match = re.exec("abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz")) { print(match[0]) }
jklmnopqr,1,9,
stuvwxy
The first character class [a-z] can be modified if you meant for it to be truly alphanumeric.

Categories

Resources