extracting middle OR final part of a string

extracting middle OR final part of a string - javascript

I want to extract only the first fontname out of a URL-string from the Google Webfont Directory. Here are some examples of possible strings and what part should be returned:
fonts.googleapis.com/css?family=Raleway // "Raleway"
fonts.googleapis.com/css?family=Caesar+Dressing // "Caesar Dressing"
fonts.googleapis.com/css?family=Raleway:300,400 // "Raleway"
fonts.googleapis.com/css?family=Raleway|Fondamento // "Raleway"
fonts.googleapis.com/css?family=Caesar+Dressing|Raleway:300,400|Fondamento // "Caesar Dressing"
So sometimes it's just one fontname, sometimes it has a weight indicated by a colon (:) and sometimes there are more fontnames divided by a pipe (|).
I have tried /family=(\S*)[:|]/ but it only matches the strings with :or |. I could do it like this, but it's not a nice solution:
var fontUrl = "fonts.googleapis.com/css?family=Caesar+Dressing|Raleway:300,400|Fondamento";
var fontName = /family=(\S*)/.exec(fontUrl)[1].replace(/\+/, " ");
if (fontName.indexOf(':') != -1){
fontName = fontName.split(':')[0];
}
if (fontName.indexOf('|') != -1){
fontName = fontName.split('|')[0];
}
console.log(fontName);
Is there a nice regex solution to this?

Instead of matching the character that (might) follow the string you want, match only the string you want except those characters:
/family=([^\s:|]*)/
Alternatively, you'd use a lookahead like this:
/family=(\S*?)(?=$|[:|])/

That should be better:
/family=([^:|]*)/
Of course for the + case, you'll have to replace it afterwards (or before maybe).

You can use (choose the i and m modifier in all case):
family=([a-z]+\+?[a-z]+)
or more simply
family=([a-z+]+)
or to avoid matching the + char:
family=([a-z]+)\+?([a-z]+)?
but it is an easyer way to use the second solution, and to replace the + chars with a space after.

try this:
/family\=(\S+?)[\:\|,]{0,2}\S*/ims

No regex is required in this case, unless you are good with regex's or test them thoroughly then you are likely to make mistakes.
var fontUrls = [];
fontUrls.push("fonts.googleapis.com/css?family=Raleway");
fontUrls.push("fonts.googleapis.com/css?family=Caesar+Dressing");
fontUrls.push("fonts.googleapis.com/css?family=Raleway:300,400");
fontUrls.push("fonts.googleapis.com/css?family=Raleway|Fondamento");
fontUrls.push("fonts.googleapis.com/css?family=Caesar+Dressing|Raleway:300,400|Fondamento");
function getFirstFont(url) {
return url.split("=")[1].split("|")[0].split(":")[0];
}
fontUrls.forEach(function (fontUrl) {
console.log(getFirstFont(fontUrl));
});
on jsfiddle

Related

replace \r\n with < br /> as text

Trying for 2 hours to replace \r\n with < br/> but it seems to be impossible.
I don't know what i'm doing! Please help!
const text = '"Hello!\r\n\r\nThis is a dog!'
const checkText = str=> {
const match = /\r|\n/.exec(text);
if (match) {
//return str.replace(/(?:\\[rn]|[\r\n]+)+/g, '<br/>');
return str.replace('/r/n', '<br/>');
}
return str;
};
checkText(text)

Just do this:
text.replace(/\r\n/g, '<br/>');

Covering all the possible new line character combinations.
String tmp = s.replaceAll("\r\n", "<br>"); // Windows
tmp = tmp.replaceAll("\r", "<br>"); // Old MAC
return tmp.replaceAll("\n", "<br>"); // Linux / UNIX

You may try:
(text+ '').replace(/([^>\r\n]?)(\r\n|\n\r|\r|\n)/g, '$1<br/>$2');

There are multiple things wrong with your code:
String.prototype.replace only replaces the first occurrence of a string. You need to use a regex argument with the /g flag to replace all occurrences.
Escapes use a backslash, not a forward slash: Use \r\n, not /r/n.
checkText returns a string, but your call-site doesn't do anything with the returned string - it's just dropped. Strings are immutable in JavaScript.
I don't recommend using strings to hold HTML because it can (very easily) cause HTML-injection (including <script>-injection) attacks.
Instead, do one of the following:
Use String.prototype.split and HTML-encode each string in the array and join with "<br />".
Add the string directly to the document with .textContent (don't use innerText anymore) and give the parent element the CSS style whitespace: pre-wrap;.

regex to remove certain characters at the beginning and end of a string

Let's say I have a string like this:
...hello world.bye
But I want to remove the first three dots and replace .bye with !
So the output should be
hello world!
it should only match if both conditions apply (... at the beginning and .bye at the end)
And I'm trying to use js replace method. Could you please help? Thanks

First match the dots, capture and lazy-repeat any character until you get to .bye, and match the .bye. Then, you can replace with the first captured group, plus an exclamation mark:
const str = '...hello world.bye';
console.log(str.replace(/\.\.\.(.*)\.bye/, '$1!'));
The lazy-repeat is there to ensure you don't match too much, for example:
const str = `...hello world.bye
...Hello again! Goodbye.`;
console.log(str.replace(/\.\.\.(.*)\.bye/g, '$1!'));

You don't actually need a regex to do this. Although it's a bit inelegant, the following should work fine (obviously the function can be called whatever makes sense in the context of your application):
function manipulate(string) {
if (string.slice(0, 3) == "..." && string.slice(-4) == ".bye") {
return string.slice(4, -4) + "!";
}
return string;
}
(Apologies if I made any stupid errors with indexing there, but the basic idea should be obvious.)
This, to me at least, has the advantage of being easier to reason about than a regex. Of course if you need to deal with more complicated cases you may reach the point where a regex is best - but I personally wouldn't bother for a simple use-case like the one mentioned in the OP.

Your regex would be
const rx = /\.\.\.([\s\S]*?)\.bye/g
const out = '\n\nfoobar...hello world.bye\nfoobar...ok.bye\n...line\nbreak.bye\n'.replace(rx, `$1!`)
console.log(out)
In English, find three dots, anything eager in group, and ending with .bye.
The replacement uses the first match $1 and concats ! using a string template.

An arguably simpler solution:
const str = '...hello world.bye'
const newStr = /...(.+)\.bye/.exec(str)
const formatted = newStr ? newStr[1] + '!' : str
console.log(formatted)
If the string doesn't match the regex it will just return the string.

How to split a string by a character not directly preceded by a character of the same type?

Let's say I have a string: "We.need..to...split.asap". What I would like to do is to split the string by the delimiter ., but I only wish to split by the first . and include any recurring .s in the succeeding token.
Expected output:
["We", "need", ".to", "..split", "asap"]
In other languages, I know that this is possible with a look-behind /(?<!\.)\./ but Javascript unfortunately does not support such a feature.
I am curious to see your answers to this question. Perhaps there is a clever use of look-aheads that presently evades me?
I was considering reversing the string, then re-reversing the tokens, but that seems like too much work for what I am after... plus controversy: How do you reverse a string in place in JavaScript?
Thanks for the help!

Here's a variation of the answer by guest271314 that handles more than two consecutive delimiters:
var text = "We.need.to...split.asap";
var re = /(\.*[^.]+)\./;
var items = text.split(re).filter(function(val) { return val.length > 0; });
It uses the detail that if the split expression includes a capture group, the captured items are included in the returned array. These capture groups are actually the only thing we are interested in; the tokens are all empty strings, which we filter out.
EDIT: Unfortunately there's perhaps one slight bug with this. If the text to be split starts with a delimiter, that will be included in the first token. If that's an issue, it can be remedied with:
var re = /(?:^|(\.*[^.]+))\./;
var items = text.split(re).filter(function(val) { return !!val; });
(I think this regex is ugly and would welcome an improvement.)

You can do this without any lookaheads:
var subject = "We.need.to....split.asap";
var regex = /\.?(\.*[^.]+)/g;
var matches, output = [];
while(matches = regex.exec(subject)) {
output.push(matches[1]);
}
document.write(JSON.stringify(output));
It seemed like it'd work in one line, as it did on https://regex101.com/r/cO1dP3/1, but had to be expanded in the code above because the /g option by default prevents capturing groups from returning with .match (i.e. the correct data was in the capturing groups, but we couldn't immediately access them without doing the above).
See: JavaScript Regex Global Match Groups
An alternative solution with the original one liner (plus one line) is:
document.write(JSON.stringify(
"We.need.to....split.asap".match(/\.?(\.*[^.]+)/g)
.map(function(s) { return s.replace(/^\./, ''); })
));
Take your pick!

Note: This answer can't handle more than 2 consecutive delimiters, since it was written according to the example in the revision 1 of the question, which was not very clear about such cases.
var text = "We.need.to..split.asap";
// split "." if followed by "."
var res = text.split(/\.(?=\.)/).map(function(val, key) {
// if `val[0]` does not begin with "." split "."
// else split "." if not followed by "."
return val[0] !== "." ? val.split(/\./) : val.split(/\.(?!.*\.)/)
});
// concat arrays `res[0]` , `res[1]`
res = res[0].concat(res[1]);
document.write(JSON.stringify(res));

Regex to detect a string that contains a URL or file extension

I'm trying to create a small script that detects whether the string input is either:
1) a URL (which will hold a filename): 'http://ajax.googleapis.com/html5shiv.js'
2) just a filename: 'html5shiv.js'
So far I've found this but I think it just checks the URL and file extension. Is there an easy way to make it so it uses an 'or' check? I'm not very experienced with RegExp.
var myRegExp = /[^\\]*\.(\w+)$/i;
Thank you in advance.

How bout this regex?
(\.js)$
it checks the end of the line if it has a .js on it.
$ denotes end of line.
tested here.

Basically, to use 'OR' in regex, simply use the 'pipe' delimiter.
(aaa|bbb)
will match
aaa
or
bbb
For regex to match a url, I'd suggest the following:
\w+://[\w\._~:/?#\[\]#!$&'()*+,;=%]*
This is based on the allowed character set for a url.
For the file, what's your definition of a filename?
If you want to search for strings, that match "(at least) one to many non-fullstop characters, followed by a fullstop, followed by (at least) one to many non-fullstop characters", I'd suggest the following regex:
[^\.]+\.[^\.]+
And altogether:
(\w+://[\w\._~:/?#\[\]#!$&'()*+,;=%]*|[^\.]+\.[^\.]+)
Here's an example of working (in javascript): jsfiddle
You can test it out regex online here: http://gskinner.com/RegExr/

If it is for the purpose of flow control you can do the following:
var test = "http://ajax.googleapis.com/html5shiv.js";
// to recognize http & https
var regex = /^https?:\/\/.*/i;
var result = regex.exec(test);
if (result == null){
// no URL found code
} else {
// URL found code
}
For the purpose of capturing the file name you could use:
var test = "http://ajax.googleapis.com/html5shiv.js";
var regex = /(\w+\.\w+)$/i;
var filename = regex.exec(test);

Yes, you can use the alternation operator |. Be careful, though, because its priority is very low. Lower than sequencing. You will need to write things like /(cat)|(dog)/.
It's very hard to understand what you exactly want with so few use/test cases, but
(http://[a-zA-Z0-9\./]+)|([a-zA-Z0-9\.]+)
should give you a starting point.

If it's a URL, strip it down to the last part and treat it the same way as "just a filename".
function isFile(fileOrUrl) {
// This will return everything after the last '/'; if there's
// no forward slash in the string, the unmodified string is used
var filename = fileOrUrl.split('/').pop();
return (/.+\..+/).test(filename);
}

Try this:
var ajx = 'http://ajax.googleapis.com/html5shiv.js';
function isURL(str){
return /((\/\w+)|(^\w+))\.\w{2,}$/.test(str);
}
console.log(isURL(ajx));

Have a look at this (requires no regex at all):
var filename = string.indexOf('/') == -1
? string
: string.split('/').slice(-1)[0];

Here is the program!
<script>
var url="Home/this/example/file.js";
var condition=0;
var result="";
for(var i=url.length; i>0 && condition<2 ;i--)
{
if(url[i]!="/" && url[i]!="."){result= (condition==1)? (url[i]+result):(result);}
else{condition++;}
}
document.write(result);
</script>

How to remove comma from number which comes dynamically in .tpl file

i want to remove comma from a number (e.g change 1,125 to 1125 ) in a .tpl file.
The value comes dynamically like ${variableMap[key]}

var a='1,125';
a=a.replace(/\,/g,''); // 1125, but a string, so convert it to number
a=parseInt(a,10);
Hope it helps.

var a='1,125'
a=a.replace(/\,/g,'')
a=Number(a)

You can use the below function. This function can also handle larger numbers like 123,123,123.
function removeCommas(str) {
while (str.search(",") >= 0) {
str = (str + "").replace(',', '');
}
return str;
};

var s = '1,125';
s = s.split(',').join('');
Hope that helps.

✨ ES2021 ✨ added replaceAll, so no need for regular expression:
const str = '1,125,100.05';
const number = parseFloat(str.replaceAll(",", ""));

You can use Regular Expression to change as it is faster than split join
var s = '1,125';
s = s.replace(/,/g, '');
//output 1125

Incoming value may not always be a string. If the incoming value is a numeric the replace method won't be available and you'll get an error.
Suggest using isNaN to see if numeric, then assume string and do replacement otherwise.
if(isNaN(x)) {
x = parseInt(x.replace(/[,]/g,''));
}
(Not foolproof because 'not number' doesn't prove it is a string, but unless you're doing something very weird should be good enough).
You can also add other symbols to the character group to remove other stray chars (such as currency symbols).

Develop Reference

JavaScript is the programming language of the Web.

extracting middle OR final part of a string - javascript

Instead of matching the character that (might) follow the string you want, match only the string you want except those characters: /family=([^\s:|])/ Alternatively, you'd use a lookahead like this: /family=(\S?)(?=$|[:|])/

That should be better: /family=([^:|]*)/ Of course for the + case, you'll have to replace it afterwards (or before maybe).

You can use (choose the i and m modifier in all case): family=([a-z]+\+?[a-z]+) or more simply family=([a-z+]+) or to avoid matching the + char: family=([a-z]+)\+?([a-z]+)? but it is an easyer way to use the second solution, and to replace the + chars with a space after.

try this: /family\=(\S+?)[\:\|,]{0,2}\S*/ims

Related

replace \r\n with < br /> as text

regex to remove certain characters at the beginning and end of a string

How to split a string by a character not directly preceded by a character of the same type?

Regex to detect a string that contains a URL or file extension

How to remove comma from number which comes dynamically in .tpl file

Categories

Resources

Develop Reference

JavaScript is the programming language of the Web.

extracting middle OR final part of a string - javascript

Instead of matching the character that (might) follow the string you want, match only the string you want except those characters: /family=([^\s:|]*)/ Alternatively, you'd use a lookahead like this: /family=(\S*?)(?=$|[:|])/

That should be better: /family=([^:|]*)/ Of course for the + case, you'll have to replace it afterwards (or before maybe).

You can use (choose the i and m modifier in all case): family=([a-z]+\+?[a-z]+) or more simply family=([a-z+]+) or to avoid matching the + char: family=([a-z]+)\+?([a-z]+)? but it is an easyer way to use the second solution, and to replace the + chars with a space after.

try this: /family\=(\S+?)[\:\|,]{0,2}\S*/ims

Related

replace \r\n with < br /> as text

regex to remove certain characters at the beginning and end of a string

How to split a string by a character not directly preceded by a character of the same type?

Regex to detect a string that contains a URL or file extension

How to remove comma from number which comes dynamically in .tpl file

Categories

Resources

Instead of matching the character that (might) follow the string you want, match only the string you want except those characters: /family=([^\s:|])/ Alternatively, you'd use a lookahead like this: /family=(\S?)(?=$|[:|])/