Regex for filename at end of href attribute (replace filepath with filename) - javascript

I am trying to replace the filepath with just the filename using regex, and struggling.
I take a list of text like this:
xlink:href="file://C:\x\y & DRAWINGS\z - CONTROLLED\a \testsvg-A01.svg"
and i just want to output
xlink:href="testsvg-A01.svg"
I can get there with separate regex (javascript) with several bits of regex as such:
let inQuotes = hrefs[0].match(/"(.*?)"/gm);
inQuotes = inQuotes[0].match(/([^\\]+$)/gm);
inQuotes = inQuotes[0].replace(/"/g, "");
this will return just the filename, but i was wondering if there was a way to take this and replace the original text to the desired style.
EDIT:
i can get it for a single occurrance with this line of code:
let testHrefs = outText.match(/xlink:href="(.*?)"/gm)[0].match(/"(.*?)"/gm)[0].match(/([^\\]+$)/gm)[0].replace(/^/, 'xlink:href="');
but it looks awful and doesn't completely do what i want. Any advice?

You could use a regex to remove the text you don't want to keep (i.e. everything between the href=" and the filename:
let href = 'xlink:href="file://C:\\x\\y & DRAWINGS\\z - CONTROLLED\\a \\testsvg-A01.svg"';
console.log(href);
console.log(href.replace(/href=".*?([^\\\/]+)$/, 'href="$1'));
Note I've used [\\\/] to allow for Unix and Windows style filenames, if you only want Windows style filenames that can simply be [\\].

First, keep in mind that backslashes in strings need to be escaped, as the backslash itself is an escape character. You'll likely want to escape these as \\.
After doing this, you can make use of a positive look-behind on the \ with the regex (?<=\\)[^\\]*:
var xlink = "file://C:\\x\\y & DRAWINGS\\z - CONTROLLED\\a \\testsvg-A01.svg";
var regex = /(?<=\\)[^\\]*/gm;
console.log(xlink.match(regex));
This splits the string into each of the folders (which may be useful within itself), though if you exclusively want the filename, you can use xlink.match(regex)[4] (assuming the length is consistent), or xlink.match(regex)[xlink.match(regex).length - 1] if it isn't.
var xlink = "file://C:\\x\\y & DRAWINGS\\z - CONTROLLED\\a \\testsvg-A01.svg";
var regex = /(?<=\\)[^\\]*/gm;
console.log(xlink.match(regex)[xlink.match(regex).length - 1]);

Related

Replace file path in Node JS

I am having a ridiculous time trying to simply remove a path from a pathname in Node JS. I think the problem is that replace is not working because the base string has slashes. But I can't seem to figure out any way to operate on the string properly. When I do replace sometimes it just removes the slashes entirely and doesn't even replace with what I asked it to.
Example.... where the heck did the slashes even go.
'C:\path\build\test\subfolder'.replace('b', 'z')
// "C:path\build\testsuzfolder"
Anyway what I'm actually trying to do is this.
Given this path I get.
C:\path\build\test\subfolder
Remove
C:\path\build\test\
But no amount of attempts with replace is working, even if I escape slashes.
Node was giving me __dirname with those slashes so there was no string function to do what I needed. So I had to use split with Node's special path.sep and then rejoin with the other slash type.
let formatted_folder = folder.split(path.sep).join('/');
According to your description of what you want, you should do
const path = require('path');
const idx = __dirname.lastIndexOf(path.sep);
const res = __dirname.slice(idx);
(or idx + 1, if you don't need the separator itself in the result).

Getting element from filename using continous split or regex

I currently have the following string :
AAAAA/BBBBB/1565079415419-1564416946615-file-test.dsv
But I would like to split it to only get the following result (removing all tree directories + removing timestamp before the file):
1564416946615-file-test.dsv
I currently have the following code, but it's not working when the filename itselfs contains a '-' like in the example.
getFilename(str){
return(str.split('\\').pop().split('/').pop().split('-')[1]);
}
I don't want to use a loop for performances considerations (I may have lots of files to work with...) So it there an other solution (maybe regex ?)
We can try doing a regex replacement with the following pattern:
.*\/\d+-\b
Replacing the match with empty string should leave you with the result you want.
var filename = "AAAAA/BBBBB/1565079415419-1564416946615-file-test.dsv";
var output = filename.replace(/.*\/\d+-\b/, "");
console.log(output);
The pattern works by using .*/ to first consume everything up, and including, the final path separator. Then, \d+- consumes the timestamp as well as the dash that follows, leaving only the portion you want.
You may use this regex and get captured group #1:
/[^\/-]+-(.+)$/
RegEx Demo
RegEx Details:
[^\/-]+: Match any character that is not / and not -
-: Match literal -
(.+): Match 1+ of any characters
$: End
Code:
var filename = "AAAAA/BBBBB/1565079415419-1564416946615-file-test.dsv";
var m = filename.match(/[^\/-]+-(.+)$/);
console.log(m[1]);
//=> 1564416946615-file-test.dsv

How would I write a Regular Expression to capture the value between Last Slash and Query String?

Problem:
Extract image file name from CDN address similar to the following:
https://cdnstorage.api.com/v0/b/my-app.com/o/photo%2FB%_2.jpeg?alt=media&token=4e32-a1a2-c48e6c91a2ba
Two-stage Solution:
I am using two regular expressions to retrieve the file name:
var postLastSlashRegEx = /[^\/]+$/,
preQueryRegEx = /^([^?]+)/;
var fileFromURL = urlString.match(postLastSlashRegEx)[0].match(preQueryRegEx)[0];
// fileFromURL = "photo%2FB%_2.jpeg"
Question:
Is there a way I can combine both regular expressions?
I've tried using capture groups, but haven't been able to produce a working solution.
From my comment
You can use a lookahead to find the "?" and use [^/] to match any non-slash characters.
/[^/]+(?=\?)/
To remove the dependency on the URL needing a "?", you can make the lookahead match a question mark or the end of line indicator (represented by $), but make sure the first glob is non-greedy.
/[^/]+?(?=\?|$)/
You don't have to use regex, you can just use split and substr.
var str = "https://cdnstorage.api.com/v0/b/my-app.com/o/photo%2FB%_2.jpeg?alt=media&token=4e32-a1a2-c48e6c91a2ba".split("?")[0];
var fileName = temp.substr(temp.lastIndexOf('/')+1);
but if regex is important to you, then:
str.match(/[^?]*\/([^?]+)/)[1]
The code using the substring method would look like the following -
var fileFromURL = urlString.substring(urlString.lastIndexOf('/') + 1, urlString.lastIndexOf('?'))

Any way to extract string within 2 different special characters using javascript?

Hi I have a varying URL similar to:
http://farm4.staticflickr.com/3877/[image_id]_[secret].jpg
e.g. http://farm4.staticflickr.com/3877/14628998490_233a15c423_q.jpg
I need to extract image_id that's first set of numbers (i.e. 14628998490) before an underscore from 14628998490_233a15c423_q.jpg between the whole URL
Is there a good way to extract image_id?
Right now I am going to use:
var image_id = image_url.match(/[\/]([0-9]+)_/)[1]
Like i said in the comment, you don't need to escape / symbol in the character class. And also you don't need even a character class also. Just \/ would be enough. The below regex would capture one or more numbers which are preceded by / symbol and followed by _ symbol.
\/(\d+)_
DEMO
> var image_id = image_url.match(/\/(\d+)_/)[1]
undefined
> image_id
'14628998490'
OR
You could try this also, if you don't want to give \d+ in your pattern.
\/([^/]*?)_
DEMO
> var image_id = image_url.match(/\/([^/]*?)_/)[1]
undefined
> image_id
'14628998490'
Not shure that it's is better way, but you can do like this:
var str = 'http://farm4.staticflickr.com/3877/[image_id]_[secret].jpg';
var image_id = str.split('/').pop().split('.')[0].split('_');
If the special character is always the same (_), you could first obtain the last part (width substring+lastIndexOf) and then use split() :
var url = "http://farm4.staticflickr.com/3877/14628998490_233a15c423_q.jpg";
var splittedUrl = url.substr(url.lastIndexOf('/')+1).split("_");
var image_id = splittedUrl[0];
console.log(image_id);
I've read somewhere that string functions are faster than regexp, so it's an option you might consider.
String splitting is faster tha regex.You can just get the last index of / and string between first occurence of _ after last occurence of /. I think that will be better idea.

How to use href.replace in extjs

how to use href.replace in extjs
This is my sample:
'iconCls': 'icon_' + href.replace(/[^.]+\./, '')
href= http://localhost:1649/SFM/Default.aspx#/SFM/config/release_history.png
Now i want to get text "release_history.png", How i get it.
Thanks
If you just want the filename, it's probably easier to do:
var href = "http://localhost:1649/SFM/Default.aspx#/SFM/config/release_history.png";
var iconCls = 'icon_' + href.split('/').pop();
Update
To get the filename without the extension, you can do something similar:
var filename = "release_history.png";
var without_ext = filename.split('.');
// Get rid of the extension
without_ext.pop()
// Join the filename back together, in case
// there were any other periods in the filename
// and to get a string
without_ext = without_ext.join('.')
some regex solutions (regex including / delimiter)
as in your example code match the start of the url that can be dropped
href.replace(/^.*\//, '')
or use a regex to get the last part of the url that you want to keep
/(?<=\/)[^.\/]+\.[^.]+$/
update
or get the icon name without .png (this is using lookbehind and lookahead feature of regex)
(?<=\/)[^.\/]+(?=\.png)
Not all flavors of regex support all lookaround reatures and I think Javascript only supports lookahead. so probably your solution is this:
[^.\/]+(?=\.png)
code examples here:
http://www.myregextester.com/?r=6acb5d23
http://www.myregextester.com/?r=b0a88a0a

Categories

Resources