Regex find string and replace that line and following lines - javascript

I am trying to find a regex to achieve the following criteria which I need to use in javascript.
Input file
some string is here and above this line
:62M:C111111EUR1211498,00
:20:0000/11111000000
:25:1111111111
:28C:00001/00002
:60M:C170926EUR1211498,06
:61:1710050926C167,XXNCHKXXXXX 11111//111111/111111
Output has to be
some string is here and above this line
:61:1710050926C167,XXNCHKXXXXX 11111//111111/111111
Briefly, find :62M: and then replace (and delete) the lines starting with :62M: followed by lines starting with :20:, :25:, :28c: and :60M:.
Or, find :62M: and replace (and delete) until the line starting with :61:.
Each line has fixed length of 80 characters followed by newline (CR LF).
Is this really possible with regex?
I know how to find a string and replace the same line where the string is. But here multiple lines to be removed which is quite hard for me.
Please could someone help me out if it is possible with regex.

Here it is. First I'm finding text to delete using regex (note that I'm using [^]* to match all the lines insted of .*, as it also matches newlines). Then I'm replacing it with a newline.
var regex = /:62M:.*([^]*):61:.*/;
var text = `some string is here and above this line
:62M:C111111EUR1211498,00
:20:0000/11111000000
:25:1111111111
:28C:00001/00002
:60M:C170926EUR1211498,06
:61:1710050926C167,XXNCHKXXXXX 11111//111111/111111`;
var textToDelete = regex.exec(text)[1];
var result = text.replace(textToDelete, '\n');
console.log(result);

Related

Regex Match End of Line Unless it Ends with a Closed Bracket

I'm trying to write a JavaScript Regex that will grab the end of a line unless said line ends with a closing bracket, example:
[word]
lengthy text line
[other word]
even lengthier text line! Whoo!
That part I have down pat writing up this Regex new RegExp(/[\n]\n|(?![^\]])$/gm)
But I also need to be able to grab the end of the line even where there isn't a double space, and that is proving to be SUPER difficult since I don't really know a ton about Regex.
-- [word]
These two lines need to be grouped -- lengthy text line
-- [other word]
These two lines need to be grouped -- even lengthier text line! Whoo!
This needs to be it's own group -- This text line is the longest of them all!
-- [more words]
These two lines need to be grouped -- The last guy can win...
What's annoying is that there is a very simple Regex that accomplishes this goal, but it's not currently supported in FireFox, and that's a problem. (?<!])\n Negative Look Behind Assertion
EDIT: The method used for the information is splitting, it splits the value placed into a textarea and matches it to array[i].match(/^\[(.*?)\]\n/). It'd look something like this:
var regex = new RegExp(/[\n]\n|(?![^\]])$/gm);
var array = $('#textar').val().split(regex);
for (var i = 0; i < array.length; i++) {
var match = array[i].match(/^\[(.*?)\]\n/)
}
but with a lot more code taking those variables and using them.
SOLUTION:
Wiktor Stribiżew had the solution. Changing .split(regex) to .match(regex) and adding their regex fixed the problem
var regex = new RegExp(/^.*[^\]\n](?:\]\n.*[^\]\n])*$/gm);
var array = $('#textar').val().match(regex);
for (var i = 0; i < array.length; i++) {
var match = array[i].match(/^\[(.*?)\]\n/)
}
You may use String#match:
text.match(/^.*[^\]\n](?:\]\n.*[^\]\n])*$/gm)
Regex details
^ - start of a line
.*[^\]\n] - 0 or more chars other than line break chars, as many as possible and then a char other than a newline and ]
(?:\]\n.*[^\]\n])* - 0 or more repetitions of
\]\n - ] and a newline, LF, char
.*[^\]\n] - 0 or more chars other than line break chars, as many as possible and then a char other than a newline and ]
$ - end of a line.
See the JS demo:
var text = "[word]\nlengthy text line\n\n[other word]\neven lengthier text line! Whoo!\nThis text is the longest of them all!\n[more words]\nThe last gyu can win...";
console.log(text.match(/^.*[^\]\n](?:\]\n.*[^\]\n])*$/gm));
You are looking for a regex like this:
/^\[.+(\n+[^\[]+)/gm
^ at the begining of the string,
look for [
.+ followed by any character
(\n+[^\[]+) an enter any number of times or any character as long as it is not [
Demo: https://regex101.com/r/c1giqu/3
For your convenience, the full match keeps the text between brackets. The first group includes only the text without the brackets.

Getting element from filename using continous split or regex

I currently have the following string :
AAAAA/BBBBB/1565079415419-1564416946615-file-test.dsv
But I would like to split it to only get the following result (removing all tree directories + removing timestamp before the file):
1564416946615-file-test.dsv
I currently have the following code, but it's not working when the filename itselfs contains a '-' like in the example.
getFilename(str){
return(str.split('\\').pop().split('/').pop().split('-')[1]);
}
I don't want to use a loop for performances considerations (I may have lots of files to work with...) So it there an other solution (maybe regex ?)
We can try doing a regex replacement with the following pattern:
.*\/\d+-\b
Replacing the match with empty string should leave you with the result you want.
var filename = "AAAAA/BBBBB/1565079415419-1564416946615-file-test.dsv";
var output = filename.replace(/.*\/\d+-\b/, "");
console.log(output);
The pattern works by using .*/ to first consume everything up, and including, the final path separator. Then, \d+- consumes the timestamp as well as the dash that follows, leaving only the portion you want.
You may use this regex and get captured group #1:
/[^\/-]+-(.+)$/
RegEx Demo
RegEx Details:
[^\/-]+: Match any character that is not / and not -
-: Match literal -
(.+): Match 1+ of any characters
$: End
Code:
var filename = "AAAAA/BBBBB/1565079415419-1564416946615-file-test.dsv";
var m = filename.match(/[^\/-]+-(.+)$/);
console.log(m[1]);
//=> 1564416946615-file-test.dsv

JS conditional RegEx that removes different parts of a string between two delimiters

I have a string of text with HTML line breaks. Some of the <br> immediately follow a number between two delimiters «...» and some do not.
Here's the string:
var str = ("«1»<br>«2»some text<br>«3»<br>«4»more text<br>«5»<br>«6»even more text<br>");
I’m looking for a conditional regex that’ll remove the number and delimiters (ex. «1») as well as the line break itself without removing all of the line breaks in the string.
So for instance, at the beginning of my example string, when the script encounters »<br> it’ll remove everything between and including the first « to the left, to »<br> (ex. «1»<br>). However it would not remove «2»some text<br>.
I’ve had some help removing the entire number/delimiters (ex. «1») using the following:
var regex = new RegExp(UsedKeys.join('|'), 'g');
var nextStr = str.replace(/«[^»]*»/g, " ");
I sure hope that makes sense.
Just to be super clear, when the string is rendered in a browser, I’d like to go from this…
«1»
«2»some text
«3»
«4»more text
«5»
«6»even more text
To this…
«2»some text
«4»more text
«6»even more text
Many thanks!
Maybe I'm missing a subtlety here, if so I apologize. But it seems that you can just replace with the regex: /«\d+»<br>/g. This will replace all occurrences of a number between « & » followed by <br>
var str = "«1»<br>«2»some text<br>«3»<br>«4»more text<br>«5»<br>«6»even more text<br>"
var newStr = str.replace(/«\d+»<br>/g, '')
console.log(newStr)
To match letters and digits you can use \w instead of \d
var str = "«a»<br>«b»some text<br>«hel»<br>«4»more text<br>«5»<br>«6»even more text<br>"
var newStr = str.replace(/«\w+?»<br>/g, '')
console.log(newStr)
This snippet assumes that the input within the brackets will always be a number but I think it solves the problem you're trying to solve.
const str = "«1»<br>«2»some text<br>«3»<br>«4»more text<br>«5»<br>«6»even more text<br>";
console.log(str.replace(/(«(\d+)»<br>)/g, ""));
/(«(\d+)»<br>)/g
«(\d+)» Will match any brackets containing 1 or more digits in a row
If you would prefer to match alphanumeric you could use «(\w+)» or for any characters including symbols you could use «([^»]+)»
<br> Will match a line break
//g Matches globally so that it can find every instance of the substring
Basically we are only removing the bracketed numbers if they are immediately followed by a line break.

Get Newline character using javascript

In my html page I have to split user input based on newline character.
How to get newline character using javascript?
Please see the below code :
var str=document.getElementById('nwline').value;
var lines = str.split(/\r\n|\r|\n/g);
console.log(lines);
http://jsfiddle.net/asimshahiddIT/0yog7v83/
The resume of possible duplicate is using regex does allow you to ignore the OS you're using:
I don't think you really need to do much of any determining, though. If you just want to split the text on newlines, you could do something like this:
lines = foo.value.split(/\r\n|\r|\n/g);
In your case:
var splittedValues = originalTxt.split(/\r\n|\r|\n/g);

Javascript Regex only replacing first match occurence

I am using regular expressions to do some basic converting of wiki markup code into copy-pastable plain text, and I'm using javascript to do the work.
However, javascript's regex engine behaves much differently to the ones I've used previously as well as the regex in Notepad++ that I use on a daily basis.
For example- given a test string:
==Section Header==
===Subsection 1===
# Content begins here.
## Content continues here.
I want to end up with:
Section Header
Subsection 1
# Content begins here.
## Content continues here.
Simply remove all equals signs.
I began with the regex setup of:
var reg_titles = /(^)(=+)(.+)(=+)/
This regex searches for lines that begin with one or more equals with another set of one or more equals. Rubular shows that it matches my lines accurately and does not catch equals signs in the middle of contet. http://www.rubular.com/r/46PrkPx8OB
The code to replace the string based on regex
var lines = $('.tb_in').val().split('\n'); //use jquery to grab text in a textarea, and split into an array of lines based on the \n
for(var i = 0;i < lines.length;i++){
line_temp = lines[i].replace(reg_titles, "");
lines[i] = line_temp; //replace line with temp
}
$('.tb_out').val(lines.join("\n")); //rejoin and print result
My result is unfortunately:
Section Header==
Subsection 1===
# Content begins here.
## Content continues here.
I cannot figure out why the regex replace function, when it finds multiple matches, seems to only replace the first instance it finds, not all instances.
Even when my regex is updated to:
var reg_titles = /(={2,})/
"Find any two or more equals", the output is still identical. It makes a single replacement and ignores all other matches.
No one regex expression executor behaves this way for me. Running the same replace multiple times has no effect.
Any advice on how to get my string replace function to replace ALL instances of the matched regex instead of just the first one?
^=+|=+$
You can use this.Do not forget to add g and m flags.Replace by ``.See demo.
http://regex101.com/r/nA6hN9/28
Add the g modifier to do a global search:
var reg_titles = /^(=+)(.+?)(=+)/g
Your regex is needlessly complex, and yet doesn't actually accomplish what you set out to do. :) You might try something like this instead:
var reg_titles = /^=+(.+?)=+$/;
lines = $('.tb_in').val().split('\n');
lines.forEach(function(v, i, a) {
a[i] = v.replace(reg_titles, '$1');
})
$('.tb_out').val(lines.join("\n"));

Categories

Resources