User customizable regex expression for string formatting - javascript

We have a web-app that allows clients to import some CSV data into our database, e.g. a list of products they sell.
The problem is, we'd like to store their data as-is but let the user specify a custom expression so that when they view the data it looks a bit nicer.
Some imported data might look like this:
product_label,quantity
A: Product1- 001,50
A: Product2- 001,80
A: Product3- 001,150
B: Product5- 001,100
In this case, the client might want to strip out the prefix 'A: ' and the suffix ' - 001' in the string 'A: Product1- 001' so that just 'Product1' is displayed.
The problem is, every client seems to have a different string format and desired output format.
I was thinking of providing the ability to specify a regex to format the string purely on the client-side using javascript but I'm not sure how I would use this regex and how to allow them to specify grouping or back-references.
Any suggestions on how to let them specify their own format? e.g. something like:
match_pattern = ... // input field text (escaped into a regex)
output_pattern = ... // How to let them specify the output from the match results?
display_string = applyFormatting(string, match_pattern, output_pattern);

Here's some Regex to split the string up.
// Set the original string
var strOriginal = 'B: Product5- 001,100';
// Settings to specify which parts they want
var bln = [];
bln[0] = true;
bln[1] = true;
bln[2] = false;
bln[3] = false;
// Split the orginal string up
var str = []
str[0] = strOriginal.replace(/([A-Z]\:\s)([A-Za-z0-9]+?)(\-\s[\d]+?)(\,[\d]+)/,'$1');
str[1] = strOriginal.replace(/([A-Z]\:\s)([A-Za-z0-9]+?)(\-\s[\d]+?)(\,[\d]+)/,'$2');
str[2] = strOriginal.replace(/([A-Z]\:\s)([A-Za-z0-9]+?)(\-\s[\d]+?)(\,[\d]+)/,'$3');
str[3] = strOriginal.replace(/([A-Z]\:\s)([A-Za-z0-9]+?)(\-\s[\d]+?)(\,[\d]+)/,'$4');
var strOutput = '';
for (i = 0; i < str.length; i++) {
if (bln[i]) {
strOutput += str[i] + '<br />';
}
}
document.getElementById('test').innerHTML = strOutput;
<div id="test"></div>

Related

Data extraction from a generated <script> and process the results

string Url= "https://www.audiusa.com/dealers-webapp/map/dealer/423E99";
HtmlWeb web = new HtmlWeb();
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
HtmlDocument doc = web.Load(Url);
var scriptGoogleTagManager = doc.DocumentNode.SelectNodes("//script").Where(x => x.InnerHtml.Contains("window.Audi.Vars.searchType"));
if (scriptGoogleTagManager )
{
foreach(var tag in scriptGoogleTagManager)
{
var s = tag.InnerText;
Regex r = new Regex("\\s+window\\.Audi\\.Vars\\.searchResult\\s+\\=\\s+");
Match m = r.Match(s.ToLower());
}
}
In above script I want to extract values after window.Audi.Vars.searchResult = and window.Audi.Vars.dealers = .I am facing problem in regex as I dont have much knowledge of it .Kindly help me
I understand you want to get rid of e.g.
window.Audi.Vars.searchResult =
var extract = s.slice(31); // since the string "window.Audi.Vars.searchResult =" has 31 chars
The slice() method extracts parts of a string and returns the extracted parts in a new string. Use the start and end parameters to specify the part of the string you want to extract. Here we only give the start param and it extracts to the end. The first character has the position 0, the second has position 1, and so on. >br> Regex is imho good when relacing, removing chars in a string here a simpler method works.
Modify your code and post the console result:
var scriptGoogleTagManager = doc.DocumentNode.SelectNodes("//script").Where(x => x.InnerHtml.Contains("window.Audi.Vars.searchType"));
if (scriptGoogleTagManager )
{
foreach(var tag in scriptGoogleTagManager)
{
var s = tag.InnerText;
console.debug("[content of s] " + s);
var extract = s.slice(31); // since the string
}
}

How to define a line break in extendscript for Adobe Indesign

I am using extendscript to build some invoices from downloaded plaintext emails (.txt)
At points in the file there are lines of text that look like "Order Number: 123456" and then the line ends. I have a script made from parts I found on this site that finds the end of "Order Number:" in order to get a starting position of a substring. I want to use where the return key was hit to go to the next line as the second index number to finish the substring. To do this, I have another piece of script from the helpful people of this site that makes an array out of the indexes of every instance of a character. I will then use whichever array object is a higher number than the first number for the substring.
It's a bit convoluted, but I'm not great with Javascript yet, and if there is an easier way, I don't know it.
What is the character I need to use to emulate a return key in a txt file in javascript for extendscript for indesign?
Thank you.
I have tried things like \n and \r\n and ^p both with and without quotes around them but none of those seem to show up in the array when I try them.
//Load Email as String
var b = new File("~/Desktop/Test/email.txt");
b.open('r');
var str = "";
while (!b.eof)
str += b.readln();
b.close();
var orderNumberLocation = str.search("Order Number: ") + 14;
var orderNumber = str.substring(orderNumberLocation, ARRAY NUMBER GOES HERE)
var loc = orderNumberLocation.lineNumber
function indexes(source, find) {
var result = [];
for (i = 0; i < source.length; ++i) {
// If you want to search case insensitive use
// if (source.substring(i, i + find.length).toLowerCase() == find) {
if (source.substring(i, i + find.length) == find) {
result.push(i);
}
}
alert(result)
}
indexes(str, NEW PARAGRAPH CHARACTER GOES HERE)
I want all my line breaks to show up as an array of indexes in the variable "result".
Edit: My method of importing stripped all line breaks from the document. Using the code below instead works better. Now \n works.
var file = File("~/Desktop/Test/email.txt", "utf-8");
file.open("r");
var str = file.read();
file.close();
You need to use Regular Expressions. Depending on the fields do you need to search, you'l need to tweek the regular expressions, but I can give you a point. If the fields on the email are separated by new lines, something like that will work:
var str; //your string
var fields = {}
var lookFor = /(Order Number:|Adress:).*?\n/g;
str.replace(lookFor, function(match){
var order = match.split(':');
var field = order[0].replace(/\s/g, '');//remove all spaces
var value = order[1];
fields[field]= value;
})
With (Order Number:|Adress:) you are looking for the fields, you can add more fields separated the by the or character | ,inside the parenthessis. The .*?\n operators matches any character till the first break line appears. The g flag indicates that you want to look for all matches. Then you call str.replace, beacause it allows you to perfom a single task on each match. So, if the separator of the field and the value is a colon ':', then you split the match into an array of two values: ['Order number', 12345], and then, store that matches into an object. That code wil produce:
fields = {
OrderNumber: 12345,
Adresss: "my fake adress 000"
}
Please try \n and \r
Example: indexes(str, "\r");
If i've understood well, wat you need is to str.split():
function indexes(source, find) {
var order;
var result = [];
var orders = source.split('\n'); //returns an array of strings: ["order: 12345", "order:54321", ...]
for (var i = 0, l = orders.length; i < l; i++)
{
order = orders[i];
if (order.match(/find/) != null){
result.push(i)
}
}
return result;
}

Safest delimiter or convert from javascript hashmap to java linked map?

I have data like this, TITLE is the name of variable and below it is the string data (Please note the data can be different sometimes)
TITLE1
abcd
TITLE2
abcde
TITLE3
acd,sdssds!###$#$#%$%$^&**()aas
Now, I want to to send these three to java and want to make a linked map from them I did like this
JAVASCRIPT
var string = "TITLE1=abcd, TITLE2=abcde, TITLE3=acd,sdssds!###$#$#%$%$^&**()aas"
and used it in java as
JAVA
LinkedHashMap <String, String> selectedCheckBoxMap = new LinkedHashMap <String, String> ();
String[] pairs = selectedCheckBoxTokens.split (",");
for (int i = 0; i < pairs.length; i++)
{
String pair = pairs[i];
String[] keyValue = pair.split ("=");
selectedCheckBoxMap.put (keyValue[0], keyValue[1]);
}
But this breaks on "," as delimiter as TITLE3 already has character ','
I then used this character instead of "," as delimiter "‡", but this not a good approach.
What should I do ? Should I change it to hashmap in javascript ? But how to convert hashmap from javascript to JAVA directly ? Or should I use some other delimiter ?
The delimiter should be character that we cannot type or ever come across in text.
Thanks,
Aiden
if your data is that regular you could split on a ", " instead.
if spaces are in your value sets, then you need to mark your data values... be that with single/double quotes or some other unique marker.
building a JSON object and then delivering it would likely be a more robust solution.
If you use JSON:
var s = {};
s["TITLE1"] = "skladjdklsajdsla";
s["TITLE2"] = "*&^&^%&*,,()*&";
s["TITLE3"] = "acd,sdssds!###$#$#%$%$^&**()aas";
That code will create an Java Script Object:
Object {
TITLE1="skladjdklsajdsla",
TITLE2="*&^&^%&*,,()*&",
TITLE3="acd,sdssds!###$#$#%$%$^&**()aas"
}
You can parse the Object using a JSON library for Java:
http://www.oracle.com/technetwork/articles/java/json-1973242.html
See: How to parse JSON in Java
JSON Standard: http://www.json.org/
When generating the string to be send, escape the commas and the equal signs in the values by replacing them with something else like %2C and %3D
Then or server side unescape by doing
selectedCheckBoxMap.put (keyValue[0], keyValue[1].replace("%2C",",").replace("%3D","="));
Using common characters as a delimiter is error-prawn ever since. You have to escape each delimiter you use and then parse the string by yourself. This means, that you have to call value.replace(/([\\=,])/g, '\\$1') on each entry, before appending it to your datastring.
Even if i would recommend you using JSON, as Alzoid proposed, here is an untested implementation you could use to decode the input (assuming '\' is your escape character):
boolean escaped = false;
boolean waitingForKey = true;
String key = "";
String current = "";
for (int i = 0; i < data.length(); i++) {
char character = data.charAt(i);
if (escaped) {
current += character;
escaped = false;
continue;
}
if (waitingForKey && Character.isWhitespace(character)) {
continue;
} else if (waitingForKey) {
waitingForKey = false;
}
switch (character) {
case '\\':
escaped = true;
break;
case '=':
key = current;
current = "";
break;
case ',':
map.put(key, current);
current = "";
key = "";
waitingForKey = true;
break;
default:
current += character;
}
}
if (!data.isEmpty()) {
map.put(key, current);
}

regex to find specific strings in javascript

disclaimer - absolutely new to regexes....
I have a string like this:
subject=something||x-access-token=something
For this I need to extract two values. Subject and x-access-token.
As a starting point, I wanted to collect two strings: subject= and x-access-token=. For this here is what I did:
/[a-z,-]+=/g.exec(mystring)
It returns only one element subject=. I expected both of them. Where i am doing wrong?
The g modifier does not affect exec, because exec only returns the first match by specification. What you want is the match method:
mystring.match(/[a-z,-]+=/g)
No regex necessary. Write a tiny parser, it's easy.
function parseValues(str) {
var result = {};
str.split("||").forEach(function (item) {
var parts = item.split("=");
result[ parts[0] /* key */ ] = parts[1]; /* value */
});
return result;
}
usage
var obj = parseValues("subject=something||x-access-token=something-else");
// -> {subject: "something", x-access-token: "something-else"}
var subj = obj.subject;
// -> "something"
var token = obj["x-access-token"];
// -> "something-else"
Additional complications my arise when there is an escaping schema involved that allows you to have || inside a value, or when a value can contain an =.
You will hit these complications with regex approach as well, but with a parser-based approach they will be much easier to solve.
You have to execute exec twice to get 2 extracted strings.
According to MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec
If your regular expression uses the "g" flag, you can use the exec() method multiple times to find successive matches in the same string.
Usually, people extract all strings matching the pattern one by one with a while loop. Please execute following code in browser console to see how it works.
var regex = /[a-z,-]+=/g;
var string = "subject=something||x-access-token=something";
while(matched = regex.exec(string)) console.log(matched);
You can convert the string into a valid JSON string, then parse it to retrieve an object containing the expected data.
var str = 'subject=something||x-access-token=something';
var obj = JSON.parse('{"' + str.replace(/=/g, '":"').replace(/\|\|/g, '","') + '"}');
console.log(obj);
I don't think you need regexp here, just use the javascript builtin function "split".
var s = "subject=something1||x-access-token=something2";
var r = s.split('||'); // r now is an array: ["subject=something1", "x-access-token=something2"]
var i;
for(i=0; i<r.length; i++){
// for each array's item, split again
r[i] = r[i].split('=');
}
At the end you have a matrix like the following:
y x 0 1
0 subject something1
1 x-access-token something2
And you can access the elements using x and y:
"subject" == r[0][0]
"x-access-token" == r[1][0]
"something2" == r[1][1]
If you really want to do it with a pure regexp:
var input = 'subject=something1||x-access-token=something2'
var m = /subject=(.*)\|\|x-access-token=(.*)/.exec(input)
var subject = m[1]
var xAccessToken = m[2]
console.log(subject);
console.log(xAccessToken);
However, it would probably be cleaner to split it instead:
console.log('subject=something||x-access-token=something'
.split(/\|\|/)
.map(function(a) {
a = a.split(/=/);
return { key: a[0], val: a[1] }
}));

Split the date! (It's a string actually)

I want to split this kind of String :
"14:30 - 19:30" or "14:30-19:30"
inside a javascript array like ["14:30", "19:30"]
so I have my variable
var stringa = "14:30 - 19:30";
var stringes = [];
Should i do it with regular expressions? I think I need an help
You can just use str.split :
var stringa = "14:30 - 19:30";
var res = str.split("-");
If you know that the only '-' present will be the delimiter, you can start by splitting on that:
let parts = input.split('-');
If you need to get rid of whitespace surrounding that, you should trim each part:
parts = parts.map(function (it) { return it.trim(); });
To validate those parts, you can use a regex:
parts = parts.filter(function (it) { return /^\d\d:\d\d$/.test(it); });
Combined:
var input = "14:30 - 19:30";
var parts = input.split('-').map(function(it) {
return it.trim();
}).filter(function(it) {
return /^\d\d:\d\d$/.test(it);
});
document.getElementById('results').textContent = JSON.stringify(parts);
<pre id="results"></pre>
Try this :
var stringa = "14:30 - 19:30";
var stringes = stringa.split("-"); // string is "14:30-19:30" this style
or
var stringes = stringa.split(" - "); // if string is "14:30 - 19:30"; style so it includes the spaces also around '-' character.
The split function breaks the strings in sub-strings based on the location of the substring you enter inside it "-"
. the first one splits it based on location of "-" and second one includes the spaces also " - ".
*also it looks more like 24 hour clock time format than data as you mentioned in your question.
var stringa = '14:30 - 19:30';
var stringes = stringa.split("-");
.split is probably the best way to go, though you will want to prep the string first. I would go with str.replace(/\s*-\s*/g, '-').split('-'). to demonstrate:
var str = "14:30 - 19:30"
var str2 = "14:30-19:30"
console.log(str.replace(/\s*-\s*/g, '-').split('-')) //outputs ["14:30", "19:30"]
console.log(str2 .replace(/\s*-\s*/g, '-').split('-')) //outputs ["14:30", "19:30"]
Don't forget that you can pass a RegExp into str.split
'14:30 - 19:30'.split(/\s*-\s*/); // ["14:30", "19:30"]
'14:30-19:30'.split(/\s*-\s*/); // ["14:30", "19:30"]

Categories

Resources