Javascript regex replace with different values - javascript

I'd like to know if it is possible to replace every matching pattern in the string with not one but different values each time.
Let's say I found 5 matches in a text and I want to replace first match with a string, second match with another string, third match with another and so on... is it achievable?
var synonyms = ["extremely", "exceedingly", "exceptionally", "especially", "tremendously"];
"I'm very upset, very distress, very agitated, very annoyed and very pissed".replace(/very/g, function() {
//replace 5 matches of the keyword every with 5 synonyms in the array
});

You may try to replace the matches inside a replace callback function:
var synonyms = ["extremely", "exceedingly", "exceptionally", "especially", "tremendously"];
var cnt = 0;
console.log("I'm very upset, very distress, very agitated, very annoyed and very pissed (and very anxious)".replace(/very/g, function($0) {
if (cnt === synonyms.length) cnt = 0;
return synonyms[cnt++]; //replace 5 matches of the keyword every with 5 synonyms in the array
}));
If you have more matches than there are items in the array, the cnt will make sure the array items will be used from the first one again.

A simple recursive approach. Be sure your synonyms array has enough elements to cover all matches in your string.
let synonyms = ["extremely", "exceedingly", "exceptionally"]
let yourString = "I'm very happy, very joyful, and very handsome."
let rex = /very/
function r (s, i) {
let newStr = s.replace(rex, synonyms[i])
if (newStr === s)
return s
return r(newStr, i+1)
}
r(yourString, 0)
I would caution that if your replacement would also match your regex, you need to add an additional check.

function replaceExpressionWithSynonymsInText(text, regX, synonymList) {
var
list = [];
function getSynonym() {
if (list.length <= 0) {
list = Array.from(synonymList);
}
return list.shift();
}
return text.replace(regX, getSynonym);
}
var
synonymList = ["extremely", "exceedingly", "exceptionally", "especially", "tremendously"],
textSource = "I'm very upset, very distress, very agitated, very annoyed and very pissed",
finalText = replaceExpressionWithSynonymsInText(textSource, (/very/g), synonymList);
console.log("synonymList : ", synonymList);
console.log("textSource : ", textSource);
console.log("finalText : ", finalText);
The advantages of the above approach are, firstly one does not alter the list of synonyms,
secondly working internally with an ever new copy of the provided list and shifting it,
makes additional counters obsolete and also provides the opportunity of being able to
shuffle the new copy (once it has been emptied), thus achieving a more random replacement.

Using the example you've provided, here's what I would do.
First I would set up some variables
var text = "I'm very upset, very distress, very agitated, very annoyed and very pissed";
var regex = /very/;
var synonyms = ["extremely", "exceedingly", "exceptionally", "especially", "tremendously"];
Then count the number of matches
var count = text.match(/very/g).length;
Then I would run a loop to replace the matches with the values from the array
for(var x = 0; x < count; x++) {
text = text.replace(regex, synonyms[x]);
}

You can do it with the use of Replace() function, where you use 'g' option for global matching (finds all occurrences of searched expression). For the second argument you can use a function which returns values from your predefined array.
Here is a little fiddle where you can try it out.
var str = "test test test";
var rep = ["one", "two", "three"];
var ix = 0;
var res = str.replace(/test/g, function() {
if (ix == rep.length)
ix = 0;
return rep[ix++];
});
$("#result").text(res);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<p id="result">
Result...
</p>

Yes it is achievable. There may be a more efficient answer than this, but the brute force way is to double the length of your regex. i.e. Instead of searching just A, search (/A){optionalText}(/A) and then replace /1 /2 as needed. If you need help with the regex itself, provide some code for what you're searching for and someone with more rep than me can probably comment the actual regexp.

Related

what is best way to match words from list to words from sentence in javascript?

i have two sentences and i would like to find all the words they share regardless of capitalization or punctuation.
currently this is what I am doing:
searchWords = sentence1.split(" ");
var wordList = sentence2.split(" ");
const matchList = wordList.filter(value => -1 !== searchWords.indexOf(value));
it works ok but obviously capitalization and punctuation cause issues.
i know i need to incorporate something like .match() in there but i don't know how to work with it. I am sure this is something someone has done before just havent found the code yet, any refrences are also appreciated.
Thank you,
Best
This dude.
If you're looking for any words that match you can use RegExp with String.prototype.replace and verify a match using String.prototype.search with a created RegExp and an i flag to allow case insensitivity.
function compare(str1, str2, matches = []) {
str1.replace(/(\w+)/g, m => str2.search(new RegExp(m, "i")) >= 0 && matches.push(m));
return matches;
}
console.log( compare("Hello there this is a test", "Hello Test this is a world") );
If you're looking for specific words that match you can use functional composition to split each string into an Array, filter each by possible matches, and then filter one against the other.
function compare(str1, str2, matchables) {
let containFilter = (a) => (i) => a.includes(i),
matchFilter = s => s.toLowerCase().split(" ").filter(containFilter(matchables));
return matchFilter(str1).filter(containFilter( matchFilter(str2) ));
}
let matchables = ["hello", "test", "world"];
console.log( compare("Hello there this is a test", "Hi Test this is a world", matchables) );
I think you may be over-thinking this. Would just converting both sentences to an array and using a for loop to cycle through the words work? For example:
var searchWords = sentence1.split(" ");
var wordList = sentence2.toLowerCase().split(" ");
var commonWords = [];
for(var i = 0; i < searchWords.length; i++){
if(wordList.includes(searchWords[i].toLowerCase())){
commonWords.push(searchWords[i])
}
}
console.log(commonWords);
Or some variation of that.
As for the punctuation, you could probably add .replace(/[^A-Za-z0-9\s]/g,"") to the end of searchWords[i].toLowerCase() as mentioned in the following answer: https://stackoverflow.com/a/33408855/10601203

How to define a line break in extendscript for Adobe Indesign

I am using extendscript to build some invoices from downloaded plaintext emails (.txt)
At points in the file there are lines of text that look like "Order Number: 123456" and then the line ends. I have a script made from parts I found on this site that finds the end of "Order Number:" in order to get a starting position of a substring. I want to use where the return key was hit to go to the next line as the second index number to finish the substring. To do this, I have another piece of script from the helpful people of this site that makes an array out of the indexes of every instance of a character. I will then use whichever array object is a higher number than the first number for the substring.
It's a bit convoluted, but I'm not great with Javascript yet, and if there is an easier way, I don't know it.
What is the character I need to use to emulate a return key in a txt file in javascript for extendscript for indesign?
Thank you.
I have tried things like \n and \r\n and ^p both with and without quotes around them but none of those seem to show up in the array when I try them.
//Load Email as String
var b = new File("~/Desktop/Test/email.txt");
b.open('r');
var str = "";
while (!b.eof)
str += b.readln();
b.close();
var orderNumberLocation = str.search("Order Number: ") + 14;
var orderNumber = str.substring(orderNumberLocation, ARRAY NUMBER GOES HERE)
var loc = orderNumberLocation.lineNumber
function indexes(source, find) {
var result = [];
for (i = 0; i < source.length; ++i) {
// If you want to search case insensitive use
// if (source.substring(i, i + find.length).toLowerCase() == find) {
if (source.substring(i, i + find.length) == find) {
result.push(i);
}
}
alert(result)
}
indexes(str, NEW PARAGRAPH CHARACTER GOES HERE)
I want all my line breaks to show up as an array of indexes in the variable "result".
Edit: My method of importing stripped all line breaks from the document. Using the code below instead works better. Now \n works.
var file = File("~/Desktop/Test/email.txt", "utf-8");
file.open("r");
var str = file.read();
file.close();
You need to use Regular Expressions. Depending on the fields do you need to search, you'l need to tweek the regular expressions, but I can give you a point. If the fields on the email are separated by new lines, something like that will work:
var str; //your string
var fields = {}
var lookFor = /(Order Number:|Adress:).*?\n/g;
str.replace(lookFor, function(match){
var order = match.split(':');
var field = order[0].replace(/\s/g, '');//remove all spaces
var value = order[1];
fields[field]= value;
})
With (Order Number:|Adress:) you are looking for the fields, you can add more fields separated the by the or character | ,inside the parenthessis. The .*?\n operators matches any character till the first break line appears. The g flag indicates that you want to look for all matches. Then you call str.replace, beacause it allows you to perfom a single task on each match. So, if the separator of the field and the value is a colon ':', then you split the match into an array of two values: ['Order number', 12345], and then, store that matches into an object. That code wil produce:
fields = {
OrderNumber: 12345,
Adresss: "my fake adress 000"
}
Please try \n and \r
Example: indexes(str, "\r");
If i've understood well, wat you need is to str.split():
function indexes(source, find) {
var order;
var result = [];
var orders = source.split('\n'); //returns an array of strings: ["order: 12345", "order:54321", ...]
for (var i = 0, l = orders.length; i < l; i++)
{
order = orders[i];
if (order.match(/find/) != null){
result.push(i)
}
}
return result;
}

Looping through a String

I want to get every word that is shown after the word and.
var s = "you have a good day and time works for you and I'll make sure to
get the kids together and that's why I was asking you to do the needful and
confirm"
for (var  i= 0 ; i <= 3; i++){
 var body = s;
 var and = body.split("and ")[1].split(" ")[0];
 body = body.split("and ")[1].split(" ")[1];
 console.log(and);
}
How do I do this?!
Simplest thing is probably to use a regular expression looking for "and" followed by whitespace followed by the "word" after it, for instance something like /\band\s*([^\s]+)/g:
var s = "you have a good day and time works for you and I'll make sure to get the kids together and that's why I was asking you to do the needful and confirm";
var rex = /\band\s*([^\s]+)/g;
var match;
while ((match = rex.exec(s)) != null) {
console.log(match[1]);
}
You may well need to tweak that a bit (for instance, \b ["word boundary"] considers - a boundary, which you may not want; and separately, your definition of "word" may be different from [^\s]+, etc.).
at first you need to split the whole string in "and ", after that you have to split every element of given array into spaces, and the first element of the second given array will be the first word after the "and" word.
var s = "you have a good day and time works for you and I'll make sure to get the kids together and that's why I was asking you to do the needful and confirm"
var body = s;
var and = body.split("and ");
for(var i =0; i<and.length;i++){
console.log(and[i].split(" ")[0]);
}
You can split, check for "and" word, and get next one:
var s = "you have a good day and time works for you and I'll make sure to get the kids together and that's why I was asking you to do the needful and confirm";
var a = s.split(' ');
var cont = 0;
var and = false;
while (cont < a.length) {
if (and) {
console.log(a[cont]);
}
and = (a[cont] == 'and');
cont++;
}
Another way to do this using replace
var s = "you have a good day and time works for you and I'll make sure to get the kids together and that's why I was asking you to do the needful and confirm"
s.replace(/and\s+([^\s]+)/ig, (match, word) => console.log(word))

Find all matches in a concatenated string of same-length words?

I have a long Javascript string with letters like :
"aapaalaakaaiartaxealpyaaraa"
This string is actually a chained list of 3-letter-words : "aap","aal","aak","aai", "art", "axe","alp", "yaa" and "raa"
In reality I have many of these strings, with different word lengths, and they can be up to 2000 words long, so I need the fastest way to get all the words that start with a certain string. So when searching for all words that start with "aa" it should return :
"aap","aal","aak" and "aai"
Is there a way to do this with a regex ? It's very important that it only matches on each 3-letter word, so matches in between words should not be counted, so "aar" should not be returned, and also not "yaa" or "raa".
The simple way:
var results = [];
for (var i = 0; i < str.length; i += 3) {
if (str.substring(i, i + 2) === "aa") {
results.push(str.substring(i, i + 3));
}
}
Don’t ask whether it’s the fastest – just check whether it’s fast enough, first. :)
How about:
var str = 'aapaalaakaaiartaxealpyaaraa';
var pattern = /^aa/;
var result = str.match(/.{3}/g).filter(function(word) {
return pattern.test(word);
});
console.log(result); //=> ["aap","aal","aak","aai"]
"aapaalaakaaiartaxealpyaaraa".replace(/\w{3}|\w+/g,function(m){return m.match(/^aa/)?m+',':','}).split(',').filter(Boolean)

Javascript / jQuery faster alternative to $.inArray when pattern matching strings

I've got a large array of words in Javascript (~100,000), and I'd like to be able to quickly return a subset of them based on a text pattern.
For example, I'd like to return all the words that begin with a pattern so typing hap should give me ["happy", "happiness", "happening", etc, etc], as a result.
If it's possible I'd like to do this without iterating over the entire array.
Something like this is not working fast enough:
// data contains an array of beginnings of words e.g. 'hap'
$.each(data, function(key, possibleWord) {
found = $.inArray(possibleWord, words);
// do something if found
}
Any ideas on how I could quickly reduce the set to possible matches without iterating over the whole word set? The word array is in alphabetical order if that helps.
If you just want to search for prefixes there are data structures just for that, such as the Trie and Ternary search trees
A quick Google search and some promissing Javascrit Trie and autocomplete implementations show up:
http://ejohn.org/blog/javascript-trie-performance-analysis/
Autocomplete using a trie
http://odhyan.com/blog/2010/11/trie-implementation-in-javascript/
I have absolutely no idea if this is any faster (a jsperf test is probably in order...), but you can do it with one giant string and a RegExp search instead of arrays:
var giantStringOfWords = giantArrayOfWords.join(' ');
function searchForBeginning(beginning, str) {
var pattern = new RegExp('\\b' + str + '\\w*'),
matches = str.match(pattern);
return matches;
}
var hapResults = searchForBeginning('hap', giantStringOfWords);
The best approach is to structure the data better. Make an object with keys like "hap". That member holds an array of words (or word suffixes if you want to save space) or a separated string of words for regexp searching.
This means you will have shorter objects to iterate/search. Another way is to sort the arrays and use a binary search pattern. There's a good conversation about techniques and optimizations here: http://ejohn.org/blog/revised-javascript-dictionary-search/
I suppose that using raw javascript can help a bit, you can do:
var arr = ["happy", "happiness", "nothere", "notHereEither", "happening"], subset = [];
for(var i = 0, len = arr.length; i < len; i ++) {
if(arr[i].search("hap") !== -1) {
subset.push(arr[i]);
}
}
//subset === ["happy", "happiness","happening"]
Also, if the array is ordered you could break early if the first letter is bigger than the first of your search, instead of looping the entire array.
var data = ['foo', 'happy', 'happiness', 'foohap'];
jQuery.each(data, function(i, item) {
if(item.match(/^hap/))
console.log(item)
});
If you have the data in an array, you're going to have to loop through the whole thing.
A really simple optimization is on page load go through your big words array and make a note of what index ranges apply to each starting letter. E.g., in my example below the "a" words go from 0 to 2, "b" words go from 3 to 4, etc. Then when actually doing a pattern match only look through the applicable range. Although obviously some letters will have more words than others, a given search will only have to look through an average of 100,000/26 words.
// words array assumed to be lowercase and in alphabetical order
var words = ["a","an","and","be","blue","cast","etc."];
// figure out the index for the first and last word starting with
// each letter of the alphabet, so that later searches can use
// just the appropriate range instead of searching the whole array
var letterIndexes = {},
i,
l,
letterIndex = 0,
firstLetter;
for (i=0, l=words.length; i<l; i++) {
if (words[i].charAt(0) === firstLetter)
continue;
if (firstLetter)
letterIndexes[firstLetter] = {first : letterIndex, last : i-1};
letterIndex = i;
firstLetter = words[i].charAt(0);
}
function getSubset(pattern) {
pattern = pattern.toLowerCase()
var subset = [],
fl = pattern.charAt(0),
matched = false;
if (letterIndexes[firstLetter])
for (var i = letterIndexes[fl].first, l = letterIndex[fl].last; i <= l; i++) {
if (pattern === words[i].substr(0, pattern.length)) {
subset.push(words[i]);
matched = true;
} else if (matched) {
break;
}
}
return subset;
}
Note also that when searching through the (range within the) words array, once a match is found I set a flag, which indicates we've gone past all of the words that are alphabetically before the pattern and are now making our way through the matching words. That way as soon as the pattern no longer matches we can break out of the loop. If the pattern doesn't match at all we still end up going through all the words for that first letter though.
Also, if you're doing this as a user types, when letters are added to the end of the pattern you only have to search through the previous subset, not through the whole list.
P.S. Of course if you want to break the word list up by first letter you could easily do that server-side.

Categories

Resources