Match String of Input to text/element and highlight reactive - javascript

HTML(JADE)
p#result Lorem ipsum is javascript j s lo 1 2 4 this meteor thismeteor. meteor
input.search
JS
Template.pg.events({
'keyup .search': function(e){
e.preventDefault();
var text = $('p#result').text();
var splitText = text.match(/\S+\s*/g);
var input = $(e.target).val();
var splitInput = input.match(/\S+\s*/g);
if(_.intersection(splitText, splitInput)) {
var match = _.intersection(splitText, splitInput);
var matchToString = match.toString();
$('p#result').text().replace(matchToString, '<b>'+matchToString+ '</b>')
}
console.log(splitText); //check what I get
console.log(splitInput); //check what I get
}
})
I have the above code.
What I'm trying to do is matching the input field's value, and then matching the text. I added it the function to keyup so that this is reactive.
When the fields and text match, it will add bold tagsto the matched strings (texts).
I think I'm almost there, but not quite yet.
How would I proceed on from here?
MeteorPad
Here

In your code, you seem to only be matching on whole words, although your question does not specify that. If you want to match on any text in the input (e.g., if you type "a", all "a"s in the text are made bold), you can do that relatively easily using the javascript split and join String methods:
Template.pg.events({
'keyup .search': function(e){
e.preventDefault();
var text = $('p#result').text();
var input = $(e.target).val();
var splitText = text.split(input); // Produces an array without whatever's in the input
console.log(splitText);
var rep = splitText.join("<b>" + input + "</b>"); // Produces a string with inputs replaced by boldened inputs
console.log(rep);
$('p#result').html(rep);
}
});
Notably, you have to replace the text on the page using $('p#result').html(), which was missing in your MeteorPad example. Note also that this is a case-sensitive implementation; you can use a regex to do the split, but it gets a bit more complicated when you want to replace the text in the join. You can play around with it on this MeteorPad.
To do this case-insensitively, the split is very straightforward -- you can use a RegExp like so:
...
var regex = new RegExp($(e.target).val(), 'gi'); // global and case-insensitive, where `input` used to be
The tricky thing is to extract the correct case of what you want to pull out, then put it back in -- you can't do this with a simple join, so you'll have to interleave the two arrays. You can see an example of interleaved arrays here, which was taken from this question. I've amended that a bit to deal with the uneven array lengths, here:
var interleave = function(array1, array2) {
return $.map(array1, function(v, i) {
if (array2[i]) { return [v, array2[i]]; } // deals with uneven array lengths
else { return [v]; }
});
}
I've also created another MeteorPad that you can play around with that does all of this. lo is a good test string to check out.

Related

Extract links in a string and return an array of objects

I receive a string from a server and this string contains text and links (mainly starting with http://, https:// and www., very rarely different but if they are different they don't matter).
Example:
"simple text simple text simple text domain.ext/subdir again text text text youbank.com/transfertomealltheirmoney/witharegex text text text and again text"
I need a JS function that does the following:
- finds all the links (no matter if there are duplicates);
- returns an array of objects, each representing a link, together with keys that return where the link starts in the text and where it ends, something like:
[{link:"http://www.dom.ext/dir",startsAt:25,endsAt:47},
{link:"https://www.dom2.ext/dir/subdir",startsAt:57,endsAt:88},
{link:"www.dom.ext/dir",startsAt:176,endsAt:192}]
Is this possible? How?
EDIT: #Touffy: I tried this but I could not get how long is any string, only the starting index. Moreover, this does not detect www: var str = string with many links (SO does not let me post them)"
var regex =/(\b(https?|ftp|file|www):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig; var result, indices = [];
while ( (result = regex.exec(str)) ) {
indices.push({startsAt:result.index});
}; console.log(indices[0].link);console.log(indices[1].link);
One way to approach this would be with the use of regular expressions. Assuming whatever input, you can do something like
var expression = /(https?:\/\/(?:www\.|(?!www))[^\s\.]+\.[^\s]{2,}|www\.[^\s]+\.[^\s]{2,})/gi;
var matches = input.match(expression);
Then, you can iterate through the matches to discover there starting and ending points with the use of indexOf
for(match in matches)
{
var result = {};
result['link'] = matches[match];
result['startsAt'] = input.indexOf(matches[match]);
result['endsAt'] =
input.indexOf(matches[match]) + matches[match].length;
}
Of course, you may have to tinker with the regular expression itself to suit your specific needs.
You can see the results logged by console in this fiddle
const getLinksPool = (links) => {
//you can replace the https with any links like http or www
const linksplit = links.replace(/https:/g, " https:");
let linksarray = linksplit.split(" ");
let linkspools = linksarray.filter((array) => {
return array !== "";
});
return linkspools;
};

Renumber the integers inside a string in Javascript

I want to renumber the integers inside a string, that has this format (letters and int numbers): "e1b2xx4d3".
In this example, I want to get: "e1b2xx3d4";
I have written the following JS code:
var count = 0;
var matches;
var transcript = "e1b2xx4d3";
var transcript1 = transcript;
regex = /\d+/g;
while ((matches = regex.exec(transcript)) !== null) {
transcript1 = transcript1.replace(matches[0], ++count);
}
console.log(transcript1);
The idea is to replace each number in the string by its sequence number (count), but it does not work because of destructive replaces (here, we get "e1b2xx4d3", because "xx4" is replaced with "xx3", but at the next iteration by "xx4" back).
I need to do this with regex because the case that I deal with is more complex than the one shown and requires using regex.
I think that I have to do it in two passes (iterations): 1. compiling replacements and 2. applying replacements simultaneously.
By curiousity, can someone find a way to do this in one pass ?
Fiddle: http://jsfiddle.net/0frru6fr/
This is usually done with a replacing function:
n = 0
result = "e1b2xx4d3".replace(/\d+/g, function() { return ++n })
alert(result)
See docs for more info.

Regular expression in Javascript: table of positions instead of table of occurrences

Regular expressions are most powerful. However, the result they return is sometimes useless:
For example:
I want to manage a CSV string using semicolons.
I define a string like:
var data = "John;Paul;Pete;Stuart;George";
If I use the instruction:
var tab = data.match(/;/g)
after what, "tab" contains an array of 4 ";" :
tab[0]=";", tab[1]=";", tab[2]=";", tab[3]=";"
This array is not useful in the present case, because I knew it even before using the regular expression.
Indeed, what I want to do is 2 things:
1stly: Suppress the 4th element (not "Stuart" as "Stuart", but "Stuart" as 4th element)
2ndly: Replace the 3rd element by "Ringo" so as to get back (to where you once belonged!) the following result:
data == "John;Paul;Ringo;George";
In this case, I would greatly prefer to obtain an array giving the positions of semicolons:
tab[0]=4, tab[1]=9, tab[2]=14 tab[3]=21
instead of the useless (in this specific case)
tab[0]=";", tab[1]=";", tab[2]=";", tab[3]=";"
So, here's my question: Is there a way to obtain this numeric array using regular expressions?
To get tab[0]=4, tab[1]=9, tab[2]=14 tab[3]=21, you can do
var tab = [];
var startPos = 0;
var data = "John;Paul;Pete;Stuart;George";
while (true) {
var currentIndex = data.indexOf(";", startPos);
if (currentIndex == -1) {
break;
}
tab.push(currentIndex);
startPos = currentIndex;
}
But if the result wanted is "John;Paul;Ringo;George", you can do
var tab = data.split(';'); // Split the string into an array of strings
tab.splice(3, 1); // Suppress the 4th element
tab[2] = "Ringo"; // Replace the 3rd element by "Ringo"
var str = tab.join(';'); // Join the elements of the array into a string
The second approach is maybe better in your case.
String.split
Array.splice
Array.join
You should try a different approach, using split.
tab = data.split(';') will return an array of the form
tab[0]="John", tab[1]="Paul", tab[2]="Pete", tab[3]="Stuart", tab[4]="George"
You should be able to achieve your goal with this array.
Why use a regex to perform this operation? You have a built-in function split, which can split your string based on the delimiter you pass.
var data = "John;Paul;Pete;Stuart;George";
var temp=data.split(';');
temp[0],temp[1]...

How to remove the last matched regex pattern in javascript

I have a text which goes like this...
var string = '~a=123~b=234~c=345~b=456'
I need to extract the string such that it splits into
['~a=123~b=234~c=345','']
That is, I need to split the string with /b=.*/ pattern but it should match the last found pattern. How to achieve this using RegEx?
Note: The numbers present after the equal is randomly generated.
Edit:
The above one was just an example. I did not make the question clear I guess.
Generalized String being...
<word1>=<random_alphanumeric_word>~<word2>=<random_alphanumeric_word>..~..~..<word2>=<random_alphanumeric_word>
All have random length and all wordi are alphabets, the whole string length is not fixed. the only text known would be <word2>. Hence I needed RegEx for it and pattern being /<word2>=.*/
This doesn't sound like a job for regexen considering that you want to extract a specific piece. Instead, you can just use lastIndexOf to split the string in two:
var lio = str.lastIndexOf('b=');
var arr = [];
var arr[0] = str.substr(0, lio);
var arr[1] = str.substr(lio);
http://jsfiddle.net/NJn6j/
I don't think I'd personally use a regex for this type of problem, but you can extract the last option pair with a regex like this:
var str = '~a=123~b=234~c=345~b=456';
var matches = str.match(/^(.*)~([^=]+=[^=]+)$/);
// matches[1] = "~a=123~b=234~c=345"
// matches[2] = "b=456"
Demo: http://jsfiddle.net/jfriend00/SGMRC/
Assuming the format is (~, alphanumeric name, =, and numbers) repeated arbitrary number of times. The most important assumption here is that ~ appear once for each name-value pair, and it doesn't appear in the name.
You can remove the last token by a simple replacement:
str.replace(/(.*)~.*/, '$1')
This works by using the greedy property of * to force it to match the last ~ in the input.
This can also be achieved with lastIndexOf, since you only need to know the index of the last ~:
str.substring(0, (str.lastIndexOf('~') + 1 || str.length() + 1) - 1)
(Well, I don't know if the code above is good JS or not... I would rather write in a few lines. The above is just for showing one-liner solution).
A RegExp that will give a result that you may could use is:
string.match(/[a-z]*?=(.*?((?=~)|$))/gi);
// ["a=123", "b=234", "c=345", "b=456"]
But in your case the simplest solution is to split the string before extract the content:
var results = string.split('~'); // ["", "a=123", "b=234", "c=345", "b=456"]
Now will be easy to extract the key and result to add to an object:
var myObj = {};
results.forEach(function (item) {
if(item) {
var r = item.split('=');
if (!myObj[r[0]]) {
myObj[r[0]] = [r[1]];
} else {
myObj[r[0]].push(r[1]);
}
}
});
console.log(myObj);
Object:
a: ["123"]
b: ["234", "456"]
c: ["345"]
(?=.*(~b=[^~]*))\1
will get it done in one match, but if there are duplicate entries it will go to the first. Performance also isn't great and if you string.replace it will destroy all duplicates. It would pass your example, but against '~a=123~b=234~c=345~b=234' it would go to the first 'b=234'.
.*(~b=[^~]*)
will run a lot faster, but it requires another step because the match comes out in a group:
var re = /.*(~b=[^~]*)/.exec(string);
var result = re[1]; //~b=234
var array = string.split(re[1]);
This method will also have the with exact duplicates. Another option is:
var regex = /.*(~b=[^~]*)/g;
var re = regex.exec(string);
var result = re[1];
// if you want an array from either side of the string:
var array = [string.slice(0, regex.lastIndex - re[1].length - 1), string.slice(regex.lastIndex, string.length)];
This actually finds the exact location of the last match and removes it regex.lastIndex - re[1].length - 1 is my guess for the index to remove the ellipsis from the leading side, but I didn't test it so it might be off by 1.

Fastest / most efficient way to compare two string arrays Javascript

Hi I was wondering whether anyone could offer some advice on the fastest / most efficient way to compre two arrays of strings in javascript.
I am developing a kind of tag cloud type thing based on a users input - the input being in the form a written piece of text such as a blog article or the likes.
I therefore have an array that I keep of words to not include - is, a, the etc etc.
At the moment i am doing the following:
Remove all punctuation from the input string, tokenize it, compare each word to the exclude array and then remove any duplicates.
The comparisons are preformed by looping over each item in the exclude array for every word in the input text - this seems kind of brute force and is crashing internet explorer on arrays of more than a few hundred words.
i should also mention my exclude list has around 300 items.
Any help would really be appreciated.
Thanks
I'm not sure about the whole approach, but rather than building a huge array then iterating over it, why not put the "keys" into a map-"like" object for easier comparison?
e.g.
var excludes = {};//object
//set keys into the "map"
excludes['bad'] = true;
excludes['words'] = true;
excludes['exclude'] = true;
excludes['all'] = true;
excludes['these'] = true;
Then when you want to compare... just do
var wordsToTest = ['these','are','all','my','words','to','check','for'];
var checkWord;
for(var i=0;i<wordsToTest.length;i++){
checkWord = wordsToTest[i];
if(excludes[checkword]){
//bad word, ignore...
} else {
//good word... do something with it
}
}
allows these words through ['are','my','to','check','for']
It would be worth a try to combine the words into a single regex, and then compare with that. The regex engine's optimizations might allow the search to skip forward through the search text a lot more efficiently than you could do by iterating yourself over separate strings.
You could use a hashing function for strings (I don't know if JS has one but i'm sure uncle Google can help ;] ). Then you would calculate hashes for all the words in your exclude list and create an array af booleans indexed by those hashes. Then just iterate through the text and check the word hashes against that array.
I have taken scunliffe's answer and modified it as follows:
var excludes = ['bad','words','exclude','all','these']; //array
now lets prototype a function that checks if a value is inside an Array:
Array.prototype.hasValue= function(value) {
for (var i=0; i<this.length; i++)
if (this[i] === value) return true;
return false;
}
lets test some words:
var wordsToTest = ['these','are','all','my','words','to','check','for'];
var checkWord;
for(var i=0; i< wordsToTest.length; i++){
checkWord = wordsToTest[i];
if( excludes.hasValue(checkWord) ){
//is bad word
} else {
//is good word
console.log( checkWord );
}
}
output:
['are','my','to','check','for']
I'd opt for the regex version
text = 'This is a text that contains the words to delete. It has some <b>HTML</b> code in it, and punctuation!';
deleteWords = ['is', 'a', 'that', 'the', 'to', 'this', 'it', 'in', 'and', 'has'];
// clear punctuation and HTML code
onlyWordsReg = /\<[^>]*\>|\W/g;
onlyWordsText = text.replace(onlyWordsReg, ' ');
reg = new RegExp('\\b' + deleteWords.join('\\b|\\b') + '\\b', 'ig');
cleanText = onlyWordsText .replace(reg, '');
// tokenize after this

Categories

Resources