JavaScript find names in strings - javascript

What's a good JavaScript library for searching a given string for a large list of names.
For example, given a list of 1000 politicians names find every instance in a string and wrap it in a span.
Priorities are performance with a growing list of names, and accuracy in determining difference between eg, "Tony Blair", "Tony Blair III".
For example, this:
["Tony Blair", "Margaret Thatcher", "Tony Blairite", "Tony Blair III", etc...]
"The best PM after Tony Blair was Margaret Thatcher."
Becomes:
"The best PM after <span class="mp">Tony Blair</span> was <span class="mp">Margaret Thatcher</span>."

var names = ['foo','bar'];
var content = "this foo is bar foobar foo ";
for (var c=0,l=names.length;c<l;c++) {
var r = new RegExp("\\b("+names[c]+")\\b","gi");
content = content.replace(r,'<span class="mp">$1</span>');
}

Related

JavaScript get first two words from string in Array

I am new to JavaScript so I am struggling to even know where to start. Please can someone help me.
I have this array of ingredients:
const ingris = [
"1 cup heavy cream",
"8 ounces paprika",
"1 Chopped Tomato",
"1/2 Cup yogurt",
"1 packet pasta ",
"1/2 teaspoon freshly ground black pepper, divided",
]
I am trying to take out for example the 1 cup or 1/2 teaspoon (first 2 words of the array) and add it to a new array of objects like below:
const ShoppingList = [
{
val: "heavy cream",
amount: "1 cup",
},
{
val: "Tomato",
amount: "1 Chopped ",
},
{
val: "yogurt",
amount: "1/2 Cup",
},
];
Probably I would try to use .map() first iterate through the array of strings and convert it into an new array of objects. On each iteration you can .split() the string by spaces and most probably the first 2 elements of the array can be the amount property and the rest is the value.
See from the documentations:
The map() method creates a new array populated with the results of calling a provided function on every element in the calling array.
The split() method divides a String into an ordered list of substrings, puts these substrings into an array, and returns the array. The division is done by searching for a pattern; where the pattern is provided as the first parameter in the method's call.
Try as the following:
const ingris = [
"1 cup heavy cream",
"8 ounces paprika",
"1 Chopped Tomato",
"1/2 Cup yogurt",
"1 packet pasta",
"1/2 teaspoon freshly ground black pepper, divided",
];
const result = ingris.map(e => {
const split = e.split(' ');
const amount = `${split[0]} ${split[1]}`;
return { val: e.replace(`${amount} `, ''), amount };
});
console.log(result);
Probably you need to add fallback once you have different format of input strings, like checking if you have at least 3 words in that string.
Using Array#map to map the given array to a new one. Split the string to an arry at the spaces. return a new object with the first 2 array-elements as val and the others as amount. For gettuing the last elements use Array#slice and Array#join with a space as glue to connect them to a string.
const ingris = [
"1 cup heavy cream",
"8 ounces paprika",
"1 Chopped Tomato",
"1/2 Cup yogurt",
"1 packet pasta ",
"1/2 teaspoon freshly ground black pepper, divided",
];
let result = ingris.map(str => {
let arr = str.split(' ');
return {val: arr[0] + ' ' + arr[1], amount: arr.slice(2).join(' ')};
});
console.log(result);
There are multiple approach but using Array.prototype.map to loop over the array, String.prototype.split to turn each string into a array inside the loop, Array.prototype.slice() to get a slice from the array will help you create what you need. But keep in mind that it currently always get the first 2 words and the words after that. So your ingredients have to be the same way every time
const ingris = [
"1 cup heavy cream",
"8 ounces paprika",
"1 Chopped Tomato",
"1/2 Cup yogurt",
"1 packet pasta ",
"1/2 teaspoon freshly ground black pepper, divided",
];
const shoppingList = ingris.map(ingredient => {
const splitIngredient = ingredient.split(' ');
const amount = splitIngredient.slice(0, 2).join(' ');
const val = splitIngredient.slice(2, splitIngredient.length).join(' ');
return { val, amount };
});
console.log(shoppingList);

How can I split a list of strings based on quantity and name?

How can I split a list of strings based on quantity and name?
For example if I have string str that looks like the following:
5 apples
7x pine apples
10 oranges
14x corn on the cob
apple pie
I could do,
var list = str.split(/\r?\n/);
So now I have each line in an array list but now I still need to get the quantity and name from each element in the list.
For list[0] which is '5 apples' I could do,
var breakdown = list[0].split(' ');
For list[1] I'd have to remove the 'x' from '7x' and it would incorrectly be split into 3 rather than just the quantity and name , etc.
For 'apple pie' the quantity should be 1.
The expected result is always,
breakdown[0]: quantity
breakdown[1]: name
How can I get the quantity and name regardless of how it's entered?
A regex on each line would do it. This follows with a second .map() to convert the numeric (or empty) string to a number.
var data = `5 apples
7x pine apples
10 oranges
14x corn on the cob
apple pie`;
var result = data.split(/\s*?(?:\r?\n)+\s*/g).map(s =>
/^(?:(\d+)x?\s+)?(.+)$/.exec(s).slice(1)
).map(([q, d]) => [+q || 1, d]);
console.log(result);
It could actually be done with just a regex too, if you include the m modifier.
var data = `5 apples
7x pine apples
10 oranges
14x corn on the cob
apple pie`;
var re = /^(?:(\d+)x?\s+)?(.+)$/gm;
var m;
var result = [];
while((m = re.exec(data))) {
result.push([+m[1] || 1, m[2]]);
}
console.log(result);

How to count occurrence of multiple sub-string in a long string with JavaScript

I am a fresh with JavaScript. I just tried a lot, but did not get the answer and information to show how to count occurrence of multiple sub-string in a long string at one time.
Further information: I need get the occurrence of these sub-string and if the number of their occurrence to much, I need replace them at one time,so I need get the occurrence at one time.
Here is an example:
The long string Text as below,
Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24–10 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the "golden anniversary" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as "Super Bowl L"), so that the logo could prominently feature the Arabic numerals 50.
The sub-string is a question, but what I need is to count each word occurrence in this sub-string at one time. for example, the word "name","NFL","championship","game" and "is","the" in this string.
What is the name of NFL championship game?
One of problems is some sub-string is not in the text, and some have shown many times.(which I might replaced it)
The Code I have tried as below, it is wrong, I have tried many different ways but no good results.
$(".showMoreFeatures").click(function(){
var text= $(".article p").text(); // This is to get the text.
var textCount = new Array();
// Because I use match, so for the first word "what", will return null, so
this is to avoid this null. and I was plan to get the count number, if it is
more than 7 or even more, I will replace them.
var qus = item2.question; //This is to get the sub-string
var checkQus = qus.split(" "); // I split the question to words
var newCheckQus = new Array();
// This is the array I was plan put the sub-string which count number less than 7, which I really needed words.
var count = new Array();
// Because it is a question as sub-string and have many words, so I wan plan to get their number and put them in a array.
for(var k =0; k < checkQus.length; k++){
textCount = text.match(checkQus[k],"g")
if(textCount == null){
continue;
}
for(var j =0; j<checkQus.length;j++){
count[j] = textCount.length;
}
//count++;
}
I was tried many different ways, and searched a lot, but no good results. The above code just want to show what I have tried and my thinking(might totally wrong). But actually it is not working , if you know how to implement it,solve my problem, please just tell me, no need to correct my code.
Thanks very much.
If I have understood the question correctly then it seems you need to count the number of times the words in the question (que) appear in the text (txt)...
var txt = "Super Bowl 50 was an American ...etc... Arabic numerals 50.";
var que = "What is the name of NFL championship game?";
I'll go through this in vanilla JavaScript and you can transpose it for JQuery as required.
First of all, to focus on the text we can make things a little simpler by changing the strings to lowercase and removing some of the punctuation.
// both strings to lowercase
txt = txt.toLowerCase();
que = que.toLowerCase();
// remove punctuation
// using double \\ for proper regular expression syntax
var puncArray = ["\\,", "\\.", "\\(", "\\)", "\\!", "\\?"];
puncArray.forEach(function(P) {
// create a regular expresion from each punctuation 'P'
var rEx = new RegExp( P, "g");
// replace every 'P' with empty string (nothing)
txt = txt.replace(rEx, '');
que = que.replace(rEx, '');
});
Now we can create a cleaner array from str and que as well as a hash table from que like so...
// Arrays: split at every space
var txtArray = txt.split(" ");
var queArray = que.split(" ");
// Object, for storing 'que' counts
var queObject = {};
queArray.forEach(function(S) {
// create 'queObject' keys from 'queArray'
// and set value to zero (0)
queObject[S] = 0;
});
queObject will be used to hold the words counted. If you were to console.debug(queObject) at this point it would look something like this...
console.debug(queObject);
/* =>
queObject = {
what: 0,
is: 0,
the: 0,
name: 0,
of: 0,
nfl: 0,
championship: 0,
game: 0
}
*/
Now we want to test each element in txtArray to see if it contains any of the elements in queArray. If the test is true we'll add +1 to the equivalent queObject property, like this...
// go through each element in 'queArray'
queArray.forEach(function(A) {
// create regular expression for testing
var rEx = new RegExp( A );
// test 'rEx' against elements in 'txtArray'
txtArray.forEach(function(B) {
// is 'A' in 'B'?
if (rEx.test(B)) {
// increase 'queObject' property 'A' by 1.
queObject[A]++;
}
});
});
We use RegExp test method here rather than String match method because we just want to know if "is A in B == true". If it is true then we increase the corresponding queObject property by 1. This method will also find words inside words, such as 'is' in 'San Francisco' etc.
All being well, logging queObject to the console will show you how many times each word in the question appeared in the text.
console.debug(queObject);
/* =>
queObject = {
what: 0
is: 2
the: 17
name: 0
of: 2
nfl: 1
championship: 0
game: 4
}
*/
Hoped that helped. :)
See MDN for more information on:
Array.forEach()
Object.keys()
RegExp.test()

Parsing out Salutation & First Name from Full Name field

I have a string that contains Full Name.
The format of the full name may or may not have the salutation. Also there may or may not be a period after the salutation as well (could display as Mr. or Mr). For example, I could receive:
"Mrs. Ella Anderson"
"Ella Anderson"
"Miss Jennifer Sply"
"Mr. Dan Johnson"
"Damien Hearst"
My goal is to remove the salutation from the Full Name string. Once the salutation is removed, I want to parse out the First Name from the Full Name. I am kinda new to regex, but I do understand how to parse out the First Name. The one part I am just not sure how to do is get rid of the salutation.
var string = "Ella Anderson"
var first = string.replace(/\s.*$/, "").toUpperCase().trim();
This regex should work.
var regex = /(Mr|MR|Ms|Miss|Mrs|Dr|Sir)(\.?)\s/,
fullNames = ["Mrs. Ella Anderson", "Ella Anderson", "Miss Jennifer Sply", "Mr. Dan Johnson", "Damien Hearst"];
var names = fullNames.map(function(name) {
var match = regex.exec(name),
n = "";
(match !== null) ? n = name.replace(match[0], "") : n = name;
return n;
});
console.log(names);
The problem is that the full name is in a string in the first place. If at all possible, you should change that to just use separate fields.
There's no telling what users will enter in a text box. Nor is it reliably possible to determine what part of the remaining name is the first name, and what part is the surname.
If the input data is separated properly, you won't have to figure out what is what, any more.
So, if possible, change the way the name is entered to something like:
<select name="select">
<option>Miss</option>
<option>Mrs</option>
<option>Mr</option>
<option>etc...</option>
</select>
<input placeholder="First name" />
<input placeholder="Surname" />
You can use this regexp: /((Mrs|Mr|Miss)\.? )?([^ ]*) ?([^ ]*)/
Examples:
var regex = /((Mrs|Mr|Miss)\.? )?([^ ]*) ?([^ ]*)/;
regex.exec('Mrs. Ella Anderson') == ["Mrs. Ella Anderson", "Mrs. ", "Mrs", "Ella", "Anderson"];
regex.exec("Ella Anderson") == ["Ella Anderson", undefined, undefined, "Ella", "Anderson"];
regex.exec("Miss Jennifer Sply") == ["Miss Jennifer Sply", "Miss ", "Miss", "Jennifer", "Sply"];
regex.exec("Mr. Dan Johnson") == ["Mr. Dan Johnson", "Mr. ", "Mr", "Dan", "Johnson"];
regex.exec("Damien Hearst") == ["Damien Hearst", undefined, undefined, "Damien", "Hearst"];
regex.exec("Missy Jennifer") == ["Missy Jennifer", undefined, undefined, "Missy", "Jennifer"];
If you want the first name and the last name, you just have to look at the last two values of the array.
Of course, this regexp will not work with something like `Mr. John Smith Junior. If you want something generic, don't use a regexp.
It's a pretty complicated regex:
/^(?:(Miss|M[rs]{1,2})\.?\s+)?(\S+)\s+(\S+)$/
Then if you want middle names or initials it gets a little trickier things like jr. or sr. - It's mostly all doable. There's some question about how to deal with hyphenates.
You can use this regexp:^[ \t]*(?<title>(Shri|Leu|DR|mrs|SMT|Major|Gen){1,10}(\.|,))?\s*(?<LstName>[A-Z][a-z-']{2,20}),? +(?<FstName>[A-Z,a-z]+)*[ \t]*[^\n]*
Tested on the following Test data:
Major. Amator Gary L
Mrs. Grundy Ronald
Dr. Domsky Alan
Shri. Worden Scott Allen
Rodriguez Howard W
NEHME ALLEN
RODRIGUEZ CHARLES G
VERGARA WILLIAM F J
EVELYN J
Leu. GLICK, JACOB L.
SMT. Taylor-garcia Dottielou

Using JavaScript/Jquery to parse JSON Data

How would i parse the json data below to out put as
Staring: Will Smith, Bridget Moynahan, Bruce GreenWood
{"query":{"\n| starring = [[Will Smith]]<br />[[Bridget Moynahan]]<br />[[Bruce Greenwood]]<br />[[James Cromwell]]<br />[[Chi McBride]]<br />[[Alan Tudyk]]}}
This was taken from here
{
"query": {
"normalized": [
{
"from": "I,_Robot_(film)",
"to": "I, Robot (film)"
}
],
"pages": {
"564947": {
"pageid": 564947,
"ns": 0,
"title": "I, Robot (film)",
"revisions": [
{
"contentformat": "text/x-wiki",
"contentmodel": "wikitext",
"*": "{{Other uses|I, Robot (disambiguation)}}\n{{Infobox film\n| name = I, Robot\n| image = Movie poster i robot.jpg\n| caption = Theatrical release poster\n| director = [[Alex Proyas]]\n| producer = [[Laurence Mark]]<br />[[John Davis (producer)|John Davis]]<br />Topher Dow<br />Wyck Godfrey\n| screenplay = [[Jeff Vintar]]<br />[[Akiva Goldsman]]\n| story = Jeff Vintar\n| based on = {{Based on|premise suggested by ''[[I, Robot]]''|[[Isaac Asimov]]}}\n| starring = [[Will Smith]]<br />[[Bridget Moynahan]]<br />[[Bruce Greenwood]]<br />[[James Cromwell]]<br />[[Chi McBride]]<br />[[Alan Tudyk]]\n| music = [[Marco Beltrami]]\n| cinematography = Simon Duggan\n| editing = Richard Learoyd<br />Armen Minasian<br />[[William Hoy]]\n| studio = [[Davis Entertainment]]<br />[[Laurence Mark Productions]]<br />[[Overbrook Entertainment|Overbrook Films]]<br/>[[Rainmaker Digital Effects]] (Provided)\n| distributor = [[20th Century Fox]]\n| released = {{Film date|2004|7|15|international|2004|7|16|United States}}\n| runtime = 115 minutes\n| country = United States\n| language = English\n| budget = $120 million\n| gross = $347,234,916\n}}\n'''''I, Robot''''' is a 2004 American [[dystopia]]n [[science fiction film|science fiction]] [[action film]] directed by [[Alex Proyas]]. The screenplay was written by [[Jeff Vintar]] and [[Akiva Goldsman]], and is inspired by (\"suggested by\", according to the end credits) [[Isaac Asimov]]'s short-story collection [[I, Robot|of the same name]]. [[Will Smith]] stars in the lead role of the film as Detective Del Spooner. The supporting cast includes [[Bridget Moynahan]], [[Bruce Greenwood]], [[James Cromwell]], [[Chi McBride]], [[Alan Tudyk]], and [[Shia LaBeouf]]. \n\n''I, Robot'' was released in [[North America]] on July 16, 2004, in [[Australia]] on July 22, 2004, in the [[United Kingdom]] on August 6, 2004 and in other countries between July 2004 to October 2004. Produced with a budget of [[United States dollar|USD]] $120 million, the film grossed $144 million domestically and $202 million in foreign markets for a worldwide total of $347 million. The movie received favorable reviews, with critics praising the writing, visual effects, and acting; but other critics were mixed with the focus on the plot. It was nominated for the 2004 [[Academy Award for Best Visual Effects]], but lost to ''[[Spider-Man 2]]''."
}
]
}
}
}
}
With the url being:
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=json&titles=I,Robot(film)&rvsection=0
Your help would be greatly appreciated.
Thank You,
use:
var Jsonstring = {title: "Movie", actors: [ 'actor1','actor2']};
var movie = $.parseJSON(Jsonstring);
alert(movie.title); //will alert Movie
alert(movie.actors[0]) // will alert actor1
this function will convert your json string to javascript object.
http://api.jquery.com/jquery.parsejson/
You can parse it with RegExp:
var str = obj.query.pages[564947].revisions[0]['*'],
matches = str.match(/\|\s+(starring)\s+=\s+(.+)\n/),
result = matches[1] + ': ' + matches[2].replace(/<br\s+\/>/ig, ', ').replace(/[\[\]]/ig, '');
There will be starring: Will Smith, Bridget Moynahan, Bruce Greenwood, James Cromwell, Chi McBride, Alan Tudyk in the result variable.

Categories

Resources