Is there a library available for Auto Suggest/Complete for cases like the following
Searching for "Vir" returns both "West Virginia" and "Virginia"
Thanks
EDIT
Sorry for not explaining it more. In the problem above, I do not want a "contains" search, but a prefix search on word boundaries. So "est" should not return "West Virginia", but "wes" or "vir" should.
The list is around 500 items large.
Proposed Solution
I modified the trie implementation by Mike de Boer https://github.com/mikedeboer/trie to solve this. I split an item on word boundaries and stored each word in the trie. For the last letter of each word I stored the index of the item that the word came from in the trie node. When user searches, I return a list of indices and then get the corresponding items from the main list.
What do you guys think?
First, you should try to use google or search previous question before asking such a straight forward question.
To answer your question, you could use jquery-ui wich has amoung many other widgets one called Autocomplete.
If you're familair with JQuery this should be pretty easy to implement.
http://jqueryui.com/demos/autocomplete/
you can use jquery autocomplete. ah, you've answered that already by yourself!
http://bit.ly/uXHRR0
Try this:
http://code.drewwilson.com/entry/autosuggest-jquery-plugin
I modified the trie implementation by Mike de Boer https://github.com/mikedeboer/trie to solve this. I split an item on word boundaries and stored each word in the trie. For the last letter of each word I stored the index of the item that the word came from in the trie node. When user searches, I return a list of indices and then get the corresponding items from the main list.
Related
I'm working on this simple, straightforward text content filtering mechanism on our post commenting module where people are prohibited from writing foul, expletive words.
So far I'm able to compare (word-by-word, using .include()) comment contents against the blacklisted words we have in the database. But to save space, time and effort in entering database entries for each word such as 'Fucking' and 'Fuck', I want to create a mechanism where we check if a word contains a blacklisted word.
This way, we just enter 'Fuck' in the database. And when visitor's comment contains 'Fucking' or 'Motherfucker', the function will automatically detect that there is a word in the comment that contain's 'fuck' in it and then perform necessary actions.
I've been thinking of integrating .substring() but I guess that's not what I need.
Btw, I'm using React (in case you know of any built-in functions). Much as possible, I wanna deviate from using libraries for this mechanism.
Thanks a heap!
"handover".indexOf("hand")
It will return index if it exists otherwise -1
To ignore cases you can define all your blacklisted words in lower case and then use this
"HANDOVER".toLowerCase().indexOf("hand")
To detect if a string has another string inside of it you can simply use the .includes method, it does not work on a word by word basis but checks for a sequence of characters so it should meet you requirements. It returns a boolean value for if the string is inside the other string
var sentence = 'Stackoverflow';
console.log(sentence.includes("flow"));
You were on the right track with .includes()
console.log('handover'.includes('hand'));
Returns true
Given that I have a string such as 'message-ID: 1394.00 This is Henry.Lin',
I want to use elasticsearch to find all the phrase or word contains '.'. In this case, 1394.00 and Henry.Lin are the words I am looking for. However, when I index my document with standard analyzer is not working. I understand that standard analyzer will escape such character. Therefore, I change the analyzer to ngram. Unfortunately, it is still not working. It would be great if someone can help me out.
You can add a custom character filter for dot. Replace "." with "dot". Just use the custom mapping like this:
"char_filter": {
"&_to_and": {
"type": "mapping",
"mappings": [ ".=>dot"]
}}
Please check this documentation for more details.
Now about, why ngram is not working ?
The question is how are you using ngram - as a tokenizer or token filter with some other analyzer? What's the min_gram, max_gram size? Checkout this example to clear the difference between two.
Also to understand more about how your data is getting indexes in elasticsearch and why not matching your query - try using termvectors api.
Finally I would not suggest you to use ngrams for solving this issue for following reasons - 1) n-grams are going to make your index larger , 2) They have totally different use case.
This is a follow-up of:
javascript regex - look behind alternative?
In my situation, I'm looking to only match the second word when there is no specific word that preceeds the term. As with the prior issue I need a solution that doesn't utilize the look behind technique.
I'm looking to exclude mentions such as the following:
patient has a history of pulmonary edema
Using the expression:
((?!pulmonary ).{10})\bedema
But given the following sentence:
Continuing dyspnea and lower-extremity edema
I would like the match to only return edema instead of extremity edema.
Please try this pattern:
(?!pulmonary).{10}\b(edema)\b
The demo is here.
I'm looking for a basic search functionality with JavaScript.
The Scenario: The user enters a single or multiple words, and hit a button. JavaScript looks up in an array of strings for items that probably relates to the entered search sentence.
The only function I'm aware of right now is "string.search" which returns the position of the string you are searching for. This is good, but not for all cases. Here are a few examples the search function should cover
Let's assume I have the following string: "This is a good day" in my array. The following search terms should return true when tested against my search function.
Search Term 1: This a good day
Search Term 2: This day
Search Term 3: This was good
Search Term 4: good dy -the user made a typo-
So nothing particular or specific. Just a basic search functionality that predicts (at a low level, and language agnostic) if the search term related to the strings in the tested array.
Was the last a typo for 'day'?
If not, you could simply split the search sentence, as well as the original string using the split() function.
Then you would iterate over the search words and for each make sure they appear in the source string. As soon as you don't find the word, you stop the search.
This is assuming that all the search words should be AND'ed, not OR'ed.
Does that help?
I guess what you are looking for is a pattern matching based live search similar to finite-state-automata-like (FSA) searching:
This link shows an example that'll allow you to search case-insensitively:
Example: Array contains 'This is a good day'
Searching for any (or all) of the following is valid:
THis a Day
Thagd (Th is a g oo d day)
good dy -intended typo-
etc.
A case-sensitive (albeit not perfect FSA based) version can be found here There is also one by John Resig but I don't have a link to his demo but it'd be worth looking at - it's a javascript/jquery port of the first link I mentioned.
Hope this helps!
This is not as simple as one would think. We're talking fuzzy matching and Levenshtein distance / algorithm.
See this past question:
Getting the closest string match
Ok, this is a multipart concept. However I'm sure if I can figure this piece out, the rest will follow.
What I have is an array of Words and Phrases. And I have a TextArea where people can type in. What I want to do is be able to search the array for matches or similarities in what the user is typing. The closest thing I can think of is an auto complete function. But thats not entirely what I want, yes in part what I want is an auto completes functionality, but so much more in the end run that an existing auto complete is a bit bulky for my needs.
What my Aim is, is after the user hits the spacebar is to trigger the search as they type. Now up to this point I am good. My issue is my logic is flawed from here. I want to be able to take the entire boy of text up to the point of hitting the spacebar and check it against my array of words and phrases. But Im not sure how. Currently I am split() on the textarea itself where space is my split() delimiter, but I realize now that thats not right. What I was thinking initially was split it, check it against the other array and it would be a happy day if something matched, then I realized I have phrases, if I am trying to check a phrase for a match then I wont match one.
Well hopefully this makes sense. I need to walk through logic on this, there really isnt code currently, as I am not debugging, I am trying to figure out a logic to work with that works. So I can move forward.
UPDATE:
Check this fiddle: http://jsfiddle.net/VwNHN/
You will need to tweak it to your requirements, but it will give a fair idea of how the below logic can be implemented.
Well, the logic upon keypress (probably any key and not just spacebar) can be something like:
1) Get your current cursor position - say X
Refer for example: http://demo.vishalon.net/getset.htm
2) Get N characters to the left of X. i.e. a substring of the whole text from index X-N to X - store it in Y. You will need to fix on a value for N (for ex: 100). N is the longest word/phrase you are looking to match.
ex: if full text is "hello world i am a sentence", and cursor is at the end, and N is 10, Y would be "a sentence"
3) Split Y by space character and store each split in an array incrementally and then reverse it - lets call the array PHRASES
ex: if Y is "this is a sentence" - then PHRASES would be
[ "this is a sentence", "this is a", "this is", "this" ]
4) Check your array of words/phrases with each item in the PHRASES - the longest matching parts will come first and the short matching ones will come last - this set of matches is your auto-complete list.
I would split the problem at least into two branches:
search event triggered by user.
Search function
visualization of results
If I understood what you're trying to implement, I would trigger a search on any 'onkeypress' event, unless your array is not too big (otherwise it will hang on any keypression).
Then, the search function: you have to search in an array, so I would search element by element. Jquery provides a nice jQuery.each() function. Also, I would consider _.each(list, iterator, [context]) in the underscore plugin.
Visualization of results: it's not clear to me what you want to show (a grid, a table...?), but if every element of the array is associated to a different DOM object, then you could modify its properties runtime, maybe with jquery.
Let me know if you need more.