VueJS detect if string contains words - javascript

I'm looking to detect whether or not a string has any word.
"DD8BC606-E0C0-41A2-8E7E-FCB2A1D66D76.jpeg" // false
"Image of Logo JPG" // true
I'd imagine there needs to be some sort of reference to the English dictionary on what constitutes a word, but in my scenario it could be in any language.
I've tried to split the string into an array of words and check the word length for non absurd lengths (though some words especially in German are absurdly long).
string_to_array = function (str) {
return str.trim().split(" ");
};
In the same vain, I tried count how many words there are in the array and require at least 2 (which would be a strong indicator that the string has been typed by a human - i.e. a word), but this invalidates one word strings (although this isn't a stringent requirement).
What's the fastest ad-hoc way to make sure that there is at least a word in a string?

Related

regex for minimum 8 character,beside letters add number or symbol or both [duplicate]

This question already has answers here:
Regex for password must contain at least eight characters, at least one number and both lower and uppercase letters and special characters
(42 answers)
Closed 3 years ago.
Examples which should Satisfy:
test#1234 (accept)
TestTo^12 (accept)
test!5655(accept)
Test!#$&(accept)
testtesttest(should not accept)
Beside Letter i want atleast one number or symbol.If both is available that is fine.
Updated with a one-regex solution:
Because any character could be the one that is non-alphabetic, testing for 8 characters and ensuring that at least one of them isn't alphabetic simultaneously requires a bit more complex regular expression. The "dumb and straight-forward" solution is to use a lot of "OR" operations for the 8 possibilities:
The first character is not alphabetic
The second character is not alphabetic
The third character is not alphabetic
...
It would look like so:
var myRegex = /([^A-Za-z].{7})|(.[^A-Za-z].{6})|(.{2}[^A-Za-z].{5})|(.{3}[^A-Za-z].{4})|(.{4}[^A-Za-z].{3})|(.{5}[^A-Za-z].{2})|(.{6}[^A-Za-z].)|(.{7}[^A-Za-z])/
A smarter solution uses Regular Expression look-ahead operations. We basically match an empty string followed by 8 characters, so long as that empty string is followed by a pattern that contains (anywhere within it) a non-alphabetic character:
var myRegex = /(?=.*[^A-Za-z]).{8}/;
Look-aheads can make regular expessions more difficult to understand and debug, so it's sometimes better to rephrase the problem as a multi-part check (e.g. first check for 8 characters, then check for a non-alphabetic character). Thus, my original answer:
Original (and still totally valid):
It may not be the preferred solution, but this would probably be easiest to solve with two regular expressions:
var myValue = "testtesttest";
if (
/.{8}/.test(myValue) &&
/[^A-Za-z]/.test(myValue)
)
{
// At least 8 characters and contains one non-alphabetic character
}
else
{
// One of the two tests failed
}
Of course /.{8}/.test(myValue) is just myValue.length >= 8 which runs a lot faster and is easier to read:
if (
myValue.length >= 8 &&
/[^A-Za-z]/.test(myValue)
)
...
A note on password security:
Complex passwords are not secure passwords
Let me reiterate:
Complex passwords are not secure passwords
When a password is complex, people can't remember it. Because they can't remember it they either have to use a password management tool (if they're smart) or, more likely, they write it down on a post-it note. But it gets worse. When you require symbols and numbers it becomes hard to type a password, so people often make them as short as possible. You limit it to 8 characters? Their password will be 8 characters. But it gets worse. Many people, to make passwords more memorable, follow some pattern like:
Take a word and tack on a number ("honeybee326")
Take a word and replace some letters with symbols ("unf#th0m#ble")
Take a word and tack on a symbol ("mypassword!")
People trying to guess a password don't need to try every possible combination of letters, numbers, and symbols. They just need to try these patterns. This greatly reduces the time it takes to crack passwords.
Any security expert who actually knows their stuff will tell you: the secret to a strong password is a long password. This means:
Hello my name is Steven and I like security
is 99999999x more secure than:
8j3#vk;8]
Instead of testing for symbols or numbers, I recommend just putting a large length requirement (something like 15 characters) if you want to really keep people secure. Or, just tell them what you recommend and let them do what they feel is best (if they choose a bad passwords, only they suffer the consequences). Unless this is an admin password, in which case enforce a massive length.

JavaScript: How to remove last word from a string only if the word contains an integer 0-9?

So I'm aware that I can remove the last word from a string of words by using lastIndexOf(" "), but I'd like to add the condition that the word should only be removed if it contains an integer 0-9.
I'm asking because I'd like to separate company names from their reference tags (if such tags exist) for a list of data. These reference tags are guaranteed to contain at least one integer 0-9. For example, I have the string "Cisco Systems RX4510", and I'd like to remove the "RX4510" to just get the company name, "Cisco Systems." However, for another string "Electronic Arts", which has no reference tag, I just leave it alone.
Any help would be appreciated, thanks.
Use a regular expression which, after the last space, looks ahead for a digit character, and then matches word characters until the end of the string, $:
const str1 = 'Cisco Systems RX4510';
const str2 = 'Electronic Arts';
const re = / (?=.*\d)\w+$/;
console.log(str1.replace(re, ''));
console.log(str2.replace(re, ''));
Note that this will replace the space before the last word too. If you want to preserve the last space, use a word boundary instead:
/\b(?=.*\d)\w+$/
You can get the last character with s.charAt(s.length) (there are cleaner ways but this one supports really old browsers) and then check if it's a number with parseInt and isNaN. Example:
if (!isNaN(parseInt(s.charAt(s.length), 10))) {
// do something
}

is there an algorithm that indexes spaces instead of characters?

New to this and was wondering after reading up on indexing and character counts. wouldn't it be more applicable to index spaces instead of characters to improve matching of words?
Looking at the example below, it selects/counts the white spaces at the end of every word. But I want it to count or recognize the space at the end of a word and the beginning of the following word, essentially noticing/collating white space characters. Does that make any sense?
var str = 'This is a string',
index = 0,
res = [];
while ((index = str.indexOf(' ', index + 1)) > 0) {
res.push(index);
}
console.log(res)
Short answer: No, but you can split it into an array of strings by spaces with String.prototype.split().
Long answer:
You could think about this in a few ways... and this applies to more languages than just JS:
Implementation
Strings are more akin to an array of characters. This is easier visualized if you think about how a string would be recognized by a computer: as a number (wait what?).
As I'm sure you know, computers can really only see 1's and 0's, so some smart people came up with a way to represent what character with what number... and then define some special things when some characters come in certain orders, but that's a whole other can of worms, these definitions are what we call charsets. Note that the number of bits available per character is defined by the charset i.e. UTF-8 and UTF-16
So returning to the original question "Why index by characters and not spaces?": because we're lazy and that was easy it's actually pretty convenient for reasons I'm about to elaborate on.
Mathematics
Let's be honest with ourselves, this is Computer Science, so we should probably back up our reasons with math... which is where Formal Languages come in.
A formal language L over an alphabet Σ is a subset of Σ*, that is, a set of words over that alphabet.
A formal grammar is a set of production rules for strings in a formal language.
A word over an alphabet can be any finite sequence (i.e., string) of letters.
In mathematics a sequence is an enumerated collection of objects in which repetitions are allowed.
Note: Be aware that this post is in no way a proper description of formal languages, just a collection of generalized descriptions provided by wikipedia
Which leads me to grammars, your case of a string being indexed by a " " is really more of something that is defined as a grammar and is really a concept derived from those nonsensical human languages.
So what this means is if you boil down what a string really is and what defines it, you can see that it is defined all the way down on the level of mathematics.
Practicality
But wait, I am human why does that still apply?
Well think about it this way, a string can hold more than just a sentence right? Take JSON for example "{\"key1\":\"hello world\",\"key2\":0}" would it make sense to have this string indexed by spaces? There's also the issue that substrings are a lot more tricky because we can't reference the individual characters anymore, so even iterating over a string would become a complicated task.
So why not make another data type?
Honestly this is just a reiteration of before: is it really necessary? Is it common enough of a problem to warrant an entirely new datatype when the key difference is more or less splitting the string and not allowing the programmer to look up the characters individually?
An Algorithm
Well this is probably the easiest part of the answer... now that we have a general understanding of what a string is. As with all programming, there's a ton of way to go about it: I'm specifically sticking with JS functions, because that was tagged in the question.
As #Redu mentioned, there's the String.prototype.split() method (which is the easiest way I know of). Which allows you to split a string based on another string, or a regular expression (these are also defined in formal languages, most programming languages that have them have much more "featured" regular expressions, they can also be used to describe some grammars). So to split a string based on a ' ' we can do one of three ways (2 are regular expression approaches).
console.log("Hello World!".split(" ")); // Split on all instances of a single space
console.log("Foo bar".split(/ /g)); // RegEx split on all single spaces
console.log("A RegEx".split(/\s+/g)); // RegEx split on one or more whitespaces (not necessarily just a space)
console.log("A RegEx".split(" ")); // For comparison (multiple spaces)
TL;DR
Blame math There are quite a few reasons, but essentially, it really doesn't make sense to.. why not split the string?

Length of a Regex Match

I have an array of data that is being filtered into different arrays via regular expressions. One of these arrays is for containing data that is considered "too long" for my program. Not all of these "too long" instances are the same length, but I would like to shorter them.
I want something like DRB1*01:02.
Too long is anything like DRB1*01:02:03 or longer, including things like DRB1*01:02:03:abc:29
However, the letters at the front will not always be the same length. I will be dealing with things such as A*1:01:02 or TIM*01:02. So I am specifically looking at the sets of two integers and their preceding colon, and perhaps any letters that may follow in data that is "too long". I want the letters out front, the star, and 2 sets of numbers and the colon between them.
I want to use a regular expression to find pieces of data that are "too long", and then measure the length of the data it matches, and slice backward to remove it.
Something so that it will inform me that DRB1*01:02:03 matches *01:02:03 and the length of that is 9. Same for anything like DRB1*01:02:03:abc:29, where it matches *01:02:03:abc:29 and tells me the length is 16. NOT matching a word by it's length.
Is there any way to find the length of what part of the data the regular expression has matched? Including cases where the regular expression does not mark a definite end?
I am using JavaScript.
Use a capture group to get the part that matches after the *:
var matches = str.match(/^[A-Z]+(\*.*)$/);
if (matches) {
var len = matches[1].length;
alert("It's "+len+" characters long");
}
perlish regex
if (/([A-Z0-9]+\*\d+:\d+)(.+)/) {
print "too long, prefix:$1 extra stuff:$2 length:".length($2)."\n";
}

New to Regular Expressions need help

I need a form with one button and window for input
that will check an array, via a regular expression.
And will find a exact match of letters + numbers. Example wxyz [some space btw] 0960000
or a mix of numbers and letters [some space btw] + numbers 01xg [some space btw] 0960000
The array has four objects for now.
Once found i need a function the will open a new page or window when match is found .
Thanks you for your help.
Michael
To answer the Javascript part, here's one way to "grep" through the array to find matching elements:
var matches = [];
var re = /whatever/;
foo.forEach(
function(el) {
if( re.exec(el) )
matches.push(el);
}
);
To attempt to answer the regular expression part: I don't know what "exact match" means to you, and I'm assuming "some space" belongs only in between the other terms, and I'm assuming letters means the English alphabet from 'a' to 'z' in lower and upper case and the digits should be 0-9 (otherwise, other language characters might be matched).
The first pattern would be /[a-zA-Z0-9]+\s*0960000/. Change "\s*" to "\s+" if there is at least one space, instead of zero or more space characters. Change "\s" to " " if matching the tab character (and some lesser-used space chars) is not desirable.
For the second pattern, I don't know what "numbers 01xg" means, but if it means numbers followed by that string, then the pattern would be /[a-zA-Z0-9]+\s*[0-9]+\s*01xg\s*0960000/. The same caveats apply as above.
Additionally, this will match a partial string. If the string much be matched in entirety (if nothing in the string must exist except that which is matched), add "^" to the beginning of the pattern to anchor it to the beginning of the string, and "$" at the end to anchor it to the end of the string. For example, /[a-zA-Z0-9]+\s*0960000/ matches "foo_bar 5 0960000", but /^[a-zA-Z0-9]+\s*0960000$/ does not.
For more on regular expressions in Javascript, take a look at developer.mozilla.org's article on the RegExp object (the link takes you to JS version 1.5 reference, which should apply to all JS-capable browsers).
(edited to add): To match either situation, since they have overlapping parts, you could use the following pattern: /[a-zA-Z0-9]+(?:\s*[0-9]+\s*01xg)?\s*0960000/. The question mark says to match the part that differs -- in a non-matching group (?:foo) -- once or zero times. (?:foo)? and (?:foo|) do the same thing in this case, but I'm not sure whether there is a performance difference; I would recommend to use the one that makes the most sense to you, so you can read it later.

Categories

Resources