Run Database Stored RegEx against DOM

Run Database Stored RegEx against DOM - javascript

I have a question about how to approach a certain scenario before I get halfway through it and figure out it was not the best option.
I work for a large company that has a team that creates tools for the team mates to use that aren’t official enterprise tools. We have no access to the database directly, just access to an internal server to store our files to run and be able to access the main site with javascript etc (same domain).
What I am working on is a tool that has a ton of options in it that allow you to select that I will call “data points” on a page.
There are things like “Account status, Balance, Name, Phone number, email etc” and have it save those to an excel sheet.
So you input account numbers, choose what you need and then using IE Objects it navigates to the page and scrapes data you request.
My question is as follows..
I want to make the scraping part pretty Dynamic in the way it works. I want to be able to add new datapoints on the fly.
My goal or idea is so store the regular expression needed to get the specific piece of data in the table with the “data point option”.
If I choose “Name” it knows the expression for name in the database to run again the DOM.
What would be the best way about creating that type of function in Javascript / Jquery?
I need to pass a Regex to a function, have it run against the DOM and then return the result.
I have a feeling that there will be things that require more than 1 step to get the information etc.
I am just trying to think of the best way to approach it without having to hardcode 200+ expressions into the file as the page may get updated and need to be changed.
Any ideas?

IRobotSoft scraper may be the tool you are looking for. Check this forum and see if questions are similar to what you are doing: http://irobotsoft.org/bb/YaBB.pl?board=newcomer. It is free.
What it uses is not regular expression but a language called HTQL, which may be more suitable for extracting web pages. It also supports regular expression, but not as the main language.
It organizes all your actions well with a visual interface, so you can dynamically compose actions or tasks for changing needs.

Related

How to transfer Values(items) from one Netsuit business system form to another Netsuit form automatically?

Currently, I have to fill out 2 of the same forms with the same language & part numbers on the Web Based business system "Netsuit" made by Oracle (extremely annoying & waste of time). I need to use a software/code system to read one form entry and duplicate it to the other automatically, just still feeling out the best way to do this and get it to transfer/skim properly.
This is between 2 sister companies, each value(Part) has a different part number linked to them, but internally they cannot be linked due to reporting purposes and which company sales what.
One company starts with 100XXX-XX numbers and the other starts with 300XXX-XX numbers for the parts. Again, they are basically the same Parts.
Not sure if Tampermonkey or java will be able to do this properly as I don't even know where to start.
Any recommendations or walkthough on the best way to do this would be awesome, I know it might be a little hard since its 2 different item systems.
Maybe just pull the description of the items since they will be almost the same?

You can create a user event script on the first company and RESTlet on the second one.
The user event script on the first company will create a JSON object of the item that is being created and pre-process the changes in the part number(or any other changes that is required) for the second company and send it to the second company's RESTlet. The RESTlet will now then create the item for the second company.
By using this approach you don't need any 3rd party application to deal with.

Save Value State for Public Sharing (Add to URL)

http://liveweave.com/xfOKga
I'm trying to figure out how to save code similar to Liveweave.
Basically whatever you code you click the save button and it generates a hash after the url. When you go to this url you can see the saved code. (I been trying to learn this, I just keep having trouble finding the right sources. My search results end up with references completely unrelated to what I'm looking for, example )
I spent the past two days researching into this and I've gotten no where.
Can anyone can help direct me to a tutorial or article that explains this type of save event thoroughly?

To understand the functionality, it is best to try and identify everything that is happening. Dissect this feature according to the technology that would typically be used for each distinguishable component. That dissected overview will then make it easier to see how the underlying technologies work together. I suspect you may lack the experience or nomenclature to see at a glance how a site like liveweave works or how to search for the individual pieces, so I will break it down for you. It will be up to you to research the individual components that I will name. Knowing this, here are the keys you need to research:
Note that without being the actual developer of liveweave, knowing all the backend technology is not possible, but intelligent guesses will suffice. The practice is all the same. This is a cursory breakdown.
1) A marked up page, with HTML, CSS, and JavaScript. This is the user-facing part of the application, where content can be typed, and how the user interacts with the application.
2) JavaScript to asynchronously (AJAX) submit the page's form to the backend for processing.
3) A backend programming/scripting language to process the incoming form. In the case of liveweave, the form is POSTed. It is also using PHP to process the form.
4) A database table with a column for each language (liveweave has HTML, CSS, and JavaScript). This database will insert the current data from each textarea submitted in the form and processed by PHP as a new row. Each row will generate a new hash and store it alongside the data just inserted. A popular database is MySQL.
5) When the database insert is complete, the scripting language takes over again, and send its response back to the marked up page (1). That page is waiting for a response from the backend. JavaScript handles the response. In the case of liveweave, the response is the latest hash to be used in the URL.
6) The URL magic happens with JavaScript. You want to look up JavaScript's latest History API, where methods like pushState will be used to update the URL in the browser without actually refreshing the page.
When a URL with a given hash is navigated to, the scripting language processes the request, grabs the hash, searches for the hash in the database table, finds a matching row, and populates the page's textareas with the data just found.
Throughout all this, there should be checks to avoid duplication and a multitude of exploits. This is also up to you to research.
It should be noted that currently there are two comments for your question. Darren's link will indeed allow the URL to change, but it is a redirect, and not what you want. ksealey's answer is not wrong; that is one way of doing it, but it is not the most robust or scalable, and would not be the recommended approach for solving this.

how to avoid cross site scripting in javascript used in tcl script? [duplicate]

This question already has answers here:
What are the common defenses against XSS? [closed]
(4 answers)
Closed 8 years ago.
while working on project open which is open source application, the url http://[host_ip]:8000/register/ includes Java Scripts which are vulnerable to cross-site scripting and Authentication Bypass Using SQL Injection.
I want to know that how can I avoid it? do I have to insert filter for that? and how should I do that?
please let me know if the problem is not clear to understand.

SQL Injection
The universal answer to SQL Injection problems is “never send any user input to the database as part of an SQL string”. Anything that can go as a parameter should do so. Thus, instead of (in some dialect that might not exactly match what you're looking at):
db eval "SELECT userid FROM users WHERE username = '$user' AND password = '$pass'"
you do:
db eval "SELECT userid FROM users WHERE username = ? AND password = ?" $user $pass
# I personally prefer to put SQL inside {braces}… but that's your call
The key is that because the database engine just understands that these are parameters, it never tries to interpret them as SQL. Injection Impossible. (Unless you're using badly-written stored procedures.)
It gets much more complex where you want to have a table or column name specified by a user. That's a case where you can't send it as a parameter; such SQL identifiers must be interpreted by the SQL engine. Your only alternatives there are to either remap from user-supplied terms to ones that you control, or to rigorously validate.
Remapping is done by having a separate trivial table that maps from user-supplied names to ones you've generated:
db eval {SELECT realname FROM namemap WHERE externalname = ?} $externalname
Because the generated name is easy to guarantee to be free of nasty characters and not to be one of SQLs keywords, it can be safely used in SQL text without further quoting. You can also try doing the mapping per request (factor out the mapping code to a procedure of course) by stripping all bad characters from it. A suitable regsub might be:
regsub -all {\W+} $externalname "" realname
but then we need additional checks to see that it isn't “evil”:
# You'll need to create an array, SQLidentifiers, first, perhaps like:
# array set SQLidentifers {UPDATE - SELECT - REPLACE - DELETE - ALTER - INSERT -}
# But you can do that once, as a global "constant"
if {[regexp {^\d} $realname] || [info exist SQLidentifiers([string toupper $realname])]} {
error "Bad identifier, $externalname"
}
As you can see, it's a good idea to factor out such transforms and checks into their own procedure so you get them right, once.
And you must test your code extensively. I cannot stress that hard enough. Your tests must try really hard to break things, to make SQL injections via every possible field that anyone could pass into the software; not one of them should ever result in anything happening that your code ever expects.
It's probably a good idea to get someone else to write at least some of the tests; experience from the security community suggests that it is relatively easy to write code that you can't break yourself, but much harder to write code that someone else can't break. Also consider doing fuzz testing, sending computer-generated random data at the interface. In all cases, either things should give a graceful error or should succeed, but never ever cause the application to outright fail.
(You might well allow highly-authenticated users — system/database administrators — to outright specify SQL to evaluate so they can do things like setting the system up, but they're the minority case.)
Cross-site Scripting
This is actually conceptually quite similar: it's caused (principally) by someone putting something in your site that unexpectedly gets interpreted as HTML (or CSS, or Javascript) rather than as human-readable text (with SQL injection, it's something getting interpreted as SQL rather than as data). Because you can't do the equivalent of parameterised queries when going back to the client, you have to use careful quoting. You're strongly recommended to do the careful quoting by using a proper templating library that constructs a DOM tree (with data coming from users or from the database being only ever inserted as text nodes).
If you want users to supply a marked up piece of text, consider either delivering it back as plain text before using Javascript to render it as, say, Markdown, or completely parsing the user-supplied text on the server to construct a model (e.g., DOM tree) of what should be delivered, before sending it back as HTML generated from that model.
You must not allow users to specify a location where you load a script or frame from. Even allowing them to specify links is worrying, but you probably have to permit that if you can't restrict things to straight plain text. (Consider adding a mechanism for listing all links that have been supplied by users. Consider marking all external links with rel=nofollow unless you can positively detect that they go to somewhere that you whitelist.)
Direct supply of HTML is a “highly-authenticated users only” operation.
(I told a lie above. You can do the equivalent of SQL parameterised queries. You write JS that the client executes to fetch the user data using an AJAX query, perhaps serialized as JSON, and then do DOM manipulations there to render it; in effect, you're moving the DOM construction from the server to the client, but you're still doing DOM construction as that's the core of how you get this right. You have to remember to never insert the things retrieved as straight HTML though. Clients must not trust the server too much.)
The comments I made above above about testing apply here too. With testing for XSS, you're looking to inject something like <script>alert("boom!")</script>; any time you can get that in and cause a popup dialog — except by being a system administrator with direct permission to edit HTML directly — you've got a massive dangerous hole to plug. (It's quite a good thing to try to inject, as it is very noticeable and yet fairly benign in itself.)
Don't try to just filter out <script> using regular expressions. It's far too hard to get that right.

How to implement different languages on html page

I am just a newcomer developing an app with html/css/js via phonegap. I've been searching info on how to make my app be displayed in different languages and Google doesn't understand me.
So the idea is to have a button on index.html that let the user choose the language in which the app will be displayed, in this case Spanish/English, nothing strange like arabic blablabla....
So I guess that the solution must be related to transform all the text that I load in html to variables and then depending on the language selected display the correct one. I have no idea how to make this, and Im not able to find examples. So that's what Im asking for... if someone could give some code snipet to see how html variables works and how should I save user language selection...
Appreciated guys!

This can be done by internationalization (such as i18N). To do this you need separate file for each language and put all your text in it. Search Google for internationalization.
Otherwise you can look into embeding Google Translate.

This depends on the complexity of language-dependencies in the application. If you have just a handful of short texts in a strongly graphic application, you can just store the texts in JavaScript variables or, better, in properties of an object, with one object per language.
But if you expect to encounter deeper language-dependencies as well (e.g., displaying dynamically computed decimal numbers, which should be e.g. 1.5 in English and 1,5 in Spanish), then it’s probably better to use a library like Globalize.js (described in some detail in my book Going Global with JavaScript and Globalize.js). That way you could use a unified approach, writing e.g. a string using Globalize.localize('greeting') and a number using Globalize.format(x, 'n1') and a date using Globalize.format(date, 'MMM d').

Building a Quote Database without Server Side

I want to build a searchable database of quotes. The idea is I would type in a keyword to a search box and I would have quotes with those keywords. I would assign key words to the quotes. I am using a hosted CMS (Adobe Business Catalyst) and cannot use server side scripting. How is the best way to go about this? Is it possible to do this with javascript and jquery?

You could put all of the quotes on to the page statically in a JSON object, or even just as HTML elements, ready to be shown, but hidden. Then search through them with your keywords, and un-hide the ones that are relevant to the search.
Depending on how many quotes you have, the page could become large and take a long time to load, but that's just something to keep in mind for performance.

The way I would go about doing this is build a quotes web app. Then construct a web apps search form and only include a text box for searching by keyword. BC will automatically search the description of the item or the custom field in your web app, whichever you choose.
This WILL take less time than creating JSON objects are parsing HTML code. This utilizes server side logic and only returns to the browser the results that match your criteria so this will have better performance.
The only downside is the results page will not be SEO friendly. In the case where you want to create a pre-defined search, I would Ajax in the results of the search into your static page.

After a bit more research I discovered that Business Catalyst will allow you to build "Web Apps". This can operate as a database and you can incorporate a nice search into the webapp that will allow you to search for keywords etc.
Other than that I believe you would be required to follow #ctcherry 's method.

Develop Reference

JavaScript is the programming language of the Web.