Is there a length limitation when using replace method of a string? - javascript

I have a big string (1116902 char length) that I want to process with a regex (pretty simple one). I get a response from a soap server that is encoded in base64. So I just get the result between the appropriate xml tags and then decode the response.
This working for a small request. But when I get a big response back, the callback function of the replace() method is never called. I have tried to test the string on the regex101 website and it can find the result. So I wonder if there is a limitation in my JavaScript engine. I'm working on a Wakanda Server V10 that use Webkit as JavaScript engine. I cannot provide the string because it contains some enterprise information.
Here is my regex : /xsd:base64Binary">((.|\n)*?)<\/responseData>/
I taught it is maybe a special character that is not included in the ((.|\n)*?) group. But then why the regex101 find out the result (then maybe is the JavaScript engine)
Maybe anybody can help me?
Thanks

If you can guarantee that there are no tags between your start and end delimiter, which sounds like it might be the case, you could just change your RE to
/xsd:base64Binary">([^<]*)<\/responseData>/
which shouldn't require any backtracking and might work for you.
[^<] simply means everything but the < character. Since there shouldn't be any tags between the open and closing tags of your section (at least that's what I understand) that will accept everything until you hit your closing tag. The important thing is that the RE engine can tell immediately whether something matches that or not, so no branching or backtracking is required.

Related

Javascript RegExp being interpreted different from a string vs from a data-attribute

Long story short, I'm trying to "fix" my system so I'm using the same regular expressions on the backend as we are the front (validating both sides for obvious security reasons). I've got my regex server side working just fine, but getting it down to the client is a pain. My quickest thought was to simply store it in a data attribute on a tag, grab it, and then validate against it.
Well, me, think again! JS is throwing me for a loop because apparently RegExp interprets the string differently depending how it's pulled in. Can anyone shine some light on what is happening here or how I might go about resolving this issue
HTML
<span data-regex="(^\\d{5}$)|(^\\d{5}-\\d{4}$)"></span>
Javascript
new RegExp($0.dataset.regex)
//returns /(^\\d{5}$)|(^\\d{5}-\\d{4}$)/
new RegExp($($0).data('regex'))
//returns /(^\\d{5}$)|(^\\d{5}-\\d{4}$)/
new RegExp("(^\\d{5}$)|(^\\d{5}-\\d{4}$)");
//returns /(^\d{5}$)|(^\d{5}-\d{4}$)/
Note in the first two how if I pull the value from the data attribute dynamically, the constructor for RegExp for some reason doesn't interpret the double slash correctly. If, however, I copy and paste the value as a string and call RegExp on the value, it correctly interprets the double slash and returns it in the right pattern.
I've also attempted simply not escaping the \d character by double slashing on the server side, but as you might (or might not) have guessed, the opposite happens. When pulled from attributes/dataset, the \ is completely removed leading the Regex to think I'm looking for the "d" character rather than digits. I'm at a loss for understanding what JS is thinking here. Please send help, Internet
Your data attribute has redundant backslashes. There's no need to escape backslashes in HTML attributes, so you'll actually get a double-backslash where you don't want one. When writing regular expressions as strings in JavaScript you have to escape backslashes, of course.
So you don't actually have the same string on both sides, simply because escaping works differently.

JSONP and a very long response with escaped characters

I've got the following problem: I am sending a AJAX request to a service which returns HTML code. There are unicode characters in this code, which will be escaped with the usual \u....
The problem is, that this response is very long and jQuery split those jsonp functions into a few functions. This is not the problem, besides the fact, when those escaped characters will be splitted inside, like jsonp463827("...blabhalbha\ud0");jsonp546114("0x8blablabla...");
Then it gives me an error which says Hexcode expected, because it cannot split those escaped characters.
Is there any solution to prevent this?
What exactly is being passed back? Example address?
I don't think jQuery is doing the splitting here. It is the nature of JSONP must return a block of JavaScript statements for direct execution in a <script> tag. The client-side can't get hold of that content to split or otherwise process it because that would be a cross-site-scripting hole, the very issue JSONP is designed to get around.
I think you'll probably need to look at that service. I'm not sure why it would be trying to split a response into several function calls as there is no limit on the length of the string passed in. The limit that you might hit is Firefox's script parser stack limit (see bug 420869), but that applies to the whole of the returned script block, so splitting into several function calls won't help.

How to encode periods for URLs in Javascript?

The SO post below is comprehensive, but all three methods described fail to encode for periods.
Post: Encode URL in JavaScript?
For instance, if I run the three methods (i.e., escape, encodeURI, encodeURIComponent), none of them encode periods.
So "food.store" comes out as "food.store," which breaks the URL. It breaks the URL because the Rails app cannot recognize the URL as valid and displays the 404 error page. Perhaps it's a configuration mistake in the Rails routes file?
What's the best way to encode periods with Javascript for URLs?
I know this is an old thread, but I didn't see anywhere here any examples of URLs that were causing the original problem. I encountered a similar problem myself a couple of days ago with a Java application. In my case, the string with the period was at the end of the path element of the URL eg.
http://myserver.com/app/servlet/test.string
In this case, the Spring library I'm using was only passing me the 'test' part of that string to the relevant annotated method parameter of my controller class, presumably because it was treating the '.string' as a file extension and stripping it away. Perhaps this is the same underlying issue with the original problem above?
Anyway, I was able to workaround this simply by adding a trailing slash to the URL. Just throwing this out there in case it is useful to anybody else.
John
Periods shouldn't break the url, but I don't know how you are using the period, so I can't really say. None of the functions I know of encode the '.' for a url, meaning you will have to use your own function to encode the '.' .
You could base64 encode the data, but I don't believe there is a native way to do that in js. You could also replace all periods with their ASCII equivalent (%2E) on both the client and server side.
Basically, it's not generally necessary to encode '.', so if you need to do it, you'll need to come up with your own solution. You may want to also do further testing to be sure the '.' will actually break the url.
hth
I had this same problem where my .htaccess was breaking input values with .
Since I did not want to change what the .htaccess was doing I used this to fix it:
var val="foo.bar";
var safevalue=encodeURIComponent(val).replace(/\./g, '%2E');
this does all the standard encoding then replaces . with there ascii equivalent %2E. PHP automatically converts back to . in the $_REQUEST value but the .htaccess doesn't see it as a period so things are all good.
Periods do not have to be encoded in URLs. Here is the RFC to look at.
If a period is "breaking" something, it may be that your server is making its own interpretation of the URL, which is a fine thing to do of course but it means that you have to come up with some encoding scheme of your own when your own metacharacters need escaping.
I had the same question and maybe my solution can help someone else in the future.
In my case the url was generated using javascript. Periods are used to separate values in the url (sling selectors), so the selectors themselves weren't allowed to have periods.
My solution was to replace all periods with the html entity as is Figure 1:
Figure 1: Solution
var urlPart = 'foo.bar';
var safeUrlPart = encodeURIComponent(urlPart.replace(/\./g, '.'));
console.log(safeUrlPart); // foo%26%2346%3Bbar
console.log(decodeURIComponent(safeUrlPart)); // foo.bar
I had problems with .s in rest api urls. It is the fact that they are interpreted as extensions which in it's own way makes sense. Escaping doesn't help because they are unescaped before the call (as already noted). Adding a trailing / didn't help either. I got around this by passing the value as a named argument instead. e.g. api/Id/Text.string to api/Id?arg=Text.string. You'll need to modify the routing on the controller but the handler itself can stay the same.
If its possible using a .htaccess file would make it really cool and easy. Just add a \ before the period. Something like:\.
It is a rails problem, see Rails REST routing: dots in the resource item ID for an explanation (and Rails routing guide, Sec. 3.2)
You shouldn't be using encodeURI() or encodeURIComponent() anyway.
console.log(encodeURIComponent('%^&*'));
Input: %^&*. Output: %25%5E%26*. So, to be clear, this doesn't convert *. Hopefully you know this before you run rm * after "cleansing" that input server-side!
Luckily, MDN gave us a work-around to fix this glaring problem, fixedEncodeURI() and fixedEncodeURIComponent(), which is based on this regex: [!'()*]. (Source: MDN Web Docs: encodeURIComponent().) Just rewrite it to add in a period and you'll be fine:
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[\.!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
console.log(fixedEncodeURIComponent('hello.'));

Javascript Special Characters coming back incorrectly

There is a page where I have certain special characters on and when retrieving values of these via javascript I am getting an odd conversion. The character 'Œ' is coming back as 'R' and its lower case version 'œ' is coming back as 'S'. Is this a limitation of javascript or could it possibly be the browser. This is from testing in firefox. Also this is being retrieved via a repl client (Jssh/MozRepl) so it seems that it could be an issue with these clients themselves rather than the browser.
You likely have an encoding problem somewhere. There are many opportunities to mis-handle the encoding of text. If you post some code, we might be able to help you find it.
Output streams aren't scriptably safe for non-ASCII characters so you will need to wrap the stream in a nsIBinaryOutputStream, a nsIUnicharOutputStream or a nsIConverterOutputStream.

Escaping double hyphens in Javascript?

I have a Javascript bookmarklet that, when clicked on, redirects the user to a new webpage and supplies the URL of the old webpage as a parameter in the query string.
I'm running into a problem when the original webpage has a double hyphen in the URL (ex. page--1--of--3.html). Stupid, I know - I can't control the original page The javascript escape function I'm using does not escape the hyphen, and IIS 6 gives a file not found error if asked to serve resource.aspx?original=page--1--of--3.html
Is there an alternative javascript escape function I can use? What is the best way to solve this problem? Does anybody know why IIS chokes on resource.aspx?original=page--1 and not page-1?
"escape" and "unescape" are deprecated precisely because it doesn't encode all the relevant characters. DO NOT USE ESCAPE OR UNESCAPE. use "encodeURIComponent" and "decodeURIComponent" instead. Supported in all but the oldest most decrepit browsers. It's really a huge shame this knowledge isn't much more common.
(see also encodeURI and decodeURI)
edit: err just tested, but this doesn't really cover the double hyphens still. Sorry.
Can you expand the escape function with some custom logic to encode the hypen's manually?
resource.aspx?original=page%2d%2d1%2d%2dof%2d%2d3.html
Something like this:
function customEscape(url) {
url = escape(url);
url = url.replace(/-/g, '%2d');
return url;
}
location.href = customEscape("resource.axd?original=test--page.html");
Update, for a bookmarklet:
Link
You're doing something else wrong. -- is legal in URLs and filenames. Maybe the file really isn't found?
-- is used to comment out text in a few scripting languages. SQL Server uses it to add comments. Do you use any database logic to store those filenames? Or create any queries where this name is part of the query string instead of using query parameters?

Categories

Resources