Escaping double hyphens in Javascript? - javascript

I have a Javascript bookmarklet that, when clicked on, redirects the user to a new webpage and supplies the URL of the old webpage as a parameter in the query string.
I'm running into a problem when the original webpage has a double hyphen in the URL (ex. page--1--of--3.html). Stupid, I know - I can't control the original page The javascript escape function I'm using does not escape the hyphen, and IIS 6 gives a file not found error if asked to serve resource.aspx?original=page--1--of--3.html
Is there an alternative javascript escape function I can use? What is the best way to solve this problem? Does anybody know why IIS chokes on resource.aspx?original=page--1 and not page-1?

"escape" and "unescape" are deprecated precisely because it doesn't encode all the relevant characters. DO NOT USE ESCAPE OR UNESCAPE. use "encodeURIComponent" and "decodeURIComponent" instead. Supported in all but the oldest most decrepit browsers. It's really a huge shame this knowledge isn't much more common.
(see also encodeURI and decodeURI)
edit: err just tested, but this doesn't really cover the double hyphens still. Sorry.

Can you expand the escape function with some custom logic to encode the hypen's manually?
resource.aspx?original=page%2d%2d1%2d%2dof%2d%2d3.html
Something like this:
function customEscape(url) {
url = escape(url);
url = url.replace(/-/g, '%2d');
return url;
}
location.href = customEscape("resource.axd?original=test--page.html");
Update, for a bookmarklet:
Link

You're doing something else wrong. -- is legal in URLs and filenames. Maybe the file really isn't found?

-- is used to comment out text in a few scripting languages. SQL Server uses it to add comments. Do you use any database logic to store those filenames? Or create any queries where this name is part of the query string instead of using query parameters?

Related

Is there a length limitation when using replace method of a string?

I have a big string (1116902 char length) that I want to process with a regex (pretty simple one). I get a response from a soap server that is encoded in base64. So I just get the result between the appropriate xml tags and then decode the response.
This working for a small request. But when I get a big response back, the callback function of the replace() method is never called. I have tried to test the string on the regex101 website and it can find the result. So I wonder if there is a limitation in my JavaScript engine. I'm working on a Wakanda Server V10 that use Webkit as JavaScript engine. I cannot provide the string because it contains some enterprise information.
Here is my regex : /xsd:base64Binary">((.|\n)*?)<\/responseData>/
I taught it is maybe a special character that is not included in the ((.|\n)*?) group. But then why the regex101 find out the result (then maybe is the JavaScript engine)
Maybe anybody can help me?
Thanks
If you can guarantee that there are no tags between your start and end delimiter, which sounds like it might be the case, you could just change your RE to
/xsd:base64Binary">([^<]*)<\/responseData>/
which shouldn't require any backtracking and might work for you.
[^<] simply means everything but the < character. Since there shouldn't be any tags between the open and closing tags of your section (at least that's what I understand) that will accept everything until you hit your closing tag. The important thing is that the RE engine can tell immediately whether something matches that or not, so no branching or backtracking is required.

Escape HTML tags. Any issue possible with charset encoding?

I have a function to escape HTML tags, to be able to insert text into HTML.
Very similar to:
Can I escape html special chars in javascript?
I know that Javascript use Unicode internally, but HTML pages may be encoded in different charsets like UTF-8 or ISO8859-1, etc..
My question is: There is any issue with this very simple conversion? or should I take into consideration the page charset?
If yes, how to handle that?
PS: For example, the equivalente PHP function (http://php.net/manual/en/function.htmlspecialchars.php) has a parameter to select a charset.
No, JavaScript lives in the Unicode world so encoding issues are generally invisible to it. escapeHtml in the linked question is fine.
The only place I can think of where JavaScript gets to see bytes would be data: URLs (typically hidden beneath base64). So this:
var markup = '<p>Hello, '+escapeHtml(user_supplied_data);
var url = 'data:text/html;base64,'+btoa(markup);
iframe.src = url;
is in principle a bad thing. Although I don't know of any browsers that will guess UTF-7 in this situation, a charset=... parameter should be supplied to ensure that the browser uses the appropriate encoding for the data. (btoa uses ISO-8859-1, for what it's worth.)

Replacing special characters like dots in javascript

I have a search query from the user and I want to process it before applying to browser. since I'm using SEO with htaccess and the search url looks like this : /search/[user query] I should do something to prevent user from doing naughty things.. :) Like searching ../include/conf.php which will result in giving away my configuration file. I want to process the query like removing spaces, removing dots(which will cause problems), commas,etc.
var q = document.getElementById('q').value;
var q = q.replace(/ /gi,"+");
var q = q.replace(/../gi,"");
document.location='search/'+q;
the first replace works just fine but the second one messes with my query.. any solution to replacing this risky characters safely?
So if I disable JavaScript or use curl I still can do "naughty things"? On the client side do the sanity escaping with:
encodeURIComponent(document.getElementById('q').value)
and leave security checks to the server. You would be amazed what malicious user can do (using some escape sequences instead of plain . is the simplest example).
I'd do this server-side - it's too easy for someone to alter your JS in the page or switch it off altogether. Your search script that runs server-side can't (as) easily be tampered with and can then filter the search consistently.
You might also want to restrict what the search returns... if it's able to show sensitive config files, your search may have a little too much reach.
Dots in regular expressions match anything. You need to escape them with a back-slash ('\'):
var q = q.replace(/\.\./gi,"");

How to encode periods for URLs in Javascript?

The SO post below is comprehensive, but all three methods described fail to encode for periods.
Post: Encode URL in JavaScript?
For instance, if I run the three methods (i.e., escape, encodeURI, encodeURIComponent), none of them encode periods.
So "food.store" comes out as "food.store," which breaks the URL. It breaks the URL because the Rails app cannot recognize the URL as valid and displays the 404 error page. Perhaps it's a configuration mistake in the Rails routes file?
What's the best way to encode periods with Javascript for URLs?
I know this is an old thread, but I didn't see anywhere here any examples of URLs that were causing the original problem. I encountered a similar problem myself a couple of days ago with a Java application. In my case, the string with the period was at the end of the path element of the URL eg.
http://myserver.com/app/servlet/test.string
In this case, the Spring library I'm using was only passing me the 'test' part of that string to the relevant annotated method parameter of my controller class, presumably because it was treating the '.string' as a file extension and stripping it away. Perhaps this is the same underlying issue with the original problem above?
Anyway, I was able to workaround this simply by adding a trailing slash to the URL. Just throwing this out there in case it is useful to anybody else.
John
Periods shouldn't break the url, but I don't know how you are using the period, so I can't really say. None of the functions I know of encode the '.' for a url, meaning you will have to use your own function to encode the '.' .
You could base64 encode the data, but I don't believe there is a native way to do that in js. You could also replace all periods with their ASCII equivalent (%2E) on both the client and server side.
Basically, it's not generally necessary to encode '.', so if you need to do it, you'll need to come up with your own solution. You may want to also do further testing to be sure the '.' will actually break the url.
hth
I had this same problem where my .htaccess was breaking input values with .
Since I did not want to change what the .htaccess was doing I used this to fix it:
var val="foo.bar";
var safevalue=encodeURIComponent(val).replace(/\./g, '%2E');
this does all the standard encoding then replaces . with there ascii equivalent %2E. PHP automatically converts back to . in the $_REQUEST value but the .htaccess doesn't see it as a period so things are all good.
Periods do not have to be encoded in URLs. Here is the RFC to look at.
If a period is "breaking" something, it may be that your server is making its own interpretation of the URL, which is a fine thing to do of course but it means that you have to come up with some encoding scheme of your own when your own metacharacters need escaping.
I had the same question and maybe my solution can help someone else in the future.
In my case the url was generated using javascript. Periods are used to separate values in the url (sling selectors), so the selectors themselves weren't allowed to have periods.
My solution was to replace all periods with the html entity as is Figure 1:
Figure 1: Solution
var urlPart = 'foo.bar';
var safeUrlPart = encodeURIComponent(urlPart.replace(/\./g, '.'));
console.log(safeUrlPart); // foo%26%2346%3Bbar
console.log(decodeURIComponent(safeUrlPart)); // foo.bar
I had problems with .s in rest api urls. It is the fact that they are interpreted as extensions which in it's own way makes sense. Escaping doesn't help because they are unescaped before the call (as already noted). Adding a trailing / didn't help either. I got around this by passing the value as a named argument instead. e.g. api/Id/Text.string to api/Id?arg=Text.string. You'll need to modify the routing on the controller but the handler itself can stay the same.
If its possible using a .htaccess file would make it really cool and easy. Just add a \ before the period. Something like:\.
It is a rails problem, see Rails REST routing: dots in the resource item ID for an explanation (and Rails routing guide, Sec. 3.2)
You shouldn't be using encodeURI() or encodeURIComponent() anyway.
console.log(encodeURIComponent('%^&*'));
Input: %^&*. Output: %25%5E%26*. So, to be clear, this doesn't convert *. Hopefully you know this before you run rm * after "cleansing" that input server-side!
Luckily, MDN gave us a work-around to fix this glaring problem, fixedEncodeURI() and fixedEncodeURIComponent(), which is based on this regex: [!'()*]. (Source: MDN Web Docs: encodeURIComponent().) Just rewrite it to add in a period and you'll be fine:
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[\.!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
console.log(fixedEncodeURIComponent('hello.'));

where did the other/ come from?

why some thing like this
url="http:/"+"www.pathname.com"+"/"+"anotherString";
turns into http://www.pathname.com/anotherString.
Notice the // after the http:.
I am not sure what you mean exactly, but if it is in the URL bar of the web-browser, then it is just correcting you. you are meant to have the two forward slashes their. Its just the way you are meant to do it.
An interesting anecdote: Nokia older phones write http:/ instead of http://.
Its just the way urls are designed, and the way browsers interpret them. For more details see this blog post
Double slash in Web addresses 'a bit of a mistake' and
Berners-Lee 'sorry' for slashes
As a string in your file, it does not add an extra / to it, but if you type that string into the browser address bar, the browser will automatically add the additional / to the URL to make it valid. I do believe it will do this for images and other links on the page to make them valid.

Categories

Resources