Remove personal information from mixpanel javascript tracking call

Remove personal information from mixpanel javascript tracking call - javascript

I'm currently integrating with MixPanel's javascript library and have run into an issue that MixPanel does not seem to have thought about. Our company deals with personally identifiable information(PII), and therefore some data that we pass as params are not appropriate for storage on a third-party service. But MixPanel's default behavior is to include the full url for the current page and the referrer with every tracking event. This makes sense to a degree, but we need to scrub some query parameters out of these fields.
It seems like MixPanel's documentation does not discuss an API for accomplishing this, so any advice from someone more experienced with MixPanel integration would be helpful.

"Our company deals with personally identifiable information(PII), and
therefore some data that we pass as params are not appropriate for
storage on a third-party service."
You should not be passing sensitive data as URL parameters. This is a security no-no: the values end up in browser history (and thus can be retrieved by somebody looking through the browser history) and in server-side logs. Always pass sensitive data through the HTTP body or headers instead.
References:
https://blog.httpwatch.com/2009/02/20/how-secure-are-query-strings-over-https/
Is an HTTPS query string secure?
https://blog.gaborszathmari.me/2015/05/05/session-ids-as-query-parameters-must-die/
http://owasp-aasvs.readthedocs.org/en/latest/requirement-9.3.html

You can actually override the Current URL by setting $current_url in your event, e.g.
mixpanel.track('My Event', {
'$current_url': 'http://www.example.com'
});

Related

Javascript redirect vulnerability

Say I store the url path in a query parameter like /?return_back_to=/foo/bar
Then pass this to some external auth service like Microsoft, which does the login and returns to the same url with my query parameter.
At this point, is it safe to get the value from the query parameter and redirect using React navigate() to this url? Or is this considered an "open redirect vulnerability" ?

On the surface, as long as you follow a bunch of best practices and validate the query parameter, it should be save to use it, and would not be "open redirect vulnerability".
You mentioned using Microsoft auth service, which i personally don't have that much experience with, but I have used firebase and google auth a lot and I know that they automatically check and if the redirect URL is not whitelisted it will not work. firebase automatically adds localhost and your app domain to whitelist and you can add more if you have external links that you would like your users to be redirected to.
source 1: https://support.google.com/firebase/answer/6400741?hl=en
source 2: https://support.google.com/firebase/answer/9021429?hl=en
in terms of it being safe to use react navigate() when users are actually back to your app, you should make sure to either check the URL against a local whitelist or just add your app domain to the URL before redirecting the users.
navigate({safeDomain} + {query parameter})
Although I should mention that if by navigate() you are refering to useNavigate() hook, I dont think you can use it for that, and you need to use redirect() .
some more useful information for mitigating against open redirect vulnerability
I hope this was helpful!

It depends on who is calling that endpoint. Well known identity providers will require you to set allowed redirect urls, and will only send back the authorized ones (the ones you set up). So they will only call the callback if it's ok, so you can redirect securely.
However, anybody else might use this url (link to it) with a different parameter, to which you don't want to naviigate, that would be an open redirect. So you need to make sure that the request actually originates from a trusted source, ie. from Azure AD. Depending on what flow you are implementing, you can either validate a token you received to make sure it is a valid request, or at the very least you can check an Origin / Referer header to see who the caller is (it's not possible to alter Origin or Referer in Javascript, so an attacker cannot have a legitimate user visit a link with a malicious redirect, with an Origin from Microsoft).
Also if you only redirect in your own origin (domain), you can and should add validation that the redirect path (return_back_to) is internal, like for example starts with a / and/or does not contain ://.

I don't completely agree with what has been said in the other answers.
Indeed, the return URL that may contain any information using this mechanism will expose this data to all kinds of attacks.
Let's take an example, if the return url is for example the user's dashboard, the url could be of the type /users/<id or name of the user>
Moreover, all this also depends on the checks that are implemented on your side.
Let's say an attacker tries to use your url to override the url legitimately returned by your authentication service: http://callback.url/?return_back_to=http://malicious.example.com
If you do not handle the data received correctly, this will lead to potential security breaches.
For all of these reasons (and many others) i f think you should avoid passing and using directly web (or filesystem) paths directly in a GET or POST var.
Instead, you can evaluate the possibility of passing only the information you need to identify the target and the construct the url in your callback script before passing it to navigate().
Also, i don't know if you do that, avoid passing any numerical ID in POST and GET vars. Instead generate an alphanumeric unique ID to avoid ID predicting from attaquers.
For having more information about Web security i recommande learning the OWASP Top Ten rules : https://owasp.org/www-project-top-ten/
This is a good entry point to building secured web apps.

How to map a custom URL to a REST resource in an SPA?

I have a design issue with my SPA, and hope someone can give me some direction. A user profile page is rendered like this:
The browser fetches /some-username.
The server checks to see if the request was a XMLHTTPRequest or not. It is not, and so it simply returns the bundled javascript app to the browser to execute.
The javascript bundle is executed in the browser, it sees the current URL and makes an AJAX request, again to /some-username.
The server sees the XMLHTTPRequest header, looks up the user who has the custom URL "/some-username" and returns the JSON data about the user back to the javascript to render.
This feels wrong. The app should be making RESTful requests to /users/:id to fetch the user data. But how can it know the id that corresponds to the user with the URL /some-username?
It is worth adding an extra HTTP request just to look up the resource identifier? Something like /get_user_id?url=/some-username.

Are you flexible about your API? If so you may change /some-username to /user-id or if you want to include username /user-id/username but ignore username.
As alternative it is also common to make requests in a filter form. Like /users?username=peter
And feel free to use /users/peter if your username identifies the user. Becuase it's actualy the id (that doesn't have to be integer) and then your url is exactly /users/:id

There is nothing "unRESTful" about /some-username. It's just another resource. The response - I hope - contains the canonical URL /user/id anyway, either as a header or as some kind of "self" link.
That's also how you could achieve your goal. Embed the URL in the page either as JavaScript or as a header equivalent (unfortunately you cannot read the headers of the page request with JavaScript):
//header. Can also use a custom header like X-User-Location
<meta http-eqiv="Location" content="/user/id">
//JavaScript
<script>
var userURL = '/user/id
</script>
I recommend keeping your current approach.

Retrieving the Omniture tracking call url before calling s.t()

I'd like to use an old s_code.js to generate a url that I can use to track an event, but I'd like not to use s.t() as it provides no way of keeping tabs on the request. Is there any way to get the url as a String before sending off the tracking request with s.t()?

You'd need to pull from a mixture of methods in order to get this data. There is no one method (at least in AppMeasurement) that will get this for you. A combination of s.pb (builds the domain and path, but also calls for the creation of the request), s.gb for the build of URL parameters, and s.t for the build of the cache buster. YMMV given this is in the core library with 0 expectations of internal methods not getting renamed.
I guess my question is why you're wanting to do this. There are options of preventing the call from being made, but needing to inspect the URL is a first for me.

Protecting my REST service, which I will use on the client side, from others to use

Let's assume that I have created my REST service smoothly and I am returning json results.
I also implemented API key for my users to communicate for my service.
Then Company A started using my service and I gave them an API key.
Then they created an HttpHandler for bridge (I am not sure what is the term here) in order not to expose API key (I am also not sure it is the right way).
For example, lets assume that my service url is as follows :
www.myservice.com/service?apikey={key_comes_here}
Company A is using this service from client side like below :
www.companyA.com/services/service1.ashx
Then they start using it on the client side.
Company A protected the api key here. That's fine.
But there is another problem here. Somebody else can still grab www.companyA.com/services/service1.ashx url and starts using my service.
What is the way of preventing others from doing that?
For the record, I am using WCF Web API in order to create my REST services.
UPDATE :
Company A's HttpHandler (second link) only looks at the host header in order to see if it is coming from www.companyA.com or not. but in can be faked easily I guess.
UPDATE 2 :
Is there any known way of implementing a Token for the url. For example, lets say that www.companyA.com/services/service1.ashx will carry a querystring parameter representing a TOKEN in order for HttpHandler to check if the request is the right one.
But there are many things here to think about I guess.

You could always require the client to authenticate, using HTTP Basic Auth or some custom scheme. If your client requires the user to login, you can at least restrict the general public from obtaining the www.companyA.com/services/service1.ashx URL, since they will need to login to find out about it.
It gets harder if you are also trying to protect the URL from unintended use by people who legitimately have access to the official client. You could try changing the service password at regular intervals, and updating the client along with it. That way a refresh of the client in-browser would pull the new password, but anyone who built custom code would be out of date. Of course, a really determined user could just write code to rip the password from the client JS programmatically when it changes, but you would at least protect against casual infringers.
With regard to the URL token idea you mentioned in update 2, it could work something like this. Imagine every month, the www.companyA.com/services/service1.ashx URL requires a new token to work, e.g. www.companyA.com/services/service1.ashx?token=January. Once it's February, 'January' will stop working. The server will have to know to only accept current month, and client will have to know to send a token (determined at the time the client web page loads from the server in the browser)
(All pseudo-code since I don't know C# and which JS framework you will use)
Server-side code:
if (request.urlVars.token == Date.now.month) then
render "This is the real data: [2,5,3,5,3]"
else
render "401 Unauthorized"
Client code (dynamic version served by your service)
www.companyA.com/client/myajaxcode.js.asp
var dataUrl = 'www.companyA.com/services/service1.ashx?token=' + <%= Date.now.month %>
// below is JS code that does ajax call using dataUrl
...
So now we have service code that will only accept the current month as a token, and client code that when you refresh in the browser gets the latest token (set dynamically as current month). Since this scheme is really predictable and could be hacked, the remaining step is to salted hash the token so no one can guess what it is going to be .
if (request.urlVars.token == mySaltedHashMethod(Date.now.month)) then
and
var dataUrl = 'www.companyA.com/services/service1.ashx?token=' + <%= mySaltedHashMethod(Date.now.month) %>
Which would leave you with a URL like www.companyA.com/services/service1.ashx?token=gy4dc8dgf3f and would change tokens every month.
You would probably want to expire faster than every month as well, which you could do my using epoch hour instead of month.
I'd be interested to see if someone out there has solved this with some kind of encrypted client code!

What you're describing is generally referred to as a "proxy" -- companyA's public page is available to anyone, and behind the scenes, it makes the right calls to your system. It's not uncommon for applications to use proxies to get around security -- for example, the same-origin policy means that your javascript can't make Ajax calls to, say, Amazon -- but if you proxy it on your own system, you can get around this.
I can't really think of a technical way to prevent this; once they've pulled data from your service, they can use that data however they want. You have legal options, of course; you can make it a term of service that proxying isn't allowed, and pull their API key if they don't comply. But most likely, if you haven't already included that in the TOS, you'd have to wait for, say, a renewal of their subscription to your service.
Presumably if they're making server-side HTTP requests to your service, those requests are all coming from the same IP address, so you could block that address. You'd probably want to tell them first, and they could certainly get around that if they wanted to.

With the second link exposed by Company A I don't think you can do much. As I understand it, you can only check whether the incoming request comes from Company A or not.
But each request issued to www.companyA.com/.. can't be distinguished from original request from Company A. Everyone they let in uses their referrer as a disguise.

How to prevent direct access to my JSON service?

I have a JSON web service to return home markers to be displayed on my Google Map.
Essentially, http://example.com calls the web service to find out the location of all map markers to display like so:
http://example.com/json/?zipcode=12345
And it returns a JSON string such as:
{"address": "321 Main St, Mountain View, CA, USA", ...}
So on my index.html page, I take that JSON string and place the map markers.
However, what I don't want to have happen is people calling out to my JSON web service directly.
I only want http://example.com/index.html to be able to call my http://example.com/json/ web service ... and not some random dude calling the /json/ directly.
Quesiton: how do I prevent direct calling/access to my http://example.com/json/ web service?
UPDATE:
To give more clarity, http://example.com/index.html call http://example.com/json/?zipcode=12345 ... and the JSON service
- returns semi-sensitive data,
- returns a JSON array,
- responds to GET requests,
- the browser making the request has JavaScript enabled
Again, what I don't want to have happen is people simply look at my index.html source code and then call the JSON service directly.

There are a few good ways to authenticate clients.
By IP address. In Apache, use the Allow / Deny directives.
By HTTP auth: basic or digest. This is nice and standardized, and uses usernames/passwords to authenticate.
By cookie. You'll have to come up with the cookie.
By a custom HTTP header that you invent.
Edit:
I didn't catch at first that your web service is being called by client-side code. It is literally NOT POSSIBLE to prevent people from calling your web service directly, if you let client-side Javascript do it. Someone could just read the source code.

Some more specific answers here, but I'd like to make the following general point:
Anything done over AJAX is being loaded by the user's browser. You could make a hacker's life hard if you wanted to, but, ultimately, there is no way of stopping me from getting data that you already freely make available to me. Any service that is publicly available is publicly available, plain and simple.

If you are using Apache you can set allow/deny on locations.
http://www.apachesecurity.net/
or here is a link to the apache docs on the Deny directive
http://httpd.apache.org/docs/2.0/mod/mod_access.html#deny
EDITS (responding to the new info).
The Deny directive also works with environment variables. You can restrict access based on browser string (not really secure, but discourages casual browsing) which would still allow XHR calls.
I would suggest the best way to accomplish this is to have a token of some kind that validates the request is a 'good' request. You can do that with a cookie, a session store of some kind, or a parameter (or some combination).
What I would suggest for something like this is to generate a unique url for the service that expires after a short period of time. You could do something like this pretty easily with Memcache. This strategy could also be used to obfuscate the service url (which would not provide any actual security, but would raise the bar for someone wanting to make direct calls).
Lastly, you could also use public key crypto to do this, but that would be very heavy. You would need to generate a new pub/priv key pair for each request and return the pubkey to the js client (here is a link to an implementation in javascript) http://www.cs.pitt.edu/~kirk/cs1501/notes/rsademo/

You can add a random number as a flag to determine whether the request are coming from the page just sent:
1) When generates index.html, add a random number to the JSON request URL:
Old: http://example.com/json/?zipcode=12345
New: http://example.com/json/?zipcode=12345&f=234234234234234234
Add this number to the Session Context as well.
2) The client browser renders the index.html and request JSON data by the new URL.
3) Your server gets the json request and checks the flag number with Session Context. If matched, response data. Otherwise, return an error message.
4) Clear Session Context by the end of response, or timeout triggered.

Accept only POST requests to the JSON-yielding URL. That won't prevent determined people from getting to it, but it will prevent casual browsing.

I know this is old but for anyone getting here later this is the easiest way to do this. You need to protect the AJAX subpage with a password that you can set on the container page before calling the include.
The easiest way to do this is to require HTTPS on the AJAX call and pass a POST variable. HTTPS + POST ensures the password is always encrypted.
So on the AJAX/sub-page do something like
if ($_POST["access"] == "makeupapassword")
{
...
}
else
{
echo "You can't access this directly";
}
When you call the AJAX make sure to include the POST variable and password in your payload. Since it is in POST it will be encrypted, and since it is random (hopefully) nobody will be able to guess it.
If you want to include or require the PHP directly on another page, just set the POST variable to the password before including it.
$_POST["access"] = "makeupapassword";
require("path/to/the/ajax/file.php");
This is a lot better than maintaining a global variable, session variable, or cookie because some of those are persistent across page loads so you have to make sure to reset the state after checking so users can't get accidental access.
Also I think it is better than page headers because it can't be sniffed since it is secured by HHTPS.

You'll probably have to have some kind of cookie-based authentication. In addition, Ignacio has a good point about using POST. This can help prevent JSON hijacking if you have untrusted scripts running on your domain. However, I don't think using POST is strictly necessary unless the outermost JSON type is an array. In your example it is an object.

Develop Reference

JavaScript is the programming language of the Web.

Remove personal information from mixpanel javascript tracking call - javascript

You can actually override the Current URL by setting $current_url in your event, e.g. mixpanel.track('My Event', { '$current_url': 'http://www.example.com' });

Related

Javascript redirect vulnerability

How to map a custom URL to a REST resource in an SPA?

Retrieving the Omniture tracking call url before calling s.t()

Protecting my REST service, which I will use on the client side, from others to use

How to prevent direct access to my JSON service?

Categories

Resources