Could you please help me with a rule that can exclude users that have already been exposed to one of other experiments in Google Optimize?
What is the best approach?
I am thinking about using 1st party cookie variable or some other custom variable that would mark the user as "exposed" so that another experiment will not affect him.
In addition I can use "run custom "JavaScript" in Optimize's visual editor that will create such a cookie. Will that solve the problem?
Also I can't understand how to prevent 2 experiments from running simultaneously. So that user who sees experiment A will not see experiment B or C (free version is limited to 3 experiments). Is there any rules or configuration that can help with that?
Just had to tackle this! The approach by #swapnil-jain seemed to work on the surface level, but unfortunately had some issues.
When Optimize evaluates whether or not a new user should be opted in to a list of experiments, it creates the _gaexp cookie once for all opt-ins, it doesn't create it then update it between opt-ins.
So it looks for the _gaexp cookie, evalutes opt-in for Exp A (does not contain <expBId>) and opts a user in to Exp A. Then it evalutes opt-in for Exp B (does not contain <expAId>), and opts a user in to Exp B. Then it creates a cookie reading something like GAX1.3.<expAId>.<expDate>.<value>.<expBId>.<expDate>.<value>.
The problem now is that on their second visit, the user will be excluded from seeing the variation for either experiment, because they now fail the audience targeting conditions. Their cookie now contains both <expAId> and <expBId>!
I had similar problems trying to target the _gaexp cookie with regex, since the cookie is created all at once for both experiments after the opt-in is decided.
My current working solution is to create a custom JavaScript rule called rand100. For a first time user, it generates a random number 0-100. If that number is below 50, they are evaluated for Exp A, and a cookie is stored containing the rand100 value. If equal to or above 50, they are evaluated for Exp B. For returning users, the cookie is retrieved, and the previous value of rand100 is returned instead of a new one, and so they still meet the targeting conditions for the experiment they've been opted into.
The one cookie is shared between opt-in evaluations, since it runs the custom JavaScript during Exp A eval, and so the cookie is available for Exp B eval. The cookie is set to expire after 90 days, which is the default expiration for Optimize tracking.
I'm running these experiments at 100% traffic, but technically they are only seeing 50% traffic because of rand100. Traffic is split 25%/25%/25%/25% between control/v1/control/v1.
When we create any experiment, google creates an experiment id which we can find in the details section. Also when an experiment is triggered for a user, it sets a _gaexp cookie which contains that experiment id (apart from other identifiers).
So, if you want to run two mutually exclusive experiments, all you need to do is exclude a user from the experiment if _gaexp contains the id of other. Here are the steps:
In audience targeting, add a rule and choose First-party cookie
Create a variable and set it's value to _gaexp
Choose the does not contain option and add other experiment's id in value
Save
Repeat the same steps for the other experiment
This is one of the reasons that I'm still using Google Experiments. It provides a lot more control with its api. With that said, you should be able to achieve the result that you're looking for by setting a cookie in the user's browser. Here is how I see it playing out:
All experiment cookies have same the same name but differing values to avoid creating multiple cookies.
Upon a new session, check to see if cookie exists
Exists - fire tag to initialize appropriate experiment.
Doesn't Exist - determine which experiment bucket to put user in and fire tag to initialize appropriate experiment.
I know that Optimizely has an algorithm to bucket experiment users in a way that each user can be part of multiple experiment but I don't believe that Google Optimize has that sort of functionality yet.
Related
I am looking for a way to take window.location.hostname and end up with the base domain name, regardless of situation. So from all of:
http://example.domain.com
https://www.example.domain.com
https://www.example.domain.com/stuff/index.php
I would end up with:
domain.com
And from: example.domain.co.uk/ would result: domain.co.uk
I have searched many questions on this topic here on SO and it seems like none ever really result in a complete answer. Some involve using complicated regex that it's hard to tell if wouldn't sometimes fail. And other answers only net a result of example.domain.com.
I am utterly floored that there is not just a simple way of getting this value in JavaScript. I am writing a plugin for websites that uses cookies to store user preferences. I am concerned that if a user sets preferences while using the plugin on a page from one hostname, say one.domain.com, on the off chance they go to use the plugin on another page hosted on two.domain.com, they would need to set their preferences all over again. When I set the cookies, I would like to be able to set them site-wide (at the domain.com level). Because this is a plugin, the domain name will not be known and needs to be calculated starting from window.location.hostname.
So is there a standard way of arriving at what I'm looking for? Or am I just approaching this the wrong way? I suppose I could just have a configuration setting for the website owner to input their base domain name and get it that way, but I'd prefer not to go that route if possible. And honestly, I'd still like to know how to do this anyway. Thanks in advance for any suggestions!
window.location.host
will return the domain name. But, in your case it would return "example.domain.com".
Then you need to do some manipulation on it, using split function , to get the desired value.
I have an experiment with multiple conditions. In each experiment, I am asking Turkers to rate a piece of news article, but I am changing the presentation across experimental conditions. I need to post all conditions to Turk simultaneously, but allow each worker to see only one of the many conditions. The first time a worker attempts a hit, he is randomly assigned to one of the experimental condition (say A) and then if he attempts any subsequent hits he should be shown tasks related to the same experimental condition (A) and not to others (B, C, etc.).
I know that one can use Qualification requirements to prevent workers to repeat surveys, but this is not possible in the case I described above, because I need to post different experimental conditions simultaneously, unlike surveys where you do phase1, retrieve worker IDs who did the task and then block (or assign qualification) so that they cannot attempt the survey in phase 2.
Anyways, short version of the question is: How do I assign a worker randomly to an experimental condition and then ensure he is assigned to the same one in any subsequent hits, provided I post all conditions simultaneously. Is there a way to retrieve worker IDs while they see the HIT and then assign them the corresponding experimental condition task. The closest I can find is this paper (page 5), where they use IFrame to find the worker ID.
What I would recommend here is not using true randomization. Instead, use javascript to retrieve the worker's workerId parameter, which will be passed as a URL query argument to the HIT. Then, in the HIT itself, use javascript to nonrandomly show workers an article based upon some portion of their workerId (e.g., the last two characters).
As a fuller example, create an array of all two-element alphanumeric combinations. Randomly assign each of those combinations to a condition. Then, when the worker accepts the HIT, extract their workerId, match the last two characters thereof to the condition based on the array, and then display the appropriate content to the worker. So, in essence, you're nonrandomly assigning people based on their workerId, but that identifier is basically assigned randomly.
I have a question about how to approach a certain scenario before I get halfway through it and figure out it was not the best option.
I work for a large company that has a team that creates tools for the team mates to use that aren’t official enterprise tools. We have no access to the database directly, just access to an internal server to store our files to run and be able to access the main site with javascript etc (same domain).
What I am working on is a tool that has a ton of options in it that allow you to select that I will call “data points” on a page.
There are things like “Account status, Balance, Name, Phone number, email etc” and have it save those to an excel sheet.
So you input account numbers, choose what you need and then using IE Objects it navigates to the page and scrapes data you request.
My question is as follows..
I want to make the scraping part pretty Dynamic in the way it works. I want to be able to add new datapoints on the fly.
My goal or idea is so store the regular expression needed to get the specific piece of data in the table with the “data point option”.
If I choose “Name” it knows the expression for name in the database to run again the DOM.
What would be the best way about creating that type of function in Javascript / Jquery?
I need to pass a Regex to a function, have it run against the DOM and then return the result.
I have a feeling that there will be things that require more than 1 step to get the information etc.
I am just trying to think of the best way to approach it without having to hardcode 200+ expressions into the file as the page may get updated and need to be changed.
Any ideas?
IRobotSoft scraper may be the tool you are looking for. Check this forum and see if questions are similar to what you are doing: http://irobotsoft.org/bb/YaBB.pl?board=newcomer. It is free.
What it uses is not regular expression but a language called HTQL, which may be more suitable for extracting web pages. It also supports regular expression, but not as the main language.
It organizes all your actions well with a visual interface, so you can dynamically compose actions or tasks for changing needs.
I have a server with tons of lines of Javascript and i need to check lines, which sets a new cookie.
All JS files are minified (variable names are abbreviated to one char etc.), so search it by the name of the cookie is almost impossible.
Is there any software/debugger/browser/approach/whatever which is able to track the lines of code, which sets up some cookies?
I've tried to use a Chrome built-in webkit debugger which allows me to set up an "Event Listener Breakpoints". Unfortunately it cannot listen to setting new cookie.
If I had to do this I would first beautify the source code to make it more readable, then find any lines which set a cookie (e.g. by searching for the regex /(document)?\.cookie\s*=\s*/), then trace the origin of the value which is assigned.
I have recently been inspired to write spam filters in JavaScript, Greasemonkey-style, for several websites I use that are prone to spam (especially in comments). When considering my options about how to go about this, I realize I have several options, each with pros/cons. My goal for this question is to expand on this list I have created, and hopefully determine the best way of client-side spam filtering with JavaScript.
As for what makes a spam filter the "best", I would say these are the criteria:
Most accurate
Least vulnerable to attacks
Fastest
Most transparent
Also, please note that I am trying to filter content that already exists on websites that aren't mine, using Greasemonkey Userscripts. In other words, I can't prevent spam; I can only filter it.
Here is my attempt, so far, to compile a list of the various methods along with their shortcomings and benefits:
Rule-based filters:
What it does: "Grades" a message by assigning a point value to different criteria (i.e. all uppercase, all non-alphanumeric, etc.) Depending on the score, the message is discarded or kept.
Benefits:
Easy to implement
Mostly transparent
Shortcomings:
Transparent- it's usually easy to reverse engineer the code to discover the rules, and thereby craft messages which won't be picked up
Hard to balance point values (false positives)
Can be slow; multiple rules have to be executed on each message, a lot of times using regular expressions
In a client-side environment, server interaction or user interaction is required to update the rules
Bayesian filtering:
What it does: Analyzes word frequency (or trigram frequency) and compares it against the data it has been trained with.
Benefits:
No need to craft rules
Fast (relatively)
Tougher to reverse engineer
Shortcomings:
Requires training to be effective
Trained data must still be accessible to JavaScript; usually in the form of human-readable JSON, XML, or flat file
Data set can get pretty large
Poorly designed filters are easy to confuse with a good helping of common words to lower the spamacity rating
Words that haven't been seen before can't be accurately classified; sometimes resulting in incorrect classification of entire message
In a client-side environment, server interaction or user interaction is required to update the rules
Bayesian filtering- server-side:
What it does: Applies Bayesian filtering server side by submitting each message to a remote server for analysis.
Benefits:
All the benefits of regular Bayesian filtering
Training data is not revealed to users/reverse engineers
Shortcomings:
Heavy traffic
Still vulnerable to uncommon words
Still vulnerable to adding common words to decrease spamacity
The service itself may be abused
To train the classifier, it may be desirable to allow users to submit spam samples for training. Attackers may abuse this service
Blacklisting:
What it does: Applies a set of criteria to a message or some attribute of it. If one or more (or a specific number of) criteria match, the message is rejected. A lot like rule-based filtering, so see its description for details.
CAPTCHAs, and the like:
Not feasible for this type of application. I am trying to apply these methods to sites that already exist. Greasemonkey will be used to do this; I can't start requiring CAPTCHAs in places that they weren't before someone installed my script.
Can anyone help me fill in the blanks? Thank you,
There is no "best" way, especially for all users or all situations.
Keep it simple:
Have the GM script initially hide all comments that contain links and maybe universally bad words (F*ck, Presbyterian, etc.). ;)
Then the script contacts your server and lets the server judge each comment by X criteria (more on that, below).
Show or hide comments based on the server response. In the event of a timeout, show or reveal based on a user preference setting ("What to do when the filter server is down? (show/hide comments with links) ).
That's it for the GM script; the rest is handled by the server.
As for the actual server/filtering criteria...
Most important is do not dare to assume that you can guess what a user will want filtered! This will vary wildly from person to person, or even mood to mood.
Setup the server to use a combination of bad words, bad link destinations (.ru and .cn domains, for example) and public spam-filtering services.
The most important thing is to offer users some way to choose and ideally adjust what is applied, for them.