I have the below string that I need help pulling an ID from in Presto. Presto uses the javascript regex. I've searched multiple options including:
JavaScript text between double quotes
Javascript regex to extract all characters between quotation marks following a specific word
I need to pull the GA Client ID which looks like this:
75714ae471df63202106404675dasd800097erer1849995367
Below is a snipped where it sits in the string.
The struggle is that the "s:38:" is not constant. The number can be anything. For example, it could be s:40: or s:1000: etc. I need it to return just the alphanumeric id.
String Snippet
"GA_ClientID__c";s:38:"75714ae471df63202106404675dasd800097erer1849995367";
Full string listed below
99524";s:9:"FirstName";s:2:"John";s:8:"LastName";s:8:"Doe";s:7:"Company";s:10:"Sample";s:5:"Email";s:20:"xxxxx#gmail.com";s:5:"Phone";s:10:"8888888888";s:7:"Country";s:13:"United States";s:5:"Title";s:8:"Creative";s:5:"State";s:2:"NC";s:13:"Last_Asset__c";s:40:"White Paper: Be a More Strategic Partner";s:16:"Last_Campaign__c";s:18:"70160000000q6TgAAI";s:16:"Referring_URL__c";s:8:"[direct]";s:19:"leadPriorityMarketo";s:2:"P2";s:18:"ProductInterest__c";s:9:"sample";s:14:"landingpageurl";s:359:"https://www.sample.com;mkt_tok=samplesamplesamplesample";s:14:"GA_ClientID__c";s:38:"75714ae471df63202106404675dasd800097erer1849995367";s:13:"Drupal_SID__c";s:36:"e1380c07-0258-47de-aaf8-82d4d8061e1a";s:4:"form";s:4:"1046";} ```
This works for your sample
"GA_ClientID__c";[^"]*"([^"]*)"
https://regex101.com/r/Q4Orj6/1
Related
I want to parse a file containing multiple lines. I will loop through the newlines to extract the information i need. I am trying to write a m3u parser which uses regex - i already succesfully made one using sting position and substring.
I want to parse this line:
#EXTINF:-1 tvg-ID="NPO 2 HD NL" tvg-name="NL : Npo 2" tvg-logo="http://www.iptv-plus.net:25461/images/Astra23picon/1_0_19_17C0_C82_3_EB0000_0_0_0.png" group-title="Holland",NL : Npo 2
So i came up with these regexpressions:
(?<=group-title=")(.*?)(?=")
(?<=tvg-name=")(.*?)(?=")
(?<=tvg-logo=")(.*?)(?=")
(?<=tvg-ID=")(.*?)(?=")
(?<=,)(.*?)$
These will extract the tvg tags, and extract the name from the end of the line (always after the last occuring comma.)
What i would like is to put all these regexpressions in one regex, so i can get an array containing all elements. I tried using | but that is an OR? i believe.. I can use them seperate, but i think it might be faster to put them all in 1? Also i would like to create named groups. But when i change it to
(?<group><=group-title=")(.*?)(?=")
it takes the lookbehind as text (<=).. Can i and how can i combine lookbehind with named groups? I like to use the regex in javascript, but could also implement a php that parses the m3u file and returns a json.
I was wondering if there was a way to hide a string of characters in a string. I found Control Characters which work for hidding those characters:
>var hidden = "\26"
undefined
>hidden
""
>hidden.replace("\26","yolo");
"yolo"
>"".replace("\26","yolo");
""
but what i would like to escape a string of characters and have them not show up like this:
>var hidden = "\26cantseethis\26"
undefined
>hidden
""
Is there any such method using ASCII characters?
edit:
What I am trying to do is give state to a google doc. I have a workflow type google app script attached to a form that creates a doc. the doc is immediatly viewable by the administrator so i dont want to put a bunch of special strings like &UserOneAgreed in the doc, mostly because of the potential of someone going in and modifying that string. I have another script that will go in and modify the related text once some user input is gathered.
You cannot do that. The control character can be used for character only, so you will need to escape each character separately to hide them.
I am using .Net:
fulltext = File.ReadAllText(#location);
to read text anyfile content at given locatin.
I got result as:
fulltext="# vdk10.syx\t1.1 - 3/15/94\r# #(#)Copyright (C) 1987-1993 Verity, Inc.\r#\r# Synonym-list Database Descriptor\r#\r$control: 1\rdescriptor:\r{\r data-table: syu\r {\r worm:\tTHDSTAMP\t\tdate\r worm:\tQPARSER\t\t\ttext\r\t/_hexdata = yes\r varwidth:\tWORD\t\tsyv\r fixwidth:\tEXPLEN\t\t\t2 unsigned-integer\r varwidth:\tEXPLIST\t\tsyx\r\t/_hexdata = yes\r }\r data-table: syw\r {\r varwidth:\tSYNONYMS\tsyz\r }\r}\r\r ";
Now, I want this fulltext to be displayed in html page so that special characters are recognized in html properly. For examples: \r should be replaced by line break tag
so that they are properly formatted in html page display.
Is there any .net class to do this? I am looking for universal method since i am reading file and I can have any special characters. Thanks in advance for help or just direction.
You're trying to solve two problems:
Ensure special characters are properly encoded
Pretty-print your text
Solve them in this order:
First, encode the text, by importing the System.Web namespace and using HttpUtility (asked on StackOverflow). Use the result in step 2.
Pretty-printing is trickier, depending on the amount of pretty-printing that you want. Here are a few approaches, in increasing order of difficulty:
Put the text in a pre element. This should preserve newlines, tabs, spaces. You can still adjust the font used using CSS if you first slap a CSS class on the pre.
Replace all \r, \r\n and remaining \n with <br/>.
Study the structure of your text, parse it according to this structure, and provide specific tags in specific contexts. For example, the tab characters in your example may be indicative of a list of items. HTML provides the ol and ul elements for lists. Similarly, consecutive line breaks may indicate paragraphs, for which HTML provides the well known p element.
Thanks Everyone here for your valuable comment. I solved my formatting problem in client side with following code.
document.getElementById('textView').innerText = fulltext;
Here textview is the div where i want to display my fulltext . I don't think i need to replace special characters in string fulltext. I output as shown in the figure.
I've got some data from dbpedia using jena and since jena's output is based on xml so there are some circumstances that xml characters need to be treated differently like following :
Guns n ' Roses
I just want to know what kind of econding is this?
I want decode/encode my input based on above encode(r) with the help of javascript and send it back to a servlet.
(edited post if you remove the space between & and amp you will get the correct character since in stackoverflow I couldn't find a way to do that I decided to put like that!)
Seems to be XML entity encoding, and a numeric character reference (decimal).
A numeric character reference refers to a character by its Universal
Character Set/Unicode code point, and uses the format
You can get some info here: List of XML and HTML character entity references on Wikipedia.
Your character is number 39, being the apostrophe: ', which can also be referenced with a character entity reference: '.
To decode this using Javascript, you could use for example php.js, which has an html_entity_decode() function (note that it depends on get_html_translation_table()).
UPDATE: in reply to your edit: Basically that is the same, the only difference is that it was encoded twice (possibly by mistake). & is the ampersand: &.
This is an SGML/HTML/XML numeric character entity reference.
In this case for an apostrophe '.
I use jquery.autocomplete, which uses a javascript regexp to highlight substrings in the list of suggestions that match the autocomplete key string. So if the use types "Beat" and one of the autocomplete suggestions the server returns is "The Beatles" then plugin displays that suggestion as "The Beatles".
I'm trying to think of ways to make this work with string matching that isn't sensitive to accents, diacriticals and the rest. So if the user typed "Huske" and the server suggested "Hüsker Dü" then this would be displayed as "Hüsker Dü".
The principle is the same as string comparison with specified collations such as in MySql or ICU, or with Oracle's sorts. In SphinxSearch a charset_table works for this. A collation such as utf8_general_ci would be ideal for my purposes.
The only thing I can think of is pretty brute-force. If any character in the input string is known to have one or more accented forms, replace it with a character class containing all of the forms when you create the regex. For example, for the input string Huske, the regex might be /H[uùúûü]sk[eèéêë]/.