How to query NodeJS stream 'meta data'?

How to query NodeJS stream 'meta data'? - javascript

I have a program with several long pipes with several transforms.
e.g.
socket.pipe(ta).pipe(tb).pipe(tc);
...
tc.pipe(other_socket);
What is the best way of adding/reading meta data to/from the pipe?
For example: ta accumulates and breaks packets into lines. tb needs to prefix each line with data based on the originating IP address (if any).
How can tb get the remoteAddress from its input?
There seem to be some similarities with prototypical inheritance here. i.e. tb should ask ta (which lacks the property) then ta should ask socket (which has the property).
I'm looking for a general approach to adding and reading metadata from pipes, as I have other more complex, but analogous issues.
I'm currently solving this issue by using 'Object Streams' consisting of objects with meta and payload properties. Each transform has to do its stuff to payload and most leave meta alone. This solution is ugly, especially as I've had to create a new xnet module which looks like net but produces these augmented objects, rather than plain buffers or strings.
(Haskellers might recognise this solution as a Monad, where I'm lifting most of the stream transforms I use into a "meta" Monad. I'm still learning Haskell, so this observation may be incorrect.)

You can use the pipe event:
tb.on('pipe', function(ta) {
console.log('getting data from', ta.remoteAddress);
});

If the meta data is read-only data for particular instance of pipeline execution then why not pass this data while creating the individual pipes. Something like : socket.pipe(new Ta(address))
In terms of Haskell, it sounds like a Reader moand, where the Pipeline execution function takes a Reader which full fill all the meta data requirements of individual pipes of the pipeline

Related

Why is serialisation necessary for sending data?

I've been reading up on JSON and serialization, from what I understand JSON is a format often used for transferring data over a network e.g. from/to a web server or storing the data to disk.
The data could be strings, numbers, objects etc. I haven't found a clear explanation for why the serialization is needed, for instance when sending a string to a web server or saving it to disk, isn't the string already stored as a series of bits and bytes by the computer, isn't this the most basic form for the data? so why can't these be sent/stored as they are?
Why does it need to be stringified into JSON i.e. serialised, which turns it into a string?
To be clear, I'm asking why it's needed and a simple clear explanation for that.
Thanks

Broadly speaking serialization does two important, mostly independent jobs:
collects all the information into a single "chunk" (stream) of data that's self-contained and
turns all the information into an agreed-on format (usually optimized for either compactness or ease of parsing)
#1 is important because a single object with many properties and sub-object can be spread all over the memory of a running program.
For example a JavaScript runtime could have a dedicated memory pool for strings constants. Then an object that uses some constant as a key would just reference into that pool from its data structure. That means that the object is no longer in a single self-contained block in memory: it's spread out over multiple areas. This kind of spreading-out is actually the norm: objects don't usually contain complex data directly and depending on the language even "primitive" values such as number could be stored as references to another place in memory.
#2 is important mostly because the format used to quickly access data in-memory might not be suitable to transfer (because it might contain unnecessary redundancy or memory pointers that don't make any sense when transferred to another computer, which partially ties to reason #1).
An example of that would be a map (or dictionary): the in-memory representation will usually involve multiple buckets that hold hashed-values and some kind of collision-handling structure inside those buckets (a linked list or a tree, for example). That structure helps with efficient access to separate keys, but transferring that structure directly over the wire is pointless: it's very easy to re-build and there's no guarantee that the receiving end uses the exact same way to represent a map. So instead we just send each key and the associated value and let the receiving end deal with re-constructing any data structures it needs for efficient access.

The simple reason is that data can be stored differently in memory on different computers, or even by programs on the same computer written in different programming languages.
Serialization formats like JSON provide a defined way for exchanging data between computers or programs.

Not everyone knows how to parse or interpret those series of bits. Sometimes you need some general structure, some format, that can be passed around so that other people understand what it is you're trying to tell them.

Listing Canvas and WebAudio contexts methods and properties under Node.js

I am working on a tool dedicated to compression for demoscene, initiated for js1k and targeted to prods in the 1k-4k categories.
The current difficulty I am facing is to have it work and produce the exact same results in both browser and Node.js environments.
One of its feature requires knowing all methods and properties of 2D, GL and Audio contexts. It also needs the values for GL constants.
No method is ever invoked though, so the actual implementation is not needed.
EDIT - an example to give a better understanding of what is going on
The original uncompressed code given to the packer looks like this (after stripping lines not relevant here, such as those adding colors to n)
c=a.getContext("2d");
e=c.getImageData(0,0,150,150);
c.fillStyle=n=c.createRadialGradient(225,75,25,225,75,60);
c.fillRect(150,0,150,150);
The packer computes the best hash, in this case i[0]+i[6]. It then replaces the methods in the code, and prepends a loop to perform the hashing (the output is standalone, thus it contains the decompression routine). Otherwise at runtime, the js interpreter would have no way to understand that c.cR() is actually context.createRadialGradient(). Here is the resulting code :
for(i in c=a.getContext("2d"))c[i[0]+i[6]]=c[i];
e=c.gg(0,0,150,150);
c.fillStyle=n=c.cR(225,75,25,225,75,60);
c.fc(150,0,150,150);
In case of a collision (several methods resulting in the same hashed string), the replacement is not performed.
Inside the browser, one can simply create an instance of the appropriate context and iterate on its methods/properties. However, Node.js does not provide this possibility. I need another way to obtain that information.
The answers to similar questions (2d canvas or WebAudio) suggested the use of Canvas module or Node WebAudio API. However, these modules are not a perfect mirror of their browser counterparts, having either additional methods, or a subset thereof. This will in some cases cause the hashing algorithm to produce a different output.
Unfortunately, this rules out the solution, as the same result is needed in both environments. What other options are possible ? Thanks in advance.

How would you explain Javascript Typed Arrays to someone with no programming experience outside of Javascript?

I have been messing with Canvas a lot lately, developing some ideas I have for a web-based game. As such I've recently run into Javascript Typed Arrays. I've done some reading for example at MDN and I just can't understand anything I'm finding. It seems most often, when someone is explaining Typed Arrays, they use analogies to other languages that are a little beyond my understanding.
My experience with "programming," if you can call it that (and not just front-end scripting), is pretty much limited to Javascript. I do feel as though I understand Javascript pretty well outside of this instance, however. I have deeply investigated and used the Object.prototype structure of Javascript, and more subtle factors such as variable referencing and the value of this, but when I look at any information I've found about Typed Arrays, I'm just lost.
With this frame-of-reference in mind, can you describe Typed Arrays in a simple, usable way? The most effective depicted use-case, for me, would be something to do with Canvas image data. Also, a well-commented Fiddle would be most appreciated.

In typed programming languages (to which JavaScript kinda belongs) we usually have variables of fixed declared type that can be dynamically assigned values.
With Typed Arrays it's quite the opposite.
You have a fixed chunk of data (represented by ArrayBuffer) that you do not access directly. Instead this data is accessed by views. Views are created at run time and they effectively declare some portion of the buffer to be of a certain type. These views are sub-classes of ArrayBufferView. The views define the certain continuous portion of this chunk of data as elements of an array of a certain type. Once the type is declared browser knows the length and content of each element, as well as a number of such elements. With this knowledge browsers can access individual elements much more efficiently.
So we dynamically assigning a type to a portion of what actually is just a buffer. We can assign multiple views to the same buffer.
From the Specs:
Multiple typed array views can refer to the same ArrayBuffer, of different types,
lengths, and offsets.
This allows for complex data structures to be built up in the ArrayBuffer.
As an example, given the following code:
// create an 8-byte ArrayBuffer
var b = new ArrayBuffer(8);
// create a view v1 referring to b, of type Int32, starting at
// the default byte index (0) and extending until the end of the buffer
var v1 = new Int32Array(b);
// create a view v2 referring to b, of type Uint8, starting at
// byte index 2 and extending until the end of the buffer
var v2 = new Uint8Array(b, 2);
// create a view v3 referring to b, of type Int16, starting at
// byte index 2 and having a length of 2
var v3 = new Int16Array(b, 2, 2);
The following buffer and view layout is created:
This defines an 8-byte buffer b, and three views of that buffer, v1,
v2, and v3. Each of the views refers to the same buffer -- so v1[0]
refers to bytes 0..3 as a signed 32-bit integer, v2[0] refers to byte
2 as a unsigned 8-bit integer, and v3[0] refers to bytes 2..3 as a
signed 16-bit integer. Any modification to one view is immediately
visible in the other: for example, after v2[0] = 0xff; v21 = 0xff;
then v3[0] == -1 (where -1 is represented as 0xffff).
So instead of declaring data structures and filling them with data, we take data and overlay it with different data types.

I spend all my time in javascript these days, but I'll take a stab at quick summary, since I've used typed arrays in other languages, like Java.
The closest thing I think you'll find in the way of comparison, when it comes to typed arrays, is a performance comparison. In my head, Typed Arrays enable compilers to make assumptions they can't normally make. If someone is optimizing things at the low level of a javascript engine like V8, those assumptions become valuable. If you can say, "Data will always be of size X," (or something similar), then you can, for instance, allocate memory more efficiently, which lets you (getting more jargon-y, now) reduce how many times you go to access memory and it's not in a CPU cache. Accessing CPU cache is much faster than having to go to RAM, I believe. When doing things at a large scale, those time savings add up quick.
If I were to do up a jsfiddle (no time, sorry), I'd be comparing the time it takes to perform certain operations on typed arrays vs non-typed arrays. For example, I imagine "adding 100,000 items" being a performance benchmark I'd try, to compare how the structures handle things.
What I can do is link you to: http://jsperf.com/typed-arrays-vs-arrays/7
All I did to get that was google "typed arrays javascript performance" and clicked the first item (I'm familiar with jsperf, too, so that helped me decide).

Flash Twitter API with JSON

I have read a lot about parsing JSON with Actionscript. Originally it was said to use this library. http://code.google.com/p/as3corelib/ but it seems Flash Player 11 has native support for it now.
My problem is that I cannot find examples or help that takes you from beginning to end of the process. Everything I have read seems to start in the middle. I have no real experience with JSON so this is a problem. I don't even know how to point ActionScript to the JSON file it needs to read.
I have a project with a tight deadline that requires me to read twitter through JSON. I need to get the three most recent tweets, along with the user who posted it, their twitter name and the time those tweets were posted.
The back end to this is already set up I believe by the development team here, therefor my JSON files or XML just needs to be pointed to and then I need to display the values in the interface text boxes I have already designed and created.
Any help will be greatly appreciated...I do know that there are a lot of threads on here I just do not understand them as they all have some understanding of it to begin with.

You need to:
Load the data, whatever it is.
Parse the data from a particular format.
For this you would normally:
Use URLLoader class to load any data. (Just go to the language reference and look into example of how to use this class).
Use whatever parser to parse the particular format that you need. http://help.adobe.com/en_US/FlashPlatform/beta/reference/actionscript/3/JSON.html this is the reference to JSON API, it also shows usage examples. I'm not aware of these API being in production version of the player, still there might be quite a bit of FP 10.X players out there, so I'd have a fallback JSON parser, but I would recommend using this library: http://www.blooddy.by/en/crypto/ over as3corelib because it is faster. The built-in API are no different from those you would find in browser, so if you look up JSON JavaScript entries, the use should be in general similar to Flash.
After you parse JSON format, you will end up with a number of objects of the following types: Object, Array, Boolean, Number, String. It has also literals to mean null and undefined. Basically, you will be working with native to Flash data structures, you only should take extra care because they will be dynamically constructed, meaning you may not make assumption about existence of parts of the data - you must always check the availability.

wvxvw's answer is good, but I think skips over a to be desired explanation of what JSON itself is. JSON is plain text, javascript object notation, when you read the text on screen it looks something like this
http://www.json.org/example.html
you can see a side by side JSON and XML (both plain text formats) essentially JSON is a bunch of name value pairs.
When you use JSON.parse("your JSON string goes here") it will do the conversions to AS3 "dynamic objects" which are just plain objects (whose properties can be assigned without previously being defined, hence dynamic). But to make a long story short, take the example you see in the link above, copy and paste the JSON as a string variable in AS3, use
var str:String = '{"glossary": {"title": "example glossary","GlossDiv": {"title": "S","GlossList": {"GlossEntry": {"ID": "SGML","SortAs": "SGML","GlossTerm": "Standard Generalized Markup Language","Acronym": "SGML","Abbrev": "ISO 8879:1986","GlossDef": {"para": "A meta-markup language, used to create markup languages such as DocBook.","GlossSeeAlso": ["GML", "XML"]},"GlossSee": "markup"}}}}}';
var test:Object = JSON.parse(str);
method on the string, store it in a variable and use the debugger to see what the resulting object is. As far as I know there's really nothing else to JSON it's simply this format for storing data (you can't use E4X on it since it's not XML based and because of that it's slightly more concise than XML, no closing tags, but in my opionion slightly less readable... but is valid javascript). For a nice break-down of the performance gains/losses between AMF, JSON and XML check out this page: http://www.jamesward.com/census2/ Though many times you don't have a choice with regard to the delivery message format or protocol being used if you're not building the service, it's good to understand what the performance costs of them are.

Why is JSON important?

I've only recently heard about JSON (Javascript Object Notation).
Can anybody explain why it is considered (by some websites/blogs/etc) to be important?
We already have XML, why is JSON better (apart from being 'native to Javascript')?
Edit: Hmm, the main answer theme seems to be 'it is smaller'. However, the fact that it allows data fetching across domains, seems important to me. Or is this in practice not (yet) much used?

XML has several drawbacks:
It's heavy!
It provides a hierarchical representation of content which is not exactly the same as (but pretty much similar to) Javascript object model.
Javascript is available everywhere. Without any external parsers, you can process JSONs directly with JS interpreter.
Clearly it's not meant to replace XML completely. For JS based Web apps, its advantages can be useful.

JSON is generally much smaller than its XML equivalent. Smaller transfer means faster transfer, which results in a better user experience.

JSON is much more concise. XML:
<person>
<name>John Doe</name>
<tags>
<tag>friend</tag>
<tag>male</tag>
</tags>
</person>
JSON:
{"name": "John Doe", "tags": ["friend", "male"]}
There's fewer overlapping features, too. For example, in XML there's tension between choosing to use elements (as above), versus attributes (<person name="John Doe">).

JSON came into popular use primarily because it offers a way to circumvent the same-origin policy used in web browsers and thereby allow mashups.
Let's say you're writing a web service on domain A. You can't load XML data from domain B and parse it because the only way to do that would be XMLHttpRequest, and XMLHttpRequest was originally limited by the same-origin policy to talking to only URLs at the same domain as the containing page.
It turns out that for a variety of reasons, you are allowed to request <script> tags across origins. Clever people realized this was a good way to work around the limitation with XMLHttpRequest. Instead of the server returning XML, it can return a series of JavaScript object and array literals.
(bonus question left as an exercise to the reader: why is <script src="..."> allowed across domains without server opt-in but XHR isn't?)
Of course, returning a <script> which consists of nothing more than object literals is not useful because without assigning the values to some variable, you can't do anything with it. Thus, most services use a variant of JSON, called JSONP (http://bob.pythonmac.org/archives/2005/12/05/remote-json-jsonp/).
With the rise in popularity of mashups, people realized that JSON was a convenient data interchange format in general, especially when JavaScript is one end of the channel. For example, JSON is used extensively in Chromium, even in cases where C++ is on both sides. It's just a nice lightweight way to represent simple data, that good parsers exist for in many languages.
Amusingly, using <script> tags to do mashups is incredibly insecure because it is essentially XSS'ing yourself on purpose. So native JSON (http://ejohn.org/blog/native-json-support-is-required/) had to be introduced, which obviates the original benefits of the format. But by that time, it was already super popular :)

If you are working in Javascript, it is much easier to us JSON. This is because JSON can be directly evaluated into a Javascript object, which is much easier to work with than the DOM.
Borrowing and slightly altering the XML and JSON from above
XML:
<person>
<name>John Doe</name>
<tag>friend</tag>
<tag>male</tag>
</person>
JSON:
{ person: {"name": "John Doe", "tag": ["friend", "male"]} }
If you wanted to get the second tag object with XML, you'd need to use the powerful but verbose DOM apis:
var tag2=xmlObj.getElementsByTagName("person")[0].getElementsByTagName("tag")[1];
Whereas with a Javascript object that came in via JSON, you could simply use:
var tag2=jsonObj.person.tag[1];
Of course, Jquery makes the DOM example much simpler:
var tag2=$("person tag",xmlObj).get(1);
However, JSON just "fits" in a Javascript world. If you work with it for a while, you will find that you have much less mental overhead than involving XML based data.
All the above examples ignore the possibility that one or more nodes are available, duplicated, or the possibility that the node has just one or no children. However, to illustrate the native-ness of JSON, to do this with the jsonObj, you'd just have to:
var tag2=(jsonObj.person && jsonObj.person.tags && jsonObj.person.tags.sort && jsonObj.person.tags.length==2 ? jsonObj.person.tags[1] : null);
(some people might not like that long of ternary, but it works). But XML would be (in my opinion) nastier (I don't think you'd want to go the ternary approach because you'd keep calling the dom methods which may have to do the work over again depending on implementation):
var tag2=null;
var persons=xmlObj.getElementsByTagName("person");
if(persons.length==1) {
var tags=persons[0].getElementsByTagName("tag");
if(tags.length==2) { tag2=tags[1]; }
}
Jquery (untested):
var tag2=$("person:only-child tag:nth-child(1)",xmlObj).get(0);

These web pages may help:
JSON - The Fat Free alternative to xml
Why JSON is Important to You!

It depends on what you are going to do. There are a lot of answers here that prefer JSON over XML. If you take a deeper look there isn't a big difference.
If you have a tree of objects you get only tree of javascript objects back. If you take a look at the tension to use OOP style access than turns back on you. Assume you have an object of type A, B ,C that are constructed in a tree. You can easily enable them to be serialzed to JSON. If you read them back in you only get a tree of javascript objects. To reconstruct your A, B, C you have to stuff the values manually into manually created objects or you doing some hacks. Sound like parsing XML and creating objects? Well, yes :)
This days only the newest browsers come with native support for JSON. To support more browsers you have two options: a) you load a json paraser in javascript that helps you parsing. So, how fat does this sound regarding fatreeness? The other option as I often see is eval. You can just do eval() on a JSON String to get the objects. But that introduces a whole new set of security problems. JSON is specified so it can't contain functions. If you are not checking the objects for function someone can easily send you code that is being executed.
So it might depend on what you like more: JSON or XML. The biggest difference is propably the ways of accessing things, be it script tags XMLHTTPRequest... I would decide upon this what to use. In my opinion if there would be proper support for XPATH in the browsers I would often decide for XML to use. But the fashion is directed towards json and loading additional json parsers in javascript.
If you can't decide and you know you need something really powerful you ight have to take a look at YAML. Reading about YAML is very interesting to get more insight in the topic. But it really depends on what you are trying to do.

JSON is a way to serialize data in Javascript objects. The syntax is taken from the language, so it should be familiar to the developer dealing with Javascript, and -- being the stringification of an object -- it's a more-natural serialization method for interaction within the browser than a full-fledged XML derivative (with all the arbitrary design decisions that implies).
It's light and intuitive.

JSON's a text-based object serialization format that's more lightweight than XML and that directly integrates with JavaScript's object model. That's most of its advantages right there.
Its disadvantages (compared to XML) are, roughly: fewer available tools (forget about standard validation and/or transformation, to say nothing of syntax highlighting or well-formedness checking in most editors), less likely to be human-readable (there's huge variations in the readability of both JSON and XML, so that's a necessarily fuzzy statement), tight integration with JavaScript makes for not-so-tight integration with other environments.

It's not that it is better, but that it can tie many things together to allow seamless data transfer without manual parsing!
For example javascript -> C# web service -> javascript

Develop Reference

JavaScript is the programming language of the Web.