Strange issue with AES CTR mode with Python and Javascript - javascript

I'm trying to decrypt a ciphertext created by CryptoJS using PyCrypto. I am using AES-256-CTR, with a 12-byte random prefix and 4-byte counter. So far, I've had limited success. Please read this previous post where I made a first attempt.
This works in Javascript:
Install the CryptoCat extension
Run CryptoCat
Fire up the developer console (F12 in Chrome/Firefox)
Run these lines of code
key = 'b1df40bc2e4a1d4e31c50574735e1c909aa3c8fda58eca09bf2681ce4d117e11';
msg = 'LwFUZbKzuarvPR6pmXM2AiYVD2iL0/Ww2gs/9OpcMy+MWasvvzA2UEmRM8dq4loB\ndfPaYOe65JqGQMWoLOTWo1TreBd9vmPUZt72nFs=';
iv = 'gpG388l8rT02vBH4';
opts = {mode: CryptoJS.mode.CTR, iv: CryptoJS.enc.Base64.parse(iv), padding: CryptoJS.pad.NoPadding};
CryptoJS.AES.decrypt(msg, CryptoJS.enc.Hex.parse(key), opts).toString(CryptoJS.enc.Utf8);
Expected output: "Hello, world!ImiAq7aVLlmZDM9RfhDQgPp0CrAyZE0lyzJ6HDq4VoUmIiKUg7i2xpTSPs28USU8".
Here is a script I wrote in Python that partially(!) decrypts the ciphertext:
import struct
import base64
import Crypto.Cipher.AES
import Crypto.Util.Counter
def bytestring_to_int(s):
r = 0
for b in s:
r = r * 256 + ord(b)
return r
class IVCounter(object):
def __init__(self, prefix="", start_val=0):
self.prefix = prefix
self.first = True
self.current_val = start_val
def __call__(self):
if self.first:
self.first = False
else:
self.current_val += 1
postfix = struct.pack(">I", self.current_val)
n = base64.b64decode(self.prefix) + postfix
return n
def decrypt_msg(key, msg, iv):
k = base64.b16decode(key.upper())
ctr = IVCounter(prefix=iv)
#ctr = Crypto.Util.Counter.new(32, prefix=base64.b64decode(iv), initial_value=0, little_endian=False)
aes = Crypto.Cipher.AES.new(k, mode=Crypto.Cipher.AES.MODE_CTR, counter=ctr)
plaintext = aes.decrypt(base64.b64decode(msg))
return plaintext
if __name__ == "__main__":
#original:
key = 'b1df40bc2e4a1d4e31c50574735e1c909aa3c8fda58eca09bf2681ce4d117e11'
msg = 'LwFUZbKzuarvPR6pmXM2AiYVD2iL0/Ww2gs/9OpcMy+MWasvvzA2UEmRM8dq4loB\ndfPaYOe65JqGQMWoLOTWo1TreBd9vmPUZt72nFs='
iv = 'gpG388l8rT02vBH4'
decrypted = decrypt_msg(key, msg, iv)
print "Decrypted message:", repr(decrypt_msg(key, msg, iv))
print decrypted
The output is:
'Hello, world!Imi\xfb+\xf47\x04\xa0\xb1\xa1\xea\xc0I\x03\xec\xc7\x13d\xcf\xe25>l\xdc\xbd\x9f\xa2\x98\x9f$\x13a\xbb\xcb\x13\xd2#\xc9T\xf4|\xd8\xcbaO)\x94\x9aq<\xa7\x7f\x14\x11\xb5\xb0\xb6\xb5GQ\x92'
The problem is, only the first 16 bytes of the output match the first 16 bytes of the expected output!
Hello, world!ImiAq7aVLlmZDM9RfhDQgPp0CrAyZE0lyzJ6HDq4VoUmIiKUg7i2xpTSPs28USU8
When I modify the script to do this:
def __init__(self, prefix="", start_val=1):
and
self.current_val += 0 #do not increment
which makes the counter output the same value (\x00\x00\x00\x01) every time it is called, the plaintext is:
\xf2?\xaf:=\xc0\xfd\xbb\xdf\xf6h^\x9f\xe8\x16I\xfb+\xf47\x04\xa0\xb1\xa1\xea\xc0I\x03\xec\xc7\x13dQgPp0CrAyZE0lyzJ\xa8\xcd!?h\xc9\xa0\x8b\xb6\x8b\xb3_*\x7f\xf6\xe8\x89\xd5\x83H\xf2\xcd'\xc5V\x15\x80k]
where the 2nd block of 16 bytes (dQgPp0CrAyZE0lyzJ) matches the expected output.
When I set the counter to emit \x00\x00\x00\x02 and \x00\x00\x00\x03, I get similar results- subsequent 16-byte blocks are revealed. The only exception is that with 0s, the first 32 bytes are revealed.
All 0s: reveals first 32 bytes.
'Hello, world!ImiAq7aVLlmZDM9RfhD\xeb=\x93&b\xaf\xaf\x8d\xc9\xdeA\n\xd2\xd8\x01j\x12\x97\xe2i:%}G\x06\x0f\xb7e\x94\xde\x8d\xc83\x8f#\x1e\xa0!\xfa\t\xe6\x91\x84Q\xe3'
All 1s: reveals next 16 bytes.
"\xf2?\xaf:=\xc0\xfd\xbb\xdf\xf6h^\x9f\xe8\x16I\xfb+\xf47\x04\xa0\xb1\xa1\xea\xc0I\x03\xec\xc7\x13dQgPp0CrAyZE0lyzJ\xa8\xcd!?h\xc9\xa0\x8b\xb6\x8b\xb3_*\x7f\xf6\xe8\x89\xd5\x83H\xf2\xcd'\xc5V\x15\x80k]"
All 2s: reveals next 16 bytes.
'l\xba\xcata_2e\x044\xb2J\xe0\xf0\xd7\xc8e\xae\x91yX?~\x7f1\x02\x93\x17\x93\xdf\xd2\xe5\xcf\xe25>l\xdc\xbd\x9f\xa2\x98\x9f$\x13a\xbb\xcb6HDq4VoUmIiKUg7i\x17P\xe6\x06\xaeR\xe8\x1b\x8d\xd7Z\x7f"'
All 3s: reveals next 13 bytes.
'I\x92\\&\x9c]\xa9L\xb1\xb6\xbb`\xfa\xbet;#\x86\x07+\xa5=\xe5V\x84\x80\x9a=\x89\x91q\x16\xea\xca\xa3l\x91\xde&\xb6\x17\x1a\x96\x0e\t/\x188\x13`\xd2#\xc9T\xf4|\xd8\xcb`aO)\x94\x9a2xpTSPs28USU8'
If you concat the "correct" blocks, you'll get the expected plaintext:
Hello, world!ImiAq7aVLlmZDM9RfhDQgPp0CrAyZE0lyzJ6HDq4VoUmIiKUg7i2xpTSPs28USU8
This is really strange. I am definitely doing something wrong on the Python end as things can be decrypted, but not all in one go. If anyone can help, I would be really grateful. Thank you.

There are a couple problems here. First, the message is not a multiple of the block size, and you're not using padding. And second -- and the most crucial to this issue -- is that the IV is also not the correct size. It should be 16 bytes, but you only have 12. Probably both implementations should fail with an exception, and in the next major revision of CryptoJS, this will be the case.
Here's what happens due to this mistake: When the counter increments for the first time, it tries to increment the undefined value, because the last byte of the IV is missing. Undefined + 1 is NaN, and NaN | 0 is 0. That's how you end up getting 0 twice.

When using crypto mode CryptoJS.mode.CTR (where CTR stands for counter) the Initailization vector together with a counter is encrypted and then applied to the data to encrypt. This is done for each block of data you encrypt.
You explain that different parts of the message are decrypted correctly, when you apply different values to start_val, so I suspect that the counter is simply not correctly increased with each decrypt of a block.
Take a look at Block Cipher Mode: CTR at wikipedia
Caution: Please note that when using the CTR mode, the combination of the initialization vector + the counter should never be repeated.

Fixed. I simply made the counter start with 0 twice. Does anyone know if this is a vulnerability?
import struct
import base64
import Crypto.Cipher.AES
import Crypto.Util.Counter
import pdb
def bytestring_to_int(s):
r = 0
for b in s:
r = r * 256 + ord(b)
return r
class IVCounter(object):
def __init__(self, prefix="", start_val=0):
self.prefix = prefix
self.zeroth = True
self.first = False
self.current_val = start_val
def __call__(self):
if self.zeroth:
self.zeroth = False
self.first = True
elif self.first:
self.first = False
else:
self.current_val += 1
postfix = struct.pack(">I", self.current_val)
n = base64.b64decode(self.prefix) + postfix
return n
def decrypt_msg(key, msg, iv):
k = base64.b16decode(key.upper())
ctr = IVCounter(prefix=iv)
#ctr = Crypto.Util.Counter.new(32, prefix=base64.b64decode(iv), initial_value=0, little_endian=False)
aes = Crypto.Cipher.AES.new(k, mode=Crypto.Cipher.AES.MODE_CTR, counter=ctr)
plaintext = aes.decrypt(base64.b64decode(msg))
return plaintext
if __name__ == "__main__":
#original:
key = 'b1df40bc2e4a1d4e31c50574735e1c909aa3c8fda58eca09bf2681ce4d117e11'
msg = 'LwFUZbKzuarvPR6pmXM2AiYVD2iL0/Ww2gs/9OpcMy+MWasvvzA2UEmRM8dq4loB\ndfPaYOe65JqGQMWoLOTWo1TreBd9vmPUZt72nFs='
iv = 'gpG388l8rT02vBH4'
decrypted = decrypt_msg(key, msg, iv)
print "Decrypted message:", repr(decrypt_msg(key, msg, iv))
print decrypted

Related

Transform Hash in javascript (Buffer) to python SHA256

I have a problem understanding this specific code and managing to convert it to Python from Javascript. The problem lies in the Buffer method used by Javascript which creates a different hash output than in Python. The main goal is to get the merkleRoot of the transactions ["a","b"].
Javascript: (The hashes of "a" and "b" individually are the same as with a python SHA256 implementation. However, the method (Buffer.concat([hashA, hashB])) makes the difference apparently, however I cannot figure out how to implement it in Python. In python I get a merkleRoot of "ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb3e23e8160039594a33894f6564e1b1348bbd7a0088d42c4acb73eeaed59c009d", which is not correct. I posted the correct merkleRoot below.
const sha256 = (tx) => crypto.createHash("sha256").update(tx).digest();
const hashPair = (hashA, hashB, hashFunction = sha256) =>
hashFunction(Buffer.concat([hashA, hashB]));
const a = sha256("a");
const b = sha256("b");
hashPair(a, b).toString("hex");
e5a01fee14e0ed5c48714f22180f25ad8365b53f9779f79dc4a3d7e93963f94a
├─ ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb
└─ 3e23e8160039594a33894f6564e1b1348bbd7a0088d42c4acb73eeaed59c009d
I have tried some approaches like with base64 and encodings, however due to my limitation in cryptography knowledge I can't seem to figure out the right approach. My approach in python was:
Get SHA256 of the string "a"
Get SHA256 of the string "b"
Get SHA256 of the concatenated hashes of "a"+"b":
ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb3e23e8160039594a33894f6564e1b1348bbd7a0088d42c4acb73eeaed59c009d
Here is the Python Implementation from: https://www.geeksforgeeks.org/introduction-to-merkle-tree/
Python:
# Python code for implemementing Merkle Tree
from typing import List
import hashlib
class Node:
def __init__(self, left, right, value: str, content, is_copied=False) -> None:
self.left: Node = left
self.right: Node = right
self.value = value
self.content = content
self.is_copied = is_copied
#staticmethod
def hash(val: str) -> str:
return hashlib.sha256(val.encode('utf-8')).hexdigest()
def __str__(self):
return (str(self.value))
def copy(self):
"""
class copy function
"""
return Node(self.left, self.right, self.value, self.content, True)
class MerkleTree:
def __init__(self, values: List[str]) -> None:
self.__buildTree(values)
def __buildTree(self, values: List[str]) -> None:
leaves: List[Node] = [Node(None, None, Node.hash(e), e) for e in values]
if len(leaves) % 2 == 1:
leaves.append(leaves[-1].copy()) # duplicate last elem if odd number of elements
self.root: Node = self.__buildTreeRec(leaves)
def __buildTreeRec(self, nodes: List[Node]) -> Node:
if len(nodes) % 2 == 1:
nodes.append(nodes[-1].copy()) # duplicate last elem if odd number of elements
half: int = len(nodes) // 2
if len(nodes) == 2:
return Node(nodes[0], nodes[1], Node.hash(nodes[0].value + nodes[1].value), nodes[0].content+"+"+nodes[1].content)
left: Node = self.__buildTreeRec(nodes[:half])
right: Node = self.__buildTreeRec(nodes[half:])
value: str = Node.hash(left.value + right.value)
content: str = f'{left.content}+{right.content}'
return Node(left, right, value, content)
def printTree(self) -> None:
self.__printTreeRec(self.root)
def __printTreeRec(self, node: Node) -> None:
if node != None:
if node.left != None:
print("Left: "+str(node.left))
print("Right: "+str(node.right))
else:
print("Input")
if node.is_copied:
print('(Padding)')
print("Value: "+str(node.value))
print("Content: "+str(node.content))
print("")
self.__printTreeRec(node.left)
self.__printTreeRec(node.right)
def getRootHash(self) -> str:
return self.root.value
def mixmerkletree() -> None:
elems = ["a", "b"]
#as there are odd number of inputs, the last input is repeated
print("Inputs: ")
print(*elems, sep=" | ")
print("")
mtree = MerkleTree(elems)
print("Root Hash: "+mtree.getRootHash()+"\n")
mtree.printTree()
mixmerkletree()
#This code was contributed by Pranay Arora (TSEC-2023).
Python Output:
Inputs:
a | b
Root Hash: 62af5c3cb8da3e4f25061e829ebeea5c7513c54949115b1acc225930a90154da
Left: ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb
Right: 3e23e8160039594a33894f6564e1b1348bbd7a0088d42c4acb73eeaed59c009d
Value: 62af5c3cb8da3e4f25061e829ebeea5c7513c54949115b1acc225930a90154da
Content: a+b
So my main question is, how can I correctly implement the Buffer method from javascript into Python to get the same hash of when combining the hashes of "a" and "b". The correct merkleRoot as shown above should be: e5a01fee14e0ed5c48714f22180f25ad8365b53f9779f79dc4a3d7e93963f94a
SOLVED, thanks to the great explanation of Michael Butscher above.
The JavaScript "Buffer" objects translate to Python "bytes" (or sometimes "bytearray") objects. Ask the hash object for its "digest" instead of the hexadecimal representation "hexdigest". The digest is a "bytes" object you can concatenate with another with a simple plus sign. A "bytes" object can be fed into a hash function (as the code does already) and has a "hex" method to return a hexadecimal string representation for printing. –
Michael Butscher
Here is a simplified python solution to get the same MerkleRoot as with the Javascript Buffer method:
import hashlib
def hashab(string):
x = string.encode()
return hashlib.sha256(x).digest()
a = hashab("a")
b = hashab("b")
ab = a+b
print(ab)
print(hashlib.sha256(ab).hexdigest())
Output:
b'\xca\x97\x81\x12\xca\x1b\xbd\xca\xfa\xc21\xb3\x9a#\xdcM\xa7\x86\xef\xf8\x14|Nr\xb9\x80w\x85\xaf\xeeH\xbb>#\xe8\x16\x009YJ3\x89Oed\xe1\xb14\x8b\xbdz\x00\x88\xd4,J\xcbs\xee\xae\xd5\x9c\x00\x9d'
e5a01fee14e0ed5c48714f22180f25ad8365b53f9779f79dc4a3d7e93963f94a

Can I encrypt a file multiple times without an exponential file increase?

The below results in an exponential increase in size for encrypted:
let original = 'something'
let passphrase = 'whatever'
let times = 100
let i = 0
let encrypted = CryptoJS.AES.encrypt(original, passphrase).toString()
while (i < times) {
encrypted = CryptoJS.AES.encrypt(encrypted, passphrase).toString()
i++
}
Is there some other CryptoJS algorithm/method/approach I can use that will not result in an exponential size increase?
Or is this not possible?
NOTE: If I don't use toString() it breaks when I try to re-encrypt what has already been encrypted. I get a UnhandledPromiseRejectionWarning: RangeError: Invalid array length.
Running your code would timeout for me. The encryption string apparently getting very long as it was base64 encoded.
We can reduce how much it increases by encrypting the wordarray instead of the base64 encoded version of the wordarray:
let original = 'something'
let passphrase = 'whatever'
let times = 100
let i = 0
let encrypted = CryptoJS.AES.encrypt(original, passphrase).toString()
encrypted = CryptoJS.enc.Base64.parse(encrypted)
while (i < times) {
encrypted = CryptoJS.AES.encrypt(encrypted, passphrase).toString()
i++
encrypted = CryptoJS.enc.Base64.parse(encrypted)
}
http://jsfiddle.net/dwvxua96/
This runs fast and creates a string that grows by only a few bytes each iteration. You can probably reduce that more by setting padding options, or passing in a key/iv pair which may prevent the addition of a salt parameter.
the decryption would look like:
i = 0
while (i <= times) {
encrypted = CryptoJS.AES.decrypt(encrypted, passphrase)
encrypted = CryptoJS.enc.Base64.stringify(encrypted);
i++
}
encrypted = CryptoJS.enc.Base64.parse(encrypted);
encrypted = CryptoJS.enc.Utf8.stringify(encrypted)

Extracting binary data from string with sign bit set in javascript

I have a string that looks like this "��is some test text"
The first 2 bytes are the length of the rest of the string (this is legacy network traffic that cannot be changed).
It decodes ok when the sign bit isn't set in the binary data but if it is I get rubbish.
Here is a little test program:
var decode = '';
//decode = new Buffer('ffff', 'hex');
//decode = new Buffer('85ff', 'hex');
decode = new Buffer('364e', 'hex');
//decode = new Buffer('7f5a', 'hex');
//decode = new Buffer('00ff', 'hex');
decode += '<this>is some test text</this>';
console.log(decode);
var testbuf = decode.slice(0,2);
console.log('testbuf =' + testbuf);
var myLength1 = decode.slice(0,1).charCodeAt('hex').toString(16);
console.log('ml1 ' + myLength1.toString());
var myLength2 = decode.slice(1,2).charCodeAt('hex').toString(16);
console.log('ml2 ' + myLength2.toString());
var myLength = myLength1 + myLength2;
console.log('mylength =' + myLength1 + myLength2);
var foo2 = parseInt(myLength, 16);
console.log('ml3 ' + foo2.toString());
Using the code above the output looks like this;
node foo.js
6N<this>is some test text</this>
testbuf =6N
ml1 36
ml2 4e
mylength =364e
ml3 13902
The answer is correct when the sign bit isn't set, but if the data contains a value with the high bit set, I end up with fffd for each character with the high order bit set (the replacement character).
This is output with high order bit data (using the 0x85ff data line above):
node foo.js
��<this>is some test text</this>
testbuf =��
ml1 fffd
ml2 fffd
mylength =fffdfffd
ml3 4294836221
I know it's because the charCodeAt() function desires to return 'fffd' as the replacement character for what it sees as non ascii, the question is, what is the alternative for extracting binary data from a string?
Thanks

ajax returns empty string instead of json [python cgi]

Basically, I have a cgi script that prints out valid json, I have checked and I have a similar functions that work the same way but this one doesn't some reason and I can't find it.
Javascript:
function updateChat(){
$.ajax({
type: "get",
url: "cgi-bin/main.py",
data: {'ajax':'1', 'chat':'1'},
datatype:"html",
async: false,
success: function(response) {
alert(response); //Returns an empty string
},
error:function(xhr,err)
{
alert("Error connecting to server, please contact system administator.");
}
});
Here is the JSON that python prints out:
[
"jon: Hi.",
"bob: Hello."
]
I used json.dumps to create the JSON it worked in previous functions that have pretty much the same JSON layout only different content.
There is a whole bunch more of server code, I tried to copy out the relevant parts. Basically I'm just trying to filter an ugly chat log for learning purposes. I filter it with regex and then create a json out of it.
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
print "Content-type: text/html\n\n"
print
import cgi, sys, cgitb, datetime, re, time, random, json
cgitb.enable()
formdata = cgi.FieldStorage()
def tail( f, window=20 ):
BUFSIZ = 1024
f.seek(0, 2)
bytes = f.tell()
size = window
block = -1
data = []
while size > 0 and bytes > 0:
if (bytes - BUFSIZ > 0):
# Seek back one whole BUFSIZ
f.seek(block*BUFSIZ, 2)
# read BUFFER
data.append(f.read(BUFSIZ))
else:
# file too small, start from begining
f.seek(0,0)
# only read what was not read
data.append(f.read(bytes))
linesFound = data[-1].count('\n')
size -= linesFound
bytes -= BUFSIZ
block -= 1
return '\n'.join(''.join(data).splitlines()[-window:])
def updateChatBox():
try:
f = open('test.txt', 'r')
lines = tail(f, window = 20)
chat_array = lines.split("\n")
f.close()
except:
print "Failed to access data"
sys.exit(4)
i = 0
while i < len(chat_array):
#remove timer
time = re.search("(\[).*(\])", chat_array[i])
result_time = time.group()
chat_array[i] = chat_array[i].replace(result_time, "")
#Removes braces around user
user = re.search("(\\().*?(_)", chat_array[i])
result_user = user.group()
chat_array[i] = chat_array[i].replace("(", "")
chat_array[i] = chat_array[i].replace(")", "")
#Removes underscore and message end marker
message = re.search("(_).*?(\|)", chat_array[i])
result_message = message.group()
chat_array[i] = chat_array[i].replace("_", ":")
chat_array[i] = chat_array[i].replace("|", "")
data += chat_array[i] + "\n"
i = i + 1
data_array = data.split("\n")
json_string = json.dumps(data_array)
print json_string
if formdata.has_key("ajax"):
ajax = formdata["ajax"].value
if ajax == "1": #ajax happens
if formdata.has_key("chat"):
chat = formdata["chat"].value
if chat == 1:
updateChatBox()
else:
print "ERROR"
elif formdata.has_key("get_all_stats"):
get_all_stats = formdata["get_all_stats"].value
if get_all_stats == "1":
getTopScores()
else:
print "ERROR"
Here is also a function that works perfectly and is in the same python file
def getTopScores():
try:
f = open('test_stats.txt', 'r')
stats = f.read()
stats_list = stats.split("\n")
f.close()
except:
print "Failed reading file"
sys.exit(4)
json_string = json.dumps(stats_list)
print json_string
The only difference is using the tail function and regex, the end result JSON actually looks identical.
Are you certain that updateChatBox is even getting called? Note that you compare ajax to the string "1" but you compare chat to the integer 1. I bet one of those doesn't match (in particular the chat one). If that doesn't match, your script will fall through without ever returning a value.
Also, though it isn't the root cause, you should clean up your content types for correctness. Your Javascript AJAX call is declared as expecting html in response, and your cgi script is also set to return content-type:text/html. These should be changed to json and content-type:application/json, respectively.

websocket handshake problem

I'm using python to implement a simple websocket server.
The handshake I'm using comes from http://en.wikipedia.org/w/index.php?title=WebSockets&oldid=372387414.
The handshake itself seems to work, but when I hit send, I get a javascript error:
Uncaught Error: INVALID_STATE_ERR: DOM Exception 11
Here's the html:
<!doctype html>
<html>
<head>
<title>ws_json</title>
</head>
<body onload="handleLoad();" onunload="handleUnload();">
<input type="text" id='input' />
<input type="button" value="submit" onclick="handleSubmit()" />
<div id="display"></div>
<script type="text/javascript">
function showmsg(str){
display = document.getElementById("display");
display.innerHTML += "<p>" + str + "</p>";
}
function send(str){
ws.send(str.length);
ws.send(str);
}
function handleSubmit(){
input = document.getElementById('input');
send(input.value);
input.focus();
input.value = '';
}
function handleLoad(){
ws = new WebSocket("ws://localhost:8888/");
ws.onopen = function(){
showmsg("websocket opened.");
}
ws.onclose = function(){
showmsg("websocket closed.");
}
}
function handleUnload(){
ws.close();
}
</script>
</body>
</html>
And here's the python code:
import socket
import threading
import json
PORT = 8888
LOCATION = "localhost:8888"
def handler(s):
print " in handler "
ip, _ = s.getpeername()
print "New connection from %s" % ip
request = s.recv(1024)
print "\n%s\n" % request
print s.getpeername()
# send response
response = "HTTP/1.1 101 Web Socket Protocol Handshake\r\n"
response += "Upgrade: WebSocket\r\n"
response += "Connection: Upgrade\r\n"
try:
peername = s.getpeername()
response += "Sec-WebSocket-Origin: http://%s\r\n" % peername[0] # % request[request.index("Origin: ")+8:-4]
except ValueError:
print "Bad Request"
raise socket.error
response += "Sec-WebSocket-Location: ws://%s\r\n" % LOCATION
response += "Sec-WebSocket-Protocol: sample"
response = response.strip() + "\r\n\r\n"
print response
s.send(response)
while True:
length = s.recv(1)
print length
if not length:
break
length = int(length)
print "Length: %i" % length
data = s.recv(length)
print "Received: %s" % data
print ""
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('localhost', PORT))
s.listen(5)
print "server is running..."
while True:
sock, addr = s.accept()
threading.Thread(target=handler, args=(sock, )).start()
Does anyone know what I'm doing wrong here?
I tested your code on Firefox 4 and got the same error upon hitting send, however before that I got
Firefox can't establish a connection
to the server at ws://localhost:8888/.
which is probably why the WebSocket object was destroyed. I suspect your handshake response is missing something, so Firefox is closing the socket.
From the Wikipedia article on Websockets:
The Sec-WebSocket-Key1 and
Sec-WebSocket-Key2 fields and the
eight bytes after the fields are
random tokens which the server uses to
construct a 16 byte token at the end
of its handshake to prove that it has
read the client's handshake.
Your server's response does not have this special number at the bottom, So I think we need to figure out how to generate it, and include it.
EDIT: How to generate that number
Lets start with key1, key2, and the 8 bytes at the end of the handshake
key1 = "18x 6]8vM;54 *(5: { U1]8 z [ 8"
key2 = "1_ tx7X d < nw 334J702) 7]o}` 0"
end8 = "Tm[K T2u"
We make a number for each key by ignoring every character that is not a digit 0-9. In Python:
def numFromKey(key):
return int(filter(lambda c: c in map(str,range(10)),key))
next we divide that number by the number of spaces in the original key string, so here is a is a function that counts the spaces in a string.
def spacesIn(key):
return len(filter(lambda c: c==' ',key))
The two numbers resulting from the keys are:
pkey1 = numFromKey(key1)/spacesIn(key1)
pkey2 = numFromKey(key2)/spacesIn(key2)
Now we need to concatenate the bytes of pkey1, pkey2, and end8. The processed keys need to be represented as 32 bit Big-Endian numbers.
from struct import pack
catstring = pack('>L',pkey1) + pack('>L',pkey2) + end8
Then we take the md5 hash of those bytes to get the magic number that we tack on the end of the handshake
import md5
magic = md5.new(catstring).digest()
Thats how I think it works at least
As of Version 8, this protocol is deprecated please refer to:
http://tools.ietf.org/id/draft-ietf-hybi-thewebsocketprotocol-12.txt
for the new version of the protocol.

Categories

Resources