Neural Network, gradient descent only finds the average of the outputs?

Neural Network, gradient descent only finds the average of the outputs? - javascript

This problem is more conceptual than in the code, so the fact that this is written in JS shouldn't matter very much.
So I'm trying to make a Neural Network and I'm testing it by trying to train it to do a simple task - an OR gate (or, really, just any logic gate). I'm using Gradient Descent without any batches for the sake of simplicity (batches seem unnecessary for this task, and the less unnecessary code I have the easier it is to debug).
However, after many iterations the output always converges to the average of the outputs. For example, given this training set:
[0,0] = 0
[0,1] = 1
[1,0] = 1
[1,1] = 0
The outputs, no matter the inputs, always converge around 0.5. If the training set is:
[0,0] = 0,
[0,1] = 1,
[1,0] = 1,
[1,1] = 1
The outputs always converge around 0.75 - the average of all the training outputs. This appears to be true for all combinations of outputs.
It seems like this is happening because whenever it's given something with an output of 0, it changes the weights to get closer to that, and whenever it's given something with an output of 1, it changes the weights to get closer to that, meaning that overtime it will converge around the average.
Here's the backpropagation code (written in Javascript):
this.backpropigate = function(data){
//Sets the inputs
for(var i = 0; i < this.layers[0].length; i ++){
if(i < data[0].length){
this.layers[0][i].output = data[0][i];
}
else{
this.layers[0][i].output = 0;
}
}
this.feedForward(); //Rerun through the NN with the new set outputs
for(var i = this.layers.length-1; i >= 1; i --){
for(var j = 0; j < this.layers[i].length; j ++){
var ref = this.layers[i][j];
//Calculate the gradients for each Neuron
if(i == this.layers.length-1){ //Output layer neurons
var error = ref.output - data[1][j]; //Error
ref.gradient = error * ref.output * (1 - ref.output);
}
else{ //Hidden layer neurons
var gradSum = 0; //Find sum from the next layer
for(var m = 0; m < this.layers[i+1].length; m ++){
var ref2 = this.layers[i+1][m];
gradSum += (ref2.gradient * ref2.weights[j]);
}
ref.gradient = gradSum * ref.output * (1-ref.output);
}
//Update each of the weights based off of the gradient
for(var m = 0; m < ref.weights.length; m ++){
//Find the corresponding neuron in the previous layer
var ref2 = this.layers[i-1][m];
ref.weights[m] -= LEARNING_RATE*ref2.output*ref.gradient;
}
}
}
this.feedForward();
};
Here, the NN is in a structure where each Neuron is an object with inputs, weights, and an output which is calculated based on the inputs/weights, and the Neurons are stored in a 2D 'layers' array where the x dimension is the layer (so, the first layer is the inputs, second hidden, etc.) and the y dimension is a list of the Neuron objects inside of that layer. The 'data' inputted is given in the form [data,correct-output] so like [[0,1],[1]].
Also my LEARNING_RATE is 1 and my hidden layer has 2 Neurons.
I feel like there must be some conceptual issue with my backpropagation method, as I've tested out the other parts of my code (like the feedForward part) and it works fine. I tried to use various sources, though I mostly relied on the wikipedia article on backpropagation and the equations that it gave me.
.
.
I know it may be confusing to read my code, though I tried to make it as simple to understand as possible, but any help would be greatly appreciated.

Related

Using a grid of points to detect if a object is in range

I have a grid of points and I made them using a for loop at the beginning of my app. Each point has two arrays, one is named objectsAroundMe and the other is called pointsAroundMe.
The objective is to detect if the object is near the point (using for loop for both objects and points)
After detection then if the object is in range we push it to the point.objectsAroumdMe array.
I have all of this fine and working, but the problem is getting the point to release the reference when the object is no longer near, I've tried running an if statement to do it and make the reference null but it doesn't work. If there were an efficient way of doing this that made it so only one reference was moving from array to array then that would be perfect. next I'm gonna try using array.splice and slice to copy amd paste references. But for now I've tried using array.filter and indexof and findindex none worked. But I'm newish to classes so if their is a difference between using a for loop iteration Id and using the"this" statement to clarify the object then please give me an example of how I would find the index of a "this" object and delete it's reference from the point array.
onHitTest(){
for (let ii = 0; ii < jsEngine.pointGrid.length; ii++) {
let point = jsEngine.pointGrid[ii];
let distanceBetween = calcTotalDistance(this.transform.x,this.transform.y,point.x,point.y);
let pointPosition = point.x + point.y;
if (!point.objectsAroundMe.includes(this)) {
if ( distanceBetween < mapWidth/density*1.4) {
point.objectsAroundMe.push(this);
this.hitTestArray = point.objectsAroundMe;
this.pointArray = point.pointsAroundMe;
//console.log(this.hitTestArray);
}
if(point.objectsAroundMe.includes(this)) {
if (pointPosition - distanceBetween > 100000) {
let indx= point.objectsAroundMe.indexOf(this);
point.objectsAroundMe[indx] = null;
}
}
}
}
//// second for loop for hit testing the passed array from the point to the object.
for (let i = 0 ; i < this.hitTestArray.length; i++){
let hitTestObject = this.hitTestArray[i];
if(hitTestObject.transform=== null)
continue;
if(hitTestObject === this)
continue;
let distance = calc_distance(this.transform,hitTestObject.transform);
if (distance < hitTestObject.transform.width + hitTestObject.transform.width
&& distance < this.transform.height + this.transform.height){
//console.log("hit!")
}
}
}
Mapwidth = 1000000 and density is 10.
distanceBetween: The distance between the object and the point using: return Math.sqrt((x1 - x2)**2 + (y1 - y2)**2);
this = the object in question (to avoid double for loop)
pointGrid= a grid of points with a total of 90 points equally spaced by mapwidth/density

I found out after finally giving up on this technique after 2 weeks that it really was not performing as well as expected, now I am going to take a similar approach and I will upload my code via: (functions and order of operation) shortly.

Mathematical Concept In Multi-Dimensional Array Sorting

I have found this code written online(not by me), and was hoping to get an answer on what the mathematical formula or concept is that makes this function work. I am curious as to how this person designed this. First, I will explain the requirements that the function must produce, then I will supply the code, and a link to a working code pen for further hacking. P.S. The problem uses the word "vector", but since this is Javascript, vector just means array.
Function Requirements
Given a vector of vectors of words, ex.
[['quick', 'lazy'], ['brown', 'black', 'grey'], ['fox', 'dog']].
Write a function that prints all combinations of one word from the first vector, one word from the second vector, etc.
The solution may not use recursion. The number of vectors and number of elements within each vector may vary.
Example output: 'quick, brown, dog', 'lazy black fox' etc.
My Current Level Of Understanding
I am already aware that by using the principle of multiplication, to find the number of possible combinations available in this scenario is to just multiply the lengths of each inner vector by each other. For this specific example, we get a total of 12(2x3x2) different possible combinations. Where I fall off however, is inside the 4 nested for loops section of the program.
Whoever wrote the code, clearly understands some concept or formula that I do not. Just two examples, are the "previous" variable used inside the loops, and the strategic placement of where they decide to increment the j variable. It seems to me that they might be aware of some mathematical formula.
Code
Below is the code without comments. If you however go to this codepen, I have included the same code with plenty of comments that explain how the program works, so you don't have to trace everything out from scratch. You can also test the output in the built-in console.
function comboMaker(vector) {
var length = vector.length;
var solutions = 1;
for (var i = 0; i < length; i++) {
solutions *= vector[i].length;
}
var combinations = [];
for (var i = 0; i < solutions; i++) {
combinations[i] = [];
}
var previous = 1;
for (var i = 0; i < length; i++) {
for (var j = 0; j < solutions;) {
var wordCount = vector[i].length;
previous *= vector[i].length;
for (var l = 0; l < wordCount; l++) {
for (var k = 0; k < (solutions/previous); k++) {
combinations[j][i] = vector[i][l];
j++
}
}
}
}
for (var i = 0; i < solutions; i++) {
console.log(combinations[i].join(" "));
}
}
comboMaker([['quick', 'lazy'], ['brown', 'black', 'grey'], ['fox', 'dog']]);

You can consider combination of items as number in mixed radix numeric system.
Radix for every position is equal to number of items in corresponding array (here {2,3,2}). Overall number of combination M is product of all radixes.
You can generate combination either
by making for-loop with counter in range 0..M-1 and separating every digit from this counter and getting corresponding item. Pseudocode
M = ProductOfLengthsOfArrays
for c = 0..M-1
t = c
combination = {}
for i = 0..NumOfArrays-1
d = t %% Array[i].Length //modulo operation
t = t / Array[i].Length //integer division
combination.add(Array[i][d])
output combination
by counting in mixed radix from 0 to M-1
if item is last in the array, get first item and increment the next array
else get the next item in the same array

A cooperation of .reduce() and .map() allows us to come up with a very efficient single liner answer for this question.
var data = [['quick', 'lazy'], ['brown', 'black', 'grey'], ['fox', 'dog'],['jumps','runs']],
result = data.reduce((p,c) => p.reduce((r,fw) => r.concat(c.map(sw => fw + " " + sw)),[]));
console.log(result);

Difficult to solve the phaser sliding puzzle as some parts of the original image is missing

Im trying to create the phaser examples game sliding puzzle
Live example is demonstrated here
But in the output game, some parts of the original image is missing. So it is difficult to solve the puzzle.
I am suspecting the algorithm of cutting the image to pieces is not correct.
The code for pieces is ,
function prepareBoard() {
var piecesIndex = 0,
i, j,
piece;
BOARD_COLS = Math.floor(game.world.width / PIECE_WIDTH);
BOARD_ROWS = Math.floor(game.world.height / PIECE_HEIGHT);
piecesAmount = BOARD_COLS * BOARD_ROWS;
shuffledIndexArray = createShuffledIndexArray();
piecesGroup = game.add.group();
for (i = 0; i < BOARD_ROWS; i++)
{
for (j = 0; j < BOARD_COLS; j++)
{
if (shuffledIndexArray[piecesIndex]) {
piece = piecesGroup.create(j * PIECE_WIDTH, i * PIECE_HEIGHT, "background", shuffledIndexArray[piecesIndex]);
}
else { //initial position of black piece
piece = piecesGroup.create(j * PIECE_WIDTH, i * PIECE_HEIGHT);
piece.black = true;
}
piece.name = 'piece' + i.toString() + 'x' + j.toString();
piece.currentIndex = piecesIndex;
piece.destIndex = shuffledIndexArray[piecesIndex];
piece.inputEnabled = true;
piece.events.onInputDown.add(selectPiece, this);
piece.posX = j;
piece.posY = i;
piecesIndex++;
}
}
}
function createShuffledIndexArray() {
var indexArray = [];
for (var i = 0; i < piecesAmount; i++)
{
indexArray.push(i);
}
return shuffle(indexArray);
}
function shuffle(array) {
var counter = array.length,
temp,
index;
while (counter > 0)
{
index = Math.floor(Math.random() * counter);
counter--;
temp = array[counter];
array[counter] = array[index];
array[index] = temp;
}
return array;
}
Please anyone have any idea ? Please share any algorithm to correctly cut the pieces.
Thanks in advance
iijb

This is the classic 15puzzle because it traditionally has a 4x4 grid with 1 tile missing (4x4-1=15 tiles). However the puzzle can practically be any grid size (4x3, 5x4, 6x6 etc).
You're using a .destIndex property to keep track of their position, but you could just give each tile a numbered index. I think that way it's easier because when all the tiles are ordered the puzzle is solved and it would also help the check-if-solvable-algorithm.
With these kind of sliding tile puzzles, there are two things to consider which are a little tricky, especially the 2nd point:
There is always one tile missing because that is the empty spot that the player can use to slide tiles into. Most commonly, the missing tile is the bottom-right tile of the image.
In your algorithm the blank tile is always the top-left tile of the image.
This is unusual and players might not expect that, however in theory it doesn't really matter and you could make a workable puzzle that way. You then keep track of the empty tile in code by value 1 (or maybe 0 for zero-indexed) because it's the first tile.
Some configurations are unsolvable, i.e. not every random scrambled tiles situation can be solved by sliding the tiles around.
A puzzle is solvable when the number of inversions (switches) needed to solve it is an even number, not odd. So count the number of pairs where a bigger number
is in front of a smaller one (=one inversions). For example in a 3x3 puzzle with the bottom-right tile missing:
5 3 4
2 6 1
8 7
In array it looks like this [5,3,4,2,6,1,8,7,9], so count pairs which are 5-3, 5-4, 5-2, 5-1, 3-2 3-1, 4-2 4-1, 2-1, 6-1, 8-7. This equals 11 pairs, so 11 inversions are needed. This is not an even number, thus this configuration is unsolvable. Btw note that the missing tile has internally the highest possible number, which is 9 in this case.
You can use this method to detect a unsolvable puzzle. All you need to do to make it solvable again is switch any two tiles, so for example the top first two tiles (so 5 and 3 in the example). When the number of switches needed is an even number, it's already solvable and you don't need to do anything.
I've made similar puzzle games, you can see the source code here to see how it works:
Photoscramble v2 (download incl Delphi source)
Photoscramble v1 (download incl BlitzBasic source)

Bilateral filter algorithm

I'm trying to implement a simple bilateral filter in javascript. This is what I've come up with so far:
// For each pixel
for (var y = kernelSize; y < height-kernelSize; y++) {
for (var x = kernelSize; x < width-kernelSize; x++) {
var pixel = (y*width + x)*4;
var sumWeight = 0;
outputData[pixel] = 0;
outputData[pixel+1] = 0;
outputData[pixel+2] = 0;
outputData[pixel+3] = inputData[pixel+3];
// For each neighbouring pixel
for(var i=-kernelSize; i<=kernelSize; i++) {
for(var j=-kernelSize; j<=kernelSize; j++) {
var kernel = ((y+i)*width+x+j)*4;
var dist = Math.sqrt(i*i+j*j);
var colourDist = Math.sqrt((inputData[kernel]-inputData[pixel])*(inputData[kernel]-inputData[pixel])+
(inputData[kernel+1]-inputData[pixel+1])*(inputData[kernel+1]-inputData[pixel+1])+
(inputData[kernel+2]-inputData[pixel+2])*(inputData[kernel+2]-inputData[pixel+2]));
var curWeight = 1/(Math.exp(dist*dist/72)*Math.exp(colourDist*colourDist*8));
sumWeight += curWeight;
outputData[pixel] += curWeight*inputData[pixel];
outputData[pixel+1] += curWeight*inputData[pixel+1];
outputData[pixel+2] += curWeight*inputData[pixel+2];
}
}
outputData[pixel] /= sumWeight;
outputData[pixel+1] /= sumWeight;
outputData[pixel+2] /= sumWeight;
}
}
inputData is from a html5 canvas object and is in the form of rgba.
My images are either coming up with no changes or with patches of black around edges depending on how i change this formula:
var curWeight = 1/(Math.exp(dist*dist/72)*Math.exp(colourDist*colourDist*8));
Unfortunately I'm still new to html/javascript and image vision algorithms and my search have come up with no answers. My guess is there is something wrong with the way curWeight is calculated. What am I doing wrong here? Should I have converted the input image to CIElab/hsv first?

I'm no Javasript expert: Are the RGB values 0..255? If so, Math.exp(colourDist*colourDist*8) will yield extremely large values - you'll probably want to scale colourDist to the range [0..1].
BTW: Why do you calculate the sqrt of dist and colourDist if you only need the squared distance afterwards?

First of all, your images turn out black/weird in the edges because you don't filter the edges. A short look at your code would show that you begin at (kernelSize,kernelSize) and finish at (width-kernelSize,height-kernelSize) - this means that you only filter a smaller rectangle inside the image where your have a margin of kernelSize on each side which is unfilterred. Without knowing your javscript/html5, I would assume that your outputData array is initialized with zero's (which means black) and then not touching them would leave them black. See my link the comment to your post for code that does handle the edges.
Other than that, follow #nikie's answer - your probably want to make sure the color distance is clamped to the range of [0,1] - youo can do this by adding the line colourDist = colourDist / (MAX_COMP * Math,sqrt(3)) (directly after the first line to calculate it). where MAX_COMP is the maximal value a color component in the image can have (usually 255)

I've found the error in the code. The problem was I was adding each pixel to itself instead of its surrounding neighbours. I'll leave the corrected code here in case anyone needs a bilateral filter algorithm.
outputData[pixel] += curWeight*inputData[kernel];
outputData[pixel+1] += curWeight*inputData[kernel+1];
outputData[pixel+2] += curWeight*inputData[kernel+2];

Need help with backtracking algorithm for generating Sudoku board

I have written an algorithm for generating a Sudoku board but it is failing. I have written it based on this though it does differ as I had written a lot of my code before I stumbled upon this.
The Code
I have a multidimensional array set up for holding the values called matrix. matrix consists of 9 arrays which are the rows and each of these hold the 9 columns. So to get the value at row 4 column 7 I would use
matrix[3][6];
The function for solving all the squares:
var populateMatrix = function() {
var possibles = generatePossibleNumbersArray();
var found = false;
for(var i=0; i< matrix.length; i++) {
for(var j=0; j< matrix[i].length; j++) {
while(possibles[i][j].length > 0) {
var rnd = Math.floor(Math.random() * possibles[i][j].length);
var num = possibles[i][j].splice(rnd, 1)[0];
if(isValid(i, j, num)) {
matrix[i][j] = num;
found = true;
break;
} else {
found = false;
continue;
}
}
if(!found) {
possibles[i][j] = [1,2,3,4,5,6,7,8,9];
j -= 2;
}
}
}
}
The generatePossibleNumbersArray() is just a helper function for creating a multidimensional array exactly like matrix except it is initialised to hold an array of integers 1-9 for each cell. During the populateMatrix() function these possible numbers get whittled down for each cell.
The Problem
It fails before completing the matrix every time because j ends up being -1. This is because as more cells get solved it becomes harder for the algorithm to find a value for a cell so it backtracks. But it eventually ends up backtracking all the way back until j == -1.
I really thought this algorithm would work and I've spent all day trying to get my head around this but I'm stumped so any light anyone could shed on this would be very much appreciated.
I thought 'I know, I'll write a javascript function for solving Sudoku. How hard can it be?'. How wrong I was.
[SOLUTION]
Based on a comment by #Steve314 (which he's now deleted!) I added matrix[i][j] = undefined into the if(!found) { ... and the algorithm now works and is lightening fast.
If anyone is interested, here is the complete code.

Backtracking algorithms usually restore the state if a branch fails and do the next possible move. So if the random filling of a field creates a failed branch just write back what was originally there.

Develop Reference

JavaScript is the programming language of the Web.