Canvas doesn’t render backslash and other chars with Topaz font

Canvas doesn’t render backslash and other chars with Topaz font - javascript

Never thought I'd come up with something new on Stack Overflow but here we are. Should this already have a solution then please feel free to point me towards it.
I'm working on this interactive desktop environment modelled after the Amiga Workbench and of course I would incorporate a recreation of the Topaz font.
Here is the thing though. Text and signs like hivens, commas and points display correctly when drawn through the Canvas API but signs like the copyright sign or backslash for instance do not.
Amiga Workbench screen where the copyright sign doesn't display correctly
I figured that the font I'm using maybe doesn't include the copyright sign but that struck me as weird. Why go through the trouble of making a font and then omit a couple of signs right?
If you open the font with Font Forge you clearly see that it's there along with a range of exotic other stuff.
Font Forge showing set of signs saved in font data
Looking at the set also rules out that there is a mixup between unicodes.
I used the same font in Gimp where I made a visual mockup/prototype and the copyright sign is displayed with no problem. That rules out any keyboard layout shenanigans right?
Screenshot of Gimp where the copyright sign displays correctly
The only thing left is my code which I guess is unproblematic. Here is how I weave in the font and draw text:
const amigaFont = new FontFace(
'Topaz A1200',
'url(./media/font/Topaz_a1200_v1.0.ttf)'
);
amigaFont.load().then(function (font) {
document.fonts.add(amigaFont);
});
// between here is code that essentially waits for all media to have
// loaded like images and including the font before it is
// used to draw text, setting up the canvas etc...
const splashScreenText = [
'AMIGA ROM Operating System and Libraries',
'Copyright © 1985-1992 Commodore-Amiga, Inc.',
'All Rights Reserved.',
'1>'
];
let fontSize = Math.round(amigaDOSwindow.bottom * (24 / 356));
ctx.font = fontSize.toString() + 'px Topaz A1200';
textCursor.x = amigaDOSinput.left;
textCursor.y = amigaDOSinput.top + fontSize;
for (let index = 0; index < splashScreenText.length; index++) {
ctx.fillText(splashScreenText[index], textCursor.x, textCursor.y);
textCursor.y += fontSize;
}
Edit: I was asked to share runnable code that reproduces the issue:
const mediaCount = 1;
const mediaArray = [];
const mediaManager = setInterval(checkArray, 15);
function checkArray() {
switch (true) {
case mediaArray.length == mediaCount:
clearInterval(mediaManager);
initCanvas();
break;
default:
break;
}
}
const amigaFont = new FontFace(
'Topaz A1200',
'url(https://cdn.jsdelivr.net/gh/rewtnull/amigafonts#master/ttf/Topaz_a1200_v1.0.ttf)'
);
amigaFont.load().then(function (font) {
document.fonts.add(amigaFont);
mediaArray.push(amigaFont);
});
const splashScreenText = [
'AMIGA ROM Operating System and Libraries',
'Copyright © 1985-1992 Commodore-Amiga, Inc.',
'All Rights Reserved.',
'1>'
];
const textCursor = {
x: 0,
y: 0,
};
function initCanvas() {
let canvas = document.createElement('canvas');
canvas.id = 'canvas';
document.body.appendChild(canvas);
processCanvas();
}
function processCanvas() {
let canvas = document.getElementById('canvas');
canvas.width = 320;
canvas.height = 180;
drawText();
}
function drawText() {
let canvas = document.getElementById('canvas');
let ctx = canvas.getContext('2d');
let fontSize = 32;
ctx.font = fontSize.toString() + 'px Topaz A1200';
textCursor.x = 0;
textCursor.y = 0 + fontSize;
for (let i = 0; i < splashScreenText.length; i++) {
ctx.fillText(splashScreenText[i], textCursor.x, textCursor.y);
textCursor.y += fontSize;
}
}
<canvas id="canvas"></canvas>
But somehow here it works...
Is there something (I clearly don't know) about fonts and the Canvas API that may lead to this behaviour?

Given that you get a replacement character (U+FFFD) this means that it's not a font issue, otherwise you'd fallback to another font, not on the replacement glyph. Instead, it means that your issue is an encoding one.
I don't know what charset your document is declared with, and I don't know what charset your html file has been encoded with, but clearly there is a mismatch.
If you do both save your html page as UTF-8 and add a <meta charset="utf-8"> in the head of your document, your glyph will render correctly, as it does in your snippet.

Related

Google Sheets. How to get the real range size in pixels

My script converts the selected range into an image, please see. It first creates a public PDF URL and then converts it to PNG.
It works well for small ranges (10-20 rows) and creates a shot including images, charts, sparklines, and formatting.
The problem is with big ranges (100-1000 rows). They contain a border of unknown size and I cannot calculate it.
Heavy borders make rows higher so the image does not fit.
If we have no borders or thin borders, the real image size appears a bit smaller than calculated. This creates an empty space below the image.
My code sample for getting the range size in pixels:
// get row height in pixels
var h = 0;
for (var i = rownum; i <= rownum2; i++) {
if (i <= options.measure_limit) {
size = sheet.getRowHeight(i);
}
h += size
/** manual correction */
if (size === 2) {
h-=1;
} else {
// h -= 0.42; /** TODO → test the range to make it fit any range */
}
if ((i % 50) === 0 && i <= options.measure_limit) {
file.toast(
'Done ' + i + ' rows of ' + rownum2,
'↕📐Measuring height...');
}
}
if (i > options.measure_limit) {
file.toast(
'Estimation: all other rows are the same size',
'↕📐Measuring height...');
}
As you see, I have to loop over all rows which is extremely inefficient. I'd be glad to hear your ideas for code optimization. Now it loops the first 150 rows and next it assumes all other rows have the same height.
Sample Situations
"Small" ranges are that you can see on screen. "Big" ranges have 100+ rows so they do not fit normal screen. As I create screenshots, I tested all possible range sizes.
Case1 - no borders or thin borders
If I select a big range I get the image, and see it has a white space at the bottom. This means the real size of image was slightly smaller than one I get from the Script by calling sheet.getRowHeight(i).
Case1 - heavy borders
If I select a big range I get the image, and see not all rows I've selected are on that image. Some rows at the bottom of the range are missing. This means when I add heavy borders, the real size of rows is bigger than one I get from the Script by calling sheet.getRowHeight(i).
Conclusion
I'd be glad to hear any ideas including JavaScript hacks to remove empty space below the image. If it is currently not possible, please also answer with links to docs.

I believe your goal is as follows.
You want to export the range as an image using Google Apps Script and Javascript.
In order to achieve this, in this question, you want to calculate the row height of the selected cell range.
Issue and workaround:
As our discussions in the comment, in the current stage, when the correct row height of the cell range is trying to be obtained, there are several problems as follows.
When the border is used for the cells, it seems that the row height + the border size is different from the exported result. Ref
Pixel size might not be changed linearly with the value of row height and border size. Ref
When I tested the cell size including the borders, I thought that the tendency of change of size might be different between height and width. Ref
When the row height is the default (21 from getRowHeight) and the text font size in the cell is increased, the value retrieved by getRowHeight is not changed from 21. Ref
There is also issue with wrapping text inside a cell which on my experience also causes errors in a pixel size of cell. Ref
From your question, when the selected cell range is large, the number of pages is more than 2. In this case, all pages cannot be correctly merged as an image.
From the above situation, I'm worried that obtaining the correct size of the selected cells might be difficult. So, I proposed to process this as image processing. Ref I thought that when this process is run with the image processing, the above issues might be able to be avoided.
But, unfortunately, in order to process this as image processing, there is no built-in method in Google Apps Script. But, fortunately, in your situation, it seems that Javascript can be used in a dialog. So, I created a Javascript library for achieving this process as the image processing. Ref
When this Javascript library is used, the sample demonstration is as follows.
Usage:
1. Prepare a Spreadsheet.
Please create a new Spreadsheet and put several values to the cells.
2. Sample script.
Please copy and paste the following script to the script editor of Spreadsheet.
Google Apps Script side: Code.gs
function getActiveRange_(ss, borderColor) {
const space = 5;
const sheet = ss.getActiveSheet();
const range = sheet.getActiveRange();
const obj = { startRow: range.getRow(), startCol: range.getColumn(), endRow: range.getLastRow(), endCol: range.getLastColumn() };
const temp = sheet.copyTo(ss);
const r = temp.getDataRange();
r.copyTo(r, { contentsOnly: true });
temp.insertRowAfter(obj.endRow).insertRowBefore(obj.startRow).insertColumnAfter(obj.endCol).insertColumnBefore(obj.startCol);
obj.startRow += 1;
obj.endRow += 1;
obj.startCol += 1;
obj.endCol += 1;
temp.setRowHeight(obj.startRow - 1, space).setColumnWidth(obj.startCol - 1, space).setRowHeight(obj.endRow + 1, space).setColumnWidth(obj.endCol + 1, space);
const maxRow = temp.getMaxRows();
const maxCol = temp.getMaxColumns();
if (obj.startRow + 1 < maxRow) {
temp.deleteRows(obj.endRow + 2, maxRow - (obj.endRow + 1));
}
if (obj.startCol + 1 < maxCol) {
temp.deleteColumns(obj.endCol + 2, maxCol - (obj.endCol + 1));
}
if (obj.startRow - 1 > 1) {
temp.deleteRows(1, obj.startRow - 2);
}
if (obj.startCol - 1 > 1) {
temp.deleteColumns(1, obj.startCol - 2);
}
const mRow = temp.getMaxRows();
const mCol = temp.getMaxColumns();
const clearRanges = [[1, 1, mRow], [1, obj.endCol, mRow], [1, 1, 1, mCol], [obj.endRow, 1, 1, mCol]];
temp.getRangeList(clearRanges.map(r => temp.getRange(...r).getA1Notation())).clear();
temp.getRange(1, 1, 1, mCol).setBorder(true, null, null, null, null, null, borderColor, SpreadsheetApp.BorderStyle.SOLID);
temp.getRange(mRow, 1, 1, mCol).setBorder(null, null, true, null, null, null, borderColor, SpreadsheetApp.BorderStyle.SOLID);
SpreadsheetApp.flush();
return temp;
}
function getPDF_(ss, temp) {
const url = ss.getUrl().replace(/\/edit.*$/, '')
+ '/export?exportFormat=pdf&format=pdf'
// + '&size=20x20' // If you want to increase the size of one page, please use this. But, when the page size is increased, the process time becomes long. Please be careful about this.
+ '&scale=2'
+ '&top_margin=0.05'
+ '&bottom_margin=0'
+ '&left_margin=0.05'
+ '&right_margin=0'
+ '&sheetnames=false'
+ '&printtitle=false'
+ '&pagenum=UNDEFINED'
+ 'horizontal_alignment=LEFT'
+ '&gridlines=false'
+ "&fmcmd=12"
+ '&fzr=FALSE'
+ '&gid=' + temp.getSheetId();
const res = UrlFetchApp.fetch(url, { headers: { authorization: "Bearer " + ScriptApp.getOAuthToken() } });
return "data:application/pdf;base64," + Utilities.base64Encode(res.getContent());
}
// Please run this function.
function main() {
const ss = SpreadsheetApp.getActiveSpreadsheet();
const temp = getActiveRange_(ss, "#000000");
const base64 = getPDF_(ss, temp);
const htmltext = HtmlService.createTemplateFromFile('index').evaluate().getContent();
htmltext = htmltext.replace(/IMPORT_PDF_URL/m, base64);
const html = HtmlService.createTemplate(htmltext).evaluate().setSandboxMode(HtmlService.SandboxMode.NATIVE);
SpreadsheetApp.getUi().showModalDialog(html, 'sample');
ss.deleteSheet(temp);
}
function saveFile(data) {
const blob = Utilities.newBlob(Utilities.base64Decode(data), MimeType.PNG, "sample.png");
return DriveApp.createFile(blob).getId();
}
HTML & Javascript side: index.gs
Here, I used a Javascript library of CropImageByBorder_js for processing this as the image processing.
<script src="//mozilla.github.io/pdf.js/build/pdf.js"></script>
<script src="https://cdn.jsdelivr.net/gh/tanaikech/CropImageByBorder_js#latest/cropImageByBorder_js.min.js"></script>
<canvas id="canvas"></canvas>
<script>
var pdfjsLib = window['pdfjs-dist/build/pdf'];
pdfjsLib.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.js';
const base64 = 'IMPORT_PDF_URL'; //Loaading the PDF from URL
const cvs = document.getElementById("canvas");
pdfjsLib.getDocument(base64).promise.then(pdf => {
const {numPages} = pdf;
if (numPages > 1) {
throw new Error("Sorry. In the current stage, this sample script can be used for one page of PDF data. So, please change the selected range to smaller.")
}
pdf.getPage(1).then(page => {
const viewport = page.getViewport({scale: 2});
cvs.height = viewport.height;
cvs.width = viewport.width;
const ctx = cvs.getContext('2d');
const renderContext = { canvasContext: ctx, viewport: viewport };
page.render(renderContext).promise.then(async function() {
const obj = { borderColor: "#000000", base64Data: cvs.toDataURL() };
const base64 = await CropImageByBorder.getInnerImage(obj).catch(err => console.log(err));
const img = new Image();
img.src = base64;
img.onload = function () {
cvs.width = img.naturalWidth;
cvs.height = img.naturalHeight;
ctx.drawImage(img, 0, 0);
}
google.script.run.withSuccessHandler(id => console.log(id)).saveFile(base64.split(",").pop());
});
});
});
</script>
3. Testing
When you test this script, please select the cells and run main(). By this, the selected cells are exported as an image (PNG) to the root folder as follows. In this case, you can see the above demonstration.
4. Flow.
In this sample script, the following flow is used.
Manually select the cells, and run the script of main().
At the script, the selected cells enclosed by the single row and column are created as a temporal sheet.
Export the temporal sheet as a PDF data as base64. Here, the PDF data is sent to Javascript side.
Convert 1st page of PDF data to an image using PDF.js.
Cropping the selected cells using CropImageByBorder_js, and return the result image to Google Apps Script side.
Save the image as a file to Google Drive.
LIMITATION:
In this sample script, it supposes that the selected range is put on one PDF page. So, when you select a large range, when the number of PDF pages is more than 2, unfortunately, this script cannot be used. So, please be careful about this.
And also, in this case, Javascript is used on a dialog. So, when you use this sample script, it is required to open the Spreadsheet and select the cells and run the script.
Note:
In your showing script, in order to use a created PDF data with PDF.js, the Spreadsheet is required to be publicly shared. But, in the case of PDF.js, it seems that the data URL can be directly used. So in this sample script, the created PDF is used as the data URL (base64). By this, it is not required to publicly share the Spreadsheet.
References:
PDF.js
CropImageByBorder_js

Javascript canvas draw line strange behavior using algorithm

There are plenty of examples on how to draw lines on canvas, in js.
But for only educational purposes i want to draw line using algorithm. basically method gets two Vector2 points, from them it finds middle point, then it continues like that recursively until minimum distance of 2 pixels is reached.
I have DrawPoint method to basically draw 1 point on canvas, and DrawLine method that does all the job.
For now I have 2 problems:
1: points are not colored red, as they should be.
2:
It doesnt look like a line.
For Vector2 i used "Victor.js" plugin, and it seems to be working well.
this is code i have:
JS:
var point2 = new Victor(100, 100);
var point3 = new Victor(150, 150);
DrawLine(point2, point3);
function DrawLine(vec0, vec1)
{
var point0 = new Victor(vec0.x, vec0.y);
var point1 = new Victor(vec1.x, vec1.y);
var dist = point1.distance(point0);
if (dist < 2)
return;
//this is how it should look like in c# var middlePoint = point0 + (point1 - point0)/2; But looks like i cant just divide by 2 using victor js because i can only divide vector by vector.
var middlePoint = point0.add(point1.subtract(point0).divide(new Victor(2,2)));
DrawPoint(middlePoint);
DrawLine(point0, middlePoint);
DrawLine(middlePoint, point1);
}
function DrawPoint(point){
var c = document.getElementById("screen");
var ctx = c.getContext("2d");
ctx.fillStyle = "FF0000";
ctx.fillRect(point.x, point.y, 3,1);
}
I really appreciate any help you can provide.

The victor.js documentation shows that most functions of Victors do not return new Victors, but operate on the current instance. In a way, v1.add(v2) is semantically more like v1 += v2 and not v1 + v2.
The problem is with calculating the midpoint. You could use the mix() method, which blends two vectors with a weight. You must clone() the Victor first, otherwise point0will be midofied:
var middlePoint = point0.clone().mix(point1, 0.5);
If you don't change the original Vectors, you don't need to create new instances of Victors from the arguments, you can use the arguments directly:
function DrawLine(point0, point1)
{
var dist = point1.distance(point0);
if (dist < 2) return;
var middlePoint = point0.clone().mix(point1, 0.5);
DrawPoint(middlePoint);
DrawLine(point0, middlePoint);
DrawLine(middlePoint, point1);
}
Finally, as Sven the Surfer has already said in a comment, "FF0000" isn't a valid colour. Use "#FF0000", note the hash mark, or one of the named web colours such as "crimson".

pdf.js: Get the text colour

I have a simple pdf file, containing the words "Hello world", each in a different colour.
I'm loading the PDF, like this:
PDFJS.getDocument('test.pdf').then( onPDF );
function onPDF( pdf )
{
pdf.getPage( 1 ).then( onPage );
}
function onPage( page )
{
page.getTextContent().then( onText );
}
function onText( text )
{
console.log( JSON.stringify( text ) );
}
And I get a JSON output like this:
{
"items" : [{
"str" : "Hello ",
"dir" : "ltr",
"width" : 29.592,
"height" : 12,
"transform" : [12, 0, 0, 12, 56.8, 774.1],
"fontName" : "g_font_1"
}, {
"str" : "world",
"dir" : "ltr",
"width" : 27.983999999999998,
"height" : 12,
"transform" : [12, 0, 0, 12, 86.5, 774.1],
"fontName" : "g_font_1"
}
],
"styles" : {
"g_font_1" : {
"fontFamily" : "serif",
"ascent" : 0.891,
"descent" : 0.216
}
}
}
However, I've not been able to find a way to determine the colour of each word. When I render it, it renders properly, so I know the information is in there somewhere. Is there somewhere I can access this?

As Respawned alluded to, there is no easy answer that will work in all cases. That being said, here are two approaches which seem to work fairly well. Both having upsides and downsides.
Approach 1
Internally, the getTextContent method uses whats called an EvaluatorPreprocessor to parse the PDF operators, and maintain the graphic state. So what we can do is, implement a custom EvaluatorPreprocessor, overwrite the preprocessCommand method, and use it to add the current text color to the graphic state. Once this is in place, anytime a new text chunk is created, we can add a color attribute, and set it to the current color state.
The downsides to this approach are:
Requires modifying the PDFJS source code. It also depends heavily on
the current implementation of PDFJS, and could break if this is
changed.
It will fail in cases where the text is used as a path to be filled with an image. In some PDF creators (such as Photoshop), the way it creates colored text is, it first creates a clipping path from all the given text characters, and then paints a solid image over the path. So the only way to deduce the fill-color is by reading the pixel values from the image, which would require painting it to a canvas. Even hooking into paintChar wont be of much help here, since the fill color will only emerge at a later time.
The upside is, its fairly robust and works irrespective of the page background. It also does not require rendering anything to canvas, so it can be done entirely in the background thread.
Code
All the modifications are made in the core/evaluator.js file.
First you must define the custom evaluator, after the EvaluatorPreprocessor definition.
var CustomEvaluatorPreprocessor = (function() {
function CustomEvaluatorPreprocessor(stream, xref, stateManager, resources) {
EvaluatorPreprocessor.call(this, stream, xref, stateManager);
this.resources = resources;
this.xref = xref;
// set initial color state
var state = this.stateManager.state;
state.textRenderingMode = TextRenderingMode.FILL;
state.fillColorSpace = ColorSpace.singletons.gray;
state.fillColor = [0,0,0];
}
CustomEvaluatorPreprocessor.prototype = Object.create(EvaluatorPreprocessor.prototype);
CustomEvaluatorPreprocessor.prototype.preprocessCommand = function(fn, args) {
EvaluatorPreprocessor.prototype.preprocessCommand.call(this, fn, args);
var state = this.stateManager.state;
switch(fn) {
case OPS.setFillColorSpace:
state.fillColorSpace = ColorSpace.parse(args[0], this.xref, this.resources);
break;
case OPS.setFillColor:
var cs = state.fillColorSpace;
state.fillColor = cs.getRgb(args, 0);
break;
case OPS.setFillGray:
state.fillColorSpace = ColorSpace.singletons.gray;
state.fillColor = ColorSpace.singletons.gray.getRgb(args, 0);
break;
case OPS.setFillCMYKColor:
state.fillColorSpace = ColorSpace.singletons.cmyk;
state.fillColor = ColorSpace.singletons.cmyk.getRgb(args, 0);
break;
case OPS.setFillRGBColor:
state.fillColorSpace = ColorSpace.singletons.rgb;
state.fillColor = ColorSpace.singletons.rgb.getRgb(args, 0);
break;
}
};
return CustomEvaluatorPreprocessor;
})();
Next, you need to modify the getTextContent method to use the new evaluator:
var preprocessor = new CustomEvaluatorPreprocessor(stream, xref, stateManager, resources);
And lastly, in the newTextChunk method, add a color attribute:
color: stateManager.state.fillColor
Approach 2
Another approach would be to extract the text bounding boxes via getTextContent, render the page, and for each text, get the pixel values which reside within its bounds, and take that to be the fill color.
The downsides to this approach are:
The computed text bounding boxes are not always correct, and in some cases may even be off completely (eg: rotated text). If the bounding box does not cover at least partially the actual text on canvas, then this method will fail. We can recover from complete failures, by checking that the text pixels have a color variance greater than a threshold. The rationale being, if bounding box is completely background, it will have little variance, in which case we can fallback to a default text color (or maybe even the color of k nearest-neighbors).
The method assumes the text is darker than the background. Otherwise, the background could be mistaken as the fill color. This wont be a problem is most cases, as most docs have white backgrounds.
The upside is, its simple, and does not require messing with the PDFJS source-code. Also, it will work in cases where the text is used as a clipping path, and filled with an image. Though this can become hazy when you have complex image fills, in which case, the choice of text color becomes ambiguous.
Demo
http://jsfiddle.net/x2rajt5g/
Sample PDF's to test:
https://www.dropbox.com/s/0t5vtu6qqsdm1d4/color-test.pdf?dl=1
https://www.dropbox.com/s/cq0067u80o79o7x/testTextColour.pdf?dl=1
Code
function parseColors(canvasImgData, texts) {
var data = canvasImgData.data,
width = canvasImgData.width,
height = canvasImgData.height,
defaultColor = [0, 0, 0],
minVariance = 20;
texts.forEach(function (t) {
var left = Math.floor(t.transform[4]),
w = Math.round(t.width),
h = Math.round(t.height),
bottom = Math.round(height - t.transform[5]),
top = bottom - h,
start = (left + (top * width)) * 4,
color = [],
best = Infinity,
stat = new ImageStats();
for (var i, v, row = 0; row < h; row++) {
i = start + (row * width * 4);
for (var col = 0; col < w; col++) {
if ((v = data[i] + data[i + 1] + data[i + 2]) < best) { // the darker the "better"
best = v;
color[0] = data[i];
color[1] = data[i + 1];
color[2] = data[i + 2];
}
stat.addPixel(data[i], data[i+1], data[i+2]);
i += 4;
}
}
var stdDev = stat.getStdDev();
t.color = stdDev < minVariance ? defaultColor : color;
});
}
function ImageStats() {
this.pixelCount = 0;
this.pixels = [];
this.rgb = [];
this.mean = 0;
this.stdDev = 0;
}
ImageStats.prototype = {
addPixel: function (r, g, b) {
if (!this.rgb.length) {
this.rgb[0] = r;
this.rgb[1] = g;
this.rgb[2] = b;
} else {
this.rgb[0] += r;
this.rgb[1] += g;
this.rgb[2] += b;
}
this.pixelCount++;
this.pixels.push([r,g,b]);
},
getStdDev: function() {
var mean = [
this.rgb[0] / this.pixelCount,
this.rgb[1] / this.pixelCount,
this.rgb[2] / this.pixelCount
];
var diff = [0,0,0];
this.pixels.forEach(function(p) {
diff[0] += Math.pow(mean[0] - p[0], 2);
diff[1] += Math.pow(mean[1] - p[1], 2);
diff[2] += Math.pow(mean[2] - p[2], 2);
});
diff[0] = Math.sqrt(diff[0] / this.pixelCount);
diff[1] = Math.sqrt(diff[1] / this.pixelCount);
diff[2] = Math.sqrt(diff[2] / this.pixelCount);
return diff[0] + diff[1] + diff[2];
}
};

This question is actually extremely hard if you want to do it to perfection... or it can be relatively easy if you can live with solutions that work only some of the time.
First of all, realize that getTextContent is intended for searchable text extraction and that's all it's intended to do.
It's been suggested in the comments above that you use page.getOperatorList(), but that's basically re-implementing the whole PDF drawing model in your code... which is basically silly because the largest chunk of PDFJS does exactly that... except not for the purpose of text extraction but for the purpose of rendering to canvas. So what you want to do is to hack canvas.js so that instead of just setting its internal knobs it also does some callbacks to your code. Alas, if you go this way, you won't be able to use stock PDFJS, and I rather doubt that your goal of color extraction will be seen as very useful for PDFJS' main purpose, so your changes are likely not going to get accepted upstream, so you'll likely have to maintain your own fork of PDFJS.
After this dire warning, what you'd need to minimally change are the functions where PDFJS has parsed the PDF color operators and sets its own canvas painting color. That happens around line 1566 (of canvas.js) in function setFillColorN. You'll also need to hook the text render... which is rather a character renderer at canvas.js level, namely CanvasGraphics_paintChar around line 1270. With these two hooked, you'll get a stream of callbacks for color changes interspersed between character drawing sequences. So you can reconstruct the color of character sequences reasonably easy from this.. in the simple color cases.
And now I'm getting to the really ugly part: the fact that PDF has an extremely complex color model. First there are two colors for drawing anything, including text: a fill color and stroke (outline) color. So far not too scary, but the color is an index in a ColorSpace... of which there are several, RGB being only one possibility. Then there's also alpha and compositing modes, so the layers (of various alphas) can result in a different final color depending on the compositing mode. And the PDFJS has not a single place where it accumulates color from layers.. it simply [over]paints them as they come. So if you only extract the fill color changes and ignore alpha, compositing etc.. it will work but not for complex documents.
Hope this helps.

There's no need to patch pdfjs, the transform property gives the x and y, so you can go through the operator list and find the setFillColor op that precedes the text op at that point.

Getting the size of text without using the boundingBox function in Raphael javascript library

I am trying to create some buttons with text in javascript using the Rahpael library. I would like to know the size of the styled text, before drawing to avoid that so I can create proper background (the button). Also I would like to avoid drawing the text outside of the canvas/paper (the position of the text is the position of its center).
I can use Raphaels getBBox() method, but I have to create (draw) the text in first place to do this. So I draw text to get the size to be able to draw it on the right position. This is ugly. So I am searching for some general method to estimate the metrics of styled text for given font, font-size, family ...
There is a possibility to do this using the HTML5 canvas http://www.html5canvastutorials.com/tutorials/html5-canvas-text-metrics/ but Raphael do not use canvas. Is there any possibility to get the text metrics using Raphael or plain Javascript?

There are a couple of ways to slice up this cat. Two obvious ones come readily to mind.
Out-of-view Ruler Technique
This one's the easiest. Just create a text element outside of the canvas's viewBox, populate it with your font information and text, and measure it.
// set it up -- put the variable somewhere globally or contextually accessible, as necessary
var textRuler = paper.text( -10000, -10000, '' ).attr( { fill: 'none', stroke: 'none' } );
function getTextWidth( text, fontFamily, fontSize )
{
textResult.attr( { text: text, 'font-family': fontFamily, 'font-size': fontSize } );
var bbox = textResult.getBBox();
return bbox.width;
}
It's not elegant by any stretch of the imagination. But it will do you want with relatively little overhead and no complexity.
Cufonized Font
If you were willing to consider using a cufonized font, you could calculate the size of a given text string without needing to mess with the DOM at all. In fact, this is probably approximately what the canvas's elements measureText method does behind the scenes. Given an imported font, you would simply do something like this (consider this protocode!)
// font should be the result of a call to paper.[getFont][2] for a cufonized font
function getCufonWidth( text, font, fontSize )
{
var textLength = text.length, cufonWidth = 0;
for ( var i = 0; i < textLength; i++ )
{
var thisChar = text[i];
if ( ! font.glyphs[thisChar] || ! font.glyphs[thisChar].w )
continue; // skip missing glyphs and/or 0-width entities
cufonWidth += font.glyphs[thisChar].w / font.face['units-per-em'] * fontSize;
}
return cufonWidth;
}
I really like working with cufonized fonts -- in terms of their capacity for being animated in interesting ways, they are far more useful than text. But this second approach may be more additional complexity than you need or want.

I know it seems sloppy, but you should have no problem drawing the text, measuring it, then drawing the button and moving the label. To be safe, just draw it off the screen:
var paper = Raphael(0, 0, 500, 500);
var text = paper.text(-100, -100, "My name is Chris");
//outputs 80 12
console.log(text.getBBox().width, text.getBBox().height);
If this REALLY offends your sensibilities, however -- and I would understand! -- you can easily generate an object to remember the width off each character for a given font:
var paper = Raphael(0, 0, 500, 500),
alphabet = "abcdefghijklmnopqrstuvwxyz";
font = "Arial",
charLengths = {},
ascii_lower_bound = 32,
ascii_upper_bound = 126;
document.getElementById("widths").style.fontFamily = font;
for (var c = ascii_lower_bound; c <= ascii_upper_bound; c += 1) {
var letter = String.fromCharCode(c);
var L = paper.text(-50, -50, letter).attr("font-family", font);
charLengths[letter] = L.getBBox().width;
}
//output
for (var key in charLengths) if (charLengths.hasOwnProperty(key)) {
var row = document.createElement("tr");
row.innerHTML = "<td>" + key + "</td><td>" + charLengths[key] + "</td>";
document.getElementById("widths").appendChild(row);
}

Shadow map appearing on wrong place

I'm trying to make use of the built-in shadow map plugin in three.js. After initial difficulties I have more or less acceptable image with one last glitch. That one being shadow appearing on top some (all?) surfaces, with normal 0,0,1. Below are pictures of the same model.
Three.js
Preview.app (Mac)
And the code used to setup shadows:
var shadowLight = new THREE.DirectionalLight(0xFFFFFF);
shadowLight.position.x = cx + dmax/2;
shadowLight.position.y = cy - dmax/2;
shadowLight.position.z = dmax*1.5;
shadowLight.lookAt(new THREE.Vector3(cx, cy, 0));
shadowLight.target.position.set(cx, cy, 0);
shadowLight.castShadow = true;
shadowLight.onlyShadow = true;
shadowLight.shadowCameraNear = dmax;
shadowLight.shadowCameraFar = dmax*2;
shadowLight.shadowCameraLeft = -dmax/2;
shadowLight.shadowCameraRight = dmax/2;
shadowLight.shadowCameraBottom = -dmax/2;
shadowLight.shadowCameraTop = dmax/2;
shadowLight.shadowBias = 0.005;
shadowLight.shadowDarkness = 0.3;
shadowLight.shadowMapWidth = 2048;
shadowLight.shadowMapHeight = 2048;
// shadowLight.shadowCameraVisible = true;
scene.add(shadowLight);
UPDATE: And a live example over here: http://jsbin.com/okobum/1/edit

Your code looks fine. You just need to play with the shadowLight.shadowBias parameter. This is always a bit tricky. (Note that the bias parameter can be negative.)
EDIT: Tighten up your shadow-camera near and far planes. This will help reduce both shadow acne and peter-panning. For example, your live link, set shadowLight.shadowCameraNear = 3*dmax;. This worked for me.
You can also try adding depth to your table tops, if it's not already there.
You can try setting renderer.shadowMapCullFrontFaces = false. This will cull back faces instead of front ones.

Develop Reference

JavaScript is the programming language of the Web.