How to include tesseract library in node-gyp build process - javascript

I'm trying to create simple node-addon with tesseract library as a dependency, but I'm a c++ beginner.
Whole code at: https://github.com/q-nick/node-tesseract
binding.cc:
#include <node.h>
#include <v8.h>
// #include <tesseract/baseapi.h>
// #include <leptonica/allheaders.h>
void Method(const v8::FunctionCallbackInfo<v8::Value>& args) {
v8::Isolate* isolate = args.GetIsolate();
args.GetReturnValue().Set(v8::String::NewFromUtf8(isolate, "world"));
}
void init(v8::Local<v8::Object> exports) {
NODE_SET_METHOD(exports, "hello", Method);
}
NODE_MODULE(NODE_GYP_MODULE_NAME, init)
binding.gyp:
{
"targets": [
{
"target_name": "binding",
"sources": [
"src/binding.cc"
],
'defines': [ 'V8_DEPRECATION_WARNINGS=1' ],
'include_dirs': [
],
'libraries': [
# '-lpvt.cppan.demo.google.tesseract.libtesseract',
# '-lleptonica'
]
}
]
}
I found a project which could help me compiling dependencies like tesseract, leptonica - it's https://cppan.org/
Unfortunately, I can't figure out - how to connect this with the node-gyp build process. CPPAN has one config file it's named cppan.yml (something like package.json in npm)
cppan.yml:
dependencies:
pvt.cppan.demo.google.tesseract.libtesseract: master
pvt.cppan.demo.danbloomberg.leptonica: 1
I want to build my node-addon and all dependencies (like tesseract) by one command. And don't know how to link c++ dependencies in node-gyp build
I want to use latest tesseract version so I can't use pre-compiled libraries. Currently, I'm working in Windows environment, but I want it to be a cross-platform process.
My example GitHub project (https://github.com/q-nick/node-tesseract) must compile successfully after uncommenting tesseract include.
If there is some other easy way how to accomplished this please share.

I want it to !
The solution is to build all c++ tesseract code as dependencies ! (and leptonica), so the first is to try to know how to build tesseract (which arguments, variables, defines ...)
Just checks this eg: https://github.com/istex/popplonode/blob/master/binding.gyp
There is a dependencies file to poppler in lib folder.
It could be could to work together on this !

I will answer my question by myself.
I found a project: https://github.com/cmake-js/cmake-js which has many explanation about why move away from gyp:
...First of all, Google, the creator of the gyp platform is moving towards its new build system called gn, which means gyp's days of support are counted...
I also found: https://github.com/nodejs/nan/
...The goal of this project is to store all logic necessary to develop native Node.js addons without having to inspect NODE_MODULE_VERSION and get yourself into a macro-tangle...
So i give it a try.
binding.cc:
#include <nan.h>
#include <baseapi.h>
#include <allheaders.h>
NAN_MODULE_INIT(InitAll) {
Set(target, New<String>("myMethod").ToLocalChecked(),
GetFunction(New<FunctionTemplate>(MyMethod)).ToLocalChecked());
}
NODE_MODULE(addon, InitAll)
NAN_METHOD(MyMethod) {
info.GetReturnValue().Set(Nan::New<v8::String>("world").ToLocalChecked());
}
Next thing is to create CMakeLists.txt file with few modification. I want to use cppan as dependencies installator, so I have to add some extra lines to default CMAkeLists.txt file:
add_subdirectory(.cppan)
...
target_link_libraries(${PROJECT_NAME} ${CMAKE_JS_LIB}
pvt.cppan.demo.google.tesseract.libtesseract
pvt.cppan.demo.danbloomberg.leptonica
)
CMakeLists.txt:
project(addon)
file(GLOB SOURCE_FILES "src/**/*.cc" "src/**/*.h")
add_library(${PROJECT_NAME} SHARED ${SOURCE_FILES})
add_subdirectory(.cppan)
set_target_properties(${PROJECT_NAME} PROPERTIES PREFIX "" SUFFIX ".node")
target_include_directories(${PROJECT_NAME} PRIVATE ${CMAKE_JS_INC})
target_link_libraries(${PROJECT_NAME} ${CMAKE_JS_LIB}
pvt.cppan.demo.google.tesseract.libtesseract
pvt.cppan.demo.danbloomberg.leptonica
)
cppan.yml
dependencies:
pvt.cppan.demo.google.tesseract.libtesseract: master
pvt.cppan.demo.danbloomberg.leptonica: 1
Now, everything is already set up and we can run install and build command:
cppan
and
cmake-js build
Good luck!

Related

How to use the .js and .wasm artifacts emitted by Emscripten and Embind?

I have two fairly simple C++ class definitions and their interfaces, uuid.{hpp,cpp} and uuid_util.{hpp,cpp}, and I have one more file uuid_bind.cpp with #include <emscripten/bind.h> to bind the C++ classes, function and static function definitions to JavaScript.
The two classes are first built as a static library uuid_lib.a, which is then linked against the latter C++ source file and built with em++ --bind -o uuid_module.js uuid_bind.cpp uuid_lib.a (which CMake generates) to generate uuid_module.js and uuid_module.wasm. Now, what do I do with these?
The documentation on Embind is somewhat sparse, and only says
The resulting quick_example.js file can be loaded as a node module or via a <script> tag:
...
I found this Google tutorial on combining Emscripten/Embind and node.js, and I have replicated it as much as possible (excluding the bit on Docker, as my Linux distribution serves Emscripten directly). I have both an index.html and a package.json file, and npm test launches http-server, which I run from Chrome.
I was under the impression that Emscripten/Embind would simply serve as a translation layer for any bound classes, functions (static or otherwise), variables, and would be able to be called straightforwardly from JavaScript, but it turns out this is not the case. Am I missing out on something here? I'm not very familiar with JS. All I want to do in index.js is something like:
index.js
import uuid_module from './uuid_lib.js';
console.log(uuid_module.uuid_util.generate_type1_uuid("BLA"));
// ... other calls
and run this with node index.js which would print a UUID string to the console.
For background, I've provided my CMakeLists.txt and uuid_bind.cpp below.
CMakeLists.txt
cmake_minimum_required(VERSION 3.16)
# Set project file
project(
uuid_generator
VERSION 1.0
DESCRIPTION "A UUID generator"
LANGUAGES CXX)
# Export compile commands for clangd
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
# Add Emscripten CMake toolchain file
if(EMSCRIPTEN)
message("Using Emscripten.")
set(CMAKE_TOOLCHAIN_FILE
${PROJECT_SOURCE_DIR}/deps/emsdk/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake)
else()
message("Using system compilers.")
endif(EMSCRIPTEN)
# C++20 guaranteed, no extensions eg. GNU/MSVC
set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)
# Strict compilation for MSVC or GCC/Clang
if(MSVC)
add_compile_options(/W4 /WX)
else()
add_compile_options(-Wall -Wextra -pedantic)
endif()
# Add uuid_generator as a shared library
add_library(uuid_lib STATIC src/uuid.cpp src/uuid_util.cpp)
# Set C++ library properties: output name, location, etc
set_target_properties(
uuid_lib
PROPERTIES OUTPUT_NAME uuid_lib
PREFIX ""
ARCHIVE_OUTPUT_DIRECTORY ${PROJECT_SOURCE_DIR}/dist)
# Set JS binding application properties: set bind, output type, etc. Runs only if Emscripten is enabled.
if(DEFINED ENV{EMSDK})
add_executable(uuid_module src/uuid_bind.cpp)
set_target_properties(
uuid_module
PROPERTIES OUTPUT_NAME uuid_module
SUFFIX ".js"
RUNTIME_OUTPUT_DIRECTORY ${PROJECT_SOURCE_DIR}/dist)
target_include_directories(uuid_module
PRIVATE ${PROJECT_SOURCE_DIR}/deps/emsdk/upstream/emscripten/system/include)
target_link_libraries(uuid_module uuid_lib)
target_link_options(
uuid_module
PUBLIC
$<$<CONFIG:DEBUG>:
-v
--bind
-sEXCEPTION_DEBUG=1
-sDEMANGLE_SUPPORT=1
-sDISABLE_EXCEPTION_CATCHING=1
-sWASM_BIGINT=1
-sMODULARIZE=1
-sEXPORT_ES6=1
-sEXPORT_NAME=uuid_module
>)
endif(DEFINED ENV{EMSDK})
uuid_bind.cpp
#include "uuid.hpp"
#include "uuid_util.hpp"
#include <cstdint>
#include <emscripten.h>
#include <emscripten/bind.h>
#include <string>
#include <iostream>
EMSCRIPTEN_BINDINGS(uuid_generator)
{
emscripten::constant("UINT29_MAX", uuid_gen::uuid::UINT29_MAX);
emscripten::constant("ICAO_ADDRESS_MAX", uuid_gen::uuid_util::ICAO_ADDRESS_MAX);
emscripten::constant("MAC_ADDRESS_MAX", uuid_gen::uuid_util::MAC_ADDRESS_MAX);
emscripten::constant("RANDOM_NAMESPACE_MAX", uuid_gen::uuid_util::RANDOM_NAMESPACE_MAX);
emscripten::class_<uuid_gen::uuid>("uuid")
.property("namespace_type", &uuid_gen::uuid::get_namespace_type)
.property("namespace_value", &uuid_gen::uuid::get_namespace_value)
.property("namespace_string", &uuid_gen::uuid::get_namespace_string)
.property("to_string", &uuid_gen::uuid::to_string)
.class_function("uniform_int_distr", &uuid_gen::uuid::uniform_int_dist);
emscripten::class_<uuid_gen::uuid_util>("uuid_util")
.class_function("generate_type1_uuid", &uuid_gen::uuid_util::generate_type1_uuid)
.class_function("generate_type2_uuid", &uuid_gen::uuid_util::generate_type2_uuid)
.class_function("generate_type3_uuid", &uuid_gen::uuid_util::generate_type3_uuid)
.class_function("generate_type4_uuid", &uuid_gen::uuid_util::generate_type4_uuid)
.class_function("generate_type5_uuid_str", emscripten::select_overload<uuid_gen::uuid(const std::string &)>(
&uuid_gen::uuid_util::generate_type5_uuid))
.class_function("generate_type5_uuid_int", emscripten::select_overload<uuid_gen::uuid(std::uint32_t)>(
&uuid_gen::uuid_util::generate_type5_uuid))
.class_function("generate_type6_uuid_str", emscripten::select_overload<uuid_gen::uuid(const std::string &)>(
&uuid_gen::uuid_util::generate_type6_uuid))
.class_function("generate_type6_uuid_int", emscripten::select_overload<uuid_gen::uuid(std::uint64_t)>(
&uuid_gen::uuid_util::generate_type6_uuid))
.class_function("generate_type7_uuid", &uuid_gen::uuid_util::generate_type7_uuid)
.class_function("from_string", &uuid_gen::uuid_util::from_string)
.class_function("is_valid_oper_agency", &uuid_gen::uuid_util::is_valid_oper_agency)
.class_function("is_valid_loc_atm_id", &uuid_gen::uuid_util::is_valid_loc_atm_id)
.class_function("is_valid_mac_address_string", &uuid_gen::uuid_util::is_valid_mac_address_string)
.class_function("namespace_char_encode", &uuid_gen::uuid_util::namespace_char_encode)
.class_function("namespace_char_decode", &uuid_gen::uuid_util::namespace_char_decode)
.class_function("namespace_encode", &uuid_gen::uuid_util::namespace_encode)
.class_function("namespace_decode", &uuid_gen::uuid_util::namespace_decode)
.class_function("mac_address_encode", &uuid_gen::uuid_util::mac_address_encode)
.class_function("mac_address_decode", &uuid_gen::uuid_util::mac_address_decode);
};
Am I missing out on something here?
What you're missing is that in -s MODULARIZE=1 mode, the default export is a factory, not the module object itself.
You'll need to create a module first and then you should be able to access the exposed properties:
import uuidModuleFactory from './uuid_lib.js';
const uuid_module = uuidModuleFactory(/* optional Emscripten config goes here */);
console.log(uuid_module.uuid_util.generate_type1_uuid("BLA"));
// ... other calls

How to debug Rollup build output with a massive encoding log?

I recently built a library with Rollup that has a few non-usual bits. That includes for instance, loading up a wasm module, workers with importScripts and a few occurences of eval() in the global scope.
Now I used the rollup-starter-app to create a demonstrator and client app for that library. The repo is https://github.com/frantic0/sema-engine-rollup
I managed to get everything working, after hitting a few walls and adding the following rollup plugins
import { wasm } from "#rollup/plugin-wasm";
import workerLoader from "rollup-plugin-web-worker-loader";
import dynamicImportVars from "#rollup/plugin-dynamic-import-vars";
import copy from "rollup-plugin-copy";
However, in the build output, I'm getting this massive log of what seems to be some encoding...
I'm not sure where this log is coming from and it is so massive that it clears out all the information of the build in the terminal...
What is the best way to tackle this issue and how to debug it effectively?
based on the suggestion #lukastaegert on the rollup issues, one solution is to redirect stderr into a file to read the log.
To do that you can add the following to your rollup command
"rollup -cw 2>err.log 1>out.log"
this allows to further inspect the build log but doesn't solve the error
[EDIT]
After a bit of peeking around Rollup's github issues and source, I found the warning categories and how to deactivate warnings.
Basically, we need to add a function onwarn to rollup.config.js. The first code section below shows the function. The second one show where we should add it on the rollup.config.js
const onwarn = (warning) => {
// Silence warning
if (
warning.code === 'CIRCULAR_DEPENDENCY' ||
warning.code === 'EVAL'
) {
return
}
console.warn(`(!) ${warning.message}`)
}
export default {= {
inlineDynamicImports: !dynamicImports,
preserveEntrySignatures: false,
onwarn,
input: `src/main.js`,
output: {

Resolving dynamic module with Webpack and Laravel Mix

Good time of the day,
Recently I've been trying to implement dynamic module loading functionality for my project. However, I'm failing for past few hours. To give you an idea of what I'm trying to achieve, here is the structure of the project
plugins
developer
assets
scss
developer.scss
js
developer.js
themes
theme_name
webpack.mix.js
node_modules/
source
js
application.js
bootstrap.js
scss
application.scss
_variables.scss
So, in order to get the available plugins, I've made the following function
/**
* Get all plugins for specified developer
* which have 'assets' folder
* #param developerPath
* #param plugins
*/
function getDeveloperPlugins(developerPath, plugins) {
if (fs.existsSync(developerPath)) {
fs.readdirSync(developerPath).forEach(entry => {
let pluginPath = path.resolve(developerPath, entry),
assetsPath = path.resolve(pluginPath, 'assets');
if (fs.existsSync(assetsPath))
plugins[entry] = assetsPath;
});
}
}
This function loads all the available plugins for the specified developer, then goes inside and looks for the assets folder, if it exists, then it returns it and we can work with the provided directory later.
The next step is to generate the reference for every plugin (direct path to the developer_name.js file) which later should be 'mixed' into one plugins.bundle.js file.
In order to achieve this, the following piece of code 'emerged'
_.forEach(plugins, (directory, plugin) => {
let jsFolder = path.resolve(directory, 'js'),
scssFolder = path.resolve(directory, 'scss');
if (fs.existsSync(jsFolder)) {
webpackModules.push(jsFolder);
let possibleFile = path.resolve(jsFolder, plugin + '.js');
if (fs.existsSync(possibleFile))
pluginsBundle.js[plugin] = possibleFile;
}
if (fs.existsSync(scssFolder)) {
webpackModules.push(scssFolder);
let possibleFile = path.resolve(scssFolder, plugin + '.scss');
if (fs.existsSync(possibleFile))
pluginsBundle.scss[plugin] = possibleFile;
}
});
And the last step before I'm starting to edit the configuration of the Webpack is to get the folders for both scss and js files for all plugins and all developers:
let jsPluginsBundle = _.values(pluginsBundle.js),
scssPluginsBundle = _.values(pluginsBundle.scss);
And here is where the problems start to appear. I've tried many solutions offered either here on GitHub (in respective repositories), but I've failed so many times.
The only error I'm having now is this one:
ERROR in F:/Web/Projects/TestProject/plugins/developer/testplugin/assets/js/testplugin.js
Module build failed: ReferenceError: Unknown plugin "transform-object-rest-spread" specified in "base" at 0, attempted to resolve relative to "F:\\Web\\Projects\\TestProject\\plugins\\developer\\testplugin\\assets\\js"
Yes, i know that webpack.mix.js file should be in the root folder of the project, however, i'm just developing theme, which uses modules developed by other members of the team.
So, idea was to:
Start build process: npm run dev|prod
Load plugins for all needed developers automatically
Use methods and html tags provided by the plugin (it is a mix of PHP for API routing and Vue.js for Components, etc) as follows: <test-component></test-component>
Any help is really appreciated, i just cant get my head around that error. If you need extra information, i'm ready to help since i myself need help to solve this issue =)
Update: The latest Webpack config used by mix.webpackConfig() (still failing though)
let webpackConfiguration = {
module: {
rules: [{
test: /\.js$/,
exclude: /(node_modules|bower_components)/,
use: {
loader: require.resolve('babel-loader'),
options: {
presets: [
'babel-preset-env'
].map(require.resolve),
plugins: [
'babel-plugin-transform-object-rest-spread'
].map(require.resolve)
}
}
}]
},
resolve: {
modules: webpackModules
}
};
mix.webpackConfig(webpackConfiguration);
And this is the content of the webpackModules variable:
[
'F:\\Web\\Projects\\TestProject\\themes\\testtheme\\node_modules',
'F:\\Web\\Projects\\TestProject\\themes\\testtheme',
'F:\\Web\\Projects\\TestProject\\plugins\\developer\\testplugin\\assets\\js',
'F:\\Web\\Projects\\TestProject\\plugins\\developer\\testplugin\\assets\\scss'
]
Okay, after 7 hours I've decided to try the most obvious method to solve the problem, to create node_modules folder in the root of the project and install laravel-mix there, and it worked like a charm.
Looks like, if it cant find the module in the directory outside the root scope of the Webpack, it will go up the tree to find the node_modules folder.
Developers should allow us to set the root folder for Webpack to fetch all the modules i guess, but well, problem is solved anyways.

Cannot call unknown function

I'm trying to compile a medium-size existing code base with emscripten. Everything currently compiles, but when I try to call it from javascript I'm getting the error:
Assertion failed: Cannot call unknown function InitHOG (perhaps LLVM optimizations or closure removed it?)
I've declared this as:
extern "C" {
void EMSCRIPTEN_KEEPALIVE InitHOG()
{ ... }
}
I'm linking the function from javascript with:
InitHog = Module.cwrap('InitHOG', 'void', []);
My code base is being compiled into libraries; the function call into the library is in my guihtml library, where the final linking command is:
emcc -o ../../../../html/debug/bidirnecessary.js ../../../../objs_html/bidirnecessary.js/debug/demos/bidirnecessary/Driver.o -lenvironments -lmapalgorithms -lalgorithms -lgraphalgorithms -lgraph -lutils -lguihtml -L../../../../html/debug -Lapps/libs -Ldemos/libs -lpthread -g
Any ideas on why it can't find my function from javascript?
While the EMSCRIPTEN_KEEPALIVE keyword works when you are compiling a single file to .js output, it doesn't work in my makefile system where I compile individual files, use emar to make a library, and then link everything together at the end.
Instead, you need to use the -s directive to specify which functions you want to export. So, something like this works.
emcc -o ../../../../html/debug/bidirnecessary.js ../../../../objs_html/bidirnecessary.js/debug/demos/bidirnecessary/Driver.o -lenvironments -lmapalgorithms -lalgorithms -lgraphalgorithms -lgraph -lutils -lguihtml -lgui -L../../../../html/debug -Lapps/libs -Ldemos/libs -lpthread -g -s EXPORTED_FUNCTIONS="['_InitHOG', '_DoFrame', '_MouseEvent']"

swig - c++ to javascript

I'm trying to build a simple javascript module using swig from my cpp files. I ran alll the right commands but it seems like nothing is working.
this is my .h file
#pragma once
class Die
{
public:
Die();
Die(int a);
~Die();
int foo(int a) ;
Die* getDie(int a);
int myVar;
};
my .cpp file:
#include <iostream>
#include "example.h"
int Die::foo(int a) {
std::cout << "foo: running fact from simple_ex" <<std::endl;
return 1;
}
Die::Die(){}
Die::Die(int a){myVar = a;}
Die::~Die(){}
Die* Die::getDie(int a) {
return new Die (a);
}
my .i file:
%module example
%{
#include "example.h"
%}
%include "example.h"
my binding.gyp file:
{
"targets": [
{
"target_name": "example",
"sources": ["example.cpp", "example_wrap.cxx" ]
}
]
}
I followed all the command from the swig docs.
I ran:
sudo apt-get install libv8-dev
sudo apt-get install libjavascriptcoregtk-1.0-dev
swig -c++ -javascript -node example.i
node-gyp configure build
After I run the last commands i get all sorts of errors:
error: ‘NewSymbol’ is not a member of ‘v8::String’
and many many more..
Any help will do.
Thanks!
I tried that example to learn this interface myself.
To help others who may stumble upon this here is an example how to work
with swig and js.
First we write the C++ class and its logic using the objectbased approach swig is learning us.
#pragma once
class Die
{
public:
Die(int a);
~Die();
int foo(int a);
int myVar;
};
extern "C"
{
Die* getDie(int a);
}
The interesting thing here is we don't always create a new instance
but we use an external function to lend us a pointer to the class which can then used to import it in our Javascript. This is literally what swig is all about.
Here is the implementation:
#include <iostream>
#include "example.h"
int Die::foo(int a)
{
std::cout << "foo: running fact from simple_ex" << std::endl;
return 1;
}
Die::Die(int a)
{
myVar = a;
}
Die::~Die()
{
}
extern "C"
{
Die* getDie(int a)
{
return new Die(a);
}
}
also here the function to get that said pointer is encapsulated in extern C
which is how we separate it from the other class implementation and also some help for the compiler.
The swig interface is the same as in the question. It is used to generate the wrap-file swig makes to give us a implemented interface between Javascript and our C++ library
%module example
%{
#include "example.h"
%}
%include "example.h"
this creates us the wrap-file using the following statement in terminal:
swig -c++ -javascript -node example.i
now we need some tools for Javascript to build this:
you need to install NodeJs and NPM to use the following things.
first we need a package.json file:
{
"name": "SwigJS",
"version": "0.0.1",
"scripts": {
"start": "node index.js",
"install": "node-gyp clean configure build"
},
"dependencies": {
"nan": "^2.16.0",
"node-gyp": "^9.0.0"
}
}
this is important to let the build programm know some information about the package and its dependencies.
after that we create a file called "binding.gyp"
{
"targets": [
{
"target_name": "SwigJS",
"sources": [ "example_wrap.cxx", "example.cpp" ],
"include_dirs" : [ "<!(node -e \"require('nan')\")" ]
}
]
}
this hold information for our buildtarget and also towards nan.
to get this working we now need to create the .node file.
this is done by either using:
node-gyp configure
node-gyp build
or using:
npm i
both does nearly the same as it appears to me. (correct me if i am wrong)
at last we now implement our Javascript and use the library there.
There are some more tricks to make the path on top to disappear so that
you could write just require("modulname") but thats actually to much for this example.
const Swigjs = require("./build/Release/SwigJS.node");
console.log("exports :", Swigjs); //show exports to see if we have a working library
die = Swigjs.getDie(5); //get the Class pointer
console.log("foo:" + die.foo(5)); //call a function from the class
I hope this helps to get a clear sight how swig and js work together

Categories

Resources