ssl: Promoting Existing Client Socket to SSL in C/C++

You may be in a situation where something else produces the sockets for you (such as an event-loop) or you otherwise need to manage the socket rather then allowing something else to.

#include <stdio.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <openssl/ssl.h>

int main(int argc, char *argv[])
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (!sockfd) {
        printf("Error creating socket.\n");
        return -1;

    struct sockaddr_in sa;
    memset (&sa, 0, sizeof(sa));

    sa.sin_family = AF_INET;
    sa.sin_addr.s_addr = inet_addr("");
    sa.sin_port = htons (443); 

    socklen_t socklen = sizeof(sa);
    if (connect(sockfd, (struct sockaddr *)&sa, socklen)) {
        printf("Error connecting to server.\n");
        return -1;


    const SSL_METHOD *meth = TLSv1_2_client_method();
    SSL_CTX *ctx = SSL_CTX_new (meth);

    SSL *ssl = SSL_new (ctx);
    if (ssl == NULL) {
        printf("Could not create SSL context.\n");
        return -1;

    SSL_set_fd(ssl, sockfd);

    int err = SSL_connect(ssl);
    if (err <= 0) {
        printf("Could not connect.\n");
        return -1;

    printf ("SSL connection using %s\n", SSL_get_cipher (ssl));

    // Do send/receive here.

    return 0;

Adapted from openssl-in-c-socket-connection-https-client, and works with both OpenSSL and BoringSSL.

C++: Embedding the V8 JavaScript Engine

V8 is Chromium’s JavaScript interpreter. Not only can you use the included D8 utility to open a console and execute JavaScript code directly, but it is not as tough as you would think to integrate your native functions from your JavaScript and your JavaScript functions from your native functions. Of course, you have to get your head around all of the layers of isolates, context, and scoping.

For a walkthrough of embedding V8, start here.

In order to build a quick example of how to implement a native function for consumption from your JavaScript, go ahead and build V8. You will have to choose how to build the project. You’ll provide the name of a particular build configuration, which will write a few build parameters (called “args”) to the “” file. This is a GN thing. GN is a build tool from the Chromium depot-tools project that generates Ninja scripts for doing the actual build.

For a list of build configurations, run:

$ tools/dev/ list

(list shortened for simplicity)

We’ve used the “x64.release.sample” configuration for our example. It is straightforward for the purpose of this example. Once you have V8 downloaded and the dependencies installed, the actual project builds in about fifteen-minutes (with sixteen threads).

Drop this example into a file named “native_function.cpp” (or update the build command below to whatever you go with). This is an amalgam of excerpts from the samples in the project, examples in the documentation, and some customization on our part.

#include <sstream>

#include "libplatform/libplatform.h"
#include "v8.h"

// This is the function that we'll register into JS as a global function.
static void TestCallback(const v8::FunctionCallbackInfo<v8::Value>& args) {
  v8::Isolate* isolate = args.GetIsolate();
  v8::HandleScope scope(isolate);

  // Return [the arbitrary integer] 55 as our result.
  auto result = v8::Integer::New(isolate, 55);

  // Set the result.

int main(int argc, char* argv[]) {

  // Initialize V8.
  std::unique_ptr<v8::Platform> platform = v8::platform::NewDefaultPlatform();

  // Create a new Isolate and make it the current one.
  v8::Isolate::CreateParams create_params;
  create_params.array_buffer_allocator =

  v8::Isolate* isolate = v8::Isolate::New(create_params);

  // Scoping in the C++ is tightly related to scoping in the JS. So, we'll 
  // retain the organizational blocks from the samples.
    v8::Isolate::Scope isolate_scope(isolate);

    // Create a stack-allocated handle scope.
    v8::HandleScope handle_scope(isolate);

    // Create a global object to install our function into.
    v8::Local<v8::ObjectTemplate> global = v8::ObjectTemplate::New(isolate);

    // Register our global function.
      v8::String::NewFromUtf8(isolate, "testCall", v8::NewStringType::kNormal).ToLocalChecked(),
      v8::FunctionTemplate::New(isolate, TestCallback)

    // Create a new context and apply the global object.
    v8::Local<v8::Context> context = v8::Context::New(isolate, NULL, global);

    // Enter the context for compiling and running the hello world script.
    v8::Context::Scope context_scope(context);

    // Load the script.

    std::stringstream sourceStream;

      << "testCall();" << std::endl;

    v8::Local<v8::String> source =

    // Compile the source code.
    v8::Local<v8::Script> script =
      v8::Script::Compile(context, source)

    // Run the script.
    v8::Local<v8::Value> result = script->Run(context).ToLocalChecked();

    // Print the screen output.
    v8::String::Utf8Value output(isolate, result);
    printf("%s\n", *output);

  // Dispose the isolate and tear down V8.
  delete create_params.array_buffer_allocator;

  return 0;

To build, set V8PATH to the absolute path of your “v8/v8” directory (the main V8 path established in the getting-started document above), and run the following:

$ g++ "-I${V8PATH}/include" -o native_function native_function.cpp -lv8_monolith "-L${V8PATH}/" -pthread -std=c++0x

The example implements a simple, native function that just returns (55), registers it as a global JS function, creates a simple JS script that calls it, and then runs the script and prints the screen output. The screen output is just the result of that function printed to the screen since it was not otherwise assigned to a variable:

$ ./native_function 

For a couple of more examples, see the project repository.

MSBuild/C#: How to Manage the Application Version Using a Text-File


C# applications have an “AssemblyInfo.cs” file that describe the assembly and executable versions of a project. Unfortunately, sometimes it is not possible to access this from the code. Other times, you need to drive this version from external sources (like a build system) and then use it for the build.

The approach this by keeping the version in a text-file:

  1. Manually set/update the version in a text-file.
  2. Install a package that helps us with string-replacements.
  3. Inject this to AssemblyInfo.cs during the build.
  4. Embed this file into the executing assembly.
  5. Extract this file file the executing assembly when you need to know it during execution.

The title of this post is a simplification for lack of an easy way to succinctly describe five steps in a couple of words.

Do It

Feel free to modify/customize these steps as suits your needs.

1. Create the Version File

Create a file called “executable.version” in the “Properties\” folder of your executable project. Make sure to include this in your project. In the “Properties” window, set “Build Action” to “Embedded Resource”.

2. Install the “MSBuild Community Tasks” NuGet Package

This is the “MSBuildTasks” package. This provides us a regular-expression string-replacement MSBuild task.

3. Create a template “AssemblyVersion.cs” File

Copy “Properties\AssemblyInfo.cs” to “Properties\AssemblyInfo.cs.use_this” and update the two version attributes as the bottom to be the following:

[assembly: AssemblyVersion("__EXECUTABLE_VERSION__")]
[assembly: AssemblyFileVersion("__EXECUTABLE_VERSION__")]

Make sure to include this new file in the project. Note that we name this so as to not have the “.cs” extension because, otherwise, Visual Studio will try to parse it and complain about the attributes being duplicated from the “AssemblyInfo.cs” file.

4. Add the Build Step

We are going to add a custom build target to inject the version. We personally chose to put this into a separate rules file in order to make it clear which of the build-logic was ours, but this is up to you. It would just as easily work if it were included at the bottom of your project-file. Create “Properties\build.targets” with the following:


<?xml version="1.0" encoding="utf-8" ?>
<Project ToolsVersion="12.0" xmlns="">

<!-- Inject a version from a text-file into AssemblyVersion.cs . We do this 
 so that it's easier for the application to know its own version [by 
 reading the text file].
 <Import Project="$(ProjectDir)..\packages\MSBuildTasks.\tools\MSBuild.Community.Tasks.Targets" /> 
 <Target Name="InjectVersion" BeforeTargets="BeforeBuild">
 <!-- Read the version from our text file. This appears to automatically 
 trim (probably per line). This is located in the project root so 
 that we copy the file to the output-path rather than establishing 
 a whole Properties/ directory in the output path.
 <ReadLinesFromFile File="$(ProjectDir)Properties\executable.version">
 <Output TaskParameter="Lines" PropertyName="ExecutableVersion" />

<!-- Print it to the build output whether we're in debug-mode or not. -->
 <Message Importance="High" Text="Executable version is [$(ExecutableVersion)]"/>

<!-- Copy our template file to the output file. -->
 <Copy SourceFiles="$(ProjectDir)Properties/AssemblyInfo.cs.use_this" DestinationFiles="$(ProjectDir)Properties/AssemblyInfo.cs"/>

<!-- Do an RX replace of the version on to the token. -->

 <WriteFiles Include='$(ProjectDir)Properties/AssemblyInfo.cs' />


<!-- Replace the cautionary note about how to use the file with one 
 saying that any changes will be lost (if made to the output file). 
 Regex="// TEMPLATE:.+"
 ReplacementText="// THIS FILE IS GENERATED! Apply any changes to 'AssemblyInfo.cs.use_this', instead."

IMPORTANT: Notice that we have to import the build targets provided by the “MSBuildTasks” package:


For us, NuGet packages go into the “packages” directory that is in the parent directory of our project directory. Also notice that we have to embed the version for this NuGet package. If your package is a different version or is located in a different place, you will have to update the example to be accurate.

NOTE: One way to get around having to embed the version is to bypass putting this package in your “packages.config file” and, instead, do a manual NuGet install of this package from a build-task to your packages directory (whereever it is) while also passing the “-ExcludeVersion” argument so as to not put the version in the package’s directory name.

Now, import the “build.targets” file from your project file. Put it somewhere near the bottom. Since it will run before the “BeforeBuild” target, we put it before that (which will be commented-out unless you use it):

 <Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />
 <Import Project="Properties\build.targets" />

5. Reading the Version From the Application

At this point, you should be able to build your project. The only thing that might be considered a disadvantage to this method is that, every time you build your project from inside Visual Studio, you will be prompted to reload the “AssemblyInfo.cs” file because it has been updated from outside of VS even if it has not changed (which is no stupider than the amount of work that we are required to do in order to find our own version). It would be easiest to check the box in this popup that says to only tell you if you happen to have unsaved changes to a file that has been changed from outside VS.

In our case, we are using the CLAP command-line parser. So, we added a private “ExecutableVersion” getter on the class that we are using to handle our subcommands. Then, we added a “version” subcommand that reads and prints the new property. Code for the property:


private string executableVersion = null;

private string ExecutableVersion
        if (executableVersion == null)
            Assembly assembly = Assembly.GetExecutingAssembly();
            string assemblyName = assembly.GetName().Name;

            // "Properties" is required since it is located in the 
            // Properties folder of the project and was thusly embedded 
            // as such.
            string filepath = assemblyName + @".Properties.executable.version";

            string[] names = assembly.GetManifestResourceNames();
            var stream = assembly.GetManifestResourceStream(filepath);
            if (stream == null)
                throw new Exception(String.Format("Could not get resource-stream with name [{0}] for version content from assembly [{1}]. Available: {2}", filepath, assembly.FullName, String.Join(",", names)));

            TextReader tr = new StreamReader(stream);
            executableVersion = tr.ReadToEnd().Trim();

        return executableVersion;


C#: Parsing a CSPROJ (Project) File Using XPath

Using XPath in C# can be done several different ways through a several built-in libraries, and none of them work unless you are a lot more familiar with the file than would be required in many other languages. However, to make matters worse, you might be further required to do some unintuitive shenanigans. In the way of an example, this is how to retrieve the assembly-name:

XNamespace xmlns = "";
XDocument projDefinition = XDocument.Load(projectFilepath);

IEnumerable<XNode> assemblyResultsEnumerable = projDefinition
	.Element(xmlns + "Project")
	.Elements(xmlns + "PropertyGroup")
	.Elements(xmlns + "AssemblyName").Nodes<XContainer>();

IList<XNode> assemblyResults = new List<XNode>(assemblyResultsEnumerable);
if(assemblyResults.Count == 0)
	throw new Exception(String.Format("The project file isn't correctly structured: [{0}]", projectFilepath));

string assemblyName = assemblyResults[0].ToString();

Notice that we have to mash the namespace URL with the node-name in order to find the node.

Best Argument-Processing for .NET/C#

After trying NDesk.Options and Fluent, I am nothing but impressed with CLAP (“Command-Line Auto-Parser”). It completely relies on reflection and parameter attributes (usually just one or two) to automatically marshal your values, assign defaults, enforce requiredness, and provide command-line documentation. It’s beautiful and, so far, flawless. Well done.

using CLAP;

namespace MyNamespace
    class Program
        [Verb(IsDefault = true, Description = &quot;Print the current version of the given package and, optionally, increment it.&quot;)]
        public void Version(
            [Description(&quot;Project path&quot;)]
            string projectPath,

            [Description(&quot;Package name&quot;)]
            string packageName,

            [Description(&quot;Base version to increment from (if lower than current, else use current)&quot;)]
            string baseVersion = null,

            [Description(&quot;Increment the version before returning&quot;)]
            bool increment = false
            // ...

If you don’t decorate with the “Required” attribute and don’t provide a default value the parameter will default to null. I explicitly set baseVersion to a default of null because I prefer being explicit.

Embedded SQL

A nostalgic visit from the past: Embedded SQL, where you can inject live SQL directly into your C code.

Introduction to Pro*C

The second refers to such development using Oracle. Example from the second:

for (;;) {
    printf("Give student id number : ");
    scanf("%d", &id);
    EXEC SQL SELECT studentname INTO :st_name
             FROM   student
             WHERE  studentid = :id;
    printf("Name of student is %s.\n", st_name);
    printf("No record exists for id %d!\n", id);

It’s worth mentioning just to have some central place to search for it later.

Scriptable C++

What if C++ were a scripting language that you could eval from your native C++?


Example (from the homepage):

#include <chaiscript/chaiscript.hpp>

std::string helloWorld(const std::string &t_name)
  return "Hello " + t_name + "!";

int main()
  chaiscript::ChaiScript chai;


Use TightOCR for Easy OCR from Python

When it comes to recognizing documents from images in Python, there are precious few options, and a couple of good reasons why.

Tesseract is the world’s best OCR solution, and is currently maintained by Google. Unlike other solutions, it comes prepackaged with knowledge for a bunch of languages, so the machine-learning aspects of OCR don’t necessarily have to be a concern of yours, unless you want to recognize for an unknown language, font, potential set of distortions, etc…

However, Tesseract comes as a C++ library, which basically takes it out of the running for use with Python’s ctypes. This isn’t a fault of ctypes, but rather of a lack of standardization in symbol-naming among the C++ compilers (there’s no way to know how to determine the naming for a symbol in the library from Python).

There is an existing Python solution, which comes in the form of a very heavy Python wrapper called python-tesseract, which is built on SWIG. It also requires a couple of extra libraries, like OpenCV and numpy, even if you don’t seem to be using them.

Even if you decide to go the python-tesseract route, you will only have the ability to return the complete document as text, as their support for iteration through the parts of the document is broken (see the bug).

So, with all of that said, we accomplished lightweight access to Tesseract from Python by first building CTesseract (which produces a C wrapper for Tesseract.. see here), and then writing TightOCR (for Python) around that.

This is the result:

from tightocr.adapters.api_adapter import TessApi
from tightocr.adapters.lept_adapter import pix_read
from tightocr.constants import RIL_PARA

t = TessApi(None, 'eng');
p = pix_read('receipt.png')

if t.mean_text_confidence() < 85:
    raise Exception("Too much error.")

for block in t.iterate(RIL_PARA):

Of course, you can still recognize the document in one pass, too:

from tightocr.adapters.api_adapter import TessApi
from tightocr.adapters.lept_adapter import pix_read
from tightocr.constants import RIL_PARA

t = TessApi(None, 'eng');
p = pix_read('receipt.png')

if t.mean_text_confidence() < 85:
    raise Exception("Too much error.")


With the exception of renaming “mean_text_conf” to “mean_text_confidence”, the library keeps most of the names from the original Tesseract API. So, if you’re comfortable with that, you should have no problem with this (if you even have to do more than the above).

I should mention that the original Tesseract library, though a universal and popular OCR solution, is very dismally documented. Therefore, there are many functions that I’ve left scaffolding for in the project, without being entirely sure how to use/test them nor having any need for them myself. So, I could use help in that area. Just submit issues or pull-requests if you want to contribute.

OCR a Document by Section

A previous post described how to use Tesseract to OCR a document to a single string. This post will describe how to take advantage of Tesseract’s internal layout processing to iterate through the documents sections (as determined by Tesseract).

This is the core logic to iterate through the document. Each section is referred to as a “paragraph”:

if(api->Recognize(NULL) < 0)

    return 3;

tesseract::ResultIterator *it = api->GetIterator();


    char *para_text = it->GetUTF8Text(tesseract::RIL_PARA);

// para_text is the recognized text. It [usually] has a 
// newline on the end.

    delete[] para_text;
} while (it->Next(tesseract::RIL_PARA));

delete it;

To add some validation to the recognized content, we’ll also check the recognition confidence. In my experience, it seems like any recognition scoring less than 80% turned out as gibberish. In that case, you’ll have to do some preprocessing to remove some of the risk.

int confidence = api->MeanTextConf();
printf("Confidence: %d\n", confidence);
if(confidence < 80)
    printf("Confidence is low!\n");

The whole program:

#include <stdio.h>

#include <tesseract/baseapi.h>
#include <tesseract/resultiterator.h>

#include <leptonica/allheaders.h>

int main()
    tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();

    // Initialize tesseract-ocr with English, without specifying tessdata path.
    if (api->Init(NULL, "eng")) 

        return 1;

    // Open input image with leptonica library
    Pix *image;
    if((image = pixRead("receipt.png")) == NULL)

        return 2;
    if(api->Recognize(NULL) < 0)

        return 3;

    int confidence = api->MeanTextConf();
    printf("Confidence: %d\n", confidence);
    if(confidence < 80)
        printf("Confidence is low!\n");

    tesseract::ResultIterator *it = api->GetIterator();


        char *para_text = it->GetUTF8Text(tesseract::RIL_PARA);
        printf("%s", para_text);
        delete[] para_text;
    } while (it->Next(tesseract::RIL_PARA));
    delete it;


    return 0;

Applying this routine to an invoice (randomly found with Google), it is far easier to identify the address, tax, total, etc.. then with the previous method (which was naive about layout):

Confidence: 89
Invoice |NV0010 '
Jackie Kensington
18 High St
Sevices Limited

Certificate Number CER/123-34 From
  1”” E'e°‘''°‘'’‘'Se”'°‘*’
17 Harold Grove, Woodhouse, Leeds,

West Yorkshire, LS6 2EZ

Email: info@ mj-e|ectrcia|

Tel: 441132816781

Due Date : 17th Mar 2012

Invoice Date : 16th Feb 2012

Item Quantity Unit Price Line Price
Electrical Labour 4 £33.00 £132.00

Installation carried out on flat 18. Installed 3 new
brass effect plug fittings. Checked fuses.
Reconnected light switch and fitted dimmer in living
room. 4 hours on site at £33 per hour.

Volex 13A 2G DP Sw Skt Wht Ins BB Round Edge Brass Effect 3 £15.57 £46.71
Volex 4G 1W 250W Dimmer Brushed Brass Round Edge 1 £32.00 £32.00
Subtotal £210.71

VAT £42.14

Total £252.85

Thank you for your business — Please make all cheques payable to ‘Company Name’. For bank transfers ‘HSBC’, sort code
00-00-00, Account no. 01010101.
MJ Electrical Services, Registered in England & Wales, VAT number 9584 158 35
 Q '|'.~..a::

OCR a Document with C++

There is an OCR library developed by HP and maintained by Google called Tesseract. It works immediately, and does not require training.

Building it is trivial. What’s more trivial is just installing it from packages:

$ sudo apt-get install libtesseract3 libtesseract-dev
$ sudo apt-get install liblept3 libleptonica-dev
$ sudo apt-get install tesseract-ocr-eng

Note that this installs the data for recognizing English.

Now, go and get the example code from the Google Code wiki for the project, and paste it into a file called ocr-test.cpp . Also, right-click and save the example document image (a random image I found with Google). You don’t have to use this particular document, as long as what is used is sufficiently clear at a high-enough resolution (the example is about 1500×2000).

Now, change the location of the file referred-to by the example code:

Pix *image = pixRead("letter.jpg");

Compile/link it:

$ g++ -o ocr-test ocr-test.cpp -ltesseract -llept

Run the example:


You’re done. The following will be displayed:

OCR output:
fie’  1?/2440
BARROSO (2012) 1300171
BARROSO (2012)

Dear Lord Tugendhat.

Thank you for your letter of 29 October and for inviting the European Commission to

contribute in the context of the Economic Aflairs Committee's inquiry into "The

Economic lmplicationsfirr the United Kingdom of Scottish Independence ".

The Committee will understand that it is not the role of the European Commission to

express a position on questions of internal organisation related to the constitutional

arrangements of a particular Member State.

Whilst refraining from comment on possible fitture scenarios. the European Commission

has expressed its views in general in response to several parliamentary questions from

Members of the European Parliament. In these replies the European Commission has 
noted that scenarios such as the separation of one part of a Member State or the creation 
of a new state would not be neutral as regards the EU Treaties. The European 
Commission would express its opinion on the legal consequences under EU law upon ;
requestfiom a Member State detailing a precise scenario. :
The EU is founded on the Treaties which apply only to the Member States who have 3
agreed and ratified them. if part of the territory of a Member State would cease to be ,
part of that state because it were to become a new independent state, the Treaties would

no longer apply to that territory. In other words, a new independent state would, by the
fact of its independence, become a third country with respect to the E U and the Treaties

would no longer apply on its territory. ‘


Acting Chairman

House of Lords q
Committee Oflice
E-mail: economicaflairs@par1igment.ttk