Thrift / c_glib and Cassandra

Thrift

Thrift is apache's tool. It can generate client / server codes based on a file written in it's own descriptor language.

At first I was thrilled how easy it'll be to write a Cassandra client with it: "you just have to generate the C files, #include them, call a few functions and it's done".

Yeah. Like anything in the world works like that. And this particular thing is no exception.



Thrift comes with a documentation that is... wait! It doesn't really comes with any documentation at all. The stuff that's in the package and / or scattered on the Net in the form of blog posts and bug reports is outdated and only can be used to prevent the enemy from using this great weapon.

My last expression isn't a sarcastic one, thrift would be great if I could wield it correctly.

Cassandra

Cassandra is noSQL database. It doesn't really matter now how it works exactly, it's enough if you know that one can store and fetch data with it, and can connect to it over the network.

Coincidentally, it uses Thrift to describe it's interface, so people of different sex, religion and programming language can generate their own interface libraries. First, I tried to put together a client in C++ based on this article. Cassandra, Thirft and gcc evolved somewhat since 2010, and / or I might be using an exotic combination of software (Ubuntu Oneiric), or the Gods might be angry at me for some strange reason,  or I may be simply too dumb to follow a bit outdated tutorial solving a few problems along the way; anyway I could not get the code compiled.

C and GLib

I have much more experience with C than C++, so I decided to throw the C++ code away my co-worker has been writing, and start from scratch with C. I was prepared to read and interpret the Thrift interface descriptor file with my already melting brain, and write the C code myself.

As I started to work I discovered that Thrift CAN generate C interface libraries. It is a bit incomplete in 0.8.0, since it does not generate the server skeleton file; it didn't really matter for me.

I cd'd into Cassandra's interface directory and issued thrift -gen c_glib cassandra.thrift command, just to find the generated sources under gen-c_glib directory.

The sources was clean and readable despite the fact that they were auto-generated.

I even found a small example, and it compiled OK.

I had to replace a few lines, to work with Cassandra instead of the calculator example. The following is the re-write for Cassandra, with connecting to a server on localhost on the default port, and executing a query that fetches a value from keyspace "example", column family "examplecf", with key "foo", from the "bar" column. Error handling might be incomplete.

Warning: I don't know a thing a about glib, and I suspect that the code below is NOT the way to use it. It works here though. I maybe will improve it in the distant future.

#include 
#include 

#include "gen-c_glib/cassandra.h"
#include "protocol/thrift_protocol.h"
#include "protocol/thrift_binary_protocol.h"
#include "transport/thrift_framed_transport.h"
#include "transport/thrift_transport.h"
#include "transport/thrift_socket.h"

#include "gen-c_glib/cassandra.h"

int main(int argc, char** argv) {
  ThriftSocket *tsocket;
  ThriftTransport *transport;
  ThriftProtocol *protocol;
  CassandraClient *client;
  CassandraIf *service;
  InvalidRequestException *ire = NULL;
  NotFoundException *nfe = NULL;
  UnavailableException *ue = NULL;
  TimedOutException *te = NULL;
  ColumnOrSuperColumn *result;
  GError *error = NULL;

  GByteArray column = {
    .data = (unsigned char *)"bar",
    .len  = 3
  };

  ColumnPath *cp = NULL;
  
  GByteArray key = {
    .data = (unsigned char *)"foo",
    .len  = 3
  };
 
  g_type_init();

  tsocket = THRIFT_SOCKET(
    g_object_new(
      THRIFT_TYPE_SOCKET, "hostname",
      "localhost", "port", 9160, 0
    )
  );
  transport = THRIFT_TRANSPORT(
    g_object_new(
      THRIFT_TYPE_FRAMED_TRANSPORT, "transport", tsocket, 0
    )
  );
  protocol = THRIFT_PROTOCOL(
    g_object_new(
      THRIFT_TYPE_BINARY_PROTOCOL, "transport", transport, 0
    )
  );
  client = CASSANDRA_CLIENT(
    g_object_new(
      TYPE_CASSANDRA_CLIENT, "input_protocol",
      protocol, "output_protocol", protocol, 0
    )
  );
  service = CASSANDRA_IF(client);

  if (
    !thrift_transport_open(transport, 0) ||
    !thrift_transport_is_open(transport)
  ) {
          printf("Could not connect to server\n");
          return 1;
  }
  printf("Connected to cassandra at localhost:9160\n");

  cassandra_client_set_keyspace(
    service, "example", &ire, &error
  );
  if (ire) {
    printf("Invalid request exception: %s\n", ire->why);
    return 1;
  }
  if (error) {
    printf("An error has occured\n");
    return 1;
  }
  printf("Selected keyspace example\n");

  cp = g_object_new(TYPE_COLUMN_PATH, 0);
  cp->column_family = "examplecf";
  cp->column = &column;
  cp->__isset_column = TRUE;

  cassandra_client_get(
    service, &result, &key, cp, CONSISTENCY_LEVEL_QUORUM,
    &ire, &nfe, &ue, &te, &error
  );

  if (ire) {
    printf("Invalid request exception: %s\n", ire->why);
    return 1;
  }
  if (nfe) {
    printf("Row not found\n");
    return 1;
  }
  if (ue) {
    printf("Unavailable exception\n");
    return 1;
  }
  if (te) {
    printf("Timed out exception\n");
    return 1;
  }
  if (error) {
    printf("An error has occured\n");
    return 1;
  }
  
  printf(
    "The result is %s\n",
    strndup(
      (char *)result->column->value->data,
      result->column->value->len
    )
  );

  /* Don't forget to free resources if
   * your program runs longer than this */

  return 0;
}



I compiled thecode with the following commands:

gcc -c `pkg-config --cflags thrift_c_glib` test.c -o test.o

gcc -c `pkg-config --cflags thrift_c_glib`\
gen-c_glib/cassandra.c -o cassandra.o

gcc -c `pkg-config --cflags thrift_c_glib`\
 gen-c_glib/cassandra_types.c -o cassandra_types.o

libtool --tag=CC --mode=link gcc `pkg-config --libs thrift_c_glib` -o test test.o cassandra.o cassandra_types.o

The last command is even more cryptic then the others, so here's the explanation:

The pkg-config command is used to query for compilation flags of program that use installed libraries. It's the library's make install script's responsibility to install this info. If a package is installed from the repository of your distribution, this information is installed by the package manager. The rest of the command line should be clear.

UPDATE:

Note that the key, column name and value does NOT contain the trailing zero byte.

Megjegyzések

  1. Hi Tamás,

    First of all, thanks a lot. I have found the post extremely useful, since I am also trying to generate a c-based client fo cassandra. In this sense, I followed the same steps you mentioned, but I am facing a big trouble when trying the libtool step:

    [root@localhost gen-c_glib]# libtool --tag=CC --mode=link gcc `pkg-config --libs thrift_c_glib` -o test test.o cassandra.o cassandra_types.o
    libtool: link: gcc -o test test.o cassandra.o cassandra_types.o -L/usr/local/lib -lthrift_c_glib -lgobject-2.0 -lglib-2.0
    /usr/bin/ld: cannot find -lthrift_c_glib
    collect2: ld returned 1 exit status

    Do u know the reason I am facing this issue? It seems it does not find the thift_c_glib library, but I installed it without errors by doing make install.

    Any idea?

    Thanks a lot in advance!
    Alvaro

    VálaszTörlés
    Válaszok
    1. Do a

      sudo ldconfig -v | grep thrift

      to see if the libs got installed.

      Try

      locate libthrift_c_glib.so

      to see if the lib file is present. It's possible that libthrift_c_glib.so.0 or libthrift_c_glib.so.0.0.0 is there, but there's no libthrift_c_glib.so. If this is the case, try creating it as a symlink to the existing lib file.

      Törlés
  2. Hi!

    I'll take a look at thrift to see what might have caused this.

    In the meantime, try to issue ldconfig as root.

    VálaszTörlés
  3. I am getting "could not connect to the server", what might be the possible
    reasons behind it?

    VálaszTörlés
    Válaszok
    1. Well, that's a pretty general error, it can be almost anything. Hard to tell from here.

      Törlés
  4. What is the basic difference between THRIFT_TYPE_BUFFERED_TRANSPORT and THRIFT_TYPE_FRAMED_TRANSPORT in the client?

    What difference is there, related to the multiple connections? (My basic aim is to have multiple app connections to our server)

    With the BUFFERED transport, I am getting the following:
    CRITICAL **: thrift_socket_open: assertion 'tsocket->sd == THRIFT_INVALID_SOCKET' failed
    Could not connect to server

    But, with the FRAMED transport, I am able have connections but getting the following message:
    ** (process:2337): WARNING **: error reading start of message: failed to read 4 bytes - Success
    ** Message: thrift_simple_server_serve: failed to read 4 bytes - Success

    ** (process:2337): WARNING **: error reading start of message: failed to read 4 bytes - Bad file descriptor
    ** Message: thrift_simple_server_serve: failed to read 4 bytes - Bad file descriptor

    ** (process:2337): WARNING **: error reading start of message: failed to read 4 bytes - Bad file descriptor
    ** Message: thrift_simple_server_serve: failed to read 4 bytes - Bad file descriptor

    ** (process:2337): WARNING **: error reading start of message: failed to read 4 bytes - Bad file descriptor
    ** Message: thrift_simple_server_serve: failed to read 4 bytes - Bad file descriptor

    ** (process:2337): WARNING **: error reading start of message: failed to read 4 bytes - Bad file descriptor
    ** Message: thrift_simple_server_serve: failed to read 4 bytes - Bad file descriptor

    ** (process:2337): WARNING **: error reading start of message: failed to read 4 bytes - Bad file descriptor
    ** Message: thrift_simple_server_serve: failed to read 4 bytes - Bad file descriptor

    ** (process:2337): WARNING **: received invalid message type -275953940 from client

    ** (process:2337): WARNING **: received invalid message type -275953940 from client

    ** (process:2337): WARNING **: received invalid message type -275953940 from client

    Please guide, thanks.

    VálaszTörlés
    Válaszok
    1. The correct choice depends on the version of the server. See https://wiki.apache.org/cassandra/ThriftExamples for examples.

      Törlés

Megjegyzés küldése

Népszerű bejegyzések ezen a blogon

How does iptables hashlimit module work?