"Drawing Git Graphs with Graphviz and Org-Mode"

Alessandro Balzano

2019-05-23 21:21

Drawing Git Graphs with Graphviz and Org-Mode - correl.phoenixinquis.net

Today I needed something like TortoiseGit's "Revision Graphs": a simple graph that shows tags and branches of a git repository in topological order.

While searching on the net, I found this cool blog post about generating this kind of graphs with elisp and graphviz. Even though the author uses Lisp, the code is very simple and approachable, and it can be easily translated to Python or other languages.

How to write generic dissectors in Wireshark

Alessandro Balzano

2019-04-27 10:00

Wireshark is a flexible network analyzer, that can also be extended via plugins or dissectors.

A dissector is a kind of plugin that lets Wireshark understand a protocol - in our case, a protocol that is only used by a certain application. There are several reasons to create your own (application-level) protocol over UDP/IP or TCP/IP, such as efficiency (by sending only binary data, formatted in a certain application-specific format).

Wireshark is a very helpful tool during system integration tests, or while developing a networked application. A dissector helps developers and testers check if the applications under test are sending (or receiving) data correctly - if the structure of a certain message is as defined by the protocol, if some fields have invalid values, if an application is sending more (or fewer) messages than expected in a certain timeframe.

WireShark Generic Dissectors - a declarative approach

Wireshark Generic Dissectors (WSGD) is a plugin that lets you define a dissector for your custom protocol, in a declarative manner.

Being declarative is a cool idea - by just saying what the protocol looks like, the content of the dissector is clear to a technical, but non-developer, user. Such protocol descriptions can also be used as documentation, without having to manage different Wireshark API versions (as it may happen with Lua-based dissectors). It's not all fun and games though: this plugin has some (reasonable) limitations, such as not managing text protocols, or requiring an header common to every kind of message described in the protocol.

Let's write a generic dissector

Let's start with the Wireshark Generic Dissector file: it contains some metadata about the protocol. These metadata, consisting of details such as the protocol name, the structure that sketches the header of all messages in the protocol and the main message, are necessary to be efficient when parsing the messages during the capture.

# file custom.wsgd

# protocol metadata
PROTONAME Custom Protocol over UDP
PROTOSHORTNAME Custom
PROTOABBREV custom

# conditions on which the dissector is applied:
# the protocol will be applied on all UDP messages with port = 8756
PARENT_SUBFIELD udp.port
PARENT_SUBFIELD_VALUES 8756

# the name of the header structure
MSG_HEADER_TYPE                    T_custom_header
# field which permits to identify the message type.
MSG_ID_FIELD_NAME                  msg_id
# the main message type - usually it is a fake message, built of one
#    of the possible messages
MSG_MAIN_TYPE                      T_custom_switch(msg_id)

# this token marks the end of the protocol description
PROTO_TYPE_DEFINITIONS

# refer to the description of the data format
include custom.fdesc;

The second file is the data format description: it described the messages of the protocol we're writing a dissector for.

# file custom.fdesc

# here, we define an enumerated type to list the type of messages
#   defined in our protocol
enum8 T_custom_msg_type
{
    word_message   0
    number_message 1
}

# here, we define the structure of the header.
# The header (the same for each message type) must...
struct T_custom_header
{
    # ... define the order of the data
    byte_order big_endian;
    uint32 counter;
    uint8  size_after_header;
    # ... contain the field defined as MSG_ID_FIELD_NAME
    T_custom_msg_type msg_id;
}

struct T_word_message
{
    T_custom_header header;
    uint8           word_len;
    # array of characters
    char[word_len]  word;
    # "word" messages will always have some unused trailing bytes:
    #   they can be marked as raw(*) - the size is calculated at runtime
    raw(*)          spare;
}

struct T_number_message
{
    T_custom_header header;
    uint8           number;
    bool8           is_even;
}

# T_custom_switch is the main message (as defined in the protocol description)
# according to the parameter msg_id (of type T_custom_msg_type), we define
# the main message to be defined by a single message: either T_word_message or T_number_message.
switch T_custom_switch T_custom_msg_type
{
case T_custom_msg_type::word_message:   T_word_message "";
case T_custom_msg_type::number_message: T_number_message "";
}

Generating some network traffic...

To verify that the dissector we've written is correct, we are going to build a small client to send some UDP messages to a very simple server.

Let's start with the server: it just receives UDP messages on port 8756, and prints the contents of those messages.

import socketserver

class CustomHandler(socketserver.DatagramRequestHandler):
    def handle(self):
        data = self.request[0].strip()
        print(data)

if __name__ == "__main__":
    serv = socketserver.UDPServer(("127.0.0.1", 8756), CustomHandler)
    serv.serve_forever()

The client sends some data to our server - we just need it to generate some traffic to sniff on Wireshark.

import socket
import struct
import random
import string
import time

HOST, PORT = "localhost", 8756

# SOCK_DGRAM is the socket type to use for UDP sockets
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)

# refer to `pydoc struct`
HEADER_STRUCT = "".join([
    ">",  # network byte order
    "L",  # counter
    "B",  # message size
    "B",  # message type (0: word, 1: number)
])

PAYLOAD_WORD_TYPE = HEADER_STRUCT + "".join([
    "B",    # word length
    "100s", # string (at most 100 characters)
])
word_struct = struct.Struct(PAYLOAD_WORD_TYPE)

PAYLOAD_NUMBER_TYPE = HEADER_STRUCT + "".join([
    "B",  # number
    "B",  # 0: even, 1: odd
])
number_struct = struct.Struct(PAYLOAD_NUMBER_TYPE)

msg_counter = 0
while True:
    msg_counter += 1

    # prepare data to send
    if random.random() < 0.70:
        num = random.choice(range(256))
        is_even = num & 1
        data = number_struct.pack(msg_counter, 2, 1, num, is_even)
    else:
        string_len = random.choice(range(100))
        the_string = bytes("".join(random.choice(string.ascii_letters+" ") for i in range(string_len)), "ascii")
        data = word_struct.pack(msg_counter, 101, 0, string_len, the_string)

    # send the message
    sock.sendto(data, (HOST, PORT))

    # wait 200ms
    time.sleep(0.2)

Set it up

Wireshark Generic Dissector is a binary plugin, distributed as a .so file - please read the installation procedure. I've summarized what I did to install the plugin and the files we've written so far:

# download the plugin - be sure it's the right one for
# the version of Wireshark installed on your system
wget http://wsgd.free.fr/300X/generic.so.ubuntu.64.300X.tar.gz
# extract the file generic.so
unzip ./generic.so.ubuntu.64.300X.tar.gz
# install the shared object globally by putting in the right folder
sudo cp generic.so  /usr/lib/wireshark/plugins/3.0/epan
# install the dissector files in the right folder - the same of the shared object
sudo cp custom.wsgd /usr/lib/wireshark/plugins/3.0/epan
sudo cp custom.fdesc /usr/lib/wireshark/plugins/3.0/epan

Test drive

/images/wireshark-wsgd-with-dissector.png

As we can see by the screenshot, we are now able to see the content of the messages our application is sending to the server, without writing a single line of code (other than our application, obviously).

References

Wireshark Generic Dissector project homepage: http://wsgd.free.fr/index.html
WSGD file description: http://wsgd.free.fr/wsgd_format.html
Data format description (fdesc): http://wsgd.free.fr/fdesc_format.html

If you feel that this article helped you, feel free to share it! If you have questions, ask on Twitter, or offer me a coffee to let me keep writing these notes!

Stadia: the future of gaming?

Alessandro Balzano

2019-03-19 21:49

Google unveiled its new game streaming service: Stadia.

Stadia, which tagline is "gather around", recognized that there are two "disconnected universes": streamers -people who play games for their audience- and viewers, that maybe cannot play the same games or just enjoy looking someone else's performances.

The company tries to combine both worlds by creating a game streaming service that is also integrated in Youtube.

Interesting parts

Other websites already covered the conference: I will write down something I found interesting or exciting.

You can access the platform by just pressing a "Play" button at the end of a videogame Youtube video - if you're using Google Chrome, obviously.

AMD designed a custom GPU, just for Stadia. Judging by its 10.7 teraflops, it is more powerful than the GPUs on current gen consoles. Developers can also use more than one GPU in their games, to make the games even more detailed in a transparent way for the user.

Stadia promises an "up to 4k 60fps" experience for the player, and all plays will be streamed on Youtube. The special "share" button on the custom controller should let creators (or random players) share their play and create a "state share" link, to let other people play the same portion of the creator's gameplay. Creators can also use Crowd Play to let their Youtube viewers join their games and better interact with them.

Every game on Stadia will be playable with existing controllers and every device that the user already owns -they will just interact with a streaming, so they won't need a powerful device to use the service.

This new platform now explains two major features of Google products I never really understood: Youtube's videogame channels (containers that automatically gather and categorize videos about specific games - see Sekiro's automatically generated channel as an example) and the WebUSB standard, that is only implemented by Chrome.

What's missing?

Stadia is not the first game streaming service in the market, and won't be the last. Hopefully it won't fail as hard as OnLive, but there are several issues that should be resolved, or mitigated, before the launch (in US, UK, Canada and most of Europe).

Let's start with the one I find more pressing: PS Now launched a week ago in Italy, with varying results. Dadobax (an Italian videogame youtuber) experienced a very important input lag while testing Bloodborne on a 100Mbps fiber connection. Will Stadia suffer the same problem? During the presentation, the Stadia representative said there will be a direct link between the ISP and the Stadia data centers, but I won't believe that everything works fine until I can try it. Other commenters note that, even if there will be no input lag, there is a risk of seeing video artifacts due to video streaming compression algorithms.

Another issue is the access to the service: We don't know how much it will cost, and which titles will be there. At least we know that Doom Eternal, Assassin's Creed Odyssey, NBA 2K19 and Shadow of the Tomb Raider will be playable on the platform. We don't know if users must buy titles on the Stadia Store even if they bought it on other stores - such as Odyssey's UPlay store-.

I'm also worried that Youtube is going to fill with a lot of digital waste: no one is interested in my gameplays (I have to admit I'm bad at videogames), so that footage won't be ever seen, but will still take some space on a random hard drive in the cloud. I hope the service won't store every gameplay ever played on Stadia.

Did you like the conference? Are you hyped? Are you critical? Let me know on Twitter !

Your blog should be rendered on the server

Alessandro Balzano

2019-02-17 16:37

"You probably don't need a single-page application" - Plausible"

This blog post from Plausible just reminded me of a a pet peeve of mine: blogs that are built as single page applications. I don't like being greeted with a blank page because I don't want to execute whatever code you send to my browser.

There are several use cases for single page applications, but blogs are not one of them, for several reasons. Some of them are already explained in the article, and I'm going to reiterate them too, but I want to add a different perspective to the Plausible's short essay.

I just want to read content

A blog just contains text and some media (images, audio or video). We don't need to download some Javascript, then execute it to request the real content and then show it to the user - why can't I just download the content?

Another advantage is that you don't need to a lot of work to let search engines index your work. Do you think it is a non-issue? Hulu would like to have a word with you. Due to some problems with their client-side-only rendering, Google could not index them anymore, destroying Hulu's previous work on their targeted keywords.

I don't care about the tool you use to generate your blog, whether you build it on your laptop and push via FTP or use Wordpress to write and publish your new eessays - it's just fine, all those workflows are valid - they let your readers (me!) just get the content they want to read.

If you really need to build your entire website using javascript, then please, I beg you: setup Server-side Static Rendering (SSR) to create a static "base" version of the website. Most libraries/frameworks support SSR, so just study your framework's documentation and set it up.

Javascript should enhance the experience

I really love New York Times'-style interactive data visualizations: they are part of the content they want to show you, so it's OK to enable Javascript to enjoy those applets. Even New York Times articles, though, are just text that are enhanced by the interactive applets - you still get the text. It's ok to require Javascript to load comments - especially if you use third-party services such as Disqus - because, y'know, comments are not a core part of the experience.

So, please, stop forcing javascript on your blog readers. Just let me read your content.

I hope you liked this new issue of Interesting Links, a column where I highlight interesting articles, with my own comments and thoughts. If you appreciated this small rant, tell me that on Twitter, or support me on Ko-fi.

Emptying Cashless Keys with Minizinc

Alessandro Balzano

2019-01-26 16:25

Since I bought my first cashless key to buy coffee at the company's vending machine, I never managed to spend every cent in the key - always fell short of 3 or 5 cents. I always wondered: which/how many drinks should I buy to reach zero? We will use Minizinc to answer this question.

What is Minizinc?

Minizinc is a free and open source constraint modeling language. It can be used to formally describe (model) constraint satisfaction and optimization problems in a high-level, solver-independent way. It is available for all major platforms as a command line tool, or bundled with an IDE, and several solvers.

In our case, we're going to solve an optimization problem! An optimization problem is the problem of finding the best solution from all solutions that respect the problem's constraints (feasible solutions). How do we define the best solution, though? We use an objective function: a function that, given a feasible solution, returns its value in the system we're studying.

Don't worry if you don't understand every single word right now, I'm going to explain them in a bit!

Let's write the model!

So, let's start by defining the parameters of our problem. We have a given amount in our key, and a list of beverages and their cost.

% the initial amount in our cashless key
int: START_AMOUNT;
% the list of beverages (coffee, tea, chocolate, ...)
enum BEVERAGES;
% how much does a beverage cost?
array[BEVERAGES] of int: COST;

Let's then define how we should spend our money! Our solution is defined as an array of integers - for each beverage b, quantity[b] is the number of times I have to drink b.

/* how many times do I have to drink a beverage B? */
array[BEVERAGES] of var int: quantity;

After defining the structure of the solution, we need to limit the values we can assign to the solution. Assignments to quantity must follow some rules - in this case, we must drink all beverages at least zero times, and we must spend every single cent we have. Every assignment that follows those rules is a feasible solution for the problem.

/* constraint: cannot drink a negative amount of drinks! */
constraint forall(b in BEVERAGES)(quantity[b] >= 0);

/* constraint: we must spend exactly START_AMOUNT */
var int: spent = sum(b in BEVERAGES)(quantity[b] * COST[b]);
constraint START_AMOUNT - spent = 0;

At the end, we want to check how we are going to spend our money. Our objective function is how many drinks we are going to buy. We want to find an assignment of quantity that maximizes the number of coffees we have to drink - hard earned money must not be wasted!

/* the objective function! */
var int: how_many_drinks;
constraint how_many_drinks = sum(b in BEVERAGES)(quantity[b]);

% use `maximize` to drink as many coffees as possible
% use `minimize` to cut down on coffees and get to zero euro as fast as possible
solve maximize how_many_drinks;

Results

Let's feed our model with a simplified instance of the problem: it will tell us that we can buy 8 coffees and 2 chocolates, for a total of 10 drinks. If we want to cut down on coffees, we can order 3 tea cups and 4 ginseng coffees.

shell % cat data.dzn
START_AMOUNT = 260; % 2.60 euro

BEVERAGES = { coffee, tea, ginseng, chocolate };
COST      = [     25,  40,      35,        30 ];
shell % minizinc model.mzn --data data.dzn
quantity = array1d(BEVERAGES, [8, 0, 0, 2]);
how_many_drinks = 10;
----------
==========
shell % # update the model to solve the minimization problem
shell % minizinc model.mzn --data data.dzn --all
quantity = array1d(BEVERAGES, [8, 0, 0, 2]);
how_many_drinks = 10;
----------
quantity = array1d(BEVERAGES, [2, 0, 0, 7]);
how_many_drinks = 9;
----------
quantity = array1d(BEVERAGES, [0, 0, 4, 4]);
how_many_drinks = 8;
----------
    quantity = array1d(BEVERAGES, [0, 3, 4, 0]);
how_many_drinks = 7;
----------
==========

Nothing stops us from adding more constraints, such as "I need to drink at least one chocolate", or "I don't want to buy more than one cup of tea", or that the number of drinks must be even because I offer coffees to a colleague. Try to experiment, add your own constraints, add more beverages and modify the prices!

References

Example gist: https://gist.github.com/alfateam123/3241bf8070d490677a08b13059d4c833
Basic Modeling on Coursera: https://www.coursera.org/learn/basic-modeling
Wikipedia "Optimization Problem": https://en.wikipedia.org/wiki/Optimization_problem

If you liked this article, Tag me in a photo of your favorite morning beverage, or consider offering me a Ko-fi !