Saturday, April 26, 2014

ns-3 Testing - If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.

Testing...

a lot of books have been written about software testing, and there are University courses about software testing. Hell, it's a full research field, I can't pretend to say anything close to be meaningful about this in one post. Why should I write something about it ?

The answer is simple: to explain (once more) why testing is important, and how to approach the testing phase in ns-3.

First and foremost: there's a ns-3 manual section about testing, and it even explains what are the generic goals of software testing (Correctness, Validation and Verification, Robustness, Performance, and Maintainability). Amazing, isn't it ?  (The most amazing part is that nobody seems to read the manual, or even know what's inside it.)

Anyway... to test or not to test, that's the question. Not really, tests are necessary, so there's no question. The question may be about what to test.

Before writing a test, one have to consider that ns-3 i a network simulator. If you're writing tests, you probably wrote a protocol, and a protocol involves two (or more) network entities, exchanging data (packets) with delays and random error / losses.
For a software engineer this is often shocking. Testing is much like any other kind of software test, with the twist that API calls can be deferred, mangled, etc. You transmit 10 and the receiver understands 100.

And this is the first lesson: it's not all about positive tests (i.e., conformance testing), it's instead about negative tests: you have to test what happens when the "normal" operation fails.
You have to test this: A send [x] to B, B replies [f(x)] to A.
But also: A send [x] to B, B understand [x'] and replies [f(x')] to A, which understand [g(x')]. A conversation between deaf, and nothing should go crazy in this (beside the developer, of course).

Ok, this is good, but it's too generic. So... some rules. In order to make an effective test for a protocol you should do the following:
  1. Static test (format): check that the packet are well formed.
  2. Positive test (conformance): check that the behaviour you have is the expected one.
  3. Negative test: try shuffling the packet order, loosing a packet and so on.
  4. Vulnerability test (the paranoid's one): try messing up with the packet's data. Packet's fields representing lengths are a good start.
Point 3 is often not performed in simulators, but it may be worth considering if the simulator have to be used in emulation mode. In this case... more to follow.

Point 1 is important, but you need something able to understand the packet's format. A popular choice is Wireshark. However, also Wireshark has its limits. If the protocol is new, chances are that Wireshark can not "dissect" its packets. Moreover, Wireshark itself may be bugged, or not understand all the protocol's options. E.g., right now Wireshark understands only the ETX metric in RPL. It can't understand the HC metric. In this case, only inspecting the packet byte by byte is possible. terribly boring - but necessary.

There's a last kind of test, the duck test: If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.
In other words, compare against a real system.

This is the most advanced possible test. And it's one of the most valuable and useless tests at the same time: the comparison against a reference implementation.
Before even thinking to do it, consider the following:

  1. You need to have an implementation to compare against.
  2. You need to setup a test environment with both your stack and the "other" stack.
And this is where doubts can arise.
  • If you have a working implementation, why the hell did you develop the protocol in the simulator? (false problem, maybe the simulator is the reference implementation, maybe you want to change stuff and in the real system you can't, maybe you need the power of the simulator to perform massive tests).
  • Something is not working as expected, who's the bad guy? (welcome to cross-implementation debugging).
  • How can I force the "other" system to send me "wrong" packets?
Well, I'm not going to explain all the possible cases, let's just say that testing against a real system is like doing normal tests, just much harder.

BUT, there's the duck.
As is: you're calling your protocol a duck because it behaves like a duck, but... who told you that the thing you're calling a duck is a duck?
If you're comparing against a "bad" example, you'll tune your system to that bad example, and both will be consistent with each other... in doing the bad thing- "Follow my steps" [Marty Feldman in Frankenstein Junior].

The bottom line: tests are important. Do them, and make sure they're covering the most important parts of the protocols. Don't avoid them just because it's hard to do them, do them. Bugs are more than often spotted by tests, because the "example" will show you how the system performs when all is right.

Tuesday, April 22, 2014

Silence is golden

My granny was used to say: if you don't have anything nice to say, don't talk.

This is not the case. I have plenty of stuff to say, but I've been a bit busy. In strictly random order:

  • A house to buy
  • GSoC and SOCIS to apply for, review student's applications, give scores and so on
  • Financed projects to apply for (they'll be financed... if they'll be accepted!)
  • Papers to write (yes, I have to write papers as well)
  • Sleep, drink and eat. Sometime breath as well.
I have some busy days. But it could be worse.

Anyway, the next posts will be about useful but terribly boring things.
  1. Debugging - How I Learned to Stop Worrying and Love the Bug
  2. Valgrind - O Bug, Where Art Thou?
  3. Testing - If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.
  4. Documentation - <put here your quote>
Have fun all !

Thursday, March 27, 2014

An effective path to learn C++ (or any other programming language)

~~ I have all the answers, but I don't remember the questions. ~~

Yes, this post is about learning C++, but the suggestions apply to any other programming language.
I could have named it "The True System to Learn C++ in an Easy and Painless Way in less then 30 minutes", but I'm a bit shy.

First and foremost, a little bit of history.
I didn't learn programming in a class, I'm self-taught.
Well, that's not totally true. I took a programming 101 course at University. I learnt FORTRAN (it was The language for scientific applications back then, and no, I'm not that old).

Anyway, at one point I decided I needed a more "modern" language, and I decided to learn C.
I hate from the bottom of my hearth the "X for Dummies" kind of books, so I took The Source: The K&R (Brian W. Kernighan and Dennis Ritchie, C Programming Language).
I loved and hated that book. It's a beautiful manual, it tells you exactly all what you need to know about C, but it tells you nothing about how to use it. Well, it does... but just a little bit. It's like if somebody would explain you exactly how an engine works and how to build all the parts, but not how to actually build one.
It's the sort of book assuming that, if you know exactly how the parts work, then putting all together is easy. It isn't.

Then I moved to C++, and since I learn from experience, I grabbed The Book: Bjarne Stroustrup, 
The C++ Programming Language. Ok, that wasn't a smart move.
I loved that book, but it's a manual as well. I had to read it THREE times to start understanding something. It makes it clear from the beginning that C++ is powerful but it's not simple. It explains everything (and by everything, I mean it). It's a reference manual. But definitely it's not a simple way to
learn C++.
I still suggest to buy it, and keep it on your desk as the Reference Manual. Not as a learning book, tho.

I've learnt C++ the hard way, and I was able to write fairly complex programs. By fairly complex I mean a whole network simulation framework and a good part of the DVB-RCS protocol stack.

Still, it wasn't the end. A friend suggested to take a look at a new thing coming out: Design Patterns. So I grabbed The Book (it's an habit, a bad one): E. Gamma, R. Helm, R. Johnson, and J. Vlissides Design Patterns: Elements of Reusable Object-Oriented Software.
Great book. A punch in the groin, tho. It was like "Oh, great, I did everything wrong so far". But no good advices on how to do it right.

So, 2 books and some hundreds pages after, I had he theory, but not the practice.

Then, something changed. I started using the thing I studied. And slowly the pieces did go where they was meant to be.

How? Well, that's the point. The sooner you start this phase, the sooner you'll master the programming language (it took me years, just because nobody told me the trick).

So, here are the tricks. Follow them and you'll master C++ in no time (and any programming language).

Trick #1: grab a GOOD book.

This one has been suggested by Mathieu Lacage, and it's a very good one: A. Shalloway and J. Trott, Design Patterns Explained: A New Perspective on Object-oriented Design.

The real trick is: don't try to learn Object Oriented programming and move to Design Patterns once you have mastered OO. Learn them at the same time. The reason? They're two side of the same coin.

Trick #2: start programming for real.

Don't do the book's exercises. Well, do them, but don't do just those.
Find a good excuse to program (e.g., join an Open Source project like ns-3) and use your skills.

Sure thing, you'll find yourself like a fish out of water at first, but you'll learn quickly. You don't learn how to drive a car by reading a book, you have to drive.
At first you'll do mistakes, and your code will be ugly. However, in due time, you'll learn the tricks and you'll start becoming better and better.
Remember, practice make it perfect, not studying. Practice without study is bad, study without practice is equally bad.

The real suggestion here? Find something that matches your interests. You have to find it interesting, not boring. And if you choose something only because it matches your job, it may be boring as hell.

If you follow the two simple tricks I told you, you'll become a programming master in no time: less than 1 year.

But... but... you said 30 minutes...
I lied.

Sunday, March 16, 2014

How to develop a new protocol

Today a not-so-small post about protocol development.

One of the recurring themes in the ns-3 users group is:
"How do I develop [my own, a new, a modification] [routing, scheduling, MAC, etc.] protocol."

There are many possible answers, depending on the case, but basically the root is always the same. When you reach the point of asking this, you have done a mistake in your development process.

I'll limit the scope of this post to a subset of the above possibilities. However, I'll try to be as generic as possible, in order to fit also the other cases.
Let's assume the question is:
"How do I develop a new routing protocol."
and let's assume you want to use ns-3. Because this blog is about ns-3, mainly.

In order to write a new routing protocol (any protocol, indeed), the following steps are required:
  1. Requirement analysis
  2. Protocol Design
  3. Development
  4. Test
Usually you'll want to iterate through these steps (at least through 2-4), much like in agile programming.
Let's see what one should expect from each phase.

Requirement analysis

The requirement analysis should fix what are your goals, the nodes capabilities and so on. Don't ever say "it's obvious". Requirements are never obvious. Check RFC 5826 for a routing requirement document example.

Protocol Design

This is the phase where you write down your protocol. Remember that you have to write (at least) THREE things:
  1. Format - how the packet is made, the length of each field, the encoding, etc.
  2. Syntax - what's the meaning of the data carried by the packet (again, it's not "obvious")
  3. Semantic - what's the behaviour of a node upon receiving a packet, what, when and why a packet have to be sent.
A lot of people forget one of these elements, basically making it impossible to implement the idea.

One very important thing: remember the data. Data are not "known", they must be measured. A node doesn't know the channel quality, it measures the channel quality. If you plan to use something in your protocol, make sure to be able to measure it !
I've seen far too many "scientific" papers with this mistake. A success, for the authors. A fail, for the reviewers. A shame, for the scientific community.

Summarising:
  • Collect data and extract measures from data.
  • Send the measures to other nodes when something happens.
  • React upon receiving a message.

Development

If you reach this stage, and your design was done in the right way, it's simply a matter of coding.

The format: define a new set of headers (or data packets) carrying your informations. Remember that measures will have to be device-independent. As an example, suppose you want to send "3.14". You will not send a double in the packet, you'll send a given number of bits representing "3.14". There are a number of ways to do this, just choose the one that suits you.

The syntax: it's how you write and read the packets. What's the meaning of each field. In ns-3 terms, it's the Serialize and Deserialize functions.

The semantic: open some sockets to receive the messages from other nodes, open some sockets to send the packets to other nodes (they could as well be the same ones), and write the logic of each function.

Testing

Once everything is done, remember to test your shiny new protocol. I'd strongly suggest to write specific tests to check all the functionalities, especially the error cases. You know, shit happens, so better to check that your protocol gracefully recover from it.

ns-3 specific tricks

This is quite specific to ns-3, but any simulation (or real) system will have something similar.
Let's not forget that we was talking about a routing protocol. Thus, it's logical that the protocol semantic will build a routing table. Not a big issue. Usually a routing table is little more than: 1) destination (e.g., 2001:db8:f00d:cafe::/60), 2) next hop (e.g., fe80::21b:63ff:fef0:6acd), 3) interface, 4) prefix to use (if necessary).

The routing table will be used in two places: RouteOutput and RouteInput.
  • RouteOutput: the obvious one. It's called when a packet is being sent.
  • RouteInput: the not-so-obvious one. It's called when the packet is received, to decide if it has to be forwarded or it's for the node itself.
Indeed, RouteInput is the real "routing" one: it decides if the packet has to be forwarded, what outgoing interface has to be used in the forwarding, etc.
RouteOutput "simply" finds out the optimal routing when a packet is sent from the local node.

When a packet is created and sent, it will pass thru multiple RouteOutput (yes, more than once, thanks to UDP, IP, etc.), then from RouteInput (once for each router is passes thru and once for the destination node).


Suggestions to understand all this? Find a "simple" routing protocol and study its specification and the implementation.
Example: RIPng for IPv6 (RFC 2080). It's short, simple and well written. The implementation for ns-3 is expected to be available starting from ns-3.20.

Saturday, March 15, 2014

Multiple IPv6 global addresses on an interface

Today no rants. An how-to instead.

Ns-3 is a great simulation framework. It's powerful, complex, and the documentation is good. However, we can't document everything. As a consequence, sometimes an how-to is useful.

Of course ns-3 has a how-to wiki page, but sometimes it's better to write interesting stuff also elsewhere.

Anyway, no more chitchatting. The how-to.

Suppose you have 2 nodes, each one with its NetDevice. You install IP and assign addresses. Business as usual.
Now you decide that you want to assign a specific address to a node. Not just a specific one, one extra address.

Let's say that you want this:

Node 0:
1- fe80::200:ff:fe00:1/64
2- 2001:db8::200:ff:fe00:1/64
3- 2001:1:f00d:cafe::42/64

Node 1:
1- fe80::200:ff:fe00:2/64
2- 2001:db8::200:ff:fe00:2/64
3- 2001:1:f00d:cafe::666/64

Note the SLAAC configured addresses (1 and 2) and the manually added addresses (3).

How to do that ?

Easy... sort of. It's easy once you know how.
  // d is the NetDeviceContainer
  Ipv6AddressHelper ipv6;
  NS_LOG_INFO ("Assign IPv6 Addresses.");
  Ipv6InterfaceContainer i = ipv6.Assign (d);

  // arbitrary address assignment
  {
    Ptr<NetDevice> device = d.Get (0);
    Ptr<Node> node = device->GetNode ();
    Ptr<Ipv6> ipv6proto = node->GetObject<Ipv6> ();
    int32_t ifIndex = 0;
    ifIndex = ipv6proto->GetInterfaceForDevice (device);
    Ipv6InterfaceAddress ipv6Addr = 
       Ipv6InterfaceAddress (
          Ipv6Address ("2001:1:f00d:cafe::42"), Ipv6Prefix (64));
    ipv6proto->AddAddress (ifIndex, ipv6Addr);
  }


That's for Node 1. Node 2 is similar.

Easy? Yes.
Obvious? Nope.

What's the problem and why it's not easier?
The answer is: you need the interface index. Unfortunately, only the Ipv6 class knows the interface index for a given NetDevice, so you first have to find what is that index, then it's a matter of one instruction.

If you think that the interface is at index 1 (because it is), then you're going to have a bad time.
That's because the interface 0 is the Loopback, and you can't be totally sure that a given NetDevice will have a specific index at IP level.
You can, of course, put a static number there (e.g., if the Node just have one NetDevice, it's kinda safe to assume that the index is 1. However, this assumption will not hold anymore if you add more NetDevices.

Anyway, adding multiple IPv6 global addresses to a NetDevice is possible, and it's not too hard once you know how to do it.

BTW... let's try a little test. Why the two nodes need two global addresses? What address 2 can not do?

Thursday, March 13, 2014

Mercurial on OS X, tips

Extremely small post on something useful.

The ns-3 source control management of choice is Mercurial.
Installing Mercurial on OS X is not an issue, just download the package and install it. Simple as that.
Once installed, you can use a terminal to give all the commands you want, or even use a GUI (see Mercurial Mac native GUIs).

What I missed the most, however, is the ability to auto-complete commands from the terminal.
In Linux you can write "hg p" and hit tab. The result is:
tommaso:ns-3-dev pecos$ hg p
parents  patch    paths    phase    pull     push     

or even... "hg push --move" + tab:
tommaso:ns-3-dev pecos$ hg qpush --move 
Ndisc.diff               TagFragmentation.diff    issue6821106_36001.diff

Handy... but it's not working on OS X. However there is a way to enable it. And it's easy too.
  1. Download the mercurial source tarball (http://mercurial.selenic.com/downloads) and choose "Mercurial xxxx source release" where xxxx is the latest version;
  2. Unpack the file, and open the folder. Open the folder named "contrib";
  3. Copy the file named bash_completion somewhere (e.g., $HOME/bin);
  4. Modify the bash profile to source it;
  5. Reopen your terminals and enjoy.
Let's do a practical example. Let's assume the latest Mercurial version is 2.9.1, and let's assume my bash profile is named ~/.profile
The commands to give are:
tommaso:~ pecos$ cd ~
tommaso:~ pecos$ cp Downloads/mercurial-2.9.1/contrib/bash_completion ~/bin
tommaso:~ pecos$ nano ~/.profile
and add the following line to .profile:
source $HOME/bin/bash_completion

Note that in some systems ".profile" could be named ".bash_profile". It's the same stuff.

If you're a terminal guy (like me) you'll really enjoy it.

Tuesday, March 4, 2014

How to ask for help in an user forum

Today no rants, I'll write an how-to.

The occasion for this comes from the ns-3 user forum. It's a kinda peculiar forum, as "users" are not really simply users.

Ns-3 is a network simulator, and is GUI-less. There are some graphical front ends, but the primary way to work with it is by writing a C++ or Python program.
As a consequence, its users are (mostly) able to develop a program, sometimes even very complex ones.

No matter how good the user is, often they find issues, and the forum is where they look for help.
Some posts in the forum are "easy" to answer to, some are... less. And this is why I'm writing this how-to.

  1. The post topic: state your issue briefly.
  2. The post: describe your issue in detail.
  3. The attachments: use them, if needed.
  4. The code in the post: don't do it.
  5. Reply to old posts: avoid it.
  6. Patience is a virtue: don't expect a reply in hours.

Let's see what I mean with some examples.

You have an issue. The first thing is to search the forum for old posts. Maybe somebody had the same issue and was solved. Let's suppose you found something similar, but no definitive answer was given (or you can't fix your issue following the thread you found).
This is a case where you may reply in the thread, but keep in mind that ns-3 is an evolving system (around 4 releases/year), and the code base may be very different from, let's say, 4-5 releases ago.
A better option is to start a new thread and link the relevant threads in your post.

You don't find anything. You start a new thread. Do not post a message with a "help needed" topic. Be specific. The more specific, the better.

The post body: state exactly what's your problem, along with relevant info, like: ns-3 version, operating system, compiler version, if you did modify the code, etc.
If you have issues with one of your programs, do not post snippets, attach the code to the message. Don't copy it in the body, attachments are there for a reason.

And then... patience. There are a few people checking the foams almost daily, but they're not paid for that. So, wait. If you don't get an answer, don't give up. Try to write to the code maintainers, and be polite, they're not paid either.

The last thing is a suggestion (a strong one). Be polite. If you're planning to ask somebody to give you his/her code, remember to ask 'em kindly. "Please"is just 6 keystrokes. Moreover, especially if the posts are old (let's say, more than 6 months ago), chances are that the original poster isn't following the forum anymore. Try writing them directly, it might work.

Have fun coding !