SourceForge Logo

Go to the SourceForge project home page


/../index.php

/../news.php

/../release_history.php

index.php


/../screenshots.php

/../download.php

/../contact.php



The HeavyMole communication protocol

Go to documentation index

Table of contents

A. Overview

The HeavyMole network is composed of nodes, i.e there is client/server aspect. Each node is uniquely identified by a Global User Identificator (GUID). This GUID is a randomly chosen 64 bits integer.

Each node knows a limited number of neighbours, with which he can communicate. These neighbours are contained in what we call the 'usertable'. The first task of the communication protocol is thus to build this usertable and to keep it up-to-date.

Each node can share a number of files. Each set of files shared by one user is contained in a data structure that we will call a 'filetree'. To get better global search results, the filetree of a given node will be duplicated on several other nodes. A second task of the protocol will thus be to duplicate our filetree, to keep the copies up-to-date when our filtree changes and to detect when a node that hosts our filetree, or from which we host the filetree, goes on/offline.

The third task of the protocol (in fact his goal) is to send requests, to search hosted filetrees and to send answers.

B. The usertable

The usertable is a data structure containing a number of users. A number of coefficients that we call preference coefficients (PC) are associated with each user (in fact, we will have one PC for each message type). They are used to choose best adressees w.r.t the type of the message to send, and to keep only best neighbours. The usertable maintains an index by user's GUID, and an index for each PC.

When the application is launched for the first time, the usertable is empty. To get started, we ask to some central server a number of users, randomly chosen among all those that are logged on this server. Once this is done, we'll never have to ask to the server again, except if it happens that all users we know are offline.

When a LOGIN message (or any other message containing user information) is received on the network, the user it contains is put in the usertable if it's not full. If the user already exists, his information is updated.

The usertable is constantly kept up-to-date as follows. Each N seconds, next user in table is pinged (PING). When the user replies (PONG), the time taken by the message to go and come back is used to compute the PC for this user. In fact, each PC has its own compute method based on the ping time, and on other informations (number of hosted files and request handling capacity) that are contained in the PONG message (see see below for details about computing PCs) If the user has still not replied the next time he is pinged, all his PCs are set to 0.

When all users have been pinged, the usertable is checked. If there are not enough users or if the mean PC is too low, we ask to other users on the network some users of their own table (ASKUSERTABLE/SENDUSERTBALE) and we remove bad users. If there are no online users (i.e if mean PC = 0) we ask to a central server. Then we start pinging the users again.

NB: To prevent from having to contact central servers too often or if all servers are unreachable, we also keep a second data structure on hard disk, containing all users received (with a maximum size of several thousands users). When we have no online users in the table, we first look in this structure and try all users until we have found M online users. This insures that after some time using HeavyMole, the probability that we know no online user will be very low.

C. Filetree Hosting

As we said before, each node will try to duplicate his filetree on several other nodes. In fact, each node is trying to keep a fixed number H of online copies of his filetree. To achieve this, we keep a list of online hosters, which is checked at regular intervals. When the number of online hosters is under H, we send a FILETREEPROPOSE message on the network to ask to other nodes to host our filetree. When a node receives such a message, it checks if its hosting capacity is not reached, and if this is the case it replies to accept the filetree (FILETREEACCEPT). When the sender gets the reply, a TCP connection is established and his filetree is uploaded.

We see that each node must have some list holding hosted filetrees. When a filetree is downloaded, it is put in the list along with information about the hosted user. This list is updated in the same way as the usertable. Each N seconds, next hosted user is pinged (HOSTERPING). If the user does not reply (HOSTERPONG), we mark him as offline in the list. When a node is pinged by one of his hosters, he knows that this hoster is online and it is added to the list of online hosters, if it is not already in it.

When our filetree changes, we must find a way to warn hosters. Each time our filetree is scanned, the difference with previous version (delta) is computed and saved, and a new version is created. In this way, we keep a trace of the history of all changes to the filetree. When a filetree is downloaded, the version number is also given. When a node pings a hosted user, it also gives the filetree version it has. If a newer version is available, the update which is composed of all computed deltas since the old version are uploaded and applied to the hosted filetree.

D. Searching

When we want to search the HeavyMole network, we first check in our filetree to get immediately some answers, then we send a REQUEST message. When a node receives a request, it is put in a buffer and forwarded to other users (see below for more explanations on message forwading). If buffer is full, the request is dropped. When the request is extracted from the buffer and handled, all or a part of hosted filetrees are searched (searchingcapacity <= hostingcapacity). An ANSWER message is sent to the sender of the request when one or more files matching the request are found. (see A.? for details on request syntax and on searching a filetree). If there are two many answers to fit in one UDP packet, it is sent in sevral packets.

E. Message transmission and forwarding

Now we have seen all types of messages, why they are sent and how they are handled, we are going to see how those messages are sent and forwarded on the network.

First, we can see that messages can be split up into two groups. The first group contains all messages that are sent by a node to another well-defined node. Those are what we call the Peer-to-peer messages. The second group contains all messages sent by a node and that have no well-defined destination, but that are intended to be received by many other users on the network. Those are what we call the flood messages.

The HeavyMole communication protocol has thus 2 separate components:

  • the Peer-2-peer protocol (P2P): messages exchanged between 2 nodes of the HeavyMole network
  • the Flood Protocol: messages sent by one node to many other nodes

E.1 P2P messages transmission

There are 7 types of P2P messages currently used: PING, PONG, ASKUSERTABLE, SENDUSERTABLE, HOSTERPING, HOSTERPONG, and ANSWER.

P2P messages are simply sent by UDP to one other user. UDP has been chosen for several reasons:

  • All messages fit into one UDP packet. Only ANSWER messages can span over multiple packets, but even then it is not required that all packets arrive in order.
  • It is never critical if packets are lost.
  • UDP allows on-the-fly connections, without any peformance limitation. In other words, we can send only one packet to another user without any connection establishing overhead.

E.2 Flood messages transmission

Flood messages are also sent by UDP, for the same reasons as above. Flood messages can again be divided into two groups. One group contains messages that are to be sent to many other nodes: those are called FLOOD messages. The other group contains messages that are to be sent to a few other random users on the network, but that are not in our immediate neighbourhood: those are called ARROW messages.

All these messages have in common that they are first sent to some users from the usertable, and that they are then forwarded by the receivers, and so on... All messages carry a time-to-live (TTL) that is decreased each time the message is forwarded. When a received message has TTL = 0, it is never forwarded. All messages also carry information about the sender. This allows to reply to him directly through P2P, but also to insert/update him in our usertable.

That's here that the preference coefficients associated with the users are used. We said above that for each message type (not with P2P messages of course), a PC was defined, each with its own compute method, to choose best addressees w.r.t the message type. For each PC, there is a pool containing users chosen among the best w.r.t this PC. This pool is computed each time the table is checked (see A.? for details about the algorithm used for extracting a pool from the usertable). When a message of a given type has to be sent or forwarded, neighbours to which we send the message are chosen randomly among the current pool for this message type.

E.2.1 FLOOD messages

There are two types of FLOOD messages currently used: REQUEST and LOGIN.

All FLOOD messages carry a reference PC (REF_PC), and a minimum and maximum growing factor (MIN_GF and MAX_GF). When a node forwards a FLOOD message, it looks at the mean PC of his current sending pool. If this mean is under REF_PC, the message is forwarded to MIN_GF neighbours, else it is forwarded to MAX_GF neighbours.

Thus the size of the neighbourhood that we reach when we send a message can be chosen with appropriate values of initial TTL and of the MIN_GF and MAX_GF. The REF_PC allows to favour directions with more good nodes.

FLOOD messages coming from me, or from a user from which we already received a FLOOD messages very recently (in last 10 seconds) are never handled nor forwarded. This prevents from explosion in small networks.

E.2.2 ARROW messages

There are two types of ARROW messages currently used: FILETREEPROPOSE and ASKUSERTABLE.

ARROW messages, as we already said, are aimed to contact a few random users on network but not in our immediate neighbourhood. This is useful for FILETREEPROPOSE messages. Indeed we've just seen that FLOOD messages (in particular REQUEST messages) are sent to a neighbourhood whose approximate size is chosen at sending time by appropriate parameters values. The main goal of duplicating filetrees is thus to get better results when searching this neighbourhood. It is of course useless for a given filetree to be hosted by 10 users in the reached neighbourhood. What we want is thus to have our filetree hosted by nodes "far" from us. This is what we do with arrow messages. ARROW messages are also used for ASKUSERTABLE, to prevent formation of unconnected sub-networks.

All ARROW messages carry a minimum TTL (MIN_TLL). When a node receives an ARROW message, if message's TTL is greater than MIN_TTL, message is immediately forwarded to one random user from the pool for this message type. Else, the message is handled. If it was not sucessfully handled and TTL is not 0, message is also forwarded to one user. Thus the number of nodes that will sucessfully handle the message is always less than or equal to the number of users we initially sent the message.

E.2.3 PC computing method

We saw above that when a message of a given type has to be sent or forwarded, neighbours to which we send the message are chosen randomly among the current pool for this message type. We are now going to see, for each message type, how the PC is computed.

Let's begin by defining all factors that will be used when computing the PC for each message type:

1. SCALED_NB_FILES scales number of hosted files linearly between 0 [0 files] and 100 [>=10000 files]

SCALED_NB_FILES = MIN(100, NB_HOSTED_FILES/100)

2. SCALED_PING_PONG scales ping-pong time between 0 [TIMEOUT = 2s] and 100 [0 sec]

SCALED_PING_PONG = MAX (0, 100 * (1 - ((exp(PING_PONG/TIMEOUT)- 1) / (e-1))))

3. SCALED_REQUEST_HANDLING_CAPACITY scales request handling capacity linearly between -100 (buffer full) and 100 (buffer empty)

SCALED_REQUEST_HANDLING_CAPACITY = 2 * (50 - REQUEST_HANDLING_CAPACITY)

REQUEST_HANDLING_CAPACITY is defined as the fill ratio of the user's request buffer (between 0 and 100)

Now let's see how the PC is computed for each message type:

1. REQUEST messages

PC = SCALED_PING_PONG * 0.65
+ SCALED_REQUEST_HANDLING_CAPACITY * 0.50 (if < 0); * 0.25 (if > 0)
+ SCALED_NB_FILES * 0.10

The main component is the ping (adds [0, +65] to the PC). If the scaled request handling capacity is negative (user's request buffer is more than half-full), the PC is decreased, else it is increased (adds [-50, +25] to the PC). This will have as a result that request will be sent to users whose buffer is not too busy. The number of hosted files is also taken into account, but only for a small part (adds [0, +10] to the PC).

2. LOGIN and ASKUSERTABLE messages

PC = SCALED_PING_PONG

Only the PING-PONG time is taken into account.

3. FILETREEPROPOSE messages

PC = SCALED_PING_PONG * 0.50 + (100 - SCALED_NB_FILES) * 0.50

This will favour users with a good ping and not too much hosted files.

Note: when we must compute a pool of 'globally' good or bad users, we use a GLOBAL_PC which is a combination of all possible PCs:

GLOBAL_PC = PC_REQUEST * 0.75 + PC_LOGIN * 0.20 + PC_FILETREEPROPOSE * 0.05

This is used:

  • when bad users have to be removed
  • when we send a pool of users in reply to an ASUSERTABLE message
  • when computing statistics about our usertable

F. Summary

The main features of the HeavyMole communication protocol are:

  • the network is composed of nodes, which are at the same time server and client
  • all messages are sent over UDP, which eliminates any connection overhead
  • each node has a table containing neighbours, used for transmitting and forwarding messages;
    when you send a message to one neighbour, he will forward it to some of his own neighbours and so on...
  • neighbours are quoted with coefficients and, depending the type of the message to forward, only best neighbours are used for message transmission.
  • the table containing neighbours is dynamic: it is constantly kept up to date by pinging users to get data about their current state, getting new neighbours from the network, and removing bad or offline neighbours
  • each user shares a number of files; references to one's files are duplicated in several places on the network in order to give better search results; these copies are constantly kept up-to-date: when files change, the changes with previous version are sent to all the hosters
  • central servers are used for finding online neighbours if we don't know any
    they are used ONLY for that pupose: searching does not imply any acces to a central server
    and there will be no trace of shared files on these servers

The main advantages are:

  • network dynamic:
    • each one knows the current state of his neighbours and can act consequently
    • it introduces much chaos in the network, which prevents from formation of disconnected islands and gives a good dispersion of messages
    • the functioning is not dependent of the size of the network, in other words it adapts his behavior to it
  • messages transport:
    • central servers are used to find first neighbours, so you don't need any technical manipulation before it works (contrary to Gnutella)
    • depending the message type to send, it chooses best adresses in known neighbours
  • queries handling:
    • it doesn't need any monstrous central servers to handle queries (contrary to Napster)
    • queries are forwarded to (and handled by) applications whose request buffer is not very busy, ie those that receive few queries or have lots of CPU power. In other words, it is CPU sharing in addition to file sharing.
    • when you launch a query, you can get immediatly answers from the part of the network you hold in principal memory
    • data are duplicated, so you can reach a big part of the network by sending messages only to few other applications.
  • security:
    • everybody knows only a small part of the network and there are no central places holding data about who shares what
    • operations performed are only known by concerned applications

Apendix. Format of messages

1 P2P Protocol

1.1 P2P Header

- All messages of the P2P protocol have a common header, with following format:

AAABUUUU

(A) Message type: 3 bits
message type can take following values:
- PING 0
- PONG 1
- ANSWER 2
- FILETREEACCEPT 3
- HOSTERPING 4
- HOSTERPONG 5
- SENDUSERTABLE 6

(B) Sender attached: 1 bit
set to 1 if GUID of sender is attached after header

(U) 4 bits unused

- If Sender Attached is 1, user GUID is attached immediately after header, and has following format:

AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA

(A) GUID: 64 bit

1.2 PING message

PING messages carry no data

1.3 PONG message

AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
BBBBBBBB

(A) Number of hosted files : 32 bits

(B) Request handling capacity : 8 bits

1.4 ANSWER message

AAAAAAAA

(A) Request Number

This header is followed by :
- a list of files (short n + n * File)
- information about the owner of these files (MainUser)
- owner's state : online or offline (char)

1.5 FILETREEACCEPT message

AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBB

(A) IP adress of TCP server where tree can be uploaded
(B) port of this server

1.6 HOSTERPING message

AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCC

(A) IP adress of TCP server where update can be uploaded
(B) port of this server
(C) version number of tree i host

1.7 HOSTERPONG message

AAAAAAAA

(A) Upload Handling Capacity

1.8 SENDUSERTABLE message

N *

AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCC
DDDDDDDDDDDDDDDD
EEEEEEEEEEEEEEEE
EEEEEEEEEEEEEEEE
FFFFFFFF

(A) GUID: 64 bits
(B) IP Adress: 32 bits
(C) P2P port: 16 bits
(D) FLOOD port: 16 bits
(E) Number of hosted files: 16 bits
(F) Request Handling Capacity: 8 bits

2 Flood Protocol

2.1 Flood Header

- All messages of the flood protocol have a common header, with following format:

ABBCDDDD

(A) Header type: 1 bit
header type can take following values:
- FLOOD 0
- ARROW 1

(B) Message type: 2 bits
message type can take following values:
- LOGIN 0
- REQUEST 1
- FILETREEPROPOSE 2
- ASKUSERTABLE 3

(C) Sender attached: 1 bit
set to 1 if sender info is attached after header
currently, all messages have this bit set

(D) Time-to-live (TTL)

- Messages with FLOOD header type have a second header with following format:

AAAAAAAA
BBBBCCCC

(A) Reference PC (REF_PC): 8 bits, value between 0 and 100

(B) Minimum growing factor (MIN_GF)

(C) Maximum growing factor (MAX_GF)

- Messages with ARROW header type have a second header with following format:

AAAAAAAA


(A) Mininum TTL

- If Sender Attached is 1, user information is attached immediately after header, and has following format:

AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCC
DDDDDDDDDDDDDDDD
EEEEEEEEEEEEEEEE
EEEEEEEEEEEEEEEE
FFFFFFFF

(A) GUID: 64 bits
(B) IP Adress: 32 bits
(C) P2P port: 16 bits
(D) FLOOD port: 16 bits
(E) Number of hosted files: 16 bits
(F) Request Handling Capacity: 8 bits

2.2 LOGIN message

LOGIN messages carry no data

2.3 REQUEST message

AAAAAAAA
BBBBBBBB
...

(A) Request Number
(B) Request string

2.4 ASKUSERTABLE message

AAAAAAAAAAAAAAAA

(A) Max number of wanted users

2.5 FILETREEPROPOSE message

AAAAAAAAAAAAAAAA
BBBBBBBB

(A) Size of filetree

(B) Upload Handling Capacity

http://heavymole.sourceforge.net - This page was last updated on 14 april 2001