Doc quassel protocols » History » Version 17
Version 16 (Sputnick, 01/30/2014 12:30 AM) → Version 17/20 (Sputnick, 01/30/2014 12:30 AM)
h1. The Quassel Protocol
h2. Overview
When we talk about the "Quassel protocol", we mean the format of data sent between a Quassel core and connected Quassel clients. At the moment (i.e., as of version 0.9), only one protocol - the "legacy protocol" - is in use. It has evolved from Quassel's early days and hasn't really changed all that much over the years. However, back then we didn't really expect Quassel to ever become popular, much less other developers writing alternate clients such as QuasselDroid or iQuassel. Accordingly, instead of designing (and documenting) a well-defined and easy-to-use format, we chose a rather pragmatic approach for sending data over the wire: Because Qt already had a facility to (de)serialize arbitrary data types over a binary stream - using QDataStream - we simply went with that.
While being both straightforward and easy to implement in Quassel, this choice turned out to be rather unlucky in retrospect:
* QDataStream's serialization format is not the most efficient one. In particular, strings are serialized as UTF-16, which means that almost half of the data exchanged between client and core is nullbytes. However, this is partially compensated by Quassel using compression if possible.
* Speaking of which, we don't use streaming compression, which means that lots of potential for benefitting from recurring strings is not used. And since many of the objects we send are key/value maps which tend to have the same set of keys every time, this does matter in practice.
* And to add insult to injury, we waste even more space all over the place because we simply didn't think about optimizing the protocol. Mobile use of Quassel was just not on our radar in 2005.
* The serialization format is nowhere documented in a concise and complete way. Yes, there's documentation somewhere in Qt for built-in types; for Quassel's own types however, one would have to hunt through the source. And without reading (and understanding) some rather icky parts of Quassel code, it's close to impossible to understand what's going on even if one manages to deserialize the binary data into proper objects. Bad news for people wanting to write alternate clients. Amazingly, some smart people still managed to reverse-engineer the protocol...
To fix these and more issues, we're now planning to replace the legacy protocol by something more sensible. As the first (and most complicated) step, we implemented a protocol abstraction that will allow us to much more easily support alternative formats. As a neat side effect, the resulting refactoring also makes some core parts of the code (e.g. SignalProxy and the initial handshake) much nicer to understand.
h2. The Master Plan
# [DONE] Refactor the code base to have all protocol-related stuff centralized at one location.
# [WiP] Implement a way to probe a core for the supported protocols and options. This will allow for supporting additional features or another format later without relying on fragile guesswork; in particular, we can enable things like compression or encryption before starting the real handshake (in the legacy protocol, this information is sent as properties in QVariantMaps during the handshake phase). It would be beneficial to get this completed prior to the release of Ubuntu 14.04 LTS.
# [WiP] Optimize the current (legacy) protocol in easy ways for clients/cores connecting with the new handshake. "Easy" means that neither the semantics nor the QDataStream-based serialization change, so 3rd party clients won't have to change much to support this; basically we want to take the opportunity to fix some stupid things in the legacy protocol in a straightforward way. The list of planned optimizations includes removing the current overhead in the per-message serialization (nesting multiple layers of QVariants and QByteArrays and sending the block size several times as a result); serializing strings in the fixed message headers (e.g. method and object names) as UTF-8 instead of UTF-16; and changing the InitData format for networks (in particular the usersAndChannels part of it) to significantly cut down the size of the initial sync data. We'd also want to switch from per-message compression to streaming compression, which should increase the compression ratio of the protocol significantly, considering that in particular key names of QVariantMaps are repeated all the time.
# [NOT STARTED] Evaluate different wire formats as alternative to QDataStream, without changing the protocol semantics. This should allow for a more efficient data exchange without immediately breaking 3rd party or older clients (or cores); it will also show if the protocol abstraction done in Step 1 is sane and working. Google Protobuf seems like a good contender for an additional wire format.
# [NOT STARTED] Refactor the protocol semantics. Most importantly, this includes removal of side effects for object syncing, and switching to events. It may also include moving the client state into the core. Note that this will completely break compatibility, and we are not sure if it's feasible to retain backwards compatibility at least for a while.
h3. Requirements for new protocols
h3. Keeping compatibility
TBD: for how long?
h2. Abstract View [DRAFT]
h3. Handshake
h4. Probing
Because we might want to support more than one protocol, we cannot start to send messages right away. First, both client and core need to agree on which protocol to use and if to enable things like compression or SSL. Therefore, right after the connection has been established, a few well-defined bytes are exchanged to probe for the capabilities on both ends and to determine in which way the real data is going to be exchanged. Note that the probing data is sent in network byte order (big endian), as is customary for network protocols.
# The client sends a 32 bit value to the core to initiate the connection. The upper 24 bits contain the magic number 0x42b33f. The lower 8 bits contain a set of global connection features (such as compression or SSL support) as defined in the Protocol::Feature enum. Since the resulting value is larger than 0x00400000, legacy (pre-0.10) cores will immediately close the connection. The client can detect this and reconnect in compatibility mode.
# Immediately afterwards, the client sends a list of the protocols it supports, in order of preference. For each protocol, a 32 bit value is sent, where the lower 8 bits contain the protocol type according to the Protocol::Type enum, and bits 8-23 hold protocol-specific data (for example, the protocol version). Bit 31 being set indicates the end of the list; now the client waits for response from the core.
# Based on this information, the core will select the protocol to use for the connection. It will then reply with a 32 bit value of its own similar to the one it just received; it will contain the chosen protocol in the lowest byte, and protocol-specific data in bits 8-23. The upper byte holds the global connection features (Protocol::Feature) to be enabled.
# Immediately afterwards, compression and encryption will be enabled on both ends if applicable, and the socket is handed over to the appropriate protocol handler, ending the probing phase.
_Note: The legacy protocol determines the supported and enabled feature set, as well as the protocol version, only during the handshake phase. Therefore, both compression and encryption are turned on later in the process. Also, a client reconnecting in compatibility mode will skip the probing phase and proceed directly with the legacy handshake._
h4. Init and Authentication
Immediately after the probing phase, client and core start exchanging a set of handshake messages. The handshake is performed by ClientAuthHandler and CoreAuthHandler respectively, based on a set of "abstract messages":https://github.com/quassel/quassel/blob/master/src/common/protocol.h#L56. The sequence of messages exchanged should be deducible fairly straight-forwardly from the code (and message names), so for now this documentation must suffice.
h3. SigProxy Mode
The handshake phase ends successfully with the core sending a SessionState message to the client. After that, communication switches to using "four different SignalProxy messages":https://github.com/quassel/quassel/blob/master/src/common/protocol.h#L170, plus a pair of heartbeat messages that are sent in regular intervals. Note that the semantics of the legacy protocol are too complex to be documented here for the moment, and the Master Plan intends to make this much easier in the future.
h2. On-Wire Format
The full communication between client and core is expressed semantically by the messages declared in "protocol.h":https://github.com/quassel/quassel/blob/master/src/common/protocol.h. However, these messages need to be serialized somehow to be sent over the wire. Starting with Quassel 0.10, the architecture supports offering more than one serialization format; serializers are implemented as subclasses of RemotePeer (which handles the high-level socket communication). A serializer, or as we call them, _peer_, basically translates the abstract messages into something that can be sent and received over the network.
During the probing phase described above, client and core negotiate which protocol (i.e., serialization format) to use. Following is a list of the available protocols.
h3. Legacy Protocol
This is the protocol that has been in use unmodified since Quassel 0.5 or even earlier. It is implemented by the "LegacyPeer":https://github.com/quassel/quassel/tree/master/src/common/protocols/legacy.
As mentioned before, the legacy protocol evolved over time and has several deficiencies besides being based on the QDataStream format (which is easy to use as long as you're using Qt, but requires lots of effort to support in other languages and frameworks). However, Quassel 0.10 still supports it.
h3. DataStream Protocol
The DataStream Protocol as implemented by the DataStreamPeer is intended to be a stream-lined version of the legacy protocol. The basic idea is to take the opportunity to break the format with the advent of the new probing phase, while at the same time not placing too much of a burden on third-party client developers to encourage migration. This means that
* the serialization of individual data types is still based on QDataStream and unchanged from the legacy protocol, and
* the semantics of the protocol (i.e. the contents, meaning and sequence of messages) is identical to the legacy protocol, with a few exceptions described below.
However, the DataStream protocol _does_ change the serialization of messages (as opposed to types) in order to reduce overhead, and it features some straightforward optimizations.
In order to ease the work for third-party client developers, here comes the full list of differences to the legacy protocol.
h4. Differences between the DataStream and the legacy protocols
_TBW_
h2. Compatibility Notes
Both client and core are fully backwards-compatible with older cores and clients. During the probing phase, a communication attempt from an older version can be detected, the rest of the probing phase is skipped, and the legacy protocol enabled instead. Developers of third-party clients are encouraged to keep supporting older cores by detecting them as follows: Upon sending the magic number (0x42b33fXX), a legacy core will immediately close the connection (without sending any data back). This should be detected by the client, and a reconnect using the legacy protocol (and no probing) should be performed.
When connecting to a 0.10+ core that supports probing, the protocol to be used, as well as SSL encryption and compression, are negotiated. It is of course recommended to provide support for the DataStream protocol, as it is more efficient than the legacy protocol; however, the legacy protocol can also still be selected by clients which do not yet support the DataStream protocol. Note that the legacy protocol, even if negotiated by probing a 0.10 core, behaves _exactly_ like in older versions. This implies that SSL encryption and compression are enabled only during the handshake phase based on deprecated properties in handshake messages, and that the support for those features negotiated during the probing phase is ignored by the core.
h2. Overview
When we talk about the "Quassel protocol", we mean the format of data sent between a Quassel core and connected Quassel clients. At the moment (i.e., as of version 0.9), only one protocol - the "legacy protocol" - is in use. It has evolved from Quassel's early days and hasn't really changed all that much over the years. However, back then we didn't really expect Quassel to ever become popular, much less other developers writing alternate clients such as QuasselDroid or iQuassel. Accordingly, instead of designing (and documenting) a well-defined and easy-to-use format, we chose a rather pragmatic approach for sending data over the wire: Because Qt already had a facility to (de)serialize arbitrary data types over a binary stream - using QDataStream - we simply went with that.
While being both straightforward and easy to implement in Quassel, this choice turned out to be rather unlucky in retrospect:
* QDataStream's serialization format is not the most efficient one. In particular, strings are serialized as UTF-16, which means that almost half of the data exchanged between client and core is nullbytes. However, this is partially compensated by Quassel using compression if possible.
* Speaking of which, we don't use streaming compression, which means that lots of potential for benefitting from recurring strings is not used. And since many of the objects we send are key/value maps which tend to have the same set of keys every time, this does matter in practice.
* And to add insult to injury, we waste even more space all over the place because we simply didn't think about optimizing the protocol. Mobile use of Quassel was just not on our radar in 2005.
* The serialization format is nowhere documented in a concise and complete way. Yes, there's documentation somewhere in Qt for built-in types; for Quassel's own types however, one would have to hunt through the source. And without reading (and understanding) some rather icky parts of Quassel code, it's close to impossible to understand what's going on even if one manages to deserialize the binary data into proper objects. Bad news for people wanting to write alternate clients. Amazingly, some smart people still managed to reverse-engineer the protocol...
To fix these and more issues, we're now planning to replace the legacy protocol by something more sensible. As the first (and most complicated) step, we implemented a protocol abstraction that will allow us to much more easily support alternative formats. As a neat side effect, the resulting refactoring also makes some core parts of the code (e.g. SignalProxy and the initial handshake) much nicer to understand.
h2. The Master Plan
# [DONE] Refactor the code base to have all protocol-related stuff centralized at one location.
# [WiP] Implement a way to probe a core for the supported protocols and options. This will allow for supporting additional features or another format later without relying on fragile guesswork; in particular, we can enable things like compression or encryption before starting the real handshake (in the legacy protocol, this information is sent as properties in QVariantMaps during the handshake phase). It would be beneficial to get this completed prior to the release of Ubuntu 14.04 LTS.
# [WiP] Optimize the current (legacy) protocol in easy ways for clients/cores connecting with the new handshake. "Easy" means that neither the semantics nor the QDataStream-based serialization change, so 3rd party clients won't have to change much to support this; basically we want to take the opportunity to fix some stupid things in the legacy protocol in a straightforward way. The list of planned optimizations includes removing the current overhead in the per-message serialization (nesting multiple layers of QVariants and QByteArrays and sending the block size several times as a result); serializing strings in the fixed message headers (e.g. method and object names) as UTF-8 instead of UTF-16; and changing the InitData format for networks (in particular the usersAndChannels part of it) to significantly cut down the size of the initial sync data. We'd also want to switch from per-message compression to streaming compression, which should increase the compression ratio of the protocol significantly, considering that in particular key names of QVariantMaps are repeated all the time.
# [NOT STARTED] Evaluate different wire formats as alternative to QDataStream, without changing the protocol semantics. This should allow for a more efficient data exchange without immediately breaking 3rd party or older clients (or cores); it will also show if the protocol abstraction done in Step 1 is sane and working. Google Protobuf seems like a good contender for an additional wire format.
# [NOT STARTED] Refactor the protocol semantics. Most importantly, this includes removal of side effects for object syncing, and switching to events. It may also include moving the client state into the core. Note that this will completely break compatibility, and we are not sure if it's feasible to retain backwards compatibility at least for a while.
h3. Requirements for new protocols
h3. Keeping compatibility
TBD: for how long?
h2. Abstract View [DRAFT]
h3. Handshake
h4. Probing
Because we might want to support more than one protocol, we cannot start to send messages right away. First, both client and core need to agree on which protocol to use and if to enable things like compression or SSL. Therefore, right after the connection has been established, a few well-defined bytes are exchanged to probe for the capabilities on both ends and to determine in which way the real data is going to be exchanged. Note that the probing data is sent in network byte order (big endian), as is customary for network protocols.
# The client sends a 32 bit value to the core to initiate the connection. The upper 24 bits contain the magic number 0x42b33f. The lower 8 bits contain a set of global connection features (such as compression or SSL support) as defined in the Protocol::Feature enum. Since the resulting value is larger than 0x00400000, legacy (pre-0.10) cores will immediately close the connection. The client can detect this and reconnect in compatibility mode.
# Immediately afterwards, the client sends a list of the protocols it supports, in order of preference. For each protocol, a 32 bit value is sent, where the lower 8 bits contain the protocol type according to the Protocol::Type enum, and bits 8-23 hold protocol-specific data (for example, the protocol version). Bit 31 being set indicates the end of the list; now the client waits for response from the core.
# Based on this information, the core will select the protocol to use for the connection. It will then reply with a 32 bit value of its own similar to the one it just received; it will contain the chosen protocol in the lowest byte, and protocol-specific data in bits 8-23. The upper byte holds the global connection features (Protocol::Feature) to be enabled.
# Immediately afterwards, compression and encryption will be enabled on both ends if applicable, and the socket is handed over to the appropriate protocol handler, ending the probing phase.
_Note: The legacy protocol determines the supported and enabled feature set, as well as the protocol version, only during the handshake phase. Therefore, both compression and encryption are turned on later in the process. Also, a client reconnecting in compatibility mode will skip the probing phase and proceed directly with the legacy handshake._
h4. Init and Authentication
Immediately after the probing phase, client and core start exchanging a set of handshake messages. The handshake is performed by ClientAuthHandler and CoreAuthHandler respectively, based on a set of "abstract messages":https://github.com/quassel/quassel/blob/master/src/common/protocol.h#L56. The sequence of messages exchanged should be deducible fairly straight-forwardly from the code (and message names), so for now this documentation must suffice.
h3. SigProxy Mode
The handshake phase ends successfully with the core sending a SessionState message to the client. After that, communication switches to using "four different SignalProxy messages":https://github.com/quassel/quassel/blob/master/src/common/protocol.h#L170, plus a pair of heartbeat messages that are sent in regular intervals. Note that the semantics of the legacy protocol are too complex to be documented here for the moment, and the Master Plan intends to make this much easier in the future.
h2. On-Wire Format
The full communication between client and core is expressed semantically by the messages declared in "protocol.h":https://github.com/quassel/quassel/blob/master/src/common/protocol.h. However, these messages need to be serialized somehow to be sent over the wire. Starting with Quassel 0.10, the architecture supports offering more than one serialization format; serializers are implemented as subclasses of RemotePeer (which handles the high-level socket communication). A serializer, or as we call them, _peer_, basically translates the abstract messages into something that can be sent and received over the network.
During the probing phase described above, client and core negotiate which protocol (i.e., serialization format) to use. Following is a list of the available protocols.
h3. Legacy Protocol
This is the protocol that has been in use unmodified since Quassel 0.5 or even earlier. It is implemented by the "LegacyPeer":https://github.com/quassel/quassel/tree/master/src/common/protocols/legacy.
As mentioned before, the legacy protocol evolved over time and has several deficiencies besides being based on the QDataStream format (which is easy to use as long as you're using Qt, but requires lots of effort to support in other languages and frameworks). However, Quassel 0.10 still supports it.
h3. DataStream Protocol
The DataStream Protocol as implemented by the DataStreamPeer is intended to be a stream-lined version of the legacy protocol. The basic idea is to take the opportunity to break the format with the advent of the new probing phase, while at the same time not placing too much of a burden on third-party client developers to encourage migration. This means that
* the serialization of individual data types is still based on QDataStream and unchanged from the legacy protocol, and
* the semantics of the protocol (i.e. the contents, meaning and sequence of messages) is identical to the legacy protocol, with a few exceptions described below.
However, the DataStream protocol _does_ change the serialization of messages (as opposed to types) in order to reduce overhead, and it features some straightforward optimizations.
In order to ease the work for third-party client developers, here comes the full list of differences to the legacy protocol.
h4. Differences between the DataStream and the legacy protocols
_TBW_
h2. Compatibility Notes
Both client and core are fully backwards-compatible with older cores and clients. During the probing phase, a communication attempt from an older version can be detected, the rest of the probing phase is skipped, and the legacy protocol enabled instead. Developers of third-party clients are encouraged to keep supporting older cores by detecting them as follows: Upon sending the magic number (0x42b33fXX), a legacy core will immediately close the connection (without sending any data back). This should be detected by the client, and a reconnect using the legacy protocol (and no probing) should be performed.
When connecting to a 0.10+ core that supports probing, the protocol to be used, as well as SSL encryption and compression, are negotiated. It is of course recommended to provide support for the DataStream protocol, as it is more efficient than the legacy protocol; however, the legacy protocol can also still be selected by clients which do not yet support the DataStream protocol. Note that the legacy protocol, even if negotiated by probing a 0.10 core, behaves _exactly_ like in older versions. This implies that SSL encryption and compression are enabled only during the handshake phase based on deprecated properties in handshake messages, and that the support for those features negotiated during the probing phase is ignored by the core.