Difference between revisions of "Pyogp/Client Lib/Packet"

From Second Life Wiki
Jump to navigation Jump to search
Line 46: Line 46:
     END
     END
</pre>
</pre>
Something you may want to know is that a packet's bytes are composed of both big-endian and little-endian bytes. The packet header, which is made up of the flags, the sequence number, and the extra header offset (as well as extra header if there is any), is always in big-endian form. The packet body, which is made up of the frequency, the message number, and the payload, is usually in little-endian form. There are some exceptions. Any data added to the payload of the packet that is of the type VARIABLE, UUID, IP_ADDRRESS, or IP_PORT is always encoded in big-endian form.


==Header==
==Header==

Revision as of 08:36, 21 August 2008

This information can also be found in a couple of places. It is listed here as another explanation, hopefully to make it more understandable.

General Knowledge

Bit manipulation

1 Byte = 8 Bits
One Hex character uses 4 bits, or half a Byte. So it takes 2 Hex characters to make up a Byte.

  • bits 11110000 = F0 hex (or 0xF0)
  • in other words, hex 0xF0 can be turned into 11110000

This means that a given Byte can be broken up into two groups of 4 bits, with each group of 4 bits being represented by a single Hex value. The first 4 bits (starting from the left) can be considered the most significant bits. The last 4 bits (the four bits at the right) can be considered the most significant bits.

  • In the binary number 11110000, 11110000 are most significant bits, 11110000 are least significant bits
  • Shown in hex, 0xF0, where F are the most significant bits, 0 are the least significant bits

Examples

  • 00010001 binary (bits) = 11 hex = 17 decimal
  • 00010001 = 0x11 = 17

Zero-encoding

Some packets can be zero-encoded. Zero-encoding means that any combination of zeros in the packet are broken down into 2 bytes. The first being a 0 byte ('\x00') and the next being how many zeros there are. This is used to compress messages to make network bandwidth a bit lower.
For instance:
'\x01\x00\x00\x00\x00' becomes
'\x01\x00\x04'
You'll notice that the last 4 bytes, which is a series of zeros, is broken down into 2 bytes. The first being a '\x00' and the second being a '\x04'. The first byte signals a series of zeros, and the second lists how many zeros there are.


Packets

This is an attempt to make packets more clear, using specific examples in Python.
References:

Packets are made up of a header and a body. To construct the packets in Python, use the Python struct module. The struct module has a function called pack, which takes a format that the packet will be formatted into, and also takes the values (in Hex form). The pack function outputs a string that can be sent over the network through a socket, which can then be unpacked and put back into a form that can be read easier (hex, decimal, etc). For more information, see Struct module.

Overview

Byte 0: packet flags
Bytes 1 - 4: sequence number
Byte 5: extra header information byte (thought about as the offset length to the data/payload)
Bytes 6 - n: packet payload. This is a sequence of block information and block variable information.
    START (REPEAT FOR EACH BLOCK)
        If block type is VARIABLE, one byte dedicated to number of repeats for the block
            START (REPEAT FOR EACH VARIABLE)
                If variable type is VARIABLE, either 1,2, or 4 bytes is dedicated to the size of the variable
                Variable data
            END
    END

Something you may want to know is that a packet's bytes are composed of both big-endian and little-endian bytes. The packet header, which is made up of the flags, the sequence number, and the extra header offset (as well as extra header if there is any), is always in big-endian form. The packet body, which is made up of the frequency, the message number, and the payload, is usually in little-endian form. There are some exceptions. Any data added to the payload of the packet that is of the type VARIABLE, UUID, IP_ADDRRESS, or IP_PORT is always encoded in big-endian form.

Header

The header is usually made up of 5 bytes, all in Big Endian form.

Byte 0 (unsigned char)

This should be a 2-digit hex value. The most significant bits are used to indicate a characteristic of the packet. The 4 least significant bits are unused. So, as a hex value, the value should always be of the form 0xV0, where V is the flag to set. The following flags are available:

  • 0x80 - LL_ZERO_CODE_FLAG - 0's in packet body are run length encoded, such that series of 1 to 255 zero bytes are encoded to take 2 bytes.
  • 0x40 - LL_RELIABLE_FLAG - This packet was sent reliably (implies please ack this packet)
  • 0x20 - LL_RESENT_FLAG - This packet is a resend from the source.
  • 0x10 - LL_ACK_FLAG - This packet contains appended acks.

Note that these can be combined in a single packet by using bit manipulation. E.g. 0x80 | 0x40 is a zero-coded and reliable packet. These flags can also be added onto any packet it seems. The zero-code flag gets added on when a packet can be compressed (has a string of zeros in it that can be zero-coded to save bytes). The reliable flag is added on when we want an ack from the destination we are sending to to tell us that they received our packet. The resent flag is added on when to indicate that this message was resent because we didn't get an ack when we wanted one (meaning, this packet could be a duplicate if the server got our packet but just took too long to ack us). The ack flag is added on if we have attached acks to the end of the packet we are sending (this can be done for any packet and is to save some network traffic).

Bytes 1-4 (unsigned long)

Bytes 1 through 4 (4 Bytes is the size of a typical integer, or long) are the sequence number. This number is pass back and forth between the client and server to note which packet is being sent in the sequence. The sequence number essentially represents where the packet is in the sequence of communication between client and server, so to be sure that we are receiving messages in order. The first packet, then, should have the sequence number 1, or 0x00000001.

Byte 5 (unsigned char)

This is a byte that signifies if there is any extra header information. It is usually 0 to signify that there is not an extra header. If it is non-zero, the number represents how many extra bytes are in the extra header.

Body

Bytes 6-n (sequences of unsigned chars) aka Packet ID

The next series of Bytes are used to represent the Packet ID (or Message ID). The Packet ID is used to determine which message is being sent, and therefore find out what the message data format will be. However, the Packet ID is actually logically broken up into 2 parts: the message frequency and a message ID number. The combination of the frequency and the Packet ID can be used as a key in the Message Template to find exactly what message is being sent. They take the form (frequency, message ID number). The Message Template can be found in scripts\messages\message_template.msg of the release, which will be included in the Pyogp project repo.

  • As an example, a key to get the AddCircuitCode packet will be ('Low', 2), with 'Low' being the frequency and 2 being the message ID number. However, this combination is actually represented by a single sequence of Bytes, 0xFFFF0002. The AddCircuitCode packet can be determined either directly by the value the Bytes represent, or we can parse the Bytes to find its sequence and message ID. The Message Template represents the packets by the form ('Low', 2) and not by the hex value, so we usually must parse the Bytes to find the frequency and ID in them.
  • ('Low, 2) is the AddCircuitCode packet, aka, 0xFFFF0002
  • ('Medium', 2) is the MultipleObjectUpdate packet, aka, 0xFF02
  • ('High', 2) is the CompletePingCheck packet, aka, 0x02

In order to determine the Packet ID to be used as a look-up into the Message Template, we have to parse Bytes 6-n. There can be 1, 2, or 4 Bytes (n can be either 1, 2, or 4) used to represent the Packet ID, hence the 6-n. The number of Bytes used is determined by the frequency of the packet. Every message has a particular frequency (also called message number or category), or how often it is typically sent over the network. There are 4 different frequencies of messages, "Low", "Medium","High", and "Fixed". Due to their frequency, messages are assigned a certain number of Bytes to represent their Packet IDs (a High frequency packet should need less Bytes so to not clog the network). High frequency packets have a single Byte to represent the Packet ID. Medium frequency packets use 2 Bytes to represent their Packet ID. Low frequency packets use 4 Bytes to represent their Packet ID. Knowing how many Bytes each category of messages use, we also have to know how to parse the Bytes to determine their frequency and message ID number.

Fixed:  FF FF FF xx 
Low:    FF FF xx xx 
Medium: FF xx .. .. 
High:   xx .. .. ..

Where FF means that the Byte must be FF, and where xx represents where the message ID number will be in the sequence of Bytes.
Frequency Bytes How to identify Values Examples
High frequency messages 1 Byte does not have 0xFF as Byte 6 assigned (from the template file) numbers 0x01 - 0xFE
  • 0x01 - ('High', 1)
  • 0xFE - ('High', 254)
Medium frequency messages 2 Bytes 0xFF as Byte 6, does not have an 0xFF as Byte 7 assigned (derived from numbers in the template file) numbers 0xFF01 - 0xFFFE
  • 0xFF01 - ('Medium', 1)
  • 0xFFFE - ('Medium', 254)
  • Notice that in both cases, the first FF are not used for the message ID number. The first FF Byte is used only to indicate that it is a medium frequency packet. The following Byte (or 2 Hex values) represents the message ID.
Low frequency messages 4 Bytes 0xFFFF as Byte 6 and Byte 7 are assigned (derived from numbers in the template file) numbers 0xFFFF0001 and up
  • 0xFFFF0001 - ('Low', 1)
  • 0xFFFFFFF9 - ('Low', 65529)
  • Notice that in both cases, the first FFFF Bytes are not used for the message ID number. The first FFFF Bytes are only used to indicate that it is a low frequency packet. The following 2 Bytes (or 4 Hex values) actually represent the message ID.
Fixed Frequency

Messages with "Fixed" frequency are really those with fixed message numbers, i.e. the numbers are assigned in the message_template.msg file itself. There are currently 6 of these, 0xFFFFFFFA - 0xFFFFFFFF. Even though all messages have numbers assigned to them in the template file, the Fixed frequency is kept around for legacy reasons.

Bytes after Packet ID aka Payload

After we get and decode the Bytes for the Packet ID, we can look up the packet in the Message Template and determine what its payload will be. We can figure out what blocks and data will be in the payload and so we can read in all the data.
What is written as payload depends on the message. Each message has a different payload, based on the message's block specifications as well as each of the blocks' variable specifications.
After the Packet ID is decoded each of the blocks for the message are written to the payload. A message can have any number of blocks. The first thing written for a block depends on the block type. If the block type is VARIABLE, then the first byte written for the block is the number of repeats for the block. Note: if it is MULTIPLE we don't have to write it because it is stored in the template (and is therefore FIXED). Also, if it is SINGLE, we don't have to write the number because it is always 1.
Then, the variables for the block are written to the payload. Again, the first thing written depends on the type of the variable. If the variable is type VARIABLE (confusing, I know) then the first thing written for the payload is the size of the variable. The maximum size of the variable can be either 1,2, or 4 bytes long, which is determined by the message template (and hence is read from the template). In other words, if the message template variable that is being written is listed as VARIABLE 1, then the variable's payload can have a length that is no longer than 1 byte (so, max length is 255). If it is VARIABLE 2, then max length is 2 bytes (65535).
Finally, the actual data is packed. The data is packed based on the variable type. For instance, if the type is U8, then a single unsigned byte is written to the payload and so on.

Also note that the entire payload (everything after the frequency and message number bytes) is encoded in little-endian format. Everything before that is encoded in big-endian format.

Bytes after Payload aka Acks

After the payload, there may be some acks added to the end of the packet. For instance, if we received any packets that had the reliable flag, then it means they are waiting for us to ack that packet. To save some network traffic, instead of sending a completely new PacketAck packet, we can just add these acks onto the end of any packet. Note that to ack a packet, we simply have to send the sequence number of the packet we want to ack.
The way this is done is by:

  1. Setting the packet's flag to the LL_ACK_FLAG
  2. Adding the list of packet sequence numbers that we want to ack to the end of the packet
  3. Add the number of acks we have added to the end of the list.

Note: when reading a packet with acks attached, we first determine if the flag is set, then we read the last byte in the packet to determine how many acks are attached, and then read the acks starting from the back. Then we go to the front and read the normal payload.

Examples

No flags, Fixed frequency, Variable block type, U32 data type

'\x00\x00\x00\x00\x02\x00\xff\xff\xff\xfb\x01\x03\x00\x00\x00'
Byte 0 - send flag

  • '\x00' - means that no flag was set (note: this is in big-endian format, but since it is 1 byte it really doesn't matter)

Bytes 1-4 - sequence number

  • '\x00\x00\x00\x02' - 2 - So, this packet was sent as the 2nd packet out of the circuit. (note: this is in big-endian format)

Byte 5 - offset

  • '\x00' - 0 - means that the next byte is the beginning of the body, there is no extra header information. (note: this is in big-endian format, but since it is 1 byte it really doesn't matter)

Parse message information
We now have to go through the next couple of bytes to determine what the frequency and message number is. Note: the message number is in big-endian format. The bytes used to make up the frequency are not counted for anything other than markers indicating the frequency. When we are trying to determine the message number, we have to consider those bytes to be in big-endian format. Byte 6

  • '\xff' - we know that it can't be a high frequency message

Byte 7

  • '\xff' - we know that it can't be a medium frequency message

Byte 8

  • '\xff' - we know that it can't be a low frequency message, so it is a fixed frequency message

Byte 9

  • '\xfb' - 251 - because we know it is fixed, we interpret this as the message number

So, we have determined that the message is a (Fixed, 251) message
Parse payload
Now we can figure out which message it is by looking at the message template. If we look at the message template for the (Fixed, 251) combination, we find that it is the PacketAck message. This tells us a few things about how to parse the rest of the body. First, it tells us that this message can have any number of the block called "PacketAck". This means that the next byte in the payload will tell us how many "Packets" blocks have been added to this particular message. Byte 10

  • '\x01' - 1 - there is only 1 "Packets" block

Now we know there is only 1 "Packets" block, so now we can start to read the actual data. To do that, we have to look at the message template and find out what sort of data the "Packets" block has. The template tells us that it has a variable called "ID" and that it is a U32 (meaning, this variable is 4 bytes long). Bytes 11-14

  • '\x03\x00\x00\x00' - 3 - note that this is in little-endian form, which means that this message is acking the packet that had the frequency number of 3.

Ack flag, Low frequency, Variable block type, Variable data types

'P\x00\x00\x00\x01\x00\xff\xff\x00\xEC\x01U\x0e\x84\x00\xe2\x9bA\xd4\xa7\x16DfUD\x00\x00\x09Locklainn\x06Linden\x03\x00\x00\x00\x04\x00\x00\x00\x02'
Byte 0 - send flag

  • 'P' - P is actually a combination of some flags. Doing bit testing, we find that P is a combination of the ack flag and the reliable flag (0x10 and 0x40). This means that sender wants us to ack this packet and that the sender has put some acks onto the end of this packet. (note: this is in big-endian format, but since it is 1 byte it really doesn't matter)

Bytes 1-4 - sequence number

  • '\x00\x00\x00\x01' - 1 - So, this packet was sent as the 1st packet out of the circuit. (note: this is in big-endian format)

Byte 5 - offset

  • '\x00' - 0 - means that the next byte is the beginning of the body, there is no extra header information. (note: this is in big-endian format, but since it is 1 byte it really doesn't matter)

Parse message information
Byte 6

  • '\xff' - we know that it can't be a high frequency message

Byte 7

  • '\xff' - we know that it can't be a medium frequency message

Byte 8

  • '\00' - we know that it is a Low frequency, because this isn't '\xff'

Byte 9

  • '\EC' - 236 - combining Byte 8 and 9 gives us the message num of '\x00\xEC', which is 236

So, we have determined that the message is a (Low, 236) message
Parse payload
Now we can figure out which message it is by looking at the message template. If we look at the message template for the (Low, 236) combination, we find that it is the UUIDNameReply message. This tells us a few things about how to parse the rest of the body. First, it tells us that this message can have any number of the block called "UUIDNameBlock". This means that the next byte in the payload will tell us how many "UUIDNameBlock" blocks have been added to this particular message. Byte 10

  • '\x01' - 1 - there is only 1 "UUIDNameBlock" block

Now we know there is only 1 "UUIDNameBlock" block, so now we can start to read the actual data. To do that, we have to look at the message template and find out what sort of data the "Packets" block has. The template tells us that it has a variable called "ID" and that it is a UUID(meaning, this variable is 16 bytes long). Bytes 11-26

  • 'U\x0e\x84\x00\xe2\x9bA\xd4\xa7\x16DfUD\x00\x00' - 550e8400-e29b-41d4-a716-446655440000 - note that this is in big-endian form (uuid's are special cases)

The next variable that will be in the packet is "FirstName" variable, which is of type Variable. So, we have to read the length of the variable in the next byte.
Byte 27

  • '\x09' - 9 - means that the first name is made up of 9 characters

Bytes 28-36

  • 'Locklainn' - Locklainn - note that this is in Big-endian form. Strings are exceptions just like uuids.

The next variable that will be in the packet is "LastName" variable, which is of type Variable. So, we have to read the length of the variable in the next byte.
Byte 37

  • '\x06' - 6- means that the last name is made up of 6 characters
  • 'Linden' - Linden - note that this is in Big-endian form. Strings are exceptions just like uuids.

Now, we have read all the payload according to the message template. However, because the first flag was the ack flag, we know that there is still more data to be read. We must now read all the acks attached to the packet. First, let's find out how many acks are attached by reading the last byte:
Byte 52 '\x02' - 2 - means that 2 acks have been attached to the packet Bytes 48-51 '\x04\x00\x00\x00' - 4 - they acked the packet with sequence number of 4 Bytes 44-47 '\x03\x00\x00\x00' - 3 - they also acked the packet with sequence number of 3