Difference between revisions of "Pyogp/Documentation/Specification/pyogp.lib.base"
Enus Linden (talk | contribs) |
|||
(28 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
This needs a significant overhaul and may be replaced with online api documentation (care of sphinx). Consider most information below deprecated. | |||
[[User:Enus Linden|Enus Linden]] 20:45, 22 April 2009 (UTC) | |||
=UDP Messaging API= | |||
The UDP messaging system is broken up into a few parts. They are: UDP Dispatcher, Circuit, Message (with Block), Packet, UDPSerializer, UDPDeserialzer, and UDPNetClient. I will discuss each of these to explain how each component should be used. | |||
==Message Template== | |||
Can be found [https://svn.secondlife.com/svn/linden/projects/2008/pyogp/pyogp.lib.base/trunk/pyogp/lib/base/data/message_template.msg message_template.msg] <br> | |||
Before I can discuss any of the design components, I should first explain the message template. The message template is a file that outlines all the different UDP messages that can be sent over a circuit. It breaks each message down by the message header information, its blocks, and the block data. <br> | |||
In order to communicate between the client and sim (or anything else) we need to be able to parse the message template, determine which messages can be sent, build messages based on the format they are specified to be in, and then read incoming messages. This is what the Message Template Parser is for. | |||
==Message Template Parser== | |||
In order to build or read (and therefore send or receive) any UDP messages, we have to first parse the message template. The parser parses the message template, searching for each message listed in it, and then iterates through the message's blocks and the block data. <br> | In order to build or read (and therefore send or receive) any UDP messages, we have to first parse the message template. The parser parses the message template, searching for each message listed in it, and then iterates through the message's blocks and the block data. <br> | ||
The parser goes through the message template and constructs a data object of type MessageTemplate, MessageTemplateBlock, and MessageTemplateVariable. These types are used to store general information about the message that is read from the message template. In other words, these objects created by the parser hold no incoming or outgoing data. They are simply used as templates to build data out of. They allow us to know the header information for the message, the blocks it is supposed to have, and the data that the blocks are supposed to have.<br> | The parser goes through the message template and constructs a data object of type MessageTemplate, MessageTemplateBlock, and MessageTemplateVariable. These types are used to store general information about the message that is read from the message template. In other words, these objects created by the parser hold no incoming or outgoing data. They are simply used as templates to build data out of. They allow us to know the header information for the message, the blocks it is supposed to have, and the data that the blocks are supposed to have.<br> | ||
Line 26: | Line 24: | ||
The output of the parser is a list of message template objects, where each object has its list of blocks, and where each block has its list of variables. | The output of the parser is a list of message template objects, where each object has its list of blocks, and where each block has its list of variables. | ||
==Message Template Dict== | |||
The parser outputs a list of message templates. In order to make accessing this list easier and more efficient, a dict has been created. This takes the list and makes dicts out of it, one that maps the template name to the template, and one that maps the frequency/num combination to the template. This way, we can get any template by its name or frequency/num combination. | The parser outputs a list of message templates. In order to make accessing this list easier and more efficient, a dict has been created. This takes the list and makes dicts out of it, one that maps the template name to the template, and one that maps the frequency/num combination to the template. This way, we can get any template by its name or frequency/num combination. This class is stored as a utility (like a Singleton that can be gotten with ZCA's getUtility() function). | ||
==Messages and Packets== | |||
<b>MsgData</b><br> | |||
In order to create a message that can be sent over a socket, we have at the very lowest level the classes MsgData, MsgBlockData, and MsgVariableData. These classes represent the components that are used when building a packet's message data. A message is made up of blocks, and each of the blocks is made up of some variables. These classes are similar to the MessageTemplate, MessageTemplateBlock, and MessageTemplateVariable classes, except these classes are designed to hold actual data that can be serialized and sent over a network. <br> | |||
To create a message, one could create a new MsgData object, and create the corresponding blocks and variables that make up the message, and add it to the message. This will create an object that knows about its blocks, and where the blocks know about their variables. One thing to know is that the MsgData object has a dictionary of blocks, mapped by name (the variable called "blocks" in MsgData). These blocks are actually block lists. The reason the blocks are actually lists is because a block can be of the type MULTIPLE or VARIABLE, and so any given message can have the same block repeated (essentially meaning the MsgData's blocks dictionary would map the different blocks to the same element in the dictionary, an overwrite). So, if you do something like message_data.blocks, just remember that blocks is a list. <br> | |||
<b>Packets</b><br> | |||
Now, the MsgData object is just the message data for something to be sent through udp. It is the payload for the packet. There are some other things we need in order to send a message. For this, we have a Packet. Packets hold information such as the flags that the packet will be sent with, the id of the packet (id being the sequence number, or the order that the packet was sent over a udp connection), its allowed number of retries, and the time it will expire. Packets also keep track of the acks that have been attached to the packet, and of course, it has the message data or payload for the Packet. Again, just like the MsgData, the Packet object is a high-level object that still needs to be serialized. | |||
<b>UDPSerializer and UDPDeserializer</b><br> | |||
With a Packet object in hand, one can then use the UDPSerializer to serialize that packet into a series of bytes that can finally be sent over a network. The UDPSerializer takes a Packet, determines which type of message it is by looking at the message template and reading information from the message data in the Packet, and attempts to serialize the Packet based on how the message template says it should look. If there is any discrepancy in what data the Packet has and what data the message template says it should have, an error will be thrown and the serialization will fail. Note that the UDPSerializer also packs things such as the flags onto the front, the message header information, the payload, and even adds the acks onto the end of the packet. <br> | |||
The UDPDeserializer does just the opposite. It takes a string of bytes and attempts to reconstruct the Packet object. It attempts to read the string of bytes by reading the header information bytes, matching it to a message template, and reading the payload based on how the template says it should be formatted. If there is any discrepancy between the construction of the string of bytes and what the template says should be in it, the deserialization will fail. <br> | |||
The way Pyogp's serialization works is that (just like everything else) we have an interface defined for ISerialization and IDeserialization, then we implement that interface with our serializers and deserializers (in this case, UDPSerializer and UDPDeserialzer), and tell zca the type of object we are attempt to (de)serialize. So, for instance, what we would do to serializer a Packet is: <br> | |||
serializer = ISerializer(myPacket) | |||
data = serializer.serialize()<br> | |||
This tells ZCA to look for the serializer that adapts our Packet object (in other words, knows how to change our Packet into a string of bytes). Then we just serialize it and get our byte payload. This serialized data has everything we need to send it over the network, including all the flags, the message template information, headers, payload, and any acks attached onto the end. | |||
==Messaging== | |||
<b>Circuit</b><br> | |||
In order to send or receive a udp packet to/from a sim or anywhere else, we must establish a udp connection called a "circuit". Pyogp has a Circuit object for just this connection. A Circuit is defined as a connection between one Host and another, with a Host being defined as an ip address and port combination. <br> | |||
The Circuit is in charge of keep track of everything related to keeping track and managing Packets flowing into and out of this connection. This means that the Circuit keeps track of all the Packets that it received that need to be acked and it keeps track of all the Packets that are sent out on it that we want acked (and so can resend them if they don't get acked in time). Circuits are also used as the final touch on the Packets before they get sent out. Packets need to have their flags and sequence number set. So, the Circuit keeps track of the id for the next packet that will be sent out over it, and sets the Packets accordingly. It also adds thing on such as the flags. All functionality that is related to the last-minute additions to the Packet, such as the flags, the sequence number, and even adding acks to the end of Packets (to save network bandwidth) is done through the Circuit. <br> | |||
However, the actual sending and receiving is not done through the Circuit. It is done through a UDPClient. | |||
<b>NetUDPClient</b><br> | |||
The NetUDPClient handles the actual sending and receiving of data over a network. It is the object that uses sockets directly. When the UDPClient receives a message on its socket, it simply passes the data along to something that can process it (see UDPDispatcher). It does nothing other than physically send and receive the data coming in and going out over a socket. <br> | |||
The NetUDPClient is used by calling the <b>start_udp_connection</b> function, which will create a new udp socket for us an return it. Then, whoever stores that socket can then call the NetUDPClient's <b>send_packet</b> and <b>receive_packet</b> functions. These functions both take in the socket to send and receive on. | |||
<br> | |||
Something to keep note of is that when we are sending and receiving, we actually pass the socket to the corresponding functions. Another design might be that the client object stores the socket itself and automatically uses that socket when sending and receiving. The reason it was done this way is because it was intended that only a single NetUDPClient object was needed even if the user wanted to use multiple sockets. | |||
<b>UDPDispatcher</b><br> | |||
The way that Pyogp currently does udp messaging is through what is called the UDPDispatcher. The UDPDispatcher provides the API through which you create a connection (or connections) and send and receive messages. The UDPDispatcher is intended to simply be an interface between the objects that send and receive information on a socket (NetUDPClients), and those that handle (respond to) Packets. <br> | |||
The UDPDispatcher is in charge of using the NetUDPClient to create UDP socket to send and receives messages from. It is also responsible for serializing Packets that are being sent out, deserializing Packets that come in, and managing all the circuits that have been established through it (this means that the UDPDispatcher sends out acks and resends Packets if the Circuits have tracked such things). | |||
The UDPDispatcher object should really be what any user of Pyogp should need to deal with either sending or receiving messages. It also shouldn't have to establish any connections or do any direct sending or receiving of messages over a socket or connection. The user should always go through the UDPDispatcher. The UDPDispatcher is meant to be driven by some outside source, either a client, a test, or just another application. It has none of its own loops or threads. <br> | |||
Although the user should really be using our UDPDispatcher object, we have separated concerns enough that if the user would like to use a Circuit and UDPClient directly, he or she has the freedom to do so. <br> | |||
Another design decision we have considered is that the UDPDispatcher uses only a single socket, even if it manages many Circuits. This means that even if the client that uses the UDPDispatcher is connected to many different sims (has many Circuits established) all network traffic will flow over a single socket. There is not a 1-1 Circuit-to-socket relationship. However, if the user would like such a relationship, he or she simply has to create a UDPDispatcher for every Circuit. So, in other words, a UDPDispatcher can be managing as many Circuits as the user would like, either 1 or many, but the UDPDispatcher always has only one socket. | |||
==Using the system== | |||
===Initializing the system=== | |||
<b>UDPDispatcher()</b> - this will create a dispatcher for us to be able to send and receive messages on.<br> | |||
===Building a message=== | |||
When someone wants to create a message, one way is to use the MsgData, MsgBlockData, and MsgVariableData objects directly, building them as they go. The old method is to use a builder that will construct the message for us. Unfortunately, this method is really c++-ish and is complicated to use. We ARE using Python, aren't we? Therefore, we have created a wrapper around MsgData to make construction of a message more Pythonic. <br> | |||
<b>Message and Block</b><br> | |||
The Message and Block classes are derivations of the MsgData and MsgBlockData classes. They allow you to fill in the message data by giving it Python objects. For instance, to create a message one would do:<br> | |||
msg = Message('TestPacket', | |||
Block('CircuitCode', ID=1234, Code=531) | |||
) | |||
This is creating a message called ‘TestPacket’, giving it a Block named ‘CircuitCode’, and giving the Block the variables ‘ID’ and ‘Code’. Message and Block then parse the passed information and construct the MsgData out of it (Message is a subclass of MsgData, and Block is a subclass of MsgBlockData). <br> | |||
Note that although we send Packets and not Messages or MsgDatas, we don’t actually have to create a Packet, we simply create the Message. The UDPDispatcher will create the Packet out of your message data and fill in the Packet fields with the necessary information. <br> | |||
Also note that at this time there is no checking to make sure you are constructing a message how the template specifies. Building a message is just constructing the MsgData data. The serializer will check to make sure you have everything that is needed and formatted properly. | |||
===Sending a message=== | |||
Once a message has been built, the user can then send the message to a given host (with a host being a combination of the ip address and port). | |||
<b>send_message(message, host)</b> - this method will construct a Packet out of your MsgData and send the message to the host specified. This function also gets the circuit to <b>prepare</b> the Packet, that is, add on any packet flags and add the sequence number to the Packet, adds any acks on to the end. The dispatcher finally makes sure that the created message is serialized. Eventually, the Circuit, or dispatcher, should also compress the message using a zero-coding, but this functionality is not yet in the system (however, we do decode zero-coded messages). This method will return the string buffer that was just sent. The message will be sent using a udp network client, or NetUDPClient. <br> | |||
<b> send_retry(message, host)</b> - sends a message using the RETRY packet flag, but delegates sending to send_messsage.<br> | |||
<b> send_reliable(message, host, retries)</b> - sends a message using the RELIABLE flag, as well as sets the packet's retry count (meaning, how many times this Packet can attempt to be resent before it is deleted if it isn’t acked) but delegates sending to send_messsage.<br> | |||
===Receiving a message=== | |||
Receiving is a bit trickier than sending a message. To receive a message, we don’t go directly through the dispatcher, but use the dispatcher’s udp client to get the data from the socket and pass it to the dispatcher to, well, dispatch it to a handler. So for instance, we could do: | |||
msg_buf, msg_size = dispatcher.udp_client.receive_packet(dispatcher.socket) | |||
packet = dispatcher.receive_check(dispatcher.udp_client.get_sender(), | |||
msg_buf, msg_size) | |||
This seems a bit confusing at first, but the reason for this abstraction is that an application could (and most likely will) run the socket receiving in a thread or some non-blocking routine. Then, when it gets data, it will just hand off the data to a dispatcher and the dispatcher will take care of it (including deserializing the message and having Circuits check for acks and such). This was designed to work with things like Twisted or Eventlet in such a way that it is swappable, with using the dispatcher’s receive_check as a callback. | |||
<b> receive_check(host, msg_buf, msg_size) </b> - this method processes the message, reading its binary form into a deserialized form, and passes the Packet to the Circuit object that correspond to the ip address and port that the data came in on. The Circuit will then process the message to determine if it needs to be acked, does it have acks attached, it is zero-coded, etc. This function will also return the received Packet for someone else to handle. Note that the dispatcher doesn’t do the Packet handling just yet. It only determines the acks for the Packet. Actually handling the Packet, just as changing client state or responding to the message, should be done otherwise. In the future, the dispatcher may do this as well, but it does not currently do it. | |||
===Maintenance=== | |||
The Message System also has some other features that allow users an easy to way make sure all maintenance is handled properly. Maintenance includes acking packets and removing stale packets (ones that cannot be resent anymore and haven't been acked in time). | |||
<b>process_acks()</b> - this function resends all packets that we have sent out that haven't been acked in time, as well as sending out all the acks of the messages we have received from the server that expect to be acked. It checks each Circuit in the dispatcher and determines if it has any Packets that can be resent, any Packets that need acking, and any Packets that are stale and have expired. | |||
===Design Decision=== | |||
When sending a message, the user specifies the host that will receive the message. | |||
*Advantages: | |||
#The same dispatcher can be used to send to multiple hosts without any reconfiguration. The other method would be to couple the dispatcher with its targeted receiver so that when sending a message it will always go to that receiver. This binds the dispatcher to communicating with only a single host, unless there is added functionality to allow a list of target hosts that one can send to, which effectively brings us back to allowing the dispatcher to send to any host (the original design). This is also not the design of the dispatcher. It was designed to “dispatch” a message to the correct Circuit to be handled properly. | |||
=Old UDP Messaging= | |||
==Message System/API== | |||
The way that Pyogp currently does messaging is through the Message System. The Message System provides the API through which you create a connection (or connections), build a send and receive messages. The Message System encapsulates all the other functionality that is needed do start working with messages. That is, it handles all parsing of the message template (message_template.msg), the message list (message.xml), creates dictionaries out of them, creates a UDP socket to send and receives messages from, sets up HTTP connections, and has builders and readers necessary to build and read the template formatted messages and the llsd formatted messages. It also handles all the maintaining of packets that need to be acked, keeping track of which we (being the client) need to ack and which we want acked by the server, as well as sending the acks or resending unacked packets. | |||
The Message System object should really be what any user of Pyogp should need to deal with either sending or receiving messages. The user shouldn't need to use a reader or a builder directly. It also shouldn't have to establish any connections or do any direct sending or receiving of messages over a socket or connection. The user should always go through the Message System. The Message System is meant to be driven by some outside source, either a client, a test, or just another application. It has none of its own loops or threads. | |||
===Initializing the system=== | |||
<b>MessageSystem(port)</b> - the port of which the Message System will receive messages. This is currently not used and may even be removed later.<br> | |||
===Building a message=== | |||
There are a few methods in the Message System that are available as high-level api calls. When a user wants to create a message, he or she does not have to know which type of message the server it expects it to be formatted as. For this reason, the Message System determines which format to use and delegates the creation of building the message to the corresponding builder. | |||
<b>new_message(message_name)</b> - begins creating a new message. The Message System determines which type of message is being built (either the template flavor or the llsd flavor). This determines which builder should be used. The creation of the new message is then delegated to the corresponding builder (transparent to the user).<br> | |||
<b>next_block(block_name)</b> - this tells the builder that we want to begin building the named block of the message. This is just delegated to a builder (see below for the details of how it works). <br> | |||
<b>add_data(var_name, data, data_type)</b> - adds data to the variable in the current block with the var_name name, delegated to the builder. <br> | |||
===Sending a message=== | |||
Once a message has been built, the user can then send the message to a given host (with a host being a combination of the ip address and port). | |||
<b>send_message(host, message_buf=None)</b> - this sends the message that we most recently built using the new_message, next_block, add_data functions. It sends the message to the host specified. The message_buf parameter allows for the user to pass in a message to send that wasn't built using the Message System (such as those created by directly using the builder of choice). This function also makes sure the created message is serialized, adds on any packet flags, adds the sequence number for the packet, adds the packet identification, and finally the payload. It also adds any acks on to the end and compresses the message using a zero-coding.<br> | |||
<b>send_retry(host, message_buf=None)</b> - sends a message using the RETRY packet flag, but delegates sending to send_messsage.<br> | |||
<b>send_reliable(host, retries, message_buf=None)</b> - sends a message using the RELIABLE flag, as well as sets the packet's retry count, but delegates sending to send_messsage.<br> | |||
===Receiving a message=== | |||
<b>receive_check()</b> - determines if there is a message waiting on the socket. Does a single pass to get a single message. If there is a message waiting, it processes the message, reading its binary form into a deserialized form, and determining its flags (that is, does it need to be acked, does it have acks attached, it is zero-coded, etc). The read message can then be accessed through the methods:<br> | |||
<b>get_received_message()</b> - this returns the whole message, in the form of a MsgData object<br> | |||
<b>get_data(block_name, var_name, data_type, block_number=0)</b> - this gets data from a particular block. The block_number is used when the message has multiple or variable blocks. <br> | |||
===Maintenance=== | |||
The Message System also has some other features that allow users an easy to way make sure all maintenance is handled properly. Maintenance includes acking packets and removing stale packets (ones that cannot be resent anymore and haven't been acked in time). | |||
<b>process_acks()</b> - this function resends all packets that we have sent out that haven't been acked in time, as well as sending out all the acks of the messages we have received from the server that expect to be acked. | |||
===Design Decision=== | |||
You'll notice that, in most cases, the user never has direct access to a created or receiving message. The user can, of course, get the message directly by going through the Message System's builder and reader (messagesystem.builder.current_msg), the design is not meant to be used in such a way. The user is not meant to manipulate a message directly but to use builders and readers for such a thing. Even then, the user shouldn't even be using builders and readers but going through the message system, which uses them. | |||
*Advantages: | |||
#Separation of concerns: this means the MsgData class represents a message, the builder builds the message, and the reader reads a message. Each has its own particular function and only that function. There is no object that has more than one piece of functionality. | |||
#Ease of use: the MessageSystem provides the high-level functionality to bring all the pieces together to allow the user to not need to build any message by hand, do any serialization or connections, keep track of circuit information (which sequence # is next, packet acking, etc), build or read packet header information (flags). Also allows the user to not need to know what formats the server expects to receive messages in. The MessageSystem handles this distinction based on message type. | |||
#Generic messages: any message can be built, read, sent, and received through this manner. All messages are read from the message template and so they can all be built with this generic representation of messages, builders, and readers. | |||
*Disadvantages: | |||
#No message object: the messages being built and sent are stored within the builders, readers, and MessageSystem, and so it may not be what users expect. For instance, users may be used to using a message OBJECT directly and passing a given message to something else to send it (rather than just calling send_message()). On the same line, people may be used to having explicit connections that they send/receive on. E.g. conn1.send(message1) rather than messageSystem.send_message() which determines which connection to send the message to. | |||
#Related to number 1, because we have no message object, there is no way to save and reuse objects. We have to reconstruct the message every time we wish to send the same message. This isn't necessarily true because we ARE currently saving packets that have been sent (so that we can resend them if they don't get acked in time). They DO exist, the design just keeps the user from ever needing to have direct access. This may be one case that the Message System could give the user a direct message if he or she so asks for it. This is currently possible; it's just not explicitly in the API. | |||
#Sequential building of message: the messages are built by doing a series of new_message, next_block, add_data calls. Some may want to build a message by having an object (see note above) that they can do direct calls to. e.g. message.name = "Locklainn" message.agent_id = uuid.UUID('blahblah'). The problem here lies in the MULTIPLE and VARIABLE type blocks. For instance, a single message can have the same variable multiple times in the message. So you can't simply do message.name="NAME" because the variable might exist in multiple blocks, and so which block does that variable belong to? Another way one might do it is by having something such as message.name_1= message.name_2=, but this doesn't seem to gain us anything in terms of ease of use in building a message. A given message can have any number of these blocks, so to handle the same thing one would have to use lists or some other container structure. One might have to do the following: message.blocks = [['name':"Locklainn", 'agent_id':UUID("blahblah")]['session_id':123531, 'circuit_code':123656]]. | |||
When sending a message, the user specifies the host that will receive the message. | |||
*Advantages: | |||
#The same messaging system can be used to send to multiple hosts without any reconfiguration. The other method would be to couple the messaging system with its targeted receiver so that when sending a message it will always go to that receiver. This binds the messaging system to communicating with only a single host, unless there is added functionality to allow a list of target hosts that one can send to, which effectively brings us back to allowing the message system to send to any host (the original design). | |||
==UDP Template Messaging== | |||
There are a few main components to sending a UDP message, the message template, the parser, the builder, and the reader. | |||
===Message Template Builder=== | ===Message Template Builder=== | ||
The builder is used to create messages that can be sent through UDP. It is used to make sure that the messages being built are in accordance with the message template and have all the necessary blocks and data that go along with the message. The builder creates message objects using the MsgData, MsgBlockData, and MsgVariableData classes. These objects differ from the template versions in that they hold the actual data. They don't hold general information but only exactly what will be sent through UDP. However, they hold the data in object form, and therefore are not serialized. The builder also serializes the message once it is finished being built. <br> | |||
Messages are built in a sequence of steps: | |||
#<b>new_message(message_name)</b> - this method sets up a new message to begin being built. The message is filled in with all the block stubs that it needs to have. | |||
#<b>next_block(block_name)</b> - sets the block that we are building to be the block with the given block name. It also fills in the stubbed block with all the variables that the block needs to have. | |||
#*Note that if we are trying to set the block to one that doesn't exist, or that has already been created, we will get errors. However, if the block is of type multiple or variable, then we can create more than one block with the same name. A multiple type block means that there is a fixed number of blocks that the message must have, so we can add that number of blocks to the message (but no more than that number). A variable type block means that there can be any number of blocks, so we can add any number of this block to the message (meaning, we can call next_block() with the same block_name any number of times). | |||
#<b>add_data(var_name, data, data_type)</b> - this adds the data to the block. There are a couple checks to make sure that the data you pass it matches the data it is expecting (from the template). | |||
#*When we add data we store the size of the data. Now, normally we know the size directly from the data type (where both type and size are stored in the template). But when the data is of type variable then the template doesn't store the size of the data (because it can't, the data can be any size). However, the template stores how many bytes the size can be. So, for type variable, we have to determine the size of the actual data being written and store that instead. | |||
#<b>build_message()</b> - this goes through the message, each of its blocks and data, and serializes the data into a string that can be sent over UDP. Before it does so, it makes sure the data added to the message is correct, that it has all the blocks and variables it is supposed to have, and in the right format. This returns the message and size that has been serialized. | |||
#*To figure out the format of the message and what the serializes is doing, check out the [[Pyogp/Client_Lib/Notes]] page. | |||
#*This uses the DataPacker to pack the data correctly. The DataPacker takes the data and the data_type and determines how to serialize the data given the type. | |||
===Message Template Reader=== | ===Message Template Reader=== | ||
[[Category: | The reader attempts to read a message that has been received through UDP. The reader can only process one message at a time. | ||
There are a few steps to processing and using the read data: | |||
#<b>validate_message(buffer,size)</b> - this attempts to decode the message and figure out what type of message it is. It also checks to make sure the message is valid (in the message template), and keeps track of the template if it is. | |||
#<b>read_message(buffer)</b> - goes through the message, skipping over the header and pre-header information (that gets added on by some other process) and processes the blocks and the data. This also makes sure that the message was validated first. Simply iterates over the buffer, reading block information and data information and constructs a MsgData object out of it. This deserializes the data back into binary form. | |||
#*Note that if the block is multiple or variable, it repeatedly reads the block data until it has processed all of the message data. | |||
#*The DataUnpacker is used to deserialize the data. It is given the data and type. | |||
#<b>get_data(block_name, var_name, data_type, block_number=0)</b> - this returns the data that was deserialized for the message. We give it the block name of which block to find the variable, the data type to make sure that the user is aware of the type (error checks) and as an optional argument, the block_number (which is only used when the message has many blocks of the same time, aka, the block type is multiple or variable). | |||
#<b>{optional}clear_message()</b> - this gets the reader ready to read a new message. The reader won't crash without doing this, but warnings will be issued to make sure this is the desired behavior. | |||
[[Category:Pyogp_Documentation]] | |||
[[Category:Pyogp_Kitchen_Sink]] |
Latest revision as of 12:46, 22 April 2009
This needs a significant overhaul and may be replaced with online api documentation (care of sphinx). Consider most information below deprecated. Enus Linden 20:45, 22 April 2009 (UTC)
UDP Messaging API
The UDP messaging system is broken up into a few parts. They are: UDP Dispatcher, Circuit, Message (with Block), Packet, UDPSerializer, UDPDeserialzer, and UDPNetClient. I will discuss each of these to explain how each component should be used.
Message Template
Can be found message_template.msg
Before I can discuss any of the design components, I should first explain the message template. The message template is a file that outlines all the different UDP messages that can be sent over a circuit. It breaks each message down by the message header information, its blocks, and the block data.
In order to communicate between the client and sim (or anything else) we need to be able to parse the message template, determine which messages can be sent, build messages based on the format they are specified to be in, and then read incoming messages. This is what the Message Template Parser is for.
Message Template Parser
In order to build or read (and therefore send or receive) any UDP messages, we have to first parse the message template. The parser parses the message template, searching for each message listed in it, and then iterates through the message's blocks and the block data.
The parser goes through the message template and constructs a data object of type MessageTemplate, MessageTemplateBlock, and MessageTemplateVariable. These types are used to store general information about the message that is read from the message template. In other words, these objects created by the parser hold no incoming or outgoing data. They are simply used as templates to build data out of. They allow us to know the header information for the message, the blocks it is supposed to have, and the data that the blocks are supposed to have.
The parser reads things such as the message frequency, the message number (which is a unique value that is matched with frequency), its trust, encoding, and deprecation.
- The template also stores something called the hex num, which is the combination of the frequency and the message number stored as a hex number. It is stored here because the hex value never changes for the templates, and rather than building the hex value every time a message is going to be sent, it is just stored in the template for quicker access.
It also reads the block information such as its type (one of single, multiple, or variable)
- If the block is of type multiple, then something called the block number is also stored. This number represents how many of the given block MUST be written for a message.
- If the block is of type variable, then any number of the given block can be written for a message.
Finally, it reads the block data for things such as the data type (one of many types) and its size.
- The string type read from the template is converted to a class variable (like an enum)
- The size of the variable is gotten from our sizeof function in message_types.py
- Although, if the variable is of type variable or fixed, the maximum number of bytes that the variable can be is stored (it is also listed in the message template as a third parameter, such as { Data Variable 2 }, which says the variable called "Data" is of type "Variable" where the variable can store up to 2 bytes worth of data.
The output of the parser is a list of message template objects, where each object has its list of blocks, and where each block has its list of variables.
Message Template Dict
The parser outputs a list of message templates. In order to make accessing this list easier and more efficient, a dict has been created. This takes the list and makes dicts out of it, one that maps the template name to the template, and one that maps the frequency/num combination to the template. This way, we can get any template by its name or frequency/num combination. This class is stored as a utility (like a Singleton that can be gotten with ZCA's getUtility() function).
Messages and Packets
MsgData
In order to create a message that can be sent over a socket, we have at the very lowest level the classes MsgData, MsgBlockData, and MsgVariableData. These classes represent the components that are used when building a packet's message data. A message is made up of blocks, and each of the blocks is made up of some variables. These classes are similar to the MessageTemplate, MessageTemplateBlock, and MessageTemplateVariable classes, except these classes are designed to hold actual data that can be serialized and sent over a network.
To create a message, one could create a new MsgData object, and create the corresponding blocks and variables that make up the message, and add it to the message. This will create an object that knows about its blocks, and where the blocks know about their variables. One thing to know is that the MsgData object has a dictionary of blocks, mapped by name (the variable called "blocks" in MsgData). These blocks are actually block lists. The reason the blocks are actually lists is because a block can be of the type MULTIPLE or VARIABLE, and so any given message can have the same block repeated (essentially meaning the MsgData's blocks dictionary would map the different blocks to the same element in the dictionary, an overwrite). So, if you do something like message_data.blocks, just remember that blocks is a list.
Packets
Now, the MsgData object is just the message data for something to be sent through udp. It is the payload for the packet. There are some other things we need in order to send a message. For this, we have a Packet. Packets hold information such as the flags that the packet will be sent with, the id of the packet (id being the sequence number, or the order that the packet was sent over a udp connection), its allowed number of retries, and the time it will expire. Packets also keep track of the acks that have been attached to the packet, and of course, it has the message data or payload for the Packet. Again, just like the MsgData, the Packet object is a high-level object that still needs to be serialized.
UDPSerializer and UDPDeserializer
With a Packet object in hand, one can then use the UDPSerializer to serialize that packet into a series of bytes that can finally be sent over a network. The UDPSerializer takes a Packet, determines which type of message it is by looking at the message template and reading information from the message data in the Packet, and attempts to serialize the Packet based on how the message template says it should look. If there is any discrepancy in what data the Packet has and what data the message template says it should have, an error will be thrown and the serialization will fail. Note that the UDPSerializer also packs things such as the flags onto the front, the message header information, the payload, and even adds the acks onto the end of the packet.
The UDPDeserializer does just the opposite. It takes a string of bytes and attempts to reconstruct the Packet object. It attempts to read the string of bytes by reading the header information bytes, matching it to a message template, and reading the payload based on how the template says it should be formatted. If there is any discrepancy between the construction of the string of bytes and what the template says should be in it, the deserialization will fail.
The way Pyogp's serialization works is that (just like everything else) we have an interface defined for ISerialization and IDeserialization, then we implement that interface with our serializers and deserializers (in this case, UDPSerializer and UDPDeserialzer), and tell zca the type of object we are attempt to (de)serialize. So, for instance, what we would do to serializer a Packet is:
serializer = ISerializer(myPacket)
data = serializer.serialize()
This tells ZCA to look for the serializer that adapts our Packet object (in other words, knows how to change our Packet into a string of bytes). Then we just serialize it and get our byte payload. This serialized data has everything we need to send it over the network, including all the flags, the message template information, headers, payload, and any acks attached onto the end.
Messaging
Circuit
In order to send or receive a udp packet to/from a sim or anywhere else, we must establish a udp connection called a "circuit". Pyogp has a Circuit object for just this connection. A Circuit is defined as a connection between one Host and another, with a Host being defined as an ip address and port combination.
The Circuit is in charge of keep track of everything related to keeping track and managing Packets flowing into and out of this connection. This means that the Circuit keeps track of all the Packets that it received that need to be acked and it keeps track of all the Packets that are sent out on it that we want acked (and so can resend them if they don't get acked in time). Circuits are also used as the final touch on the Packets before they get sent out. Packets need to have their flags and sequence number set. So, the Circuit keeps track of the id for the next packet that will be sent out over it, and sets the Packets accordingly. It also adds thing on such as the flags. All functionality that is related to the last-minute additions to the Packet, such as the flags, the sequence number, and even adding acks to the end of Packets (to save network bandwidth) is done through the Circuit.
However, the actual sending and receiving is not done through the Circuit. It is done through a UDPClient.
NetUDPClient
The NetUDPClient handles the actual sending and receiving of data over a network. It is the object that uses sockets directly. When the UDPClient receives a message on its socket, it simply passes the data along to something that can process it (see UDPDispatcher). It does nothing other than physically send and receive the data coming in and going out over a socket.
The NetUDPClient is used by calling the start_udp_connection function, which will create a new udp socket for us an return it. Then, whoever stores that socket can then call the NetUDPClient's send_packet and receive_packet functions. These functions both take in the socket to send and receive on.
Something to keep note of is that when we are sending and receiving, we actually pass the socket to the corresponding functions. Another design might be that the client object stores the socket itself and automatically uses that socket when sending and receiving. The reason it was done this way is because it was intended that only a single NetUDPClient object was needed even if the user wanted to use multiple sockets.
UDPDispatcher
The way that Pyogp currently does udp messaging is through what is called the UDPDispatcher. The UDPDispatcher provides the API through which you create a connection (or connections) and send and receive messages. The UDPDispatcher is intended to simply be an interface between the objects that send and receive information on a socket (NetUDPClients), and those that handle (respond to) Packets.
The UDPDispatcher is in charge of using the NetUDPClient to create UDP socket to send and receives messages from. It is also responsible for serializing Packets that are being sent out, deserializing Packets that come in, and managing all the circuits that have been established through it (this means that the UDPDispatcher sends out acks and resends Packets if the Circuits have tracked such things).
The UDPDispatcher object should really be what any user of Pyogp should need to deal with either sending or receiving messages. It also shouldn't have to establish any connections or do any direct sending or receiving of messages over a socket or connection. The user should always go through the UDPDispatcher. The UDPDispatcher is meant to be driven by some outside source, either a client, a test, or just another application. It has none of its own loops or threads.
Although the user should really be using our UDPDispatcher object, we have separated concerns enough that if the user would like to use a Circuit and UDPClient directly, he or she has the freedom to do so.
Another design decision we have considered is that the UDPDispatcher uses only a single socket, even if it manages many Circuits. This means that even if the client that uses the UDPDispatcher is connected to many different sims (has many Circuits established) all network traffic will flow over a single socket. There is not a 1-1 Circuit-to-socket relationship. However, if the user would like such a relationship, he or she simply has to create a UDPDispatcher for every Circuit. So, in other words, a UDPDispatcher can be managing as many Circuits as the user would like, either 1 or many, but the UDPDispatcher always has only one socket.
Using the system
Initializing the system
UDPDispatcher() - this will create a dispatcher for us to be able to send and receive messages on.
Building a message
When someone wants to create a message, one way is to use the MsgData, MsgBlockData, and MsgVariableData objects directly, building them as they go. The old method is to use a builder that will construct the message for us. Unfortunately, this method is really c++-ish and is complicated to use. We ARE using Python, aren't we? Therefore, we have created a wrapper around MsgData to make construction of a message more Pythonic.
Message and Block
The Message and Block classes are derivations of the MsgData and MsgBlockData classes. They allow you to fill in the message data by giving it Python objects. For instance, to create a message one would do:
msg = Message('TestPacket', Block('CircuitCode', ID=1234, Code=531) )
This is creating a message called ‘TestPacket’, giving it a Block named ‘CircuitCode’, and giving the Block the variables ‘ID’ and ‘Code’. Message and Block then parse the passed information and construct the MsgData out of it (Message is a subclass of MsgData, and Block is a subclass of MsgBlockData).
Note that although we send Packets and not Messages or MsgDatas, we don’t actually have to create a Packet, we simply create the Message. The UDPDispatcher will create the Packet out of your message data and fill in the Packet fields with the necessary information.
Also note that at this time there is no checking to make sure you are constructing a message how the template specifies. Building a message is just constructing the MsgData data. The serializer will check to make sure you have everything that is needed and formatted properly.
Sending a message
Once a message has been built, the user can then send the message to a given host (with a host being a combination of the ip address and port).
send_message(message, host) - this method will construct a Packet out of your MsgData and send the message to the host specified. This function also gets the circuit to prepare the Packet, that is, add on any packet flags and add the sequence number to the Packet, adds any acks on to the end. The dispatcher finally makes sure that the created message is serialized. Eventually, the Circuit, or dispatcher, should also compress the message using a zero-coding, but this functionality is not yet in the system (however, we do decode zero-coded messages). This method will return the string buffer that was just sent. The message will be sent using a udp network client, or NetUDPClient.
send_retry(message, host) - sends a message using the RETRY packet flag, but delegates sending to send_messsage.
send_reliable(message, host, retries) - sends a message using the RELIABLE flag, as well as sets the packet's retry count (meaning, how many times this Packet can attempt to be resent before it is deleted if it isn’t acked) but delegates sending to send_messsage.
Receiving a message
Receiving is a bit trickier than sending a message. To receive a message, we don’t go directly through the dispatcher, but use the dispatcher’s udp client to get the data from the socket and pass it to the dispatcher to, well, dispatch it to a handler. So for instance, we could do: msg_buf, msg_size = dispatcher.udp_client.receive_packet(dispatcher.socket)
packet = dispatcher.receive_check(dispatcher.udp_client.get_sender(), msg_buf, msg_size)
This seems a bit confusing at first, but the reason for this abstraction is that an application could (and most likely will) run the socket receiving in a thread or some non-blocking routine. Then, when it gets data, it will just hand off the data to a dispatcher and the dispatcher will take care of it (including deserializing the message and having Circuits check for acks and such). This was designed to work with things like Twisted or Eventlet in such a way that it is swappable, with using the dispatcher’s receive_check as a callback.
receive_check(host, msg_buf, msg_size) - this method processes the message, reading its binary form into a deserialized form, and passes the Packet to the Circuit object that correspond to the ip address and port that the data came in on. The Circuit will then process the message to determine if it needs to be acked, does it have acks attached, it is zero-coded, etc. This function will also return the received Packet for someone else to handle. Note that the dispatcher doesn’t do the Packet handling just yet. It only determines the acks for the Packet. Actually handling the Packet, just as changing client state or responding to the message, should be done otherwise. In the future, the dispatcher may do this as well, but it does not currently do it.
Maintenance
The Message System also has some other features that allow users an easy to way make sure all maintenance is handled properly. Maintenance includes acking packets and removing stale packets (ones that cannot be resent anymore and haven't been acked in time).
process_acks() - this function resends all packets that we have sent out that haven't been acked in time, as well as sending out all the acks of the messages we have received from the server that expect to be acked. It checks each Circuit in the dispatcher and determines if it has any Packets that can be resent, any Packets that need acking, and any Packets that are stale and have expired.
Design Decision
When sending a message, the user specifies the host that will receive the message.
- Advantages:
- The same dispatcher can be used to send to multiple hosts without any reconfiguration. The other method would be to couple the dispatcher with its targeted receiver so that when sending a message it will always go to that receiver. This binds the dispatcher to communicating with only a single host, unless there is added functionality to allow a list of target hosts that one can send to, which effectively brings us back to allowing the dispatcher to send to any host (the original design). This is also not the design of the dispatcher. It was designed to “dispatch” a message to the correct Circuit to be handled properly.
Old UDP Messaging
Message System/API
The way that Pyogp currently does messaging is through the Message System. The Message System provides the API through which you create a connection (or connections), build a send and receive messages. The Message System encapsulates all the other functionality that is needed do start working with messages. That is, it handles all parsing of the message template (message_template.msg), the message list (message.xml), creates dictionaries out of them, creates a UDP socket to send and receives messages from, sets up HTTP connections, and has builders and readers necessary to build and read the template formatted messages and the llsd formatted messages. It also handles all the maintaining of packets that need to be acked, keeping track of which we (being the client) need to ack and which we want acked by the server, as well as sending the acks or resending unacked packets.
The Message System object should really be what any user of Pyogp should need to deal with either sending or receiving messages. The user shouldn't need to use a reader or a builder directly. It also shouldn't have to establish any connections or do any direct sending or receiving of messages over a socket or connection. The user should always go through the Message System. The Message System is meant to be driven by some outside source, either a client, a test, or just another application. It has none of its own loops or threads.
Initializing the system
MessageSystem(port) - the port of which the Message System will receive messages. This is currently not used and may even be removed later.
Building a message
There are a few methods in the Message System that are available as high-level api calls. When a user wants to create a message, he or she does not have to know which type of message the server it expects it to be formatted as. For this reason, the Message System determines which format to use and delegates the creation of building the message to the corresponding builder.
new_message(message_name) - begins creating a new message. The Message System determines which type of message is being built (either the template flavor or the llsd flavor). This determines which builder should be used. The creation of the new message is then delegated to the corresponding builder (transparent to the user).
next_block(block_name) - this tells the builder that we want to begin building the named block of the message. This is just delegated to a builder (see below for the details of how it works).
add_data(var_name, data, data_type) - adds data to the variable in the current block with the var_name name, delegated to the builder.
Sending a message
Once a message has been built, the user can then send the message to a given host (with a host being a combination of the ip address and port).
send_message(host, message_buf=None) - this sends the message that we most recently built using the new_message, next_block, add_data functions. It sends the message to the host specified. The message_buf parameter allows for the user to pass in a message to send that wasn't built using the Message System (such as those created by directly using the builder of choice). This function also makes sure the created message is serialized, adds on any packet flags, adds the sequence number for the packet, adds the packet identification, and finally the payload. It also adds any acks on to the end and compresses the message using a zero-coding.
send_retry(host, message_buf=None) - sends a message using the RETRY packet flag, but delegates sending to send_messsage.
send_reliable(host, retries, message_buf=None) - sends a message using the RELIABLE flag, as well as sets the packet's retry count, but delegates sending to send_messsage.
Receiving a message
receive_check() - determines if there is a message waiting on the socket. Does a single pass to get a single message. If there is a message waiting, it processes the message, reading its binary form into a deserialized form, and determining its flags (that is, does it need to be acked, does it have acks attached, it is zero-coded, etc). The read message can then be accessed through the methods:
get_received_message() - this returns the whole message, in the form of a MsgData object
get_data(block_name, var_name, data_type, block_number=0) - this gets data from a particular block. The block_number is used when the message has multiple or variable blocks.
Maintenance
The Message System also has some other features that allow users an easy to way make sure all maintenance is handled properly. Maintenance includes acking packets and removing stale packets (ones that cannot be resent anymore and haven't been acked in time).
process_acks() - this function resends all packets that we have sent out that haven't been acked in time, as well as sending out all the acks of the messages we have received from the server that expect to be acked.
Design Decision
You'll notice that, in most cases, the user never has direct access to a created or receiving message. The user can, of course, get the message directly by going through the Message System's builder and reader (messagesystem.builder.current_msg), the design is not meant to be used in such a way. The user is not meant to manipulate a message directly but to use builders and readers for such a thing. Even then, the user shouldn't even be using builders and readers but going through the message system, which uses them.
- Advantages:
- Separation of concerns: this means the MsgData class represents a message, the builder builds the message, and the reader reads a message. Each has its own particular function and only that function. There is no object that has more than one piece of functionality.
- Ease of use: the MessageSystem provides the high-level functionality to bring all the pieces together to allow the user to not need to build any message by hand, do any serialization or connections, keep track of circuit information (which sequence # is next, packet acking, etc), build or read packet header information (flags). Also allows the user to not need to know what formats the server expects to receive messages in. The MessageSystem handles this distinction based on message type.
- Generic messages: any message can be built, read, sent, and received through this manner. All messages are read from the message template and so they can all be built with this generic representation of messages, builders, and readers.
- Disadvantages:
- No message object: the messages being built and sent are stored within the builders, readers, and MessageSystem, and so it may not be what users expect. For instance, users may be used to using a message OBJECT directly and passing a given message to something else to send it (rather than just calling send_message()). On the same line, people may be used to having explicit connections that they send/receive on. E.g. conn1.send(message1) rather than messageSystem.send_message() which determines which connection to send the message to.
- Related to number 1, because we have no message object, there is no way to save and reuse objects. We have to reconstruct the message every time we wish to send the same message. This isn't necessarily true because we ARE currently saving packets that have been sent (so that we can resend them if they don't get acked in time). They DO exist, the design just keeps the user from ever needing to have direct access. This may be one case that the Message System could give the user a direct message if he or she so asks for it. This is currently possible; it's just not explicitly in the API.
- Sequential building of message: the messages are built by doing a series of new_message, next_block, add_data calls. Some may want to build a message by having an object (see note above) that they can do direct calls to. e.g. message.name = "Locklainn" message.agent_id = uuid.UUID('blahblah'). The problem here lies in the MULTIPLE and VARIABLE type blocks. For instance, a single message can have the same variable multiple times in the message. So you can't simply do message.name="NAME" because the variable might exist in multiple blocks, and so which block does that variable belong to? Another way one might do it is by having something such as message.name_1= message.name_2=, but this doesn't seem to gain us anything in terms of ease of use in building a message. A given message can have any number of these blocks, so to handle the same thing one would have to use lists or some other container structure. One might have to do the following: message.blocks = [['name':"Locklainn", 'agent_id':UUID("blahblah")]['session_id':123531, 'circuit_code':123656]].
When sending a message, the user specifies the host that will receive the message.
- Advantages:
- The same messaging system can be used to send to multiple hosts without any reconfiguration. The other method would be to couple the messaging system with its targeted receiver so that when sending a message it will always go to that receiver. This binds the messaging system to communicating with only a single host, unless there is added functionality to allow a list of target hosts that one can send to, which effectively brings us back to allowing the message system to send to any host (the original design).
UDP Template Messaging
There are a few main components to sending a UDP message, the message template, the parser, the builder, and the reader.
Message Template Builder
The builder is used to create messages that can be sent through UDP. It is used to make sure that the messages being built are in accordance with the message template and have all the necessary blocks and data that go along with the message. The builder creates message objects using the MsgData, MsgBlockData, and MsgVariableData classes. These objects differ from the template versions in that they hold the actual data. They don't hold general information but only exactly what will be sent through UDP. However, they hold the data in object form, and therefore are not serialized. The builder also serializes the message once it is finished being built.
Messages are built in a sequence of steps:
- new_message(message_name) - this method sets up a new message to begin being built. The message is filled in with all the block stubs that it needs to have.
- next_block(block_name) - sets the block that we are building to be the block with the given block name. It also fills in the stubbed block with all the variables that the block needs to have.
- Note that if we are trying to set the block to one that doesn't exist, or that has already been created, we will get errors. However, if the block is of type multiple or variable, then we can create more than one block with the same name. A multiple type block means that there is a fixed number of blocks that the message must have, so we can add that number of blocks to the message (but no more than that number). A variable type block means that there can be any number of blocks, so we can add any number of this block to the message (meaning, we can call next_block() with the same block_name any number of times).
- add_data(var_name, data, data_type) - this adds the data to the block. There are a couple checks to make sure that the data you pass it matches the data it is expecting (from the template).
- When we add data we store the size of the data. Now, normally we know the size directly from the data type (where both type and size are stored in the template). But when the data is of type variable then the template doesn't store the size of the data (because it can't, the data can be any size). However, the template stores how many bytes the size can be. So, for type variable, we have to determine the size of the actual data being written and store that instead.
- build_message() - this goes through the message, each of its blocks and data, and serializes the data into a string that can be sent over UDP. Before it does so, it makes sure the data added to the message is correct, that it has all the blocks and variables it is supposed to have, and in the right format. This returns the message and size that has been serialized.
- To figure out the format of the message and what the serializes is doing, check out the Pyogp/Client_Lib/Notes page.
- This uses the DataPacker to pack the data correctly. The DataPacker takes the data and the data_type and determines how to serialize the data given the type.
Message Template Reader
The reader attempts to read a message that has been received through UDP. The reader can only process one message at a time. There are a few steps to processing and using the read data:
- validate_message(buffer,size) - this attempts to decode the message and figure out what type of message it is. It also checks to make sure the message is valid (in the message template), and keeps track of the template if it is.
- read_message(buffer) - goes through the message, skipping over the header and pre-header information (that gets added on by some other process) and processes the blocks and the data. This also makes sure that the message was validated first. Simply iterates over the buffer, reading block information and data information and constructs a MsgData object out of it. This deserializes the data back into binary form.
- Note that if the block is multiple or variable, it repeatedly reads the block data until it has processed all of the message data.
- The DataUnpacker is used to deserialize the data. It is given the data and type.
- get_data(block_name, var_name, data_type, block_number=0) - this returns the data that was deserialized for the message. We give it the block name of which block to find the variable, the data type to make sure that the user is aware of the type (error checks) and as an optional argument, the block_number (which is only used when the message has many blocks of the same time, aka, the block type is multiple or variable).
- {optional}clear_message() - this gets the reader ready to read a new message. The reader won't crash without doing this, but warnings will be issued to make sure this is the desired behavior.