Sunday, February 20, 2011

SOAP Encoding

This article is intended for users who understand the syntax of SOAP and want to know more about the power of SOAP. For people who want to know more read here.

The real power of SOAP stems from some of the capability discussed before. Let us try to look at one of them here- "The SOAP encoding mechanism".

Let us first understand the meaning and utility encoding. We understand and appreciate that the power of SOAP comes from its ability to transfer messages in XML format. Encoding is the process of converting any user specific data into XML format. This data could be any of the following:

  1. A remote procedure call (RPC) to a special function/procedure at the remote computer.
  2. A request/order that is valid in the business sense and the remote computer is capable of understanding.
  3. Java objects that are to be transferred to the other end.
  4. A physical document (ex. A mail message or text document) that has to be transferred.
To bring in flexibility in the approach it is very essential that we have a standard way of doing things. We have a standard way of "encoding" the data so that either computer can easily understand the data. This is where the SOAP Encoding comes into picture. The SOAP Encoding allows us to transfer messages of any type within the SOAP envelope. There are two types of encoding

  1. SOAP encoding.
  2. Literal encoding.
SOAP encoding is a set of rules defined in the SOAP specification that tell us how to convert any data into XML format. It follows the concept of SOAP Data Model, which views all data as Object graphs and provides a way to "encode" this graph into XML. Some of the salient features provided by them are:

  1. All data is viewed as nodes which have some value.
  2. The nodes could be connected to other nodes via directional edges called labels.
  3. A node could have incoming-edge, outgoing-edge or both.
  4. A simple value edge with a labeled inbound edge will be serialized into single XML element with label as element name and value as element's text content.
  5. When serializing a node into an XML element, an xsi:type can be added to specify the type of this node.
  6. A compound value node with labeled outbound edges, a data structure, will be serialized into a single XML element with child elements. One outbound edge will be serialized into one child element with element's name equal to edge's label.

     
Example of a Data Model:


class Product{
     String desc;
     String type;
     String sku;
     double price;
}
A java class
A graph notation


The above can be represented in XML format as according to the rules defined in SOAP spec as below:

<product>
<type xsi:type="xs:string">type of prod</type>
<desc xsi:type="xs:string">desc of product</desc>
<price xsi:type="xs:double">100.00</price>
<sku xsi:type="xs:string">xyz</sku>
</product>


This form of encoding has certain disadvantages:

  1. There is no way to tell if the encoding is valid or not (since we can't perform any validation on the message)
  2. The presence of xsi:type is not recommended by WS-I Basic Profile 1.0
  3. SOAP encoding is not supported by WS-I Basic Profile 1.0


"Literal" means a XML document fragment that can be validated against its XML Schema. The biggest advantage of the literal encoding is the possibility to "validate" a XML message at every stage of the message. Literal supports 2 messaging modes:

  1. Document/literal
  2. Document wrapped/literal
  3. RPC/literal
Document/Literal – In the Document/Literal SOAP messaging mode, the Body element of a SOAP message contains an XML document fragment i.e. a well formed XML element that contains arbitrary data and belongs to an XML schema and namespace separate from the SOAP message's XML schema and namespace.

    The term "arbitrary data" signifies that it does not pertain to any standardized (e.g. w3c,ws-i) namespace. It could/should however have some business significance, something that makes logical sense to the requester and servicer. The second thing to note in the Document/Literal encoding is that the SOAP Body can have more than one child elements.

<soap:Envelope><soap:Body>
<product>
<price xsi:type="xs:double">100.00</price>
<sku xsi:type="xs:string">xyz</sku>
</product>
<product>
<price xsi:type="xs:double">200.00</price>
<sku xsi:type="xs:string">abc</sku>
</product>
</soap:Body></soap:Envelope>
<soap:Envelope><soap:Body>
<getProducts>
<product>
<price xsi:type="xs:double">100.00</price>
<sku xsi:type="xs:string">xyz</sku>
</product>
<product>
<price xsi:type="xs:double">200.00</price>
<sku xsi:type="xs:string">abc</sku>
</product>
</getProducts>
</soap:Body></soap:Envelope>
Document/Literal
Document wrapped/literal


Document Wrapped/Literal – This differs from the Document Literal in only one feature namely that it can contain only one child element. If there are to be multiple elements the SOAP encoding style takes care of "wrapping" it. It uses the name of the operation/method to create a parent element for the "Document".

RPC/literal – The RPC/literal style of encoding is used to expose the traditional components as web services. Such components do not explicitly exchange XML data but have methods with parameters and return values. Unlike the above encoding style which may contain any arbitrary data, the SOAP specification fixes some rules for the RPC/literal.

  1. The name of the method to be invoked will be the child element of the SOAP:Body element.
  2. The name of the method appended with "Response" will be the child element of SOAP:Body for the response.
  3. The name of the method parameters will be the child element of the method name in order that they appear.
  4. The response of the method will contain a child called "return".
The biggest difference between the RPC/literal and RPC/encoded however is the use of xsi:type tags which are used with encoding. Also under literal encoding it is possible and advised to have a XSD available to validate the XML request.



Though all the above encoding styles can be used, the one that is most recommended is the Document/Literal with only on child element of SOAP:Body. This provides the complete flexibility with XML messaging and also provides validation mechanisms like XSD. One more very important thing to understand is that these encoding styles make a lot of difference to any API that implements the SOAP specification. For an end-user, there is very little difference between these encoding styles other than to predict the sample request/response for any operation. However if we specify any of the above encoding styles to the API it will tackle the request correspondingly and treat it as per the rules defined above.


1 comment: