ovs-fields(7)
fields - protocol header fields in OpenFlow and Open vSwitch
Description
'\" tp
. PP . RS -0.15in . I "\\$1" . RE
. PP . I "\\$1"
. br . ns . IP "\\$1"
. br . ns . TP "\\$1"
\\$2 \(laURL: \\$1 \(ra\\$3
ovs-fields - protocol header fields in OpenFlow and Open vSwitch .
This document aims to comprehensively document all of the fields, both standard and non-standard, supported by OpenFlow or Open vSwitch, regardless of origin\[char46]
A field is a property of a packet\[char46] Most familiarly, data fields are fields that can be extracted from a packet\[char46] Most data fields are copied directly from protocol headers, e\[char46]g\[char46] at layer 2, the Ethernet source and destination addresses, or the VLAN ID; at layer 3, the IPv4 or IPv6 source and destination; and at layer 4, the TCP or UDP ports\[char46] Other data fields are computed, e\[char46]g\[char46] ip_frag describes whether a packet is a fragment but it is not copied directly from the IP header\[char46]
Data fields that are always present as a consequence of the basic networking technology in use are called called root fields\[char46] Open vSwitch 2\[char46]7 and earlier considered Ethernet fields to be root fields, and this remains the default mode of operation for Open vSwitch bridges\[char46] When a packet is received from a non-Ethernet interfaces, such as a layer-3 LISP tunnel, Open vSwitch 2\[char46]7 and earlier force-fit the packet to this Ethernet-centric point of view by pretending that an Ethernet header is present whose Ethernet type that indicates the packet\(cqs actual type (and whose source and destination addresses are all-zero)\[char46]
Open vSwitch 2\[char46]8 and later implement the ``packet type-aware pipeline\(cq\(cq concept introduced in OpenFlow 1\[char46]5\[char46] Such a pipeline does not have any root fields\[char46] Instead, a new metadata field, packet_type, indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6, or another type\[char46] For backward compatibility, by default Open vSwitch 2\[char46]8 imitates the behavior of Open vSwitch 2\[char46]7 and earlier\[char46] Later versions of Open vSwitch may change the default, and in the meantime controllers can turn off this legacy behavior, on a port-by-port basis, by setting options:packet_type to ptap in the Interface table\[char46] This is significant only for ports that can handle non-Ethernet packets, which is currently just LISP, VXLAN-GPE, and GRE tunnel ports\[char46] See ovs-vwitchd\[char46]conf\[char46]db(5) for more information\[char46]
Non-root data fields are not always present\[char46] A packet contains ARP fields, for example, only when its packet type is ARP or when it is an Ethernet packet whose Ethernet header indicates the Ethertype for ARP, 0x0806\[char46] In this documentation, we say that a field is applicable when it is present in a packet, and inapplicable when it is not\[char46] (These are not standard terms\[char46]) We refer to the conditions that determine whether a field is applicable as prerequisites\[char46] Some VLAN-related fields are a special case: these fields are always applicable for Ethernet packets, but have a designated value or bit that indicates whether a VLAN header is present, with the remaining values or bits indicating the VLAN header\(cqs content (if it is present)\[char46]
An inapplicable field does not have a value, not even a nominal ``value\(cq\(cq such as all-zero-bits\[char46] In many circumstances, OpenFlow and Open vSwitch allow references only to applicable fields\[char46] For example, one may match (see Matching, below) a given field only if the match includes the field\(cqs prerequisite, e\[char46]g\[char46] matching an ARP field is only allowed if one also matches on Ethertype 0x0806 or the packet_type for ARP in a packet type-aware bridge\[char46]
Sometimes a packet may contain multiple instances of a header\[char46] For example, a packet may contain multiple VLAN or MPLS headers, and tunnels can cause any data field to recur\[char46] OpenFlow and Open vSwitch do not address these cases uniformly\[char46] For VLAN and MPLS headers, only the outermost header is accessible, so that inner headers may be accessed only by ``popping\(cq\(cq (removing) the outer header\[char46] (Open vSwitch supports only a single VLAN header in any case\[char46]) For tunnels, e\[char46]g\[char46] GRE or VXLAN, the outer header and inner headers are treated as different data fields\[char46]
Many network protocols are built in layers as a stack of concatenated headers\[char46] Each header typically contains a ``next type\(cq\(cq field that indicates the type of the protocol header that follows, e\[char46]g\[char46] Ethernet contains an Ethertype and IPv4 contains a IP protocol type\[char46] The exceptional cases, where protocols are layered but an outer layer does not indicate the protocol type for the inner layer, or gives only an ambiguous indication, are troublesome\[char46] An MPLS header, for example, only indicates whether another MPLS header or some other protocol follows, and in the latter case the inner protocol must be known from the context\[char46] In these exceptional cases, OpenFlow and Open vSwitch cannot provide insight into the inner protocol data fields without additional context, and thus they treat all later data fields as inapplicable until an OpenFlow action explicitly specifies what protocol follows\[char46] In the case of MPLS, the OpenFlow ``pop MPLS\(cq\(cq action that removes the last MPLS header from a packet provides this context, as the Ethertype of the payload\[char46] See Layer 2\[char46]5: MPLS for more information\[char46]
OpenFlow and Open vSwitch support some fields other than data fields\[char46] Metadata fields relate to the origin or treatment of a packet, but they are not extracted from the packet data itself\[char46] One example is the physical port on which a packet arrived at the switch\[char46] Register fields act like variables: they give an OpenFlow switch space for temporary storage while processing a packet\[char46] Existing metadata and register fields have no prerequisites\[char46]
A field\(cqs value consists of an integral number of bytes\[char46] For data fields, sometimes those bytes are taken directly from the packet\[char46] Other data fields are copied from a packet with padding (usually with zeros and in the most significant positions)\[char46] The remaining data fields are transformed in other ways as they are copied from the packets, to make them more useful for matching\[char46]
The most important use of fields in OpenFlow is matching, to determine whether particular field values agree with a set of constraints called a match\[char46] A match consists of zero or more constraints on individual fields, all of which must be met to satisfy the match\[char46] (A match that contains no constraints is always satisfied\[char46]) OpenFlow and Open vSwitch support a number of forms of matching on individual fields:
Exact match, e\[char46]g\[char46] nw_src=10\[char46]1\[char46]2\[char46]3 Only a particular value of the field is matched; for example, only one particular source IP address\[char46] Exact matches are written as field=value\[char46] The forms accepted for value depend on the field\[char46]
All fields support exact matches\[char46]
Bitwise match, e\[char46]g\[char46] nw_src=10\[char46]1\[char46]0\[char46]0/255\[char46]255\[char46]0\[char46]0 Specific bits in the field must have specified values; for example, only source IP addresses in a particular subnet\[char46] Bitwise matches are written as field=value/mask, where value and mask take one of the forms accepted for an exact match on field\[char46] Some fields accept other forms for bitwise matches; for example, nw_src=10\[char46]1\[char46]0\[char46]0/255\[char46]255\[char46]0\[char46]0 may also be written nw_src=10\[char46]1\[char46]0\[char46]0/16\[char46]
Most OpenFlow switches do not allow every bitwise matching on every field (and before OpenFlow 1\[char46]2, the protocol did not even provide for the possibility for most fields)\[char46] Even switches that do allow bitwise matching on a given field may restrict the masks that are allowed, e\[char46]g\[char46] by allowing matches only on contiguous sets of bits starting from the most significant bit, that is, ``CIDR\(cq\(cq masks [RFC 4632]\[char46] Open vSwitch does not allows bitwise matching on every field, but it allows arbitrary bitwise masks on any field that does support bitwise matching\[char46] (Older versions had some restrictions, as documented in the descriptions of individual fields\[char46])
Wildcard, e\[char46]g\[char46] ``any nw_src\(cq\(cq The value of the field is not constrained\[char46] Wildcarded fields may be written as field=*, although it is unusual to mention them at all\[char46] (When specifying a wildcard explicitly in a command invocation, be sure to using quoting to protect against shell expansion\[char46])
There is a tiny difference between wildcarding a field and not specifying any match on a field: wildcarding a field requires satisfying the field\(cqs prerequisites\[char46]
Some types of matches on individual fields cannot be expressed directly with OpenFlow and Open vSwitch\[char46] These can be expressed indirectly:
Set match, e\[char46]g\[char46] ``tcp_dst \[mo] {80, 443, 8080}\(cq\(cq The value of a field is one of a specified set of values; for example, the TCP destination port is 80, 443, or 8080\[char46]
For matches used in flows (see Flows, below), multiple flows can simulate set matches\[char46]
Range match, e\[char46]g\[char46] ``1000 \[<=] tcp_dst \[<=] 1999\(cq\(cq The value of the field must lie within a numerical range, for example, TCP destination ports between 1000 and 1999\[char46]
Range matches can be expressed as a collection of bitwise matches\[char46] For example, suppose that the goal is to match TCP source ports 1000 to 1999, inclusive\[char46] The binary representations of 1000 and 1999 are:
\fL
\fL01111101000
\fL11111001111
\fL
The following series of bitwise matches will match 1000 and 1999 and all the values in between:
\fL
\fL01111101xxx
\fL0111111xxxx
\fL10xxxxxxxxx
\fL110xxxxxxxx
\fL1110xxxxxxx
\fL11110xxxxxx
\fL1111100xxxx
\fL
which can be written as the following matches:
tcp,tp_src=0x03e8/0xfff8
tcp,tp_src=0x03f0/0xfff0
tcp,tp_src=0x0400/0xfe00
tcp,tp_src=0x0600/0xff00
tcp,tp_src=0x0700/0xff80
tcp,tp_src=0x0780/0xffc0
tcp,tp_src=0x07c0/0xfff0
Inequality match, e\[char46]g\[char46] ``tcp_dst \[!=] 80\(cq\(cq The value of the field differs from a specified value, for example, all TCP destination ports except 80\[char46]
An inequality match on an n-bit field can be expressed as a disjunction of n 1-bit matches\[char46] For example, the inequality match ``vlan_pcp \[!=] 5\(cq\(cq can be expressed as ``vlan_pcp = 0/4 or vlan_pcp = 2/2 or vlan_pcp = 0/1\[char46]\(cq\(cq For matches used in flows (see Flows, below), sometimes one can more compactly express inequality as a higher-priority flow that matches the exceptional case paired with a lower-priority flow that matches the general case\[char46]
Alternatively, an inequality match may be converted to a pair of range matches, e\[char46]g\[char46] tcp_src \[!=] 80 may be expressed as ``0 \[<=] tcp_src < 80 or 80 < tcp_src \[<=] 65535\(cq\(cq, and then each range match may in turn be converted to a bitwise match\[char46]
Conjunctive match, e\[char46]g\[char46] ``tcp_src \[mo] {80, 443, 8080} and tcp_dst \[mo] {80, 443, 8080}\(cq\(cq As an OpenFlow extension, Open vSwitch supports matching on conditions on conjunctions of the previously mentioned forms of matching\[char46] See the documentation for conj_id for more information\[char46]
All of these supported forms of matching are special cases of bitwise matching\[char46] In some cases this influences the design of field values\[char46] ip_frag is the most prominent example: it is designed to make all of the practically useful checks for IP fragmentation possible as a single bitwise match\[char46]
Some matches are very commonly used, so Open vSwitch accepts shorthand notations\[char46] In some cases, Open vSwitch also uses shorthand notations when it displays matches\[char46] The following shorthands are defined, with their long forms shown on the right side:
eth packet_type=(0,0) (Open vSwitch 2\[char46]8 and later)
ip eth_type=0x0800
ipv6 eth_type=0x86dd
icmp eth_type=0x0800,ip_proto=1
icmp6 eth_type=0x86dd,ip_proto=58
tcp eth_type=0x0800,ip_proto=6
tcp6 eth_type=0x86dd,ip_proto=6
udp eth_type=0x0800,ip_proto=17
udp6 eth_type=0x86dd,ip_proto=17
sctp eth_type=0x0800,ip_proto=132
sctp6 eth_type=0x86dd,ip_proto=132
arp eth_type=0x0806
rarp eth_type=0x8035
mpls eth_type=0x8847
mplsm eth_type=0x8848
The discussion so far applies to all OpenFlow and Open vSwitch versions\[char46] This section starts to draw in specific information by explaining, in broad terms, the treatment of fields and matches in each OpenFlow version\[char46]
OpenFlow 1\[char46]0 defined the OpenFlow protocol format of a match as a fixed-length data structure that could match on the following fields:
Ingress port\[char46]
Ethernet source and destination MAC\[char46]
Ethertype (with a special value to match frames that lack an Ethertype)\[char46]
VLAN ID and priority\[char46]
IPv4 source, destination, protocol, and DSCP\[char46]
TCP source and destination port\[char46]
UDP source and destination port\[char46]
ICMPv4 type and code\[char46]
ARP IPv4 addresses (SPA and TPA) and opcode\[char46]
Each supported field corresponded to some member of the data structure\[char46] Some members represented multiple fields, in the case of the TCP, UDP, ICMPv4, and ARP fields whose presence is mutually exclusive\[char46] This also meant that some members were poor fits for their fields: only the low 8 bits of the 16-bit ARP opcode could be represented, and the ICMPv4 type and code were padded with 8 bits of zeros to fit in the 16-bit members primarily meant for TCP and UDP ports\[char46] An additional bitmap member indicated, for each member, whether its field should be an ``exact\(cq\(cq or ``wildcarded\(cq\(cq match (see Matching), with additional support for CIDR prefix matching on the IPv4 source and destination fields\[char46]
Simplicity was recognized early on as the main virtue of this approach\[char46] Obviously, any fixed-length data structure cannot support matching new protocols that do not fit\[char46] There was no room, for example, for matching IPv6 fields, which was not a priority at the time\[char46] Lack of room to support matching the Ethernet addresses inside ARP packets actually caused more of a design problem later, leading to an Open vSwitch extension action specialized for dropping ``spoofed\(cq\(cq ARP packets in which the frame and ARP Ethernet source addressed differed\[char46] (This extension was never standardized\[char46] Open vSwitch dropped support for it a few releases after it added support for full ARP matching\[char46])
The design of the OpenFlow fixed-length matches also illustrates compromises, in both directions, between the strengths and weaknesses of software and hardware that have always influenced the design of OpenFlow\[char46] Support for matching ARP fields that do fit in the data structure was only added late in the design process (and remained optional in OpenFlow 1\[char46]0), for example, because common switch ASICs did not support matching these fields\[char46]
The compromises in favor of software occurred for more complicated reasons\[char46] The OpenFlow designers did not know how to implement matching in software that was fast, dynamic, and general\[char46] (A way was later found [Srinivasan]\[char46]) Thus, the designers sought to support dynamic, general matching that would be fast in realistic special cases, in particular when all of the matches were microflows, that is, matches that specify every field present in a packet, because such matches can be implemented as a single hash table lookup\[char46] Contemporary research supported the feasibility of this approach: the number of microflows in a campus network had been measured to peak at about 10,000 [Casado, section 3\[char46]2]\[char46] (Calculations show that this can only be true in a lightly loaded network [Pepelnjak]\[char46])
As a result, OpenFlow 1\[char46]0 required switches to treat microflow matches as the highest possible priority\[char46] This let software switches perform the microflow hash table lookup first\[char46] Only on failure to match a microflow did the switch need to fall back to checking the more general and presumed slower matches\[char46] Also, the OpenFlow 1\[char46]0 flow match was minimally flexible, with no support for general bitwise matching, partly on the basis that this seemed more likely amenable to relatively efficient software implementation\[char46] (CIDR masking for IPv4 addresses was added relatively late in the OpenFlow 1\[char46]0 design process\[char46])
Microflow matching was later discovered to aid some hardware implementations\[char46] The TCAM chips used for matching in hardware do not support priority in the same way as OpenFlow but instead tie priority to ordering [Pagiamtzis]\[char46] Thus, adding a new match with a priority between the priorities of existing matches can require reordering an arbitrary number of TCAM entries\[char46] On the other hand, when microflows are highest priority, they can be managed as a set-aside portion of the TCAM entries\[char46]
The emphasis on matching microflows also led designers to carefully consider the bandwidth requirements between switch and controller: to maximize the number of microflow setups per second, one must minimize the size of each flow\(cqs description\[char46] This favored the fixed-length format in use, because it expressed common TCP and UDP microflows in fewer bytes than more flexible ``type-length-value\(cq\(cq (TLV) formats\[char46] (Early versions of OpenFlow also avoided TLVs in general to head off protocol fragmentation\[char46])
OpenFlow 1\[char46]0 does not clearly specify how to treat inapplicable fields\[char46] The members for inapplicable fields are always present in the match data structure, as are the bits that indicate whether the fields are matched, and the ``correct\(cq\(cq member and bit values for inapplicable fields is unclear\[char46] OpenFlow 1\[char46]0 implementations changed their behavior over time as priorities shifted\[char46] The early OpenFlow reference implementation, motivated to make every flow a microflow to enable hashing, treated inapplicable fields as exact matches on a value of 0\[char46] Initially, this behavior was implemented in the reference controller only\[char46]
Later, the reference switch was also changed to actually force any wildcarded inapplicable fields into exact matches on 0\[char46] The latter behavior sometimes caused problems, because the modified flow was the one reported back to the controller later when it queried the flow table, and the modifications sometimes meant that the controller could not properly recognize the flow that it had added\[char46] In retrospect, perhaps this problem should have alerted the designers to a design error, but the ability to use a single hash table was held to be more important than almost every other consideration at the time\[char46]
When more flexible match formats were introduced much later, they disallowed any mention of inapplicable fields as part of a match\[char46] This raised the question of how to translate between this new format and the OpenFlow 1\[char46]0 fixed format\[char46] It seemed somewhat inconsistent and backward to treat fields as exact-match in one format and forbid matching them in the other, so instead the treatment of inapplicable fields in the fixed-length format was changed from exact match on 0 to wildcarding\[char46] (A better classifier had by now eliminated software performance problems with wildcards\[char46])
The OpenFlow 1\[char46]0\[char46]1 errata (released only in 2012) added some additional explanation [OpenFlow 1\[char46]0\[char46]1, section 3\[char46]4], but it did not mandate specific behavior because of variation among implementations\[char46]
The OpenFlow 1\[char46]1 protocol match format was designed as a type/length/value (TLV) format to allow for future flexibility\[char46] The specification standardized only a single type OFPMT_STANDARD (0) with a fixed-size payload, described here\[char46] The additional fields and bitwise masks in OpenFlow 1\[char46]1 cause this match structure to be over twice as large as in OpenFlow 1\[char46]0, 88 bytes versus 40\[char46]
OpenFlow 1\[char46]1 added support for the following fields:
SCTP source and destination port\[char46]
MPLS label and traffic control (TC) fields\[char46]
One 64-bit register (named ``metadata\(cq\(cq)\[char46]
OpenFlow 1\[char46]1 increased the width of the ingress port number field (and all other port numbers in the protocol) from 16 bits to 32 bits\[char46]