Routing
Group Master
One
server in each routing group is defined as the Routing Master. By default, the
first server installed into a routing group takes on the role of the routing
master and remains as such unless altered by an administrator. Any server can
be given the role, whose most important and critical task is to maintain the
link state table (LST). The LST is automatically shared between all the servers
in a routing group and contains details of the current connectivity between
servers in the group as well as connections to other routing groups. As
bridgehead servers become aware of information about connections (their link
state), they relay the data to the Routing Master, which creates an updated LST
every time an update arrives from any source. After the LST is updated, the
Routing Master broadcasts change information to all the servers in the routing
group, and the servers adjust their own copy of the link state table. The
Routing Master also allows the entire routing group to be managed as a single
entity by providing a single point of contact to make administrative changes
for the group.
The Routing Master role is easily moved
between servers, as shown in Figure 13. Select the server that you want to take
on the role and right-click to bring up a context-sensitive menu. Then select
the “Set as Master” option.
Occasionally, the Routing Master is
unavailable due to maintenance or a system outage. When this happens the other
servers in the routing group continue to route based on the last known LST.
Routing in this situation may not follow an optimal path because messages may
well be sent to a server whose link is unavailable, or there may be a better
route available. However, even in its sub-optimal state, the last LST is likely
to be far more up to date with network information than the GWART used by the
MTA to route messages in Exchange 5.5. The GWART is normally generated once a
day, or after a new server or connector is added to a site or if an
administrator decided to explicitly request it to be generated. As such, the
GWART is relatively static. Problems usually didn’t surface with the GWART when
the network was stable, but if connectors or servers go offline it can be
difficult to regenerate an accurate picture of the network and get messages to
flow optimally again.
Creating
New Routing Groups
Creating
a new Routing Group is easy, mostly because a new Routing Group is just a
container in the Exchange organization immediately after it is created. Unlike
an Exchange 5.5 site, which is instantiated when a server is installed into the
site, a Routing Group can be defined and present long before the first server
is moved into the group. This is a useful feature, because it allows the
Routing Group structure for an Exchange organization to be configured with just
a few servers in place. It also avoids the complications that arise in Exchange
5.5 when the original first server is removed from a site and all the site
objects (like the default set of system folders) need to be reconfigured.
To create a new Routing Group, select the
Routing Groups container and take the “New” option from the right-click menu,
as shown in Figure 14. This can be done even when your organization is
operating in mixed mode — the sole requirement is that a routing group
container already exists in the Administrative Group. If you haven’t yet
created a routing group container, click on the Administrative Group; take the
“New” option from the context-sensitive menu, and then the “Routing Group
Container” choice.
This makes it obvious that a Routing Group
is just yet another a container within the configuration container for an
Exchange organization, so the properties aren’t very exciting. Like Windows
2000 sites, I favour naming Routing Groups after geographic terms so that their
purpose and coverage is immediately obvious. The alternative is to come up with
another naming scheme, but I don’t see the point unless you really want to sit
down and construct a set of complex names that only you and your fellow
administrators understand. However, consider the case of a new administrator
who comes on board. Will they understand the naming and purpose of Routing
Groups if you don’t use simple names? In all cases, the important thing is that
users won’t be bothered with whatever you do, because all of the organizational
detail about Exchange is hidden well away from them. Figure 15 shows an example
of the type of naming convention I recommend for Routing Groups: simple and
easy to understand.
After the new Routing Group is created,
you can add servers to it by selecting a server and dragging it to the
“Members” container of the new Routing Group. You can also install servers
directly into the new Routing Group.
Dragging and dropping servers between
Routing Groups just to see what happens seems like a fun way to idle away a
rainy afternoon, but it’s really only an exercise that should be carried out in
a software laboratory for test purposes. Never move a server unless you have
to, and only perform the operation after everyone involved in the
administration group has approved. In other words, it’s still good to have the
discipline to plan server moves rather than move servers around on a whim.
Exchange imposes some restrictions on
server moves. You can’t move a server to another Routing Group if the server is
a bridgehead for a connector. Connectors and Routing Groups are fundamental
inputs to the routing tables, and moving a server that hosts a number of
connectors would have a huge impact on the way messages are routed. While
Exchange 2000 is much more amenable to change and has a mechanism to discover
changes in the network very quickly (link state routing) as well as taking
updates through changes made to configuration data managed by the AD, it’s
still not a good idea to make wholesale changes. Figure 16 illustrates the type
of error you’ll see if you attempt to move a server that still has active
bridgeheads. In this instance the server is in the Dublin Routing Group and is
pointed to by two servers in the Belfast Routing Group. The server also acts as
a bridgehead for two connectors in its current Routing Group. If a server is
the only member of a Routing Group, you will have to delete all connectors in
the group (a connector must have at least one bridgehead server allocated to
it) before you can move the server to a new Routing Group.
LINK STATE ROUTING
Any
large messaging infrastructure is usually in a state of flux. While network
links are normally available all the time, human, computer, or network error
can conspire to interrupt traffic on a circuit and block the flow of messages.
The more distributed and extensive the network, the more likely it is that some
part of the network is currently unavailable.
As mentioned earlier, Exchange 5.5 uses
the GWART to maintain a list of the routes messages can take to a final
destination, including gateways. The GWART does not attempt to keep track of
temporary network outages and merely consists of information about routes. If
the route is blocked for any reason, large queues can quickly build up in the
MTA or connectors.
Exchange 2000 sends link state information
between servers to provide an up-to-date picture of available routes messages
can take. Two methods are used to send link state information.
Within
a Routing Group, the Routing Service on each bridgehead server binds to port
69114 via the IIS on the routing master to send and receive link state table
updates. Communication occurs using a special protocol called LSA15, specially
developed by Microsoft for this purpose. In its turn, after receiving updates
from bridgehead servers, the routing master broadcasts changes to all the
servers in the routing group. The architecture allows another protocol to be
inserted instead, if one is agreed by the IETF.
Between
Routing Groups, link state table updates are sent between bridgehead servers
whenever an update is available. The bridgehead server then passes the data on
to the Routing Master. If the RGC or SMTP connector is used, SMTP messages are
sent between the servers using port 25 instead of the LSA protocol across port
691. The connection starts with an “EHLO” to tell the server that ESMTP is
going to be used, and then a “X-LINK2STATE” command to advertise the fact that
the server is capable of exchanging link state information. If the receiving
server acknowledges the command, the two servers then trade link state
information. The link state data is passed in a highly compressed format and
only requires a single DWORD to pass the up/down information; so little overhead
is required to accommodate the basic data. Configuration data updates take up a
little more space, and the GUID and digest for the organization is also passed,
but the overhead remains small16. X.400 connections use a field to store and
transmit link state information. Before any information is exchanged, servers
check that they are connected to another server in the same Exchange
organization by verifying that both share the same organizational GUID and
digest. The digest contains a hash of the organization name and version number
and is used to provide a string value that can be quickly checked against a
value generated by another server. If the check is passed, they proceed with
the update.
The SMTP virtual server log
(\WINNT\SYSTEM32\LOGFILES\SMTPSVC1 is the location for the default virtual
server) is a good place to look to gain an insight into the way that link state
routing information is passed around. The log file extract in Figure 17 shows
two typical transactions between servers. The first transaction is with the
server with IP address 19.209.12.154 and consists of a pretty standard
HELO/MAIL/RCPT/DATA/QUIT command sequence. These commands establish a link,
state whom a message is going to, say whom the message is from, send some
message data, and then terminate the connection. We know that the server we’re
corresponding with is not running Exchange 2000 as the initial connect is made
with the HELO command rather than EHLO. The next transaction is with the server
with IP address 19.40.65.204. Note that the conversation begins with EHLO. This
doesn’t automatically mean that the remote server is running Exchange 2000, as
there are many other SMTP servers that support extended SMTP, but it’s a good
start. The log doesn’t tell us how the remote server responded to the EHLO
command, but the fact that Exchange then issues a X-LINK2STATE command is a
very good indicator that we’ve connected to another Exchange 2000 server. As it
happens, I know that the remote server is a bridgehead for another routing group.
Bridgeheads always take the opportunity to update each other every time they
talk.
Sending link state information doesn’t
take very much time (less than a second in this case), but it is always done
first to allow the remote server to update its LST if the need arises. After
the link state information is passed, the two servers settle down to the normal
sequence of commands necessary to send a message and the transaction is then
terminated.
Like the GWART, the LST is held in memory.
Unfortunately, unlike the GWART, there is no on-disk representation. While its
format is a little esoteric, the GWART can be analysed on an Exchange 5.5
server by editing or viewing the \MTADATA\GWART0.MTA file. Sometimes, especially
when you’re trying to work out just how messages are being routed, the GWART
can deliver an insight that helps to solve a problem. An extract from the GWART
on an Exchange 5.5 server in the Compaq organization can be seen in Figure 18.
The extract shows the X.400 routing information that is used to direct messages
from the site that generated the GWART to other sites in the organization. As
it happens, Compaq uses a hub and spoke organization; so all messages are
directed to a site known as the global hub first before they are routed to a
connector from the global hub to the destination site. Multiple hops are
sometimes required. For instance, you can see that messages sent to the CKK
site (o=<CKK>) must first pass to the global hub (X400HUB) and then along
a connector called “X.400 Connector 2 — CORP” before arriving at “X.400
Connector 1 to CKK”. The CKK site is in Greater China, so the network
connections are reasonably complex and result in the multi-hop route.
The LST holds information about connection
availability and cost for an entire organization. However, the LST is organized
much differently to the GWART. AS discussed earlier, the GWART is usually
generated once a day and remains static thereafter, unless a new connector or
site is added to the organization. While it can remain static and will be if
the Exchange organization does not change and network outages force no
adjustments, the LST is often in a state of dynamic flux as the Routing Group
Master updates it with information coming in from other servers in the same
Routing Group plus information from other Routing Groups.
Think of the organization LST as being
built from many different tables, one for each Routing Group, as shown in
Figure 19. The concept of availability extends beyond the boundaries of
Exchange, as external connections are also included. This prevents Exchange
attempting to continually send messages across a connection such as an external
Internet gateway when a network link is down. When a particular link fails, a
retry is attempted. If the retry fails an event is fired to tell the server
that it must issue a link state update to the RG master. This is a good example
of how events are fully integrated into Exchange 2000 rather than being an
interesting programming interface on the side as was the case in Exchange 5.5.
A version number is maintained for the
link state information for each Routing Group. The version number can only be
incremented by the Routing Group Master, and this happens whenever a change is
made to the information as a result of a link state update. The version number
is used when two Routing Groups compare information about the state of the
network. The data held by each Routing Group can be compared to a view of the
network, and if the versions don’t match during link state operations, then the
servers know that they have to update each other.
The reason for maintaining dynamic link
state information is to allow Exchange 2000 to find the optimum path for
messages. The LST contains the data used to make the decision, which is based
on a modified form of Dijkstra’s algorithm, a commonly used method to determine
the shortest path between two points in a network. OSPF (Open Shortest Path First)
is another name for this type of routing, and this is employed in many network
routers to get packets sent between systems in the most efficient manner.
Inside a network, the optimum route is
determined by factors such as delay, throughput, and connectivity. Messaging is
a little different because other factors come into play, like the eventual
destination of a message (does it have to be routed out across a specific
connector), its size, the sender, and message priority. During the decision
process, the Exchange organization is modeled as a network with each Routing
Group represented as a network node and each connector as a link between nodes.
The basic decision that has to be made is: Given a message and its properties
(current location, sender, recipient, priority, and size) and the network
infrastructure (link state and cost) what is the next best hop to route the
message? As already discussed, Exchange 2000 allows connectors to be limited to
handle particular sizes of messages, or only accept messages from specific
e-mail addresses.
Avoiding message “ping-pong” and the type
of rerouting that occur in Exchange 5.5 are major reasons for implementing link
state routing. The GWART is static, so messages can be routed to an inoperative
connector. When this happens, the MTA checks the GWART to discover whether
another route exists and attempts to send the messages across the alternate
route. If this connector is also unavailable, the MTA will attempt other routes
until all possible routes are exhausted, in which case the messages remain
queued until a route becomes available. It sounds OK to reroute messages in
this fashion, but the messages ping-pong around sites and connectors until all
available routes are exhausted, and in a large organization containing many
sites and connectors, it can take some time before the MTA decides that all
routes have been tried. Another complication is introduced by the fact that
connectors built with the Exchange Development Kit (EDK) maintain their own
queues, and once the MTA has passed responsibility for a message to a connector
by placing it onto the connector’s queue, no further rerouting can take place.
The IMS is the best example of how this can cause a problem. If an Internet
connection is down, then all of the messages queued to the IMS that serves the
connection will remain on that queue until the connection comes back up.
Because manual intervention is required to force a GWART update (by increasing
the cost to use the IMS that is down), and time is required to replicate the
GWART to all sites, messages continue to accumulate on the queue until every
site is updated.
WinConnections Conference Fall 2008 Don’t miss the premier event for Microsoft IT Professionals in Las Vegas, November 10-13. Register and book your room by August 25 and receive a FREE room night (based on a three night minimum stay).
Master SharePoint with 3 eLearning Seminars Learn how to build a better SharePoint infrastructure and enable powerful collaboration with MVPs Dan Holme and Michael Noel. Register today!
SharePointConnections Conference Fall 2008 Don’t miss the premier event for Microsoft IT Professionals in Las Vegas, November 10-13. Register and book your room by August 25 and receive a FREE room night (based on a three night minimum stay).
VMworld 2008 - Sign Up Today! Join your peers on September 15-18 at The Venetian Hotel in Las Vegas as VMware hosts VMworld 2008, the leading Virtualization event.
Microsoft® Tech•Ed EMEA 2008 IT Professionals Advance your thinking with new ideas and practical real-world solutions at Microsoft’s FIVE day technical infrastructure conference 3-7 Nov., 2008. Register before 26 September 2008 to save €300.
Order Your Fundamentals CD Today! Gain an introduction to Exchange, learn server security requirements, and understand how unified communications can play a role in your messaging strategies with this free Exchange CD.
Are You Really Compliant with Software Regulations? View this web seminar that will help you with compliance best practices and check out a management solution to assure that you won’t be in jeopardy of an audit.
Virtualization Congress Oct. 14-16 in London Don't miss Virtualization Congress, the premiere EMEA conference dedicated to hardware, OS and application virtualization. Oct. 14-16 in London.