Dynamic updates address the problem. The
key advance is to propagate updates fast so that all Routing Groups learn about
downed connectors so that messages don’t have to travel to the connector only
to discover that they need to be rerouted. Each Routing Group maintains its own
LST and has a copy of the LST from every other Routing Group. Updates can occur
literally each time a bridgehead server communicates to another bridgehead
servers, and the LST is updated as quickly as the bridgehead server can make a
connection across port 691 to the Routing Master. The updated LST is
immediately used as the basis for routing decisions, so message queues don’t
accumulate. Best of all, despite the fact that the Routing Group Connector and
SMTP Connector are both SMTP-based, they both allow rerouting, so even if an
SMTP connection is unavailable, messages can be rerouted quickly as soon as a
connector is deemed to be unavailable, if an alternate path exists. This makes
the whole routing system very efficient because less processing is expended to
route messages to their final destination.
The MTA doesn’t have the same central role
in message routing in Exchange 2000 as it had in previous versions. However,
the MTA continues to perform this role for Exchange 5.5 servers, even in mixed
mode sites, and it is able to take advantage of the faster notification of
downed connectors to make better routing decisions. Think of the new role of
the MTA as the protocol interface for X.400 and the gateway (via the Store) to
EDK connectors such as those for IBM PROFS and SNADS, and the wide variety of
FAX connectors that are available.
Microsoft originally intended that the Exchange
2000 routing master should automatically take over the RID Master role (also
known as the Routing Calculation Master) within sites in mixed mode
organizations. The RID master is the server that builds a GWART within a site
and publishes it to the other servers. Acting in this role, the Routing Master
would be able to combine its knowledge of the routing table and GWART data
provided by other Exchange 5.5 sites (obtained by replication through the Site
Replication Service and the ADC) and generate a GWART that is then fed back to
the Exchange 5.5 servers in the site. The advantage of this mechanism is that
the Exchange 5.5 servers are able to take some advantage of the dynamic nature
of the LST. Static routing will be performed to other Exchange 5.5 sites
because the GWART data from those sites remains essentially static, but the
updates flowing into the Exchange 2000 link state routing table will be fed
into the GWART used in mixed mode sites.
One little flaw affected the plan.
Exchange 5.5 supports the concepts of sub-sites through locations that are
assigned as properties of servers. The availability of connectors can be
limited by setting their scope, one of which limits a connector to the sub-site
or location that the server is installed into. The Exchange 2000 Routing Master
supports connector scopes, but administrative and routing groups have replaced
sites, so the concept of sub-sites has gone away. If the Routing Master is used
to generate the GWART, it will ignore sub-site scopes, so if you use this
feature in Exchange 5.5 you don’t want the Routing Master to generate the
GWART. Fortunately, you can assign any server in a mixed mode site to act as
the Exchange 5.5 routing calculation master. This is done through the Site
Addressing properties of the site object, as shown in Figure 20.
There’s obviously a big difference between
the way the MTA handles address spaces in the GWART and the new link state
routing mechanism. If you’ve deployed connectors that limit their scope to a
single site, you may find that you need to review the connectors before you
start to deploy Exchange 2000 servers. Here’s an example from the Compaq
deployment.
A small site located close to Microsoft’s
HQ in Redmond deployed an Exchange 5.5 IMS to handle SMTP traffic to
microsoft.com instead of channelling the messages across one of the general IMS
connectors deployed around the corporation. The scope for the IMS was set to
“site only”, meaning that the connector was invisible to other sites in the
organization. The Exchange 2000 routing engine is far more dynamic than
Exchange 5.5 and ignores the old scope restrictions (remember, there are no
sites anymore). So when Exchange 2000 servers joined the organization, their routing
engines looked for all available SMTP connectors when the time came to send a
message to microsoft.com, the route via the IMS dedicated to microsoft.com was
selected as the best way to send the message as it is more specific than the
general “SMTP:*” space assigned to the other connectors. This all seems OK, but
any server not in the site hosting the IMS will fail as they can’t find a path
to the connector and the message will not be sent.
The workaround is to add the relevant
address space (in this case microsoft.com) to another connector in the
organization. Now the routing engine will have a choice of connectors and will
be able to transfer messages to the “right” connector if a routing failure
occurs as a result of attempting the scoped connector. Microsoft is considering
whether this behaviour is a bug or feature and may address it in a future
release or service pack. It’s still wise to review connectors and scopes as you
prepare to migrate, just in case.
Routing,
Retries, and Updates
To
explain what happens when a failure occurs on the network and how Exchange 2000
updates the LST, let’s use the LST data outlined in Figure 19 to follow the
path of a message generated on a server in the Dublin Routing Group sent to a
mailbox on a server in the Boston Routing Group. A schematic showing how the
routing groups are connected is shown in Figure 21. Network connectivity is
reasonably similar to the type of links used in large corporate deployments.
The network is organized into a series of hubs (New York, London, and
Copenhagen), with the major links between New York and London and London and
Copenhagen. A backup transatlantic link is available between Copenhagen and New
York, but it is costed to prevent traffic going across the link unless no other
route is available.
The message starts by being routed to the
bridgehead server in the Dublin Routing Group. A direct SMTP link is
automatically generated to route the message from the originating server to the
bridgehead, which then attempts to create a connection to the bridgehead server
in the London Routing Group. After the message is successfully received, its
address is analyzed and a determination is made for the next hop. London then
attempts to open a connection to a bridgehead server in New York, but the
attempt fails because of a network outage. If there are multiple bridgehead
servers defined for New York, the London bridgehead will attempt to open a
connection to each. All attempts fail.
The London bridgehead now goes into a
“glitch-retry” state. This means that the server has recognized that a problem
exists, but will try to establish a connection in 60 seconds in case the fault
is temporary. After 60 seconds, an event fires to tell the server to try again.
An attempt is made to contact each bridgehead server in New York but fails due
to a continuing network problem. The London bridgehead goes through the
“glitch-retry” sequence three times before applying the retry schedule set on
the SMTP Virtual Server. The messages that caused the retry are rerouted
immediately a problem is detected and don’t have to wait for the “glitch-retry”
sequence to finish. The connection is then marked as “Down” and the Routing
Group Master for London is informed by the bridgehead server, which connects to
port 691 before sending the server down status via LSA.
After receiving the update, the Routing
Group Master updates its LST and sends updates to all of the other servers in
the London Routing Group. The bridgehead server consults the updated LST
(Figure 19) and decides that an alternative, higher-cost route is available via
Copenhagen. Link state updates are also sent to the other Routing Groups via
the ESMTP X-LINK2STATE command to inform them that the London to New York link
is currently unavailable. The update occurs before an attempt is made to send
any other messages to prevent servers in the Dublin, Frankfurt, Paris,
Copenhagen, and Stockholm Routing Groups attempting to send messages to London
for onward processing. The Routing Groups that receive the link state update
compare the version number on the update against the data held in their own
tables. If the version number is higher, the update is applied and a new LST is
created for the Routing Group.
Connectors are one-way, so the problem is
first detected in London. At the other side of the Atlantic, a message sent to
London will prompt the bridgehead server in New York to go through the same
discovery process and update its own Routing Group Master with a down status.
The updated link state information will then be published to servers in the New
York, Boston, and San Francisco Routing Groups, which proceed to update their
copies of the LST.
The link between New York and Copenhagen
becomes the preferred transatlantic connection until a bridgehead server in
either London or New York determines that the link between the two Routing
Groups is now available. The retry schedule on the SMTP virtual server
determines when attempts are made to investigate the current status of the
connector, and as soon as a connection is successful, the link is marked as
“Up,” in which case a series of LST updates begins again to inform all routing
groups that the connection is back and available for routing.
Looking
at Routing Information
Short
of trawling through memory and making some excellent guesses about what you
find there, there’s no out-of-the-box way of getting a detailed view of the LST
on a server. The WinRoute17 and Status section of the Exchange System Manager
snap-in providing the next best thing. As Figure 22 shows, the Status option
lists all the servers in the Routing Group that your server is connected to
plus their status. In this case, our server (QEMEA-ES1) is monitoring servers
in both the “Ireland” and “French Servers” Administrative Groups because the
routing group spans a number of Administrative Groups. We can also see that
QEMEA-DC1 is unavailable for some reason, possibly because the set of Exchange
services is not running. In any case, messages cannot be routed to QEMEA-DC1
now.
The top portion of the display lists all
the connectors available to the servers. Lack of attention to naming
conventions has made the output in Figure 22 a real mess and there is no
immediate indication of what each connector is intended to do. However, a
little work will swiftly address the problem and create a much more informative
view.
Details of connectors are stored as AD
objects, and they can be renamed at any time. A short delay occurs before the
rename is effective and displayed in the status window. This is because
Exchange System Manager keeps a cache of configuration data to stop it having
to go back to the AD each time it repaints a window. Within ten minutes of
renaming your connectors, the new names should appear in the list, which is
shown in Figure 23. Compare the information conveyed by simply following a
naming convention. All of the connectors are clearly identified with their
location and purpose. The naming convention doesn’t have to be too strict and
it is possible to rename the connectors after they are created by clicking on
the name within the routing group and typing a new name in. Nevertheless, if a
naming convention isn’t followed from the start, administrators will forget and
won’t go back and clean up the names afterwards, which may result in confusion
later on. According to Murphy’s 233rd law of computing, that confusion will
inevitably arise during a crisis, just as you’re trying to debug an onerous
routing problem. By the way, don’t assume that I always follow my own good
advice. Sometimes you just need to get something done quickly and then things
maybe aren’t as finished as they should be.
If you’re unsure about the route that
Exchange is currently using to send messages, select a message that has been
recently sent between two Routing Groups and examine the message header. All of
the servers that handled the message en route will be detailed in the header.
Figure 24 illustrates the point. In this instance, we’re using Outlook Express
to examine the properties of a message that had some delivery problems. The
“details” tab of the properties reveals the route information, and it is often
easier to follow this data by clicking the “Message Source” button to view the
complete message in a resizable window.
CONNECTING ROUTING GROUPS
Routing
groups can be connected together by Routing Group Connectors (RGCs), as well as
X.400 or standard SMTP connections. The RGC is very similar to the RPC-driven
“Site” connector in Exchange 5.5 as it is the fastest and easiest connector to
set up. Like the Site connector, convenience is achieved at the expense of
giving up a certain amount of control over how messages are transmitted such as
the fact that the RGC cannot be configured to prevent messages of a certain
size being delivered. Both of the other connectors can be configured to a much
tighter degree. This fact is probably not important if you want to simply route
messages between two routing groups located in a well-established and stable
network, but it might if you wanted to route mail between two routing groups
over the public Internet.
The RGC uses SMTP to pass messages between
servers, and so can an SMTP connector. Is there a conflict here? In many ways
it is a similar situation to the decision to deploy Site or X.400 connectors in
Exchange 5.5. Like the RGC, the Site connector is easiest to set up and manage
whereas the X.400 connector has more property pages and places to tweak
settings that might be needed to get messages through. The SMTP connector is
used whenever you need to attain the same level of control and granularity
previously delivered in the X.400 connector. You should use the SMTP connector
instead of the RGC when:
You
need to connect to Exchange 5.5 sites that use the IMS as their connection
mechanism.
You
need to authenticate a remote bridgehead server before sending messages.
You
need to schedule message exchange with another server, perhaps to pick up
messages across a link that is only available at particular times. Connecting
to an ISP for message pickup is perhaps the best example of such a scenario.
You
want to use DNS MX records as the basis for routing messages instead of the
Exchange configuration data held in the AD. In an Exchange 2000-pure
environment, DNS is never consulted to route messages because all of the
information about servers and routing groups is held in the AD and the Link
State Table contains all the data required to route messages along the lowest
cost path that’s available. DNS is used to translate server names to IP
addresses, but the MX records are ignored.
Some people will use SMTP or X.400
connectors to tie Routing Groups together, but the vast majority of connections
will be made with the RGC. Let’s have a look at the connector in some detail.
Creating
a Routing Group Connector
A RGC
links one or more bridgehead servers in one routing group to one or more
bridgehead servers in another. This is a difference to the Site connector in
Exchange 5.5, which allows you to select either one server to serve as a
bridgehead for a site or use any server in a site. The “any server in a site”
option is often not desirable when you deal with very large sites. Multiple
bridgehead servers should be configured whenever you want to achieve better
resilience across a link or the volume of messages requires the load to be
balanced across multiple servers. Exchange 2000 automatically balances message
load if multiple bridgehead servers are configured in a routing group.
The RGC is unidirectional, which means
that two connectors must be configured before messages flow in both directions
between two routing groups. Like the Site connector in Exchange 5.5, when you
set up a RGC on one server, you can have the RGC configured in the target
routing group at the same time, providing you hold the appropriate
administrative permissions for that routing group. The RGC is protocol
independent. SMTP is used to connect Exchange 2000 routing groups together, but
in a mixed mode environment where an RGC is used to connect an Exchange 2000
routing group to an Exchange 5.5 site, RPCs flow across the connector. This is
logical because SMTP is an optional protocol on an Exchange 5.5 server and RPC
is the only protocol that is absolutely guaranteed to be available for
inter-server communications.
WinConnections Conference Fall 2008 Don’t miss the premier event for Microsoft IT Professionals in Las Vegas, November 10-13. Register and book your room by August 25 and receive a FREE room night (based on a three night minimum stay).
Master SharePoint with 3 eLearning Seminars Learn how to build a better SharePoint infrastructure and enable powerful collaboration with MVPs Dan Holme and Michael Noel. Register today!
SharePointConnections Conference Fall 2008 Don’t miss the premier event for Microsoft IT Professionals in Las Vegas, November 10-13. Register and book your room by August 25 and receive a FREE room night (based on a three night minimum stay).
VMworld 2008 - Sign Up Today! Join your peers on September 15-18 at The Venetian Hotel in Las Vegas as VMware hosts VMworld 2008, the leading Virtualization event.
Microsoft® Tech•Ed EMEA 2008 IT Professionals Advance your thinking with new ideas and practical real-world solutions at Microsoft’s FIVE day technical infrastructure conference 3-7 Nov., 2008. Register before 26 September 2008 to save €300.
Order Your Fundamentals CD Today! Gain an introduction to Exchange, learn server security requirements, and understand how unified communications can play a role in your messaging strategies with this free Exchange CD.
Are You Really Compliant with Software Regulations? View this web seminar that will help you with compliance best practices and check out a management solution to assure that you won’t be in jeopardy of an audit.
Virtualization Congress Oct. 14-16 in London Don't miss Virtualization Congress, the premiere EMEA conference dedicated to hardware, OS and application virtualization. Oct. 14-16 in London.