This section describes the process by which dns2xml transforms existing BIND data into XML. In order to demonstrate what happens throughout the various stages of the process, some (pseudo)DNS resource records are listed below. Assume they represent data from both forward and reverse zone files. For clarity's sake, the DNS names have been reduced from a.mydomain.com.
form to simply a
, and IP addresses from 10.0.0.1
to 1
. In other words, the systems are alpha and the IPs are numeric.
a IN A 1 a IN A 2 a IN CNAME x b IN A 3 b IN A 4 c IN A 1 d IN A 2 1 IN PTR a 2 IN PTR a 3 IN PTR b 4 IN PTR b
a b c d | | | | +-IPs +-IPs +-IPs +-IPs | | | | | | +-1 +-3 +-1 +-2 | | | | +-2 +-4 | +-CNAMEs | +-x 1 2 3 4 | | | | +-PTRs +-PTRs +-PTRs +-PTRs | | | | +-a +-a +-b +-b
<SYSTEM>
elements in the XML. We can do this by first thinking of the DNS names and IP addresses as nodes on a graph, with the A
and PTR
resource records representing the lines connecting them. Once in graph form, it becomes easy to determine the relationships among them and thereby separate them into systems. The first step is visualizing the resource record data in graph form, so here are some diagrams to help illustrate the process.
Here is the forward zone data. The arrows represent the link from DNS name to IP address indicated by the A
resource records:
Here is the reverse zone data. The arrows in this case represent the links from IP address to DNS name indicated by the PTR
resource records:
And here is the completed graph, showing how all the DNS names and IP addresses are related through the resource record data:
Hopefully, it is easy to see how the A
and PTR
resource record data translates to graph form. The next step is to find the connected components of the graph. These connected components are essentially individual subgraphs which, in this case, represent the groupings of related data that correspond to <SYSTEM>
elements.
The algorithm (based on Depth First Search) that finds the connected components ignores the direction of the arrows linking the nodes, so we can now visualize the graph as follows:
The following diagram represents the partitioning of the original graph into its connected components, each of which contains the members of an eventual <SYSTEM>
element:
These results show that System A will be comprised of a
, c
, and d
, and the interfaces 1
and 2
. Likewise, System B will be comprised of b
and interfaces 3
and 4
.
Since System B below only contains b
, it is automatically the system. (As an aside, b
has two interfaces in this example, but probably the most common situation by far is a single system with a single interface.)
The XML generated for System B is (roughly) as follows:
<SYSTEM> <DNS NAME="b"></DNS> <-- 'b' is the only choice, so definitely the 'system' <INTERFACE> <IP ADDR="3"> <-- representing b IN A 3 <PTR>b</PTR> <-- representing 3 IN PTR b </IP> </INTERFACE> <INTERFACE> <IP ADDR="4"> <-- representing b IN A 4 <PTR>b</PTR> <-- representing 4 IN PTR b </IP> </INTERFACE> </SYSTEM>
System A, on the other hand, has multiple DNS names: a
, c
, and d
. Which of these should be the system? The answer is determined algorithmically, based upon a series of rules like "Given two DNS names that share an interface, if one of them has additional A
records it is more likely to be the system than the one with only a single A
record."
By inspecting the directed graph of System A, we can see that not only does a
have more A
records (black arrows) associated with it than c
or d
, but it is also referenced by a couple of PTR
records (red arrows), which neither c
nor d
have. In this case, it is clear that a
is most likely the best candidate to be the actual system. The other two are just alternate names for the 1
and 2
interfaces of a
. Here's the xml:
<SYSTEM> <DNS NAME="a"> <-- 'a' was determined to be the system <ALIAS>x</ALIAS> <-- representing x IN CNAME a </DNS> <INTERFACE> <DNS NAME="c"></DNS> <-- representing c IN A 1 <IP ADDR="1"> (and implicitly) a IN A 1 <PTR>a</PTR> <-- representing 1 IN A a </IP> </INTERFACE> <INTERFACE> <DNS NAME="d"></DNS> <-- representing d IN A 2 <IP ADDR="2"> (and implicitly) a IN A 2 <PTR>a</PTR> <-- representing 2 IN A a </IP> </INTERFACE> </SYSTEM>
For most of the situations encountered while trying to convert existing DNS data into XML, this process works fine. There are, however, times when the systems generated in this manner are somewhat ambiguous. These situations are checked for by dns2xml, and if found, trigger some additional processing to try to resolve that ambiguity. An example of such a case and how it is resolved can be found here.