The focus is on some of the guts of the two main scripts, dns2xml and xml2dns, in order to provide any poor soul who has to work with and modify my code some insight into how it all currently works.
The d2xpp script was (unfortunately) written very quickly to accomplish something more "important" at the time. It works, but should be rewritten to be less fragile and more correct (and to add sanity-checking for the DNS data.) It is a pretty simple script, and its role is spelled out pretty well in the Users's Guide, so rewriting should be relatively easy, and it won't be covered here.
Everything else is just details (covered shortly.)
The global data structures that are maintained by dns2xml are the $Zones
, $Systems
, and $Partition
hash references. A brief discussion of these is necessary in order to talk about how the rest of dns2xml works:
$Zones
is a hash ref that maps a zone name to that zone's data. The zones in $Zones
are the zones listed in the BIND config file (named.conf
) and the information comes from the corresponding zone files. There are also default forward and default reverse zones in the dns2xml.conf
file that are added to $Zones
.
$Zones => { 'mydomain1.com' => { 'NAME' => 'mydomain1.com.', 'TYPE' => 'MASTER', 'FILE' => 'db.mydomain1', 'DIR' => 'FORWARD', 'SERIAL' => '2000083101', # \ 'REFRESH' => 10800, # | 'RETRY' => 600, # | 'MINTTL' => 21600, # | SOA info 'NEGTTL' => 21600, # | 'EXPIRE' => 604800, # | 'HOST' => 'ns1.mydomain1.com.', # | 'MAILADDR' => 'root.ns1.mydomain1.com.', #/ 'NS' => { 'dns1.mydomain1.com.' => 1, # nameservers 'dns2.mydomain1.com.' => 1, }, 'MX' => { '10 mxhost1.mydomain1.com.' => 1, # mx hosts '15 mxhost2.mydomain1.com.' => 1, }, } 'zone2.com.' => { ... info ... } 'zone3.com.' => { ... info ... } 'FORWARD' => { ... default forward info ... } 'REVERSE' => { ... default reverse info ... } }
$DNS_Names
contains the graph structure and resource record data for all the dns names extracted from the zone files.
Given the following resource records:
; from zone file db.mydomain1 ns1.mydomain1.com. IN A 1.2.3.4 ns1.mydomain1.com. IN A 1.2.3.5 ns1.mydomain1.com. IN MX 10 mxhost1.mydomain1.com. ns1.mydomain1.com. IN MX 15 mxhost1.mydomain1.com. ns1.mydomain1.com. IN CNAME ns1.mydomain1.com. ; from zone file db.1.2.3 4.3.2.1.in-addr.arpa. IN PTR ns1.mydomain.com.
$DNS_Names
would contain the following two entries:
$DNS_Names => { 'ns1.mydomain1.com.' => { 'NAME' => 'ns1', 'DOMAIN' => 'mydomain1.com.', 'P_TO' => {'4.3.2.1.in-addr.arpa.' => 1, '5.3.2.1.in-addr.arpa.' => 1, }, 'P_FROM' => {'4.3.2.1.in-addr.arpa.' => 1}, 'A' => {'4.3.2.1.in-addr.arpa.' => 1, '5.3.2.1.in-addr.arpa.' => 1, }, 'CNAME' => {'www.mydomain1.com.' => 1}, 'MX' => { '10 mxhost1.mydomain1.com.' => 1 '15 mxhost2.mydomain1.com.' => 1, } }, '4.3.2.1.in-addr.arpa.' => { 'NAME' => '4', 'DOMAIN' => '3.2.1.in-addr.arpa.', 'P_TO' => {'ns1.mydomain1.com.' => 1}, 'P_FROM' => {'ns1.mydomain1.com.' => 1}, 'PTR' => {'ns1.mydomain1.com.' => 1}, }, }The keys under an individual dns name entry in
$DNS_Names
fall into some basic categories:
NAME
DOMAIN
P_TO
A
and PTR
resource records:x.y.com. IN A 1.2.3.4 <--> x.y.com. points to 1.2.3.4
P_FROM
P_TO
is generated: if we know that x P_TO y
, then we generate y P_FROM x
. The purpose of the P_FROM
is to provide a complete view of what nodes are connected to what other ones for use in the partitioning and DFS algorithms.A, PTR, MX, HINFO,
etc.) will be a key for that type's data for that dns name. In the examples above, $DNS_Names->{'ns1.mydomain1.com.'}{MX}
returns a hash ref whose keys are the mx host entries for ns1.mydomain1.com.
, while $DNS_Names->{'ns1.mydomain1.com.'}{A}
returns the IP addresses it points to.While $DNS_Names
is the place that all the resource record information gets bundled up and associated with the appropriate dns name, $Partition
is where all those dns names get bundled into individual systems.
Essentially, the partitioning routines assign a number to each entry in $DNS_Names
and then move all those entries to their corresponding spot in $Partition
. ($DNS_Names
is empty at the end of all the moves.)
If $DNS_Names
looked like this:
$DNS_Names => { 'dnsname1' => { ... info ... } 'dnsname2' => { ... info ... } 'dnsname3' => { ... info ... } 'dnsname4' => { ... info ... } }
and the partitioning routines determined that dnsname1
and dnsname2
belonged to one system while dnsname3
and dnsname4
belonged to a different one, then $Partition
might look like this:
$Partition => { '1' => { # <---- system number, assigned in processing order 'dnsname1' => { ... info ... } 'dnsname2' => { ... info ... } } '2' => { 'dnsname3' => { ... info ... } 'dnsname4' => { ... info ... } } }
Remember: Whenever there is confusion with these multiple hashes of hashes, inserting a print Dumper $variable;
will list the contents of the variable and can be invaluable to determine what's going on. It can substantially increase the running time of the script, though.
<SYSTEM>
elements in the XML document.
The PartitionsIntoSystems
subroutine partitions the original set of dns names in $DNS_Names
into numbered systems in $Partition
. See the Data Structures section at the start of this chapter for an example of the transformation and resulting structures.
The following is the basic algorithm for grouping the dns names into systems. This is the essence of the algorithm in pseudocode (doesn't correspond exactly w/code, but clearly shows the process):
To begin with, no dns name entry in $DNS_Names has an assigned SYSTEM_NUMBER PartitionIntoSystems{ $current_system_number = 0; foreach ($dns_name in $DNS_Names) { if ($dns_name doesn't have an assigned SYSTEM_NUMBER) { increment the $current_system_number; DFS($current_system_number, $dns_name); } } } DFS($system_num, $dns_name) { $dns_name->{SYSTEM_NUMBER} = $system_num; foreach ($adjacent_node of $dns_name) { if ($adjacent_node doesn't have an assigned SYSTEM_NUMBER) { DFS($system_num, $adjacent_node); } }
The end result is that we find the connected components of the graph such that:
if $DNS_Names->{$dns_name1}{SYSTEM_NUMBER} is equal to $DNS_Names->{$dns_name2}{SYSTEM_NUMBER} then $dns_name1 and $dns_name2 belong to the same connected component (the same system) in $Partition.
So, if the SYSTEM_NUMBER
for $dns_name1
and $dns_name2
was 1
, then
$Partition->{'1'} => { $dns_name1 => {...info...}, $dns_name2 => {...info...}, }An illustrated walkthru of the process can be viewed here.
<SYSTEM>
element. It is ambiguous. An ambiguous system is one where there is an interface in the system for which there is no corresponding A
resource record linking the system's dns name with the interface IP. The IP in question is pulled into the system through a cross linking with one of the interface DNS names:
x IN A 1 | The DFS will pull together {x,y,1,2,3,4} as a system. x IN A 2 | Then x is determined to be the system name, and x IN A 3 | y is the name of the '2' interface of system x. y IN A 2 | The problem is that y also points to '4', which has y IN A 4 | nothing to do with system x from a DNS standpoint.
We need to figure out how to split y and 4 out so that they make the most sense. Here's the basic algorithm:
foreach $dns_name that points to one or more of $system_name's IPs: ($dns_name can be $system_name) if all $dns_name's IPs are pointed to by $system_name MOVE $dns_name MOVE all $dns_name's IPs else // $dns_name points to some IP that $system_name doesn't if subpartition type == ip_unique COPY $dns_name MOVE $dns_name's IPs that are pointed to by $system_name else // subpartition type == sys_unique SKIP $dns_name COPY $dns_name's IPs that are pointed to by $system_name MOVE = remove from original system and copy to new one COPY = leave in original system and copy to new one SKIP = leave in original system and don't copy to new oneThe processing of this algorithm actually occurs in two stages. The first is the loop shown above, where each
$dns_name
is actually "marked" with the action to take. Once the loop is finished, and all the $dns_names
have been marked with an action, then we sweep through all those $dns_names
and take the appropriate action.
During the looping of the marking phase, a $dns_name
may be marked more than once. That's OK, but the following rule is observed:
AFor another view of this problem and what is done to fix it, see this walkthru of subpartitioning.COPY
can overwrite aMOVE
, but aMOVE
cannot overwrite aCOPY
If there are any questions on any parts of these scripts, please feel free to email me at bom@alumni.utexas.net.