next up previous contents index
Next: Site-specific information Up: Analyzers and Events Previous: General Processing Events   Contents   Index

Subsections


Generic Connection Analysis

The conn analyzer performs generic connection analysis: connection start time, duration, sizes, hosts, and the like. You don't in general load conn directly, but instead do so implicitly by loading the tcp, udp, or icmp analyzers. Consequently, conn doesn't load a capture_filter value by itself, but instead uses whatever is set up by these more specific analyzers.

conn analyzes a number of events related to connections beginning or ending. We first describe the connection record data type that keeps track of the state associated with each connection (§ ), and then we detail the events in § . The main output of its analysis are one-line connection summaries, which we describe in § , and in §  we give an overview of the different callable functions provided by conn.

conn also loads three other Bro modules: the hot and scan analyzers, and the port-name utility module.


The connection record

Figure 7.3: Definition of conn_id and connection records.
\begin{figure}\begin{verbatim}type conn_id: record {
orig_h: addr;  ...

A connection record holds the state associated with a connection, as shown in Figure 7.3.1. Its first field, id, is defined in terms of the conn_id record, which has the following fields:

[orig_h] The IP address of the host that originated (initiated) the connection. In ``client/server'' terminology, this is the ``client.''

[orig_p] The TCP or UDP port used by the connection originator (client). For ICMP ``connections'', it is set to 0 (§ ).

[resp_h] The IP address of the host that responded (received) the connection. In ``client/server'' terminology, this is the ``server.''

[resp_p] The TCP or UDP port used by the connection responder (server). For ICMP ``connections'', it is set to 0 (§ ).

The orig and resp fields of a connection record both hold endpoint record values, which consist of the following fields:

[size] How many bytes the given endpoint has transmitted so far. Note that for some types of filtering, the size will be zero until the connection terminates, because the nature of the filtering is to discard the connection's intermediary packets and only capture its start/stop packets (§ ).


Table 7.1: TCP and UDP connection states, as stored in an endpoint record.
State Meaning
TCP_INACTIVE The endpoint has not sent any traffic.
TCP_SYN_SENT It has sent a SYN to initiated a connection.
TCP_SYN_ACK_SENT It has sent a SYN ACK to respond to a connection request.
TCP_PARTIAL The endpoint has been active, but we did not see the beginning of the connection.
TCP_ESTABLISHED The two endpoints have established a connection.
TCP_CLOSED The endpoint has sent a FIN in order to close its end of the connection.
TCP_RESET The endpoint has sent a RST to abruptly terminate the connection.
UDP_INACTIVE The endpoint has not sent any traffic.
UDP_ACTIVE The endpoint has sent some traffic.


[state] The current state the endpoint is in with respect to the connection. Table 7.3.1 defines the different possible states for TCP and UDP connections. Deficiency: The states are currently defined as count, but should instead be an enumerated type; but Bro does not yet support enumerated types.

Note: UDP ``connections'' do not have a well-defined structure, so the states for them are quite simplistic. See §  for further discussion.

The remaining fields in a connection record are:

[start_time] The time at which the first packet associated with this connection was seen.

[duration] How long the connection lasted, or, if it is still active, how long since it began.

[service] The name of the service associated with the connection. For example, if $id$resp_p is tcp/80, then the service will be "http". Usually, this mapping is provided by the port_names global variable, perhaps via the endpoint_id function; but the service does not always directly correspond to $id$resp_p, which is why it's a separate field. In particular, an FTP data connection can have a service of "ftp-data" even though its $id$resp_p is something other than tcp/20 (which is not consistently used by FTP servers).

If the name of the service has not yet been determined, then this field is set to an empty string.

[addl] Additional information associated with the connection. For example, for a login connection, this is the username associated with the login.

Deficiency: A significant deficiency associated with the addl field is that it is simply a string without any further structure. In practice, this has proven too restrictive. For example, we may well want to associate an unambiguous username with a login session, and also keep track of the names associated with failed login attempts. (See the login analyzer for an example of how this is implemented presently.) What's needed is a notion of union types which can then take on a variety of values in a type-safe manner.

If no additional information is yet associated with this connection, then this field is set to an empty string.

[hot] How many times this connection has been marked as potentially sensitive or reflecting a break-in. The default value of 0 means that so far the connection has not been regarded as ``hot''.

Note: Bro does not presently make fine-grained use of this field; the standard scripts log connections with a non-zero hot field, and do not in general log those that do not, though there are exceptions. In particular, the hot field is not rigorously maintained as an indicator of trouble; it instead is used loosely as an indicator of particular types of trouble (access to sensitive hosts or usernames).


Definitions of connections

Connections for TCP are well-defined, because establishing and terminating a connection plays a central part of the TCP protocol. For UDP and ICMP, however, the notion is much looser.

For UDP, a connection begins when host $A$ sends a packet to host $B$ for the first time, $B$ never having sent anything to $A$. This transmission is termed a request, even if in fact the application protocol being used is not based on requests and replies. If $B$ sends a packet back, then that packet is termed a reply. Each packet $A$ or $B$ sends is another request or reply. Deficiency: There is presently no mechanism by which generic (non-RPC) UDP connections are terminated; Bro holds the state indefinitely. There should probably be a generic timeout for UDP connections that don't correspond to some higher-level protocol (such as RPC), and a user-accessible function to mark connections with particular timeouts.

For ICMP, Bro likewise creates a connection the first time it sees an ICMP packet from $A$ to $B$, even if $B$ previously sent a packet to $A$, because that earlier packet would have been for a different transport connection than the ICMP itself--the ICMP will likely refer to that connection, but it itself is not part of the connection. For simplicity, this holds even for ICMP ECHOs and ECHO_REPLYs; if you want to pair them up, you need to do so explicitly in the policy script. Deficiency: As with UDP, Bro does not time out ICMP connections.


Generic TCP connection events

There are a number of generic events associated with TCP connections, all of which have a single connection record as their argument:

[new_connection] Generated whenever state for a new (TCP) connection is instantiated.

Note: Handling this event is potentially expensive. For example, during a SYN flooding attack, every spoofed SYN packet will lead to a new new_connection event.

[connection_established] Generated when a connection has become established, i.e., both participating endpoints have agreed to open the connection.

[connection_attempt] Generated when the originator (client) has unsuccessfully attempted to establish a connection. ``Unsuccessful'' is defined as at least ATTEMPT_INTERVAL seconds having elapsed since the client first sent a connection establishment packet to the responder (server), where ATTEMPT_INTERVAL is an internal Bro variable which is presently set to 300 seconds. Deficiency: This variable should be user-settable. If you want to immediately detect that a client is attempting to connect to a server, regardless of whether it may soon succeed, then you want to handle the new_connection event instead.

Note: Handling this event is potentially expensive. For example, during a SYN flooding attack, every spoofed SYN packet will lead to a new connection_attempt event, albeit delayed by ATTEMPT_INTERVAL.

[partial_connection] Generated when both connection endpoints enter the TCP_PARTIAL state (Table 7.3.1). This means that we have seen traffic generated by each endpoint, but the activity did not begin with the usual connection establishment. Deficiency: For completeness, Bro's event engine should generate another form of partial_connection event when a single endpoint becomes active (see new_connection below). This hasn't been implemented because our experience is network traffic often contains a great deal of ``crud'', which would lead to a large number of these really-partial events. However, by not providing the event handler, we miss an opportunity to detect certain forms of stealth scans until they begin to elicit some form of reply.

[connection_finished] Generated when a connection has gracefully closed.

[connection_rejected] Generated when a server rejects a connection attempt by a client.

Note: This event is only generated as the client attempts to establish a connection. If the server instead accepts the connection and then later aborts it, a connection_reset event is generated (see below). This can happen, for example, due to use of TCP Wrappers.

Note: Per the discussion above, a client attempting to connect to a server will result in one of connection_attempt, connection_established, or connection_rejected; they are mutually exclusive.

[connection_half_finished] Generated when Bro sees one endpoint of a connection attempt to gracefully close the connection, but the other endpoint is in the TCP_INACTIVE state. This can happen due to split routing (§ ), in which Bro only sees one side of a connection.

[connection_reset] Generated when one endpoint of an established connection terminates the connection abruptly by sending a TCP RST packet.

[connection_partial_close] Generated when a previously inactive endpoint attempts to close a connection via a normal FIN handshake or an abort RST sequence. When it sends one of these packets, Bro waits PARTIAL_CLOSE_INTERVAL (an internal Bro variable set to 10 seconds) prior to generating the event, to give the other endpoint a chance to close the connection normally.

[connection_pending] Generated for each still-open connection when Bro terminates.


The tcp analyzer

The general tcp analyzer lets you specify that you're interested in generic connection analysis for TCP. It simply @load's conn and adds the following to capture_filter:

    tcp[13] & 0x7 != 0
which instructs Bro to capture all TCP SYN, FIN and RST packets; that is, the control packets that delineate the beginning (SYN) and end (FIN) or abnormal termination (RST) of a connection.


The udp analyzer

The general udp analyzer lets you specify that you're interested in generic connection analysis for UDP. It @load's both hot and conn, and defines two event handlers:

[udp_request (u: connection)] Invoked whenever a UDP packet is seen on the forward (request) direction of a UDP connection. See §  for a discussion of how Bro defines UDP connections.

The analyzer invokes check_hot with a mode of CONN_ATTEMPTED and then record_connection to generate a connection summary (necessary because Bro does not time out UDP connections, and hence cannot generate a connection-attempt-failed event).

[udp_reply (u: connection)] Invoked whenever a UDP packet is seen on the reverse (reply) direction of a UDP connection. See §  for a discussion of how Bro defines UDP connections.

The analyzer invokes check_hot with a mode of CONN_ESTABLISHED and then again with a mode of CONN_FINISHED to cover the general case that the reply reflects that the connection was both established and is now complete. Finally, it invokes record_connection to generate a connection summary.

Note: The standard script does not update capture_filter to capture UDP traffic. Unlike for TCP, where there is a natural generic filter that captures only a subset of the traffic, the only natural UDP filter would be simply to capture all UDP traffic, and that can often be a huge load.


Connection summaries

The main output of conn is a one-line ASCII summary of each connection. By tradition, these summaries are written to a file with the name red.tag, where tag uniquely identifies the Bro session generating the logs. (``red'' is mnemonic for ``reduced,'' from Bro's roots in performing protocol analysis for Internet traffic studies.)

The summaries are produced by the record_connection function, and have the following format:

<start> <duration> <service> $B_{o}$ $B_{r}$ $A_{l}$ $A_{r}$ <state> <flags> <addl>

start
corresponds to the connection's start time, as defined by start_time.

duration
gives the connection's duration, as defined by duration.

service
is the connection's service, as defined by service.

$B_{o}$, $B_{r}$
give the number of bytes sent by the originator and responder, respectively. These correspond to the size fields of the corresponding endpoint records.

$A_{l}$, $A_{r}$
correspond to the local and remote addresses that participated in the connection, respectively. The notion of which addresses are local is controlled by the local_nets global variable, if refined from its default value of empty. If local_nets has not been refined, then $A_{l}$ is the connection responder and $A_{r}$ is the connection originator.

Note: The format and defaults for $A_{l}$ and $A_{r}$ are unintuitive; they reflect the use of Bro's predecessor for analyzing Internet traffic patterns, and have not been changed so as to maintain compatibility with old, archived connection summaries.


Table 7.2: Summaries of connection states, as reported in red files.
Symbol Name Meaning
} S0 Connection attempt seen, no reply.
> S1 Connection established, not terminated.
> SF Normal establishment and termination. Note that this is the same symbol as for state S1. You can tell the two apart because for S1 there will not be any byte counts in the summary, while for SF there will be.
[ REJ Connection attempt rejected.
}2 S2 Connection established and close attempt by originator seen (but no reply from responder).
}3 S3 Connection established and close attempt by responder seen (but no reply from originator).
>] RSTO Connection established, originator aborted (sent a RST).
>[ RSTR Established, responder aborted.
}] RSTOS0 Originator sent a SYN followed by a RST, we never saw a SYN ACK from the responder.
<[ RSTRH Responder sent a SYN ACK followed by a RST, we never saw a SYN from the (purported) originator.
>h SH Originator sent a SYN followed by a FIN, we never saw a SYN ACK from the responder (hence the connection was ``half" open).
<h SHR Responder sent a SYN ACK followed by a FIN, we never saw a SYN from the originator.
?>? OTH No SYN seen, just midstream traffic (a ``partial connection'' that was not later closed).


state
reflects the state of the connection at the time the summary was written (which is usually either when the connection terminated, or when Bro terminated). The different states are summarized in Table 7.3.6. The ASCII Name given in the Table is what appears in the red file; it is returned by the conn_state function. The Symbol is used when generating human-readable versions of the file--see hot-report.

For UDP connections, the analyzer reports connections for which both endpoints have been active as SF; those for which just the originator was active as S0; those for which just the responder was active as SHR; and those for which neither was active as OTH (this latter shouldn't happen!).

flags
reports a set of additional binary state associated with the connection:

L
indicates that the connection was initiated locally, i.e., the host corresponding to $A_{l}$ initiated the connection. If L is missing, then the host corresponding to $A_{r}$ initiated the connection.

U
indicates the connection involved one of the networks listed in the neighbor_nets variable. The use of ``U'' for this indication (rather than ``N'', say) is historical, as for the most part is the whole notion of ``neighbor network.''

Note that connection can have both L and U set (see next item).

X
is used to indicate that neither the ``L'' or ``U'' flags is associated with this connection. An explicit negative indication is needed to disambiguate the flags field from the subsequent addl field.

addl
lists additional information associated with the connection, i.e., as defined by addl.

Putting all of this together, here is an example of a red connection summary:

931803523.006848 54.3776 http 7320 38891 206.132.179.35 128.32.162.134 RSTO X %103
The connection began at timestamp 931803523.006848 (18:18:43 hours GMT on July 12, 1999; see the cf utility for how to determine this) and lasted 54.3776 seconds. The service was HTTP (presuambly; this conclusion is based just on the responder's use of port 80/tcp). The originator sent 7,320 bytes, and the responder sent 38,891 bytes. Because the ``L'' flag is absent, the connection was initiated by host 128.32.162.134, and the responding host was 206.132.179.35. When the summary was written, the connection was in the ``RSTO'' state, i.e., after establishing the connection and transferring data, the originator had terminated it with a RST (this is unfortunately common for Web clients). The connection had neither the L or U flags associated with it, and there was additional information, summarized by the string ``%103'' (see the http analyzer for an explanation of this information).


Connection functions

We finish our discussion of generic connection analysis with a brief summary of the different Bro functions provided by the conn analyzer:

[conn_size(e: endpoint, is_tcp: bool): string ] returns a string giving either the number of bytes the endpoint sent during the given connection, or "?" if from the connection state this can't be determined. The is_tcp parameter is needed so that the function can inspect the endpoint's state to determine whether the connection was closed.

[conn_state(c: connection, is_tcp: bool): string ] returns the name associated with the connection's state, as given in Table 7.3.6.

[determine_service(c: connection): bool ] sets the service field of the given connection, using port_names. If you are using the ftp analyzer, then it knows about FTP data connections and maps them to port_names[20/tcp], i.e., "ftp-data".

[full_id_string(c: connection): string ] returns a string identifying the connection in one of the two following forms. If the connection is in state S0, S1, or REJ, then no data has been transferred,7.1 and the format is:

$A_{o}$  <state>  $A_{r}$/<service>  <addl>
where $A_{o}$ is the IP address of the originator ($id$orig_h), state is as given in the Symbol column of Table 7.3.6, $A_{r}$ is the IP address of the responder ($id$resp_h), service gives the application service ($service) as set by determine_service, and addl is the contents of the $addl field (which may be an empty string).

Note that the ephemeral port used by the originator is not reported. If you want to display it, use id_string.

So, for example:

    128.3.6.55 > 131.243.88.10/telnet "luser"
identifies a connection originated by 128.3.6.55 to 131.243.88.10's Telnet server, for which the additional associated information is "luser", the username successfully used during the authentication dialog as determined by the login analyzer. From Table 7.3.6 we see that the connection must be in state S1, as that's the only state of S0, S1, or REJ that has a > symbol. (We can tell it's not in state SF because the format used for that state differs--see below.)

For connections in other states, Bro has size and duration information available, and the format returned by full_id_string is:

$A_{o}$  $S_{o}$b  <state>  $A_{r}$/<service>  $S_{r}$b  $D$s  <addl>
where $A_{o}$, $A_{r}$, state, service, and addl are as before, $S_{o}$ and $S_{r}$ give the number of bytes transmitted so far by the originator to the responder and vice versa, and $D$ gives the duration of the connection in seconds (reported with one decimal place) so far.

An example of this second format is:

    128.3.6.55 63b > 131.243.88.10/telnet 391b 39.1s "luser"
which reflects the same connection as before, but now 128.3.6.55 has transmitted 63 bytes to 131.243.88.10, which has transmitted 391 bytes in response, and the connection has been active for 39.1 seconds. The ``>'' indicates that the connection is in state SF.

[id_string(id: conn_id): string ] returns a string identifying the connection by its address/port quadruple. Regardless of the connection's state, the format is:

$A_{o}$/$P_{o}$  >  $A_{r}$/$P_{r}$
where $A_{o}$ and $A_{r}$ are the originator and responder addresses, respectively, and $P_{o}$ and $P_{r}$ are representations of the originator and responder ports as returned by the port-name module, i.e., either ``<number>/<tcp or udp>'' or a string like ``http'' for a well-known port such as 80/tcp.

An example:

    128.3.6.55/2244 > 131.243.88.10/telnet

Note, id_string is implemented using a pair of calls to endpoint_id.

Deficiency: It would be convenient to have a form of id_string that can incorporate a notion of directionality, for example 128.3.6.55/2244 < 131.243.88.10/telnet to indicate the same connection as before, but referring specifically to the flow from responder to originator in that connection (indicated by using ``<'' instead of ``>'').

[log_hot_conn(c: connection) ] logs a real-time alert of the form:

hot: <connection-id>
where connection-id is the format returned by full_id_string. log_hot_conn keeps track of which connections it has logged and will not log the same connection more than once.

[record_connection(c: connection, disposition: string) ] Generates a connection summary to the red file in the format described in § . If the connection's $hot field is positive, then also logs the connection using log_hot_conn. The disposition is a text description of the connection's state, such as "attempt" or "half_finished"; it is not presently used.

[service_name(c: connection): string ] returns a string describing the service associated with the connection, computed as follows. If the responder port ($id$resp_p), $p$, is well-known, that is, in the port_names table, then $p$'s entry in the table is returned (such as "http" for TCP port 80). Otherwise, for TCP connections, if the responder port is less than 1024, then priv-$p$ is returned, otherwise other-$p$. For UDP connections, the corresponding service names are upriv-$p$ and uother-$p$.

[terminate_connection(c: connection) ] Attempts to terminate the given connection using the rst utility in the current directory. It does not check to see whether the utility is actually present, so an unaesthetic shell error will appear if the utility is not available.

rst terminates connections by forging RST packets. It is not presently distributed with Bro, due to its potential for disruptive use.

If Bro is reading a trace file rather than live network traffic, then terminate_connection logs the rst invocation but does not actually invoke the utility. In either case, it finishes by logging that the connection is being terminated.


next up previous contents index
Next: Site-specific information Up: Analyzers and Events Previous: General Processing Events   Contents   Index
Vern Paxson 2002-11-17