Details
In this section you will experiment with the rwcount
, rwstats
, and rwnuniq
tools. The goal is to understand how these tools function and examine how they can be used to answer important questions that an analyst will ask when researching a network, investigating network activities, or engaging in threat hunting activities.
Exercise 1
The rwstats
tool is used to aggregate information about a collection of flows according to a user specified aggregation criteria. Working with the data in this way can take some getting used to and there are some pitfalls to watch out for. All of the questions in this exercise make use of the SiLK repository on the VM in the date range from January 1, 2022 through July 8, 2022.
- When beginning to analyze a network to better understand how it is used or to identify potentially “bad” activity, it is usual to answer some basic questions about that network. One of the very first series of questions will attempt to identify who the “Top Talkers” are on the network, what protocols they are using, how much data is being sent, etc.
On the network in question, what is the IP address of the host that appears as the source in the greatest number of flows?

I begin with the rwfilter
tool to query the repository for all flows in the date range. All of the flows that pass this filter are then processed by rwstats
. Using the --fields
argument, I can specify that the flows should be aggregated into bins for each unique source IP address. The IP address we are looking for is 10.200.223.4.
2. In the last question, it appears that an internal address appears in the greatest number of flows. While this is interesting, a more important question is likely, “Which source address sends the greatest number of bytes?” In fact, see if you can answer that question now: Which source IP address sends the greatest number of bytes and how many bytes does it send?
By default rwstats
reports the number of flows. Using the --bytes
option, I can override this, forcing it to aggregate and report based on the number of bytes

172.28.10.5 sends the greatest number of bytes.
3. The last answer reveals that an internal host appears to be the source of the greatest number of bytes sent. Where were those bytes sent?

4. Let’s narrow this data down even further. We now know which source sends the most data and to which destination that data is sent. Which protocols are used? Which port numbers?
Use the SiLK tools to identify the top 10 destination ports and protocols, based on the number of bytes, used in flows originating from 172.28.10.5 and going to 10.200.223.6.

5. Consider the output of the last solution. Is 172.28.10.5 likely to be the client or the server in these flows? Since the destination port on 10.200.223.6 is around the 44,000 range, it is most likely an ephemeral port. Since this is true, it is most likely that 172.28.10.5 is the server, not the client. This should make you wonder what the source port for these flows are and, possibly, what the top ten ports and protocols looks like when we view 172.28.10.5 as a server.
Since we are sure you are as curious about this as we are, please adjust your SiLK commands so that you are examining the top ten (based on bytes) source ports and protocols used by the source host 172.28.10.5 when 10.200.223.6 is the destination. What do you find?

6. Look carefully at the output from the commands in parts 4 and 5. What conclusion or conclusions can you draw about the port 22 activity? You may wish to run additional queries to identify the duration of one or more of these connections.
The host 172.28.10.5 is running a service on TCP port 22, probably an SSH service. This service is used by the host 10.200.223.6 to establish a number of connections and transfer a large amount of data over a number of connections.
Exercise 2
It is important that we are able to reconstruct a complete session out of NetFlow data. We know that a single session may be comprised of a series of NetFlow records. In this exercise we will examine how to find and reassemble these pieces.
1. Please begin by using rwfilter
and rwstats
to find the top ten (by bytes) outbound TCP connections from the 172.16.0.0/16 network between May 1, 2019 and May 4, 2019. Which source host sends the greatest number of bytes to which external destination host? (An external host, in this case, should not have an address in the 10/8, 192.168/16, or 172.16/12 networks)

2. We can see that the largest number of bytes transferred is between the internal host 172.16.20.14 and the external host 52.223.227.117. Extract the TCP flow records between these two hosts using 172.16.20.14 as the source and 52.223.227.117 as the destination within the same time range, displaying the source IP, destination IP, source port, destination port, and flow duration.

3. Consider the output from the last step. Please notice the first seven rows. All of these have the same source and destination port, in addition to having the same duration. Looking at the duration, we can derive that the NetFlow sensor that is generating this flow information in the repository is most likely configured to use a refresh interval of 1,800 seconds. Seeing that all but the last flow are right at this threshold and that the source and destination ports do not change, it seems reasonable that these are all flows from the same session.
Please extract all of the flow records related to this specific connection. Your output should include the source, destination, session flags, initial flags, and the number of bytes transferred.

4. According to the NetFlow repository, how many bytes, in total, were transferred between these two hosts in this session?
The first record indicates that host 172.16.20.14 initiated the connection, sending the initial SYN. We can see that this connection is seen in seven time intervals. The last record has a FIN session flags set. This implies that we have all of the information about this entire session.
To determine the number of bytes transferred in total, I just need to do the sum of the bytes column:
4513670 + 4542623 + 4510972 + 4512661 + 4536906 + 4534209 + 4272324 = 31,405,419 bytes