epbrtcybersecurityportfolio.xyz

Blog

Artificial intelligence
Objectives
This exercise introduces you to a machine learning/AI pipeline solution that pushes data from Zeek through an AI model to produce alerts about network activity.

Details
This lab will require us to start several SSH connections to the virtual machine. One of these will run Zeek, monitoring the loopback interface. Another will run a Python script that will load a trained AI model and use it to generate alerts regarding network protocols that are present. The last connection will be used to replay packets over loopback so that Zeek has something to look at.

Exercise 1

To begin with, we need to get three separate command lines established to the VM.
1. Please either open three separate SSH connections to the VM or, alternatively, use tmux to split your current connection into at least three panes. Use the cd command to change into the /sec503/Exercises/Day5/ai directory in each session.
2. Now that we have three command lines available, we can start setting up the various pieces required to make this work. One of the sessions will be running Zeek and using the classify.zeek script. This script will push the first bytes in every network stream into a Broker channel named /sec503/content. To use it, we need to start Zeek, ask it to listen on the lo or loopback interface, and configure it to run this script. When we run Zeek on loopback, we will also see warnings related to checksums. While we would never do so in production, we will tell Zeek to ignore checksums while running in this lab. Please execute the following command as root in one of the sessions:
3. Now that Zeek is running, we can use another one of our sessions to connect the AI classifier to the Broker channel. We will do this using the classify.py script in the lab directory. Please run it as follows:

4. Our final task is to send data over the loopback interface so that Zeek can see it, relay the sessions to the classifier, and the classifier can report what it is seeing. To do this, you will use your third session. This session must be running as root.

5. Observe the output in the session that is running the classification script:

While there are some protocols being misclassified, overall this tool is doing an excellent job identifying known protocols.
July 15, 2025
Researching Anomalies
Details

During class we discovered that there was unusual activity on January 2, 2021. This leads to several important questions. Which way was the data moving? What does the data appear to be? Is this likely data exfiltration? We will answer these questions in this lab.

Exercise 1

In the course book, we saw that the anomalous activity occurs on Saturday, January 2, 2021.
1. Begin by using SiLK to confirm that January 2, 2021 is an outlier. Query the repository for all of the data between March 1, 2020 and March 1, 2021. Use the SiLK tools to process the data to report the total number of bytes transferred over each 24 hour period.
If I scroll down to the January 2,2021 line, I can easily see that there was an abnormal number of bytes transferred on this day confirming our assumption:

2. Now that you have confirmed what the tool reported, it’s time to drill into January 2, 2021. Identify when the greatest amount of data is seen moving on the network. During which hour does the greatest amount of data move?

I just to modify my last command line by changing the start and end date in rwfilter and change the bin size in rwcount:

3. If you review the data that you find in step 2, you should see that the large amounts of data are being sent very early in the day. Please check the 24 period from 12:00 PM January 1 through 12:00 PM January 2 to check if the data flows begin on January 1.

I can specify a starting and ending hour on top my dates in rwfilter:

This confirms that the large amount of data being sent started right after midnight on January 2nd.

4. Satisfied that the large data transfer(s) are occurring on January 2, it’s time to examine what’s happening on January 2.
Examine the data between 00:00 and 12:00 on January 2. Which IP protocols are present and which of those transfers the greatest number of bytes?

Out of the 4 protocols transferring data on that day, TCP (protocol 6) is the one transferring the greatest number of bytes.

5. The next task is to find the connection or connections involved. Using SiLK, determine which connection or connections are most likely creating this massive spike. You might find it most interesting to look at the top 20 connections.

6. Examine the connections between 172.217.10.74 and 192.168.2.163. Also examine the records between 172.217.10.10 and 192.168.2.163. Answer the following questions:
- Is the data moving in or out of the enterprise?
- What seems to be the purpose of the communication?
- Does this appear to be something automated or something human driven?
The first thing that I noticed was that the communication between these hosts, at least for more than 95% and 92% of their communications, is over port 443.

• Is the data moving in or out of the enterprise?
- We were provided with a network diagram during the class therefore I know that 192.168/16 is an internal network and the 172.217.10.74 and 172.217.10.10 is not within the enterprise. The data is moving in the entreprise.
• What seems to be the purpose of the communication?
- It is easy to say this is TLS communication over port 443. These appear to be large file transfers. We have many connections with a large amount of data in each. Finally, performing a whois on 172.217.10.74 and 172.217.10.10 show that the addresses are assigned to Google. It could mean that somebody is streaming Youtube videos from inside the network this is YouTube related, but it could also be some type of Google Drive synchronization.
• Does this appear to be something automated or something human driven?
- Based on how quickly the connections seem to restart, it appears as some type of automated activity rather than being human driven.
July 12, 2025
SiLK Statistics
Details
In this section you will experiment with the rwcount, rwstats, and rwnuniq tools. The goal is to understand how these tools function and examine how they can be used to answer important questions that an analyst will ask when researching a network, investigating network activities, or engaging in threat hunting activities.

Exercise 1
The rwstats tool is used to aggregate information about a collection of flows according to a user specified aggregation criteria. Working with the data in this way can take some getting used to and there are some pitfalls to watch out for. All of the questions in this exercise make use of the SiLK repository on the VM in the date range from January 1, 2022 through July 8, 2022.
1. When beginning to analyze a network to better understand how it is used or to identify potentially “bad” activity, it is usual to answer some basic questions about that network. One of the very first series of questions will attempt to identify who the “Top Talkers” are on the network, what protocols they are using, how much data is being sent, etc.
  On the network in question, what is the IP address of the host that appears as the source in the greatest number of flows?
I begin with the rwfilter tool to query the repository for all flows in the date range. All of the flows that pass this filter are then processed by rwstats. Using the --fields argument, I can specify that the flows should be aggregated into bins for each unique source IP address. The IP address we are looking for is 10.200.223.4.

2. In the last question, it appears that an internal address appears in the greatest number of flows. While this is interesting, a more important question is likely, “Which source address sends the greatest number of bytes?” In fact, see if you can answer that question now: Which source IP address sends the greatest number of bytes and how many bytes does it send?

By default rwstats reports the number of flows. Using the --bytes option, I can override this, forcing it to aggregate and report based on the number of bytes

172.28.10.5 sends the greatest number of bytes.

3. The last answer reveals that an internal host appears to be the source of the greatest number of bytes sent. Where were those bytes sent?

4. Let’s narrow this data down even further. We now know which source sends the most data and to which destination that data is sent. Which protocols are used? Which port numbers?

Use the SiLK tools to identify the top 10 destination ports and protocols, based on the number of bytes, used in flows originating from 172.28.10.5 and going to 10.200.223.6.

5. Consider the output of the last solution. Is 172.28.10.5 likely to be the client or the server in these flows? Since the destination port on 10.200.223.6 is around the 44,000 range, it is most likely an ephemeral port. Since this is true, it is most likely that 172.28.10.5 is the server, not the client. This should make you wonder what the source port for these flows are and, possibly, what the top ten ports and protocols looks like when we view 172.28.10.5 as a server.

Since we are sure you are as curious about this as we are, please adjust your SiLK commands so that you are examining the top ten (based on bytes) source ports and protocols used by the source host 172.28.10.5 when 10.200.223.6 is the destination. What do you find?

6. Look carefully at the output from the commands in parts 4 and 5. What conclusion or conclusions can you draw about the port 22 activity? You may wish to run additional queries to identify the duration of one or more of these connections.

The host 172.28.10.5 is running a service on TCP port 22, probably an SSH service. This service is used by the host 10.200.223.6 to establish a number of connections and transfer a large amount of data over a number of connections.

Exercise 2

It is important that we are able to reconstruct a complete session out of NetFlow data. We know that a single session may be comprised of a series of NetFlow records. In this exercise we will examine how to find and reassemble these pieces.

1. Please begin by using rwfilter and rwstats to find the top ten (by bytes) outbound TCP connections from the 172.16.0.0/16 network between May 1, 2019 and May 4, 2019. Which source host sends the greatest number of bytes to which external destination host? (An external host, in this case, should not have an address in the 10/8, 192.168/16, or 172.16/12 networks)

2. We can see that the largest number of bytes transferred is between the internal host 172.16.20.14 and the external host 52.223.227.117. Extract the TCP flow records between these two hosts using 172.16.20.14 as the source and 52.223.227.117 as the destination within the same time range, displaying the source IP, destination IP, source port, destination port, and flow duration.

3. Consider the output from the last step. Please notice the first seven rows. All of these have the same source and destination port, in addition to having the same duration. Looking at the duration, we can derive that the NetFlow sensor that is generating this flow information in the repository is most likely configured to use a refresh interval of 1,800 seconds. Seeing that all but the last flow are right at this threshold and that the source and destination ports do not change, it seems reasonable that these are all flows from the same session.

Please extract all of the flow records related to this specific connection. Your output should include the source, destination, session flags, initial flags, and the number of bytes transferred.

4. According to the NetFlow repository, how many bytes, in total, were transferred between these two hosts in this session?

The first record indicates that host 172.16.20.14 initiated the connection, sending the initial SYN. We can see that this connection is seen in seven time intervals. The last record has a FIN session flags set. This implies that we have all of the information about this entire session.

To determine the number of bytes transferred in total, I just need to do the sum of the bytes column:
```
4513670 + 4542623 + 4510972 + 4512661 + 4536906 + 4534209 + 4272324 = 31,405,419 bytes
```
July 11, 2025
SiLK and NetFlow
Details
This section of exercises allows you to explore the use of SiLK with a NetFlow repository rather than using files generated from packet capture files. Using SiLK with packet captures is very useful during an incident response if a NetFlow repository isn’t available, but during normal day-to-day operations, you would typically use SiLK with a repository.

In our collective experience, even though NetFlow is generally already supported by the switches, routers, and other network devices that enterprises have installed, it is rare to find that an enterprise has a NetFlow repository configured unless they have a fairly knowledgeable network engineering staff. It is even more rare to find that it is being used for any type of security analytics or to identify potential indicators of compromise. A NetFlow repository, therefore, is one of the easiest and least expensive changes that can be made to a network infrastructure that will immediately provide greater insight into how the network is used and assist to identify anomalous behavior.

Exercise 1
SiLK Repository
When using SiLK with a repository, you have the ability to retrieve results covering long periods of time, from specific sensors, and more. SiLK relies on configuration files in the /data directory to determine what the names of the sensors are, how data will be collected, etc., in addition to which fields are displayed by default when using rwcut. There is absolutely no need for you to make any changes or directly work with the files in this directory, but you are welcome to explore.

1. Please query the repository stored on the class VM to determine the total number of flows seen between October 1, 2018, and October 15, 2018. How many flows are there?

I am going to use the rwfilter tool. Using this tool I can specify partitioning criteria, such as the type of data to retrieve, sensors of interest, and time ranges of interest. I can also specify query criteria, such as the protocols of interest etc…

On top of it, I can leverage options like --print-statistics that will give me the number of flows :

There are 4143675 flows in this time range.

2. Please query the repository for flows occurring between October 1, 2018, and October 15, 2018. How many TCP flows were logged?

I can use the same command as above to answer. All I have to modify is the –proto option as TCP is the protocol number 6:

There are 2997358 TCP flows logged.

Exercise 2
1. Please query the repository to find all of the hosts that are seen establishing a connection to destination port 60000 between October 1, 2018, and October 31, 2018. How many unique source hosts are seen?
Since I am only interested in host seen establishing a connection, I must select all of the flows that begin with a SYN. I can use the rwfilter option called –flags-initial

There are 3 lines that are no part of the listed flows. Therefore, using wc -l, I can quickly figure out how many unique source hosts are seen:

We have 48 unique source hosts.

Exercise 3

Let’s switch to the repository data that does not have all of the flags data present. While it isn’t convenient to work with this data, it is not unusual to have sensors that will not properly populate these fields. This makes it important to have familiarity with working with this type of data.

The data of interest covers dates from February 8, 2022 through July 3, 2022.
1. Please query this new repository and identify all of the flows where only SYN, with or without ECN bits, was present. Examine the first 20 flows that are displayed, especially noting the source, destination, ports, number of packets, and flags fields.
Notice the behavior of the source hosts and source ports, in addition to the number of packets seen. How would you characterize this? Do these appear to be “real” connection attempts, or some type of spoofed scanning behavior?

I observed that the source IP address 172.28.30.4 appears to be initiating connections to six different destination hosts—first targeting port 9573, then port 10001. The presence of packets with only the SYN flag set strongly suggests these are the initial steps of a TCP three-way handshake. Additionally, I noticed that the source port changes with each destination, which indicates the packets are likely not spoofed but instead generated by a legitimate IP stack initiating connections.

Each flow consists of two or three packets, which is important. If there were only one packet per flow, it might suggest scanning behavior. However, two to three packets usually point to actual connection attempts, possibly with some retries involved. Taking all of this into account, the evidence supports that 172.28.30.4 is most likely making genuine TCP connection attempts rather than performing a spoofed scan.

2. Extract all of the records from this data that involve hosts 172.28.30.5 and host 192.225.158.2 and examine the sip, sport, dip, dport, flags, and packets fields.

Using the ‘–any-address’ option coupled with chaining two rwfilter command together allow me to extract the records involving just these two hosts.

3. Examine all of the flows between 192.225.158.2 and 172.28.30.5 and explain why two of the flows have no flags set.

If I add the protocol field to my rwcut command, I can see that the two flows with the SYN flag are TCP flows (protocol 6) whereas the flows without flags are UDP flows (protocol 17):
July 10, 2025
Packet Crafting for IDS/IPS
Exercise 1
1. Craft an ICMP echo request with the following:
  - An Ethernet source address of aa:bb:cc:dd:ee:ff
  - An Ethernet destination address of ff:ff:ff:ff:ff:ff
  - A source IP address of 192.168.1.1
  - A destination address of 192.168.1.2
  - An ICMP sequence number of 234
I am going to use a tool called scapy to complete this lab:

The first thing that I need to do is to create an Ethernet header and an IP header, assigning each to a variable:

Let’s now create the ICMP sequence number:

Now that all the required headers have been built, I can assemble the frame:

The ICMP echo request is now crafted

2. Display the frame you just created.

3. Write the frame you created to the output pcap file named /tmp/icmp.pcap.

4. Use ssh to connect to the virtual machine in a second terminal window. In the new terminal, use tcpdump to examine the packet in /tmp/icmp.pcap to make sure that the frame you crafted matches the specifications detailed. With tcpdump, use either the -XX, -X, or -v option to show the link layer.

Exercise 2
1. Read /tmp/icmp.pcap that you just created in the previous exercise using a Scapy session.
- Alter the value of the ICMP sequence number to 4321.
- Write the new record to /tmp/icmp2.pcap.
- Read /tmp/icmp2.pcap in a different terminal (new or from the previous exercise) using tcpdump, supplying it the -vv option to verify that you crafted a valid record.
We read /tmp/icmp.pcap into a list named r:

Next, I extract the only record in the list (r[0]) and assign it a name of echoreq

I assign the ICMP layer of the echoreq an attribute sequence number value of 4321 and display it.

Scapy displays the ICMP sequence number in hex, so I can validate that 0x10e1 is equivalent to decimal 4321:

Next, I use wrpcap() to write echoreq to /tmp/icmp2.pcap and use tcpdump in verbose mode to read the record.

2. When you view the resulting packet in the new /tmp/icmp2.pcap file with tcpdump, you should be able to identify an obvious problem with the packet. What is it?

The checksum is corrupted.

3. Why did this happen ?

I altered the ICMP sequence number value and did not get scapy to recompute the checksum after that. The checksum value is not recomputed until the frame is either or stored to a pcap file.

4. Correct the issue by altering the record that still exists in your Scapy interactive session and writing it out again to /tmp/icmp2.pcap.

I need to delete the checksum value from the ICMP header

Now, I can write it out again

5. Rerun tcpdump to make sure the error was corrected

Exercise 3

Description: This exercise requires you to craft and send some crafted traffic using Scapy. Specifically, you craft an ICMP echo request in one Scapy interactive session, listen for it in another Scapy interactive session, and respond with a crafted ICMP echo reply from the second session.

You need to open three different ssh connections to the virtual machine for this. If you still have Scapy running from the previous exercises, using sudo scapy, this can be the first ssh connection.

In a second terminal, use tcpdump to sniff for the traffic you will craft and send from the Scapy sessions from the other two terminals. Unlike simply reading a pcap as we have been doing, sniffing traffic using tcpdump requires you to have elevated privileges. Like with Scapy, use sudo to elevate your privileges when running tcpdump to sniff traffic off an interface. The below tcpdump command sniffs for traffic and disables DNS name resolution with the -n option, suppresses the timestamp display with the -tt option, shows you the ASCII payload with the -A option, and filters for ICMP traffic only. You do not need to specify the interface to sniff on if you are sniffing on the first Ethernet interface

In the third ssh session, invoke a second Scapy interactive interface and prepare Scapy to sniff an ICMP echo request that you will send from the first Scapy session.

The Scapy sniff listens on a given interface for packets and you can add BPF filters with the filter option. Run the below command in Scapy.
1. In the first Scapy session, craft an ICMP echo request with a source IP address of “172.16.1.1”, a destination IP address of “192.168.200.200”, an ICMP ID value of 10, and an ICMP sequence value of 100. Add any string payload to this, enclosing it in double quotes. Now, send the crafted ICMP echo request.
This is what I see in the tcpdump window

2. Return to the Scapy interface that sniffed the packet. Display the received ICMP echo request to find the ICMP ID value of 10, displayed as 0xa, and the ICMP sequence number of 100, displayed as 0x64.

3. Continuing in the Scapy session, craft and send an appropriate ICMP reply. Make use of the ICMP echo request that Scapy captured, modifying fields as necessary. You should build a new IP header, but reuse the ICMP header and payload from the captured packet.

First,I need to create a new IP header and stack that with the captured ICMP request and payload

Next, I need to set the source of this new IP packet to be whatever the destination address was in the request. I also need to set the destination address for this new IP packet to be the source of the captured request.

Finally, since I want to send an echo-reply, I need to set the ICMP type to be 0. I also need to delete the ICMP checksum value, which was copied from the original packet. I want Scapy to automatically recalculate this value so that a checksum error does not get generated.

Now, I can send my packet

4. Verify that your crafted echo reply was properly sent by checking the tcpdump output from the other window.
July 8, 2025
IDS/IPS Evasion
Exercise 1

Description:
Examine the TCP session between hosts 192.168.1.103 and 192.168.1.104. There is something that is nonstandard about this session. What is it, and why might it cause an IDS evasion?
1. Does the session get established?
In Packet 64, the client at 192.168.1.104 tried to establish a connection with the server at 192.168.1.103. Instead of acknowledging the connection, host 192.168.1.103 sent a TCP packet with an SYN flag to the host at 192.168.1.104. In Packet 66, the client responded with a Syn Ack packet. This packet is flagged as a retransmission as there was a time lapse of 30 seconds between pack 65 and 66. What seems to have happened is that since he did not receive a response to his Syn packet, the client retransmitted it while at the same time acknowledging the Syn packet sent by host 192.168.1.103. The server sent an Ack package in packet 67 to complete the handshake and the session was established. This is what is called a “four-way handshake” and it might lead to IDS/IPS evasion as this session will not be tracked since it is not a conventional three-way handshake.

2. Can Snort find the malicious content?

One of the connections that is present in the evade.pcap file has content that looks like this:

21:56:47.400000 IP 184.168.221.65.52342 > 10.1.15.80: Flags [P.], seq 143:463, ack 1, win 8192, length 421
HTTP: GET /EVILSTUFF HTTP/1.1..Host: example.com..User-Agent: curl/7.35.0..Accept: /….

We can clearly see GET /EVILSTUFF HTTP/1.1 in the packet. Let’s see if we can alert on that using this alert:

alert http (msg: “Evil 1 in URI”; content: “EVIL”; sid:10000005; rev:1;)

Let’s run Snort and see how it does.

The alert was not triggered.

3. Can Zeek find it?

Let’s create a Zeek signature specifically designed to find that EVIL URL request. Please create a file named evil.sig that contains:

signature Evil {
ip-proto == tcp
dst-port == 80
payload /EVIL/
event “EVIL URL!”
}

Let’s run Zeek against the evade.pcap file and see if Zeek finds the known signature:

Zeek also fails to find the malicious content.

Exercise 2

Description: Look at the HTTP traffic between hosts 10.246.50.2 and 10.246.50.6.

Examine the HTTP headers on the GET request. What type of attack is this, and what does the code instruct the HTTP server to do? Was the attack successful? How do you know?

The User Agent looks abnormal. Normally it indicates the browser version used by the client, but in this case, it looks like an empty function followed by a ping command:

User-Agent: () { :;}; /bin/ping -c1 10.246.50.2

Searching for this kind of exploit online, I found out that this is a Shellshock vulnerability that is delivered via the User-Agent HTTP header value because the User-Agent is an environment variable.

If the attack was successful, I should be able to find a ping request sent from the server (10.246.50.6) to the client (10.246.50.2):

The attack was successful.

Exercise 3:

Description:
Look at the traffic between hosts 192.168.1.105 and 192.168.1.103. The fourth record in the exchange between the hosts is a RST from the client 192.168.1.105 to the server 192.168.1.103. However, as you can observe, 192.168.1.105 continues to send traffic and 192.168.1.103 acknowledges it. Explain the reason why traffic is sent and acknowledged after the RST and why it might cause an IDS evasion.

The fourth packet has a bad TCP checksum, meaning that receiving host 192.168.1.103 will drop it, permitting the subsequent sent and acknowledged packets. Some IDS/IPS systems do not validate the TCP checksum and therefore may stop tracking the session because it sees the RST. This would cause an evasion because the session continues and the destination host receives the malicious traffic without the IDS/IPS being aware of it.
July 6, 2025
Zeek Script Part 3

In this lab, I will be developing another useful script that is a little bit more advanced than the ones created in part 1 and 2.

Exercise – HTTP Exfiltration?

Description:
In this exercise, we will create a script that locates anomalous outbound data transfers based on the idea that, generally, we would expect to find that web connections have more data coming from the server to the client. This will potentially allow us to identify data exfiltrations.

Using the Zeek documentation, write a Zeek script that prints a message any time a connection involving TCP port 80 ends and the amount of data sent by the client was greater than that sent by the server.

The first problem is determining which event to subscribe to. Again, I need to review the Zeek documentation, to be able to find corresponding events of interest for the problem I am trying to solve. I am looking for an event that would correspond to a connection ending. There is an event that corresponds to event new_connection(). This is the event connection_finished(c:connection) event.

I am interested in HTTP connections, and I want to limit my view to only connections involving TCP port 80. I first tried implementing the following condition in my script :

if(c$id$resp_p != 80) { return; }

but I received a ‘type clash‘ error when I tried running a script using this condition. Zeek exposes the idea of a port as its own data type. This data type requires a number and a protocol name, separated by a slash. Typical HTTP would be 80/tcp. My condition should then be modified to if(c$id$resp_p != 80/tcp) { return; }

I also need to look at the number of byts sent by the server and the client. The c$orig and c$resp sections of the connection_finished event handler have useful data in them. There are fields that will give me the total number of IP bytes, but there are also size attributes. This field informs you of the total number of bytes of payload sent by either side of the connection.

Using these fields, I would then need to add some logic that will compare the number of bytes sent by the originator to the number of bytes sent by the respondent. If the originator (client) sent more than the respondent (server), I want to print a message.

Let’s test this script out:

July 5, 2025
Zeek Script Part 2
Objectives

Our main objectives in this section are to both learn more about how to write Zeek scripts by experimenting with the language and to create a script that will be useful in the real world.

Exercise

Description: In this exercise, we will create a script that reports outbound connections for which no previous DNS resolutions were observed.

First, for what reasons would a host ever attempt an outbound connection without first performing a DNS resolution? Here are some possibilities:
- Hard-coded DNS server addresses
- Other hard-coded configuration of addresses
- Malware phoning home to a set of known addresses
- Resolution occurred over another path
- Outbound network scanners
- Malware attempting to spread out of our border
- Use of DNS over HTTPS or DNS over TLS
There are surely other possibilities. However, the list above includes most of the more common explanations. Let’s discuss and explain these.

We do expect outbound connections to the upstream DNS servers with no corresponding name queries. Why? That’s just how it works! We statically configure our DNS server with a set of “root hints” or configure it to relay all requests upstream to a specific name server. We will never see queries related to finding these addresses.

We may have systems or services that have hard-coded IP addresses in them. These are not necessarily nefarious, but such configurations are very fragile! It is much better to configure systems to connect to names that can be resolved via DNS. This allows us to relocate systems or services without breaking anything.

The other reasons are not so good. This behavior could be an indication of a malware infection. Perhaps it is attempting to establish a command-and-control link to a static address or addresses. This is especially common when we are early on in a targeted compromise. Perhaps it’s generic malware that’s attempting to spread beyond the borders of our network. Since it doesn’t know exactly who it wants to infect, malware will frequently scan random addresses (without DNS lookups, of course!).

As far as DNS resolutions occurring over some other path, consider both why and how this might be done. Certainly, someone can argue that it would be done for privacy, but within an enterprise network, how much “private” activity should there be? It is more likely that, in this case, it is being done to bypass content filtering controls, though it could also be inadvertent.

A prime way that this could happen is through the use of DNS over TLS (port 853) or DNS over HTTPS (port 443). As you are likely aware, a number of browsers optionally support DNS over HTTPS today. While the script that we are creating will not allow us to see what the DNS resolutions were, we can certainly detect that a system seems to be circumventing our normal resolution infrastructure and take appropriate action to remediate the system or activity.

We can see that this is all interesting behavior. How can we find it?
1. Using any editor of your choice that is installed on the VM, please create a script named anomalousOutbound.zeek that will print the address resolved every time a DNS A or DNS AAAA record is seen.
To begin, I should define a module name for my script. I am going to call my module a module NoDNS. This defines a namespace and moves everything contained in my script into that namespace.

Next, I need to determine which events I need to subscribe to using the Zeek Scripting protocol analyzers documentation for DNS. Looking at this documentation, there are two events that I am interested in:

event dns_A_reply(c: connection, msg: dns_msg, ans: dns_answer, a: addr)

and the IPv6 version:

event dns_AAAA_reply(c: connection, msg: dns_msg, ans: dns_answer, a: addr)

I need to create event handlers for these two events in my script and add a code block to each that will print the resolved address.

I am interested in printing the address resolved every time an A record or AAAA record is seen. The parameter ‘a: addr’ holds the resolved IP address from the DNS response and is passed to the event handler when the event is triggered during Zeek’s analysis of network traffic. This is exactly what I need to print. Let’s modify my script to account for this.

Let’s make sure that the output of this script is correct:

The next step is to add some sort of global variable that will keep track of the DNS addresses that have been resolved. I can add a global variable for this purpose and add all of the addresses to it. I am going to use the zeek_done() event to print the content of this variable when the script completes to verify that it is working.

The output from this script doesn’t look much different, but it runs much differently. Rather than printing addresses as they are seen, the full list is printed at the very end. The entire list is contained within curly braces ({}), indicating that it is a set:

The final step is to remove the zeek_done() call and to cause our script to trigger every time a connection is seen to a responding host that was not seen in our set. I need to identify an event that will be generated every time a new connection or session is seen. Looking at the zeek documentation, there is the event new_connection(c: connection) that is perfect for my purposes. It is generated whenever a new stream is identified involving TCP, UDP, or ICMP.

Now I need to add some output to the event, but only if the responding host isn’t found in the set of resolved addresses. Both vectors and sets allow for the use of the ‘in’ keyword to check for membership.

Let’s run this new script:

This script still has a few problems. I am seeing external to internal and internal to internal. I need to add a condition to the new_connection() event handler to ignore external to internal and internal to internal connections.

The very last thing that I will add to this script is an exception for certain approved external DNS servers (such as 8.8.8.8, which is Google’s DNS server). This is necessary because internal DNS resolvers often connect directly to external DNS servers without first performing a DNS lookup for their addresses. These addresses are typically configured in advance, so no DNS resolution happens beforehand. To avoid false positives, the script should exclude these approved servers from being flagged.

Let’s do a last test run :
July 5, 2025
Zeek Script Part 1

Objectives
This exercise involves creating and running a very basic Zeek script.

Exercise 1
Description: In this first exercise, we will create the traditional “Hello, World!” that is typical of a first attempt at programming in a new language.

Using any editor of your choice that is installed on the VM, please create a script that will generate the string “Hello, World!” at the console when it is executed with Zeek.

Since Zeek is a C-like language, it shares many very familiar syntactical and grammatical similarities. A list of a few similarities is:

• Functions and events are defined using a type, a name, and a parameter list.
• Parameter lists are contained within parentheses.
• The types of all variables passed as parameters must be defined.
• The end of a statement is delimited with a semicolon ;.
• Sections of code are all contained within curly-braces {}.

The first step is to create a file to put my script into. I am going to name my script first.zeek.

The next step is to determine which Zeek events will serve my needs. I want an event that will fire every time Zeek starts up.

After searching through the Zeek documentation, I found an event called “zeek_init()” that seems to be the correct event. It is triggered every time Zeek starts.

Regarding Module Name, my SEC 503 very strongly recommended that I develop the habit of prefixing all of my scripts with the module keyword and a name (example: module MyFirstScript;)

This statement defines a namespace within which all of my global variables, functions, and events exist. When I do this, other scripts can still access my global variables, functions, and events by prefixing them with the MyFirstScript:: namespace operator, and it protects me from inadvertently overwriting existing variables, events, and functions in the global namespace.

With the event information, I simply need to define my event and instruct Zeek as to what it should do when the event fires. In my case I want Zeek to print out “Hello, world!” when Zeek runs.

Putting it all together:

Let’s test this script and see if it runs properly:

It worked!!!

July 3, 2025
Zeek Signature

Given the time spent mastering signature-based detection, signatures are a very familiar starting point, and signatures can certainly play a role in Zeek scripts and logs. Our objective is to create a simple signature and configure Zeek to use this signature to detect content.

Exercise 1
Description: Create a signature to find the dnscat proof-of-concept covert channel. This form of dnscat can be easily identified by creating a signature that looks for the string dnscat in UDP DNS packets.

Zeek signatures are typically stored in files using the .sig extension. Please create a signature to log every time the word dnscat is seen in UDP DNS packets! Use the filename dnscat.sig for your signature. Run Zeek with this signature file and verify that it successfully logs events. Activity from this covert channel can be found in the signature.pcap file in the /sec503/Exercises/Day4/zeek/zeek-sig directory.

The general form of a zeek signature will be:

signature signatureName {
ip-proto == XXX # Fill in the correct IP protocol
dst-port == XXX # Fill in the correct DNS server port number
# Any other required IP, TCP, UDP, or other protocol headers go here.
payload /RegularExpression/ # Create a regular expression to find the content
event “Longer message” # An arbitrary message string to add to the log
}

To tell Zeek to load a specific signature file, you must use the -s option: zeek -r capturefile.pcap -s signaturefile.sig.

Putting it all together the signature, the signature could look like this:

signature dnscat{

ip-proto==udp

dst-port==53

payload/.*dsncat.*/

event “udp dnscat tunnel”

}

Let’s load this signature file

July 2, 2025