Exercise 5: Identifying Reconnaissance
Estimated Time to Complete: 15 minutes
Objectives
- Study the various resources deployed in exercise 1 to determine log sources to aid the investigation of the attack techniques in exercises 2 through 4
- Download AWS CloudFront access logs to your CloudShell session
- Breakdown the structure of the AWS CloudFront access logs
- Use various Linux commands to determine the following for each CloudFront interaction:
- Time of the request
- Source IP address of the requestor
- HTTP Method of the request
- HTTP endpoint requested
- User-Agent of the requestor
Challenges
Challenge 1: Understand the Evidence-App Logging Resources
Look at the diagram outlined in the evidence-app's README.md file. What are the possible log sources you can use to find the various attack techniques you performed in exercises 2 through 4?
Solution
-
If you open up the
README.mdfile in the evidence-app's source code repository, you will find the following diagram:
-
The blue, dashed line indicates all of the logging interactions. Below is a list of the cloud resources, where the logs are written to in your AWS account, and what kind of data is logged:
Resource Logging Destination Log Description AWS CloudFront Distribution aws-logs-*S3 bucketAll HTTP requests to the evidence-app AWS CloudTrail aws-logs-*S3 bucketMost API calls made to your AWS account Amazon API Gateway /aws/api_gw/evidence_apiCloudWatch log groupAll interactions between CloudFront and API Gateway AWS Lambda /aws/lambda/evidenceCloudWatch log groupExecutions of evidenceLambda function
Challenge 2: Download CloudFront Log Data
Your CloudFront access logs can be found in an S3 bucket beginning with aws-logs-. Use the AWS CLI in your CloudShell session to download only the CloudFront access logs to a new cloudtrail-logs directory the cloudshell-user home directory.
Solution
-
Take a look at which S3 buckets are in your account.
aws s3 lsExpected Result
2022-07-17 13:33:27 awslogs-rprcf6nm0n42opsl 2022-07-17 13:33:27 evidence-rprcf6nm0n42opsl 2022-07-17 13:33:27 webcode-rprcf6nm0n42opsl -
Just as you did in exercise 4, set the S3 bucket beginning with
aws-logs-to theLOG_BUCKETenvironment variable as we will reference this bucket a few times.export LOG_BUCKET=$(aws s3 ls | egrep -o awslogs-.*) -
Now, take a look at the folder structure of this bucket.
aws s3 ls s3://$LOG_BUCKETExpected Result
PRE AWSLogs/ PRE CloudFront/ -
It appears that there are two directories:
AWSLogsandCloudFront.AWSLogscontains your CloudTrail data andCloudFrontcontains theCloudFrontinteractions. We will focus onCloudFrontfor now. -
Download the contents of the
CloudFrontdirectory to your CloudShell session in a new directory at/home/cloudshell-user/cloudfront-logs.aws s3 cp --recursive s3://$LOG_BUCKET/CloudFront /home/cloudshell-user/cloudfront-logs/ -
You should now have CloudFront data available at
/home/cloudshell-user/cloudfront-logs.
Challenge 3: Breakdown CloudWatch evidence_api Log Group Structure
Now that the CloudFront access log data is downloaded in gzip-compressed format, use zcat to view the raw content and break down what each value represents.
Solution
-
Since CloudShell has
zcatavailable, use it to extract and display the contents of all CloudFront log files you downloaded.zcat /home/cloudshell-user/cloudfront-logs/*gzSample Results
#Version: 1.0 #Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type cs-protocol-version fle-status fle-encrypted-fields c-port time-to-first-byte x-edge-detailed-result-type sc-content-type sc-content-len sc-range-start sc-range-end 2022-07-17 13:36:35 EWR53-C3 1253 203.0.113.41 GET dcq0rpclk4hxh.cloudfront.net / 200 - Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 - - Miss BrRUkKyw9Bc5tgUGbVGos-S_kbQtHVppO5JhZyYR4FuVjs9KMxl9lA== dcq0rpclk4hxh.cloudfront.net https 454 0.057 - TLSv1.3 TLS_AES_128_GCM_SHA256 Miss HTTP/2.0 - - 61449 0.057 Miss text/html 935 - - 2022-07-17 13:36:35 EWR53-C3 925 203.0.113.41 GET dcq0rpclk4hxh.cloudfront.net /styles.css 200 https://dcq0rpclk4hxh.cloudfront.net/ Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 Miss c4FcoNVGNrjfIXTlUVKsVChJbhzQFkQKzagLVD-tPaumq-Sfcx598g== dcq0rpclk4hxh.cloudfront.net https 99 0.074 - TLSv1.3 TLS_AES_128_GCM_SHA256 Miss HTTP/2.0 61449 0.074 Miss text/css 607 - - <snip> -
That is a lot of very cryptic data! To see the structure of this tab-delimited data, you can look at just the second line of the output using
sed.zcat /home/cloudshell-user/cloudfront-logs/*gz | sed -n '2p'Expected Results
#Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type cs-protocol-version fle-status fle-encrypted-fields c-port time-to-first-byte x-edge-detailed-result-type sc-content-type sc-content-len sc-range-start sc-range-end -
Now, you could look at each field name and count where it appears in the data, but here's a neat trick to show this information:
zcat /home/cloudshell-user/cloudfront-logs/*gz | sed -n '2p' | tr ' ' '\n' | \ egrep -v "^#Fields" | awk '{print NR "\t" $0}'Expected Results
1 date 2 time 3 x-edge-location 4 sc-bytes 5 c-ip 6 cs-method 7 cs(Host) 8 cs-uri-stem 9 sc-status 10 cs(Referer) 11 cs(User-Agent) 12 cs-uri-query 13 cs(Cookie) 14 x-edge-result-type 15 x-edge-request-id 16 x-host-header 17 cs-protocol 18 cs-bytes 19 time-taken 20 x-forwarded-for 21 ssl-protocol 22 ssl-cipher 23 x-edge-response-result-type 24 cs-protocol-version 25 fle-status 26 fle-encrypted-fields 27 c-port 28 time-to-first-byte 29 x-edge-detailed-result-type 30 sc-content-type 31 sc-content-len 32 sc-range-start 33 sc-range-end -
A breakdown of each of these fields can be found in AWS' documentation.
Challenge 4: Determine evidence_api Interaction Specifics
Now that you know the structure of the log data, extract the following details for each CloudFront request:
- Time of the request (
dateandtime) - Source IP address of the requestor (
c-ip) - HTTP Method of the request (
cs-method) - HTTP endpoint requested (
cs-uri-stem) - User-Agent of the requestor (
cs(User-Agent))
Solution
-
Given the above field names as well as the structure determined in the previous challenge, you will be looking for the 1st, 2nd, 5th, 6th, 8th, and 11th tab-delimited fields.
-
You can grab just these fields and output it in a readable format using a command like the one shown below:
zcat /home/cloudshell-user/cloudfront-logs/*gz | egrep -v "^#" | awk '{print $1","$2","$5","$6","$8","$11}' | sort -n | column -ts ","Sample Results
2022-07-17 13:36:35 107.3.2.240 GET /Cloud_Ace_Final.png Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:36:35 107.3.2.240 GET /favicon.ico Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:36:35 107.3.2.240 GET / Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:36:35 107.3.2.240 GET /script.js Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:36:35 107.3.2.240 GET /styles.css Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:36:36 107.3.2.240 GET /api/ Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:37:00 107.3.2.240 POST /api/ Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:37:01 107.3.2.240 GET /api/ Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:37:39 34.229.160.87 GET / TotallyNotWget 2022-07-17 13:37:39 34.229.160.87 HEAD /Cloud_Ace_Final.png TotallyNotWget 2022-07-17 13:37:39 34.229.160.87 HEAD /script.js TotallyNotWget 2022-07-17 13:37:39 34.229.160.87 HEAD /styles.css TotallyNotWget 2022-07-17 13:37:39 34.229.160.87 HEAD / TotallyNotWget 2022-07-17 13:38:00 34.229.160.87 GET /script.js curl/7.79.1 2022-07-17 13:38:08 34.229.160.87 GET /script.js curl/7.79.1 2022-07-17 13:38:15 34.229.160.87 GET /api/ curl/7.79.1 2022-07-17 13:39:34 34.229.160.87 POST /api/ curl/7.79.1 2022-07-17 13:39:43 34.229.160.87 HEAD / curl/7.79.1 2022-07-17 13:39:49 34.229.160.87 HEAD /api/ curl/7.79.1 2022-07-17 13:40:14 107.3.2.240 GET /styles.css Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:40:27 107.3.2.240 POST /api/ Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:40:29 107.3.2.240 GET /api/ Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML %20like%20Gecko)%20Chrome/103.0.0.0%20Safari/537.36 2022-07-17 13:41:39 34.229.160.87 POST /api/ python-requests/2.25.1 2022-07-17 13:41:39 34.229.160.87 POST /api/ python-requests/2.25.1 2022-07-17 13:41:40 34.229.160.87 POST /api/ python-requests/2.25.1 2022-07-17 13:41:40 34.229.160.87 POST /api/ python-requests/2.25.1 2022-07-17 13:41:41 34.229.160.87 POST /api/ python-requests/2.25.1 2022-07-17 13:41:42 34.229.160.87 GET /api/ python-requests/2.25.1 2022-07-17 13:41:42 34.229.160.87 POST /api/ python-requests/2.25.1 <snip> -
It appears, in the example shown above, that the IP address of
34.229.160.87is interacting with the application quite strangely (the IP address you find will be different, but just as suspicious). First, we see a weird User-Agent namedTotallyNotWget. Of course, a User-Agent does not always indicate malice, but take a look at just how quickly the entries that contain this strange User-Agent requesting HTTP headers from all web pages in the evidence-app? In the above example, all pages are visited within the same second with aHEADrequest. That is abnormal for a user of this application. -
Secondly, the user at that same IP address switches to
curlto communicate with thescript.jsand/api/endpoints and later topython-requests/2.25.1. This, again is quite odd. Most (if not all) legitimate application users will use a web browser to access the application and upload evidence data. -
This has many signs of possible reconnaissance, but we will learn more about this IP address and its interactions with the application in future exercises. So for now, we can just deem the IP address suspicious, but not yet malicious.
-
So that we can keep track of this suspect IP address when the CloudShell session terminates, add another entry to your
.bashrcfile.echo "export SUSPECT=$(zcat /home/cloudshell-user/cloudfront-logs/*gz | grep TotallyNotWget | awk '{print $5}' | head -1)" \ >> /home/cloudshell-user/.bashrc source /home/cloudshell-user/.bashrc cat /home/cloudshell-user/.bashrcExpected Result
# .bashrc # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi # Uncomment the following line if you don't like systemctl's auto-paging feature: # export SYSTEMD_PAGER= # User specific aliases and functions complete -C '/usr/local/bin/aws_completer' aws export AWS_EXECUTION_ENV=CloudShell export TARGET=https://dcq0rpclk4hxh.cloudfront.net export SUSPECT=34.229.160.87
ATT&CK
MITRE ATT&CK techniques potentially detected:
| Tactic | Technique | Description |
|---|---|---|
| Reconnaissance | Active Scanning (T1595) | Saw requests with TotallyNotWget User-Agent possibly crawling the evidence-app looking for additional targets |
| Discovery | Cloud Infrastructure Discovery (T1580) | Requests with curl as the User-Agent are interacting with the / and /api/ endpoints (possibly determining backend infrastructure) |