How to Protect Infrastructure During Penetration Testing
If you’re a penetration tester, you know that for any test or phishing campaign, you begin with setting up your infrastructure with a domain name and redirectors. You might also know that this step is straightforward, and many have created walkthroughs on different ways to architect and automate infrastructure deployments.
That being said, one of the more difficult steps in these campaigns is maintaining a clean domain name and keeping it off known block lists—as very experienced penetration testers, we want to help you simplify this step as much as possible.
In this article, we’re going to explain how to protect infrastructure from unwanted visitors and reduce the likelihood of burning a domain name. Next time you begin a new engagement, you’ll be able to proceed more smoothly and provide your client with more efficient testing.
4 Ways to Protect Your Infrastructure Without Burning the Domain Name
1. Deploy a Honeypot to Identify Bots and Scanners
To start, domain names get categorized in multiple ways:
- Users manually report the domain name to services like Google Safe Browsing, WebPulse Site Review, etc.
- Through automated services that regularly scan the Internet and analyze websites.
Certain companies and security products will scan a website before users visit the site to better protect the user from malicious websites—for instance, when a link is shared through e-mail, these security products will first browse the link and check for malicious content before allowing the user to visit.
If one of these security products is in use, oftentimes it’ll have an unusual user agent set—by identifying these, you can preemptively either block them from analyzing a page or provide them with different content than what the user would see.
To do that, we deployed a honeypot to identify some of these security products, web crawlers, and bots. The honeypot was a simple web server that captured each request and logged the IP address and user agent that visited the website. The web server had a single page that acted as a login form, but was not actually functional.
We deployed the honeypot for thirty-four days and logged each request. We did not submit the domain to any categorization websites for scanning, so these results are based only on automated crawlers that found the website:
Days Online | 34 |
Requests | 568,698 |
Unique IP Addresses | 148,926 |
Unique User Agents | 194 |
Many of the user agents contained details related to the service that was performing the scans. Below are a few of the bots that scanned the honeypot.
- http://netsystemsresearch.com
- http://project-resonance.com
- http://tchelebi.io
- http://www.baidu.com/search/spider.html
- http://www.bing.com/bingbot.htm
- http://www.googlebot.com/bot.html
- https://about.censys.io
- https://best-proxies.ru/faq/#from
- https://gdnplus.com
- https://github.com/robertdavidgraham/masscan
- https://internet-measurement.com
- https://leakix.net
- https://nmap.org/book/nse.html
- https://opensiteexplorer.org/dotbot
- https://paloaltonetworks.com
Out of the 194 unique user agents, there were only 13 that had more than 100 requests.
User Agent |
Occurrences |
---|---|
BLANK |
283,621 |
Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko |
56,821 |
Opera/9.80 (Windows NT 5.1; U; en) Presto/2.10.289 Version/12.01 |
56,487 |
Mozilla/5.0 (Windows NT 6.0) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1 |
56,424 |
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0 |
56,222 |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36 |
55,958 |
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36 |
567 |
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36 |
345 |
curl/7.54.0 |
126 |
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36 |
119 |
Linux Gnu (cow) |
119 |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36 |
102 |
python-requests/2.28.1 |
100 |
2. Block User Agents
With that established starting point, we started blocking what we needed to based on the user agent. Though there are free and paid services with a variety of built-in rules, we set up our own using the Caddy web server as a redirector.
Below is a simple Caddy configuration file that blocks requests based on their user agent and redirects other requests to a local service running on port 8000:
:443 {
# List of user agents to block
@blocked {
# Blank
header_regexp User-Agent ""
# Common tools
header User-Agent curl*
header User-Agent python*
header User-Agent *masscan*
header User-Agent *nmap*
header User-Agent *paloalto*
header User-Agent *censys*
}
# Block request based on user agent
handle @blocked {
respond "Bad User-Agent!"
}
# Proxy all other requests
handle {
reverse_proxy localhost:8000
}
}
However, if you have a long list of user agents to block, this configuration file may also get long. As an alternative, you can split out the user agents into their own file, which will look like the following:
:443 {
# List of user agents to block
@blocked {
# Import the user agent file
import blocked.txt
}
# Block request based on user agent
handle @blocked {
respond "Bad User-Agent!"
}
# Proxy all other requests
handle {
reverse_proxy localhost:8000
}
}
# Blank
header_regexp User-Agent ""
# Common tools
header User-Agent curl*
header User-Agent python*
header User-Agent *masscan*
header User-Agent *nmap*
header User-Agent *paloalto*
header User-Agent *censys*
The blocked.txt file will then contain each of the user agents.
3. Only Allow Certain Countries
Another way to reduce unwanted users is to only allow requests from certain countries. If the company you’re performing the testing against is only located in North America, you may want to only allow traffic from the United States and Canada.
Since Caddy is a modular framework, it can be built with additional functionality based on the user's needs. The MaxMind Geolocation module uses MaxMind's free database of IP address geolocation information to provide Caddy with the ability to block traffic based on country, county, or city.
Once Caddy is built with the additional modules, the configuration file can be updated to include which countries to allow based on their country code:
:443 {
# List of user agents to block
@blocked {
# Import the user agent file
import blocked.txt
}
# Block request based on user agent
handle @blocked {
respond "Bad User-Agent!"
}
# Block requests outside of the US and Canada
@country {
not maxmind_geolocation {
db_path "/usr/share/GeoIP/GeoLite2-Country.mmdb"
allow_countries US CA
}
}
# Block requests outside of the US and Canada
handle @country {
respond "Country not allowed!"
}
# Proxy all other requests
handle {
reverse_proxy localhost:8000
}
}
4. Allow Only Certain Autonomous System Numbers (ASN)
MaxMind has another free database that provides the ASN, and Autonomous System Organization (ASO) information based on the IP address.
Using this database, you can create an additional Caddy module to allow or block traffic based on the organization associated with the IP address. You can add this configuration below to our previous work so that it only allows traffic coming from Amazon or Cloudflare.
:443 {
# List of user agents to block
@blocked {
# Import the user agent file
import blocked.txt
}
# Block request based on user agent
handle @blocked {
respond "Bad User-Agent!"
}
# Allow requests from the US and Candada
@country {
not maxmind_geolocation {
db_path "/usr/share/GeoIP/GeoLite2-Country.mmdb"
allow_countries US CA
}
}
# Block requests outside of the US and Canada
handle @country {
respond "Country not allowed!"
}
# Allow requests from Amazon and Cloudflare
@aso {
maxmind_asn {
db_path "/usr/share/GeoIP/GeoLite2-ASN.mmdb"
allow_asos amazon cloudflare
}
}
# Proxy requests
handle @aso {
reverse_proxy localhost:8000
}
# Handle all other requests
handle /* {
respond “Invalid request”
}
}
Learn More About Penetration Testing
When conducting a penetration test campaign, it can be difficult to protect the infrastructure you’re working with and keep it clean for more accurate results.
But now you know that there are at least four ways to do so without burning the domain name, and while having an allow list or block list can be hard to maintain and relatively easy to bypass, it does help reduce the traffic from automated services. Plus, the ability to limit the traffic based on these techniques discussed can also help prevent unwanted visitors, extending the life of your domain during a campaign.
For even more insight into helpful penetration testing practices that could improve both your experience as a tester and make for a more efficient process for your client, check out our other instructional content:
About Clint Mueller
Clint Mueller is a Lead Penetration Tester with Schellman based in the St. Louis, Missouri area. Prior to joining Schellman in 2021, Clint worked as the Senior Red Team Manager for a large health care company. During this time, Clint performed a variety of security assessments and threat emulations based on adversary tactics, techniques, and procedures (TTP) to help improve the company’s monitoring and detection capabilities. Clint has over seven years of experience comprised of serving clients in various industries, including health care, telecommunications, and financial services. Clint is now focused primarily on offensive security assessments including internal and external network testing, phishing, and web application assessments for organizations across various industries.