theHarvester: Gather Emails, Subdomains & IPs from Public Sources
NotionWhat is theHarvester?
theHarvester is a simple yet powerful OSINT tool designed for the early stages of a penetration test or red team engagement. It gathers emails, names, subdomains, IPs, and URLs from multiple public data sources — all without ever touching the target's infrastructure directly.
Created by Christian Martorella (Edge Security), theHarvester queries search engines, certificate transparency logs, DNS databases, and threat intelligence platforms to build a profile of a target organization's external footprint.
In OSINT and penetration testing, theHarvester is typically run right after defining the target scope. The email addresses it finds become phishing targets. The subdomains become attack surface. The employee names feed into social engineering.
Legal Notice: theHarvester only queries publicly available data sources. However, always ensure your reconnaissance activities are authorized and within scope.
Installation
On Kali Linux (Pre-installed)
theHarvester -hInstall via pip
pip install theHarvesterInstall from GitHub (Latest)
git clone https://github.com/laramies/theHarvester.git
cd theHarvester
pip install -r requirements.txt
python3 -m theHarvester -hVerify Installation
theHarvester -hExpected Output:
*******************************************************************
* _ _ _ *
* | |_| |__ ___ /\ /\__ _ _ ____ _____ ___| |_ ___ _ __ *
* | __| '_ \ / _ \ / /_/ / _` | '__\ \ / / _ \/ __| __/ _ \ '__| *
* | |_| | | | __/ / __ / (_| | | \ V / __/\__ \ || __/ | *
* \__|_| |_|\___| \/ /_/ \__,_|_| \_/ \___||___/\__\___|_| *
* *
* theHarvester 4.6.0 *
*******************************************************************
usage: theHarvester [-h] -d DOMAIN [-l LIMIT] [-S START] [-p]
[-s] [--screenshot SCREENSHOT] [-v]
[-e DNS_SERVER] [-t] [-r [DNS_RESOLVE]]
[-n] [-c] [-f FILENAME] [-b SOURCE]Data Sources
theHarvester can query many different public data sources. Each has different strengths:
Basic Usage
Search a Domain with a Single Source
theHarvester -d tesla.com -b crtshExpected Output:
*******************************************************************
* _ _ _ *
* | |_| |__ ___ /\ /\__ _ _ ____ _____ ___| |_ ___ _ __ *
* | __| '_ \ / _ \ / /_/ / _` | '__\ \ / / _ \/ __| __/ _ \ '__| *
* | |_| | | | __/ / __ / (_| | | \ V / __/\__ \ || __/ | *
* \__|_| |_|\___| \/ /_/ \__,_|_| \_/ \___||___/\__\___|_| *
* *
* theHarvester 4.6.0 *
*******************************************************************
[*] Target: tesla.com
[*] Searching crtsh.
[*] No IPs found.
[*] No emails found.
[*] Hosts found: 47
---------------------
aca.tesla.com
accounts.tesla.com
api.tesla.com
api-internal.tesla.com
auth.tesla.com
blog.tesla.com
ca.tesla.com
charging.tesla.com
cloud.tesla.com
cn.tesla.com
de.tesla.com
dev.tesla.com
engage.tesla.com
energydesk.tesla.com
factory.tesla.com
finance.tesla.com
fleet-api.tesla.com
gf.tesla.com
ir.tesla.com
learn.tesla.com
mail.tesla.com
nv.tesla.com
ny.tesla.com
obs.tesla.com
paloalto.tesla.com
service.tesla.com
shop.tesla.com
solarcity.tesla.com
staging.tesla.com
support.tesla.com
tesla.com
tx.tesla.com
www.tesla.comSearch Multiple Sources
theHarvester -d example.com -b crtsh,anubis,hackertarget,dnsdumpsterSearch All Free Sources
theHarvester -d example.com -b allLimit Results
theHarvester -d microsoft.com -b bing -l 200The -l flag limits the number of search results to process.
Set Start Offset
theHarvester -d microsoft.com -b bing -l 200 -S 100Starts at result 100 — useful for paginating through large result sets.
Output Options
Save to HTML and XML
theHarvester -d tesla.com -b crtsh,anubis -f tesla_resultsCreates tesla_results.html and tesla_results.xml.
DNS Resolution on Discovered Hosts
theHarvester -d tesla.com -b crtsh -rThe -r flag resolves all discovered hostnames to IP addresses.
Additional Output:
[*] Hosts found: 47
---------------------
aca.tesla.com:104.18.32.45
accounts.tesla.com:199.66.9.47
api.tesla.com:199.66.9.46
auth.tesla.com:104.18.33.45
blog.tesla.com:199.66.9.48
...DNS TLD Expansion
theHarvester -d company -b bing -tThe -t flag performs DNS TLD expansion, searching for company.com, company.net, company.org, etc.
Virtual Host Detection
theHarvester -d tesla.com -b crtsh -vThe -v flag performs virtual host search using Bing.
Take Screenshots of Discovered Hosts
theHarvester -d tesla.com -b crtsh --screenshot /tmp/screenshots/Captures screenshots of all discovered web servers.
Finding Emails
Email discovery is one of theHarvester's most valuable features for social engineering assessments.
Search Bing for Emails
theHarvester -d microsoft.com -b bing -l 500Expected Output (emails section):
[*] Emails found: 12
--------------------
jsmith@microsoft.com
helpdesk@microsoft.com
press@microsoft.com
recruiting@microsoft.com
security@microsoft.com
partners@microsoft.com
support@microsoft.com
info@microsoft.com
legal@microsoft.com
abuse@microsoft.com
privacy@microsoft.com
msrc@microsoft.comSearch DuckDuckGo for Emails
theHarvester -d target.com -b duckduckgo -l 300Search with Hunter.io (API Key Required)
theHarvester -d target.com -b hunterHunter.io is specifically designed for email finding and provides email patterns, confidence scores, and verified results.
Certificate Transparency (CT) Logs
CT logs are one of the best subdomain discovery sources because they contain every SSL/TLS certificate ever issued.
crt.sh Search
theHarvester -d tesla.com -b crtshThis queries crt.sh (Certificate Transparency log aggregator) and finds every subdomain that has ever had an SSL certificate issued.
CertSpotter
theHarvester -d tesla.com -b certspotterManual crt.sh API Query (Bonus)
You can also query crt.sh directly for deeper analysis:
curl -s "https://crt.sh/?q=%.tesla.com&output=json" | \
python3 -c "
import json, sys
data = json.load(sys.stdin)
names = set()
for cert in data:
for name in cert.get('name_value','').split('\n'):
names.add(name.strip())
for name in sorted(names):
print(name)
"Real Output (abbreviated):
*.ca.tesla.com
*.cloud.tesla.com
*.cn.tesla.com
*.de.tesla.com
*.engage.tesla.com
*.gf.tesla.com
*.nv.tesla.com
*.ny.tesla.com
*.obs.tesla.com
*.paloalto.tesla.com
*.tesla.com
*.tx.tesla.com
akamai-apigateway-einvoicing-stg.tesla.com
bettertime.tesla.com
ca.tesla.com
cloud.tesla.com
cn.tesla.comReal-World OSINT Workflow
Step 1: Cast a Wide Net
# Run all free sources
theHarvester -d target.com -b anubis,crtsh,certspotter,hackertarget,rapiddns,urlscan,otx,threatminer -f initial_reconStep 2: Resolve All Discovered Hosts
# Resolve hostnames to IPs
theHarvester -d target.com -b crtsh -r -f resolved_hostsStep 3: Mine for Emails
# Search engines are best for email discovery
theHarvester -d target.com -b bing,duckduckgo,yahoo -l 500 -f email_harvestStep 4: Feed Results into Nmap
# Extract IPs from theHarvester XML output
grep -oP '\d+\.\d+\.\d+\.\d+' resolved_hosts.xml | sort -u > target_ips.txt
# Scan discovered hosts with Nmap
nmap -sV -sC -iL target_ips.txt -oA target_scanStep 5: Feed Emails into Sherlock
# Extract usernames from emails (part before @)
grep '@' email_harvest.xml | sed 's/@.*//' | sort -u > usernames.txt
# Search social media for those usernames
while read user; do
sherlock --print-found --csv "$user"
done < usernames.txtAPI Key Configuration
For sources that require API keys, create the config file:
cp ~/.theHarvester/api-keys.yaml.example ~/.theHarvester/api-keys.yaml
nano ~/.theHarvester/api-keys.yamlConfig structure:
apikeys:
bing:
key: YOUR_BING_API_KEY
hunter:
key: YOUR_HUNTER_API_KEY
shodan:
key: YOUR_SHODAN_API_KEY
virustotal:
key: YOUR_VT_API_KEY
securityTrails:
key: YOUR_ST_API_KEY
github:
key: YOUR_GITHUB_TOKEN
censys:
id: YOUR_CENSYS_API_ID
secret: YOUR_CENSYS_SECRETUseful Flags Reference
Summary
theHarvester is the perfect early-stage reconnaissance tool. It maps an organization's external footprint by pulling emails, subdomains, and hostnames from dozens of public sources — all passively, without touching the target's infrastructure.
Key Takeaways:
- Use
-b crtsh,anubis,certspotterfor subdomain discovery (no API keys needed) - Use
-b bing,duckduckgofor email harvesting - Use
-rto resolve discovered hosts to IPs - Use
-fto save results for later analysis - Feed results into Nmap for port scanning and Sherlock for social media mapping
- Configure API keys for premium sources like Shodan, Hunter.io, and VirusTotal
Share this post
Help this article travel further
One tap opens the share sheet or pre-fills the post for the platform you want.