How to Practise Threat Hunting With Real Data

Threat hunting is one of the most sought-after skills in defensive security, and one of the hardest to develop without access to a real environment. The problem is obvious: most learning resources teach you the concepts and frameworks of threat hunting without putting you inside actual data. You read about hypothesis-driven hunting, you study the MITRE ATT&CK framework, you learn what lateral movement looks like in theory. Then you open a blank SIEM and have no idea what to do next.

This guide is about closing that gap. It covers what threat hunting actually involves, where to find real data to practise on, how to structure a hunt from first principles, and how TryHackMe's hands-on environments give you guided practice before you attempt it independently.

What Threat Hunting Actually Is

Threat hunting is often described as proactive security, and that description is useful but incomplete. The more precise framing is this: threat hunting is the practice of forming a hypothesis about adversary behaviour and then searching through data to confirm or rule it out, without waiting for an automated alert to tell you something is wrong.

This distinguishes it from alert triage, which is reactive. A SOC analyst responds to alerts the SIEM generates. A threat hunter asks: what would an attacker be doing in this environment that our current detections would not catch? Then they look for it.

The distinction matters practically because it changes what you are looking for. Alert triage is about investigating known patterns. Threat hunting is about identifying unknown activity by understanding how attackers behave and looking for the traces that behaviour leaves behind. The MITRE ATT&CK framework is the reference library that makes this structured: it maps adversary tactics, techniques, and procedures (TTPs) against the artefacts and event data they produce, giving hunters a systematic way to form hypotheses and know where to look.

The Threat Hunting Process

A structured hunt follows a consistent sequence regardless of the specific hypothesis or data source.

Form a hypothesis. Every hunt starts with a question grounded in threat intelligence or knowledge of adversary behaviour. "Are there signs of credential dumping activity on endpoints in the last 30 days?" or "Has any process accessed LSASS memory in a way consistent with Mimikatz?" are hypotheses. "Look for anything suspicious" is not. MITRE ATT&CK technique pages are the most reliable source of hypothesis ideas for practice: each technique page describes what the behaviour looks like and which data sources it generates evidence in.

Identify the relevant data sources. Different techniques produce evidence in different places. Credential dumping leaves traces in Windows Security Event Log (Event ID 4624, 4625, 4648), Sysmon logs (Event ID 10 for process access), and potentially memory. Network-based lateral movement appears in firewall logs, Zeek/Bro logs, and network flow data. Knowing which data source to query before you start prevents wasting time searching the wrong place.

Query and investigate. Write queries in your SIEM or analysis tool that surface events matching your hypothesis criteria. The initial query is rarely the final one: you iterate, filter noise, pivot on interesting findings, and follow threads. This is where the practical skill actually lives, and why it can only be developed through doing rather than reading.

Document and conclude. Every hunt produces an outcome: either evidence of the suspected activity, or documented confidence that the activity is not present in the dataset. Both are valuable. Documenting methodology, queries used, and conclusions produces hunt runbooks that make future hunts faster and more repeatable.

Where Real Data Comes From

This is the question most learning resources avoid. The honest answer is that several high-quality free datasets exist specifically for threat hunting practice, and most practitioners do not know about them.

EVTX-ATTACK-SAMPLES is a GitHub repository of Windows Event Log samples mapped directly to MITRE ATT&CK techniques. Each sample contains real EVTX files capturing specific adversary techniques, making it ideal for practising Windows-based hunts. You load the files into a SIEM or log analysis tool and hunt for exactly the technique the sample was generated from, then compare your findings to the documented expected output.

PCAP-ATTACK is a companion repository of packet capture files mapped to ATT&CK techniques, covering network-based adversary behaviour. Wireshark and Zeek are the primary analysis tools for working through these samples.

Splunk Boss of the SOC (BOTS) datasets are curated datasets simulating a realistic enterprise environment under attack, built specifically for security operations and threat hunting practice. The botsv2 and botsv3 datasets are freely available on GitHub and contain log data from multiple sources covering a simulated APT campaign. Loading these into a Splunk instance gives you weeks of realistic hunting practice on data that mirrors what real enterprise environments generate.

malware-traffic-analysis.net provides PCAP files and associated write-ups from real malware infections and network-based attacks. The site is maintained by Brad Duncan and is one of the most widely used resources among practitioners for realistic network traffic analysis practice.

TryHackMe's threat hunting rooms take a different approach: rather than making you set up your own analysis environment and ingest raw data, the platform puts you inside pre-configured environments with real data already loaded and guided scenarios that walk you through each phase of a hunt. This removes the setup barrier that stops most people from ever getting started with external datasets, and means your first experience with threat hunting data is structured enough to be genuinely educational rather than overwhelming.

Key Tools for Threat Hunting Practice

You do not need access to enterprise-grade tools to practise effectively. The following cover the core data types that threat hunters work with.

Splunk Free (up to 500MB/day ingestion) is the standard SIEM for self-directed threat hunting practice. The BOTS datasets are designed specifically for Splunk. SPL (Splunk Processing Language) is the query language that most threat hunting content references, and proficiency in it is consistently valued in hiring.

Sysmon is a Windows system monitor that generates detailed process creation, network connection, and file creation events that Windows' native logging does not capture at sufficient granularity for hunting. Installing Sysmon on a Windows VM and understanding which Event IDs it generates for specific adversary techniques is foundational practice. Key Event IDs for hunting: 1 (process creation), 3 (network connection), 7 (image loaded), 10 (process accessed), 13 (registry value set).

Wireshark is the standard tool for analysing PCAP data. Network-based threat hunting requires comfort reading packet captures to identify unusual protocols, beaconing behaviour, data exfiltration patterns, and command and control traffic.

YARA is a pattern-matching language used for identifying malware and suspicious file characteristics. Writing YARA rules to detect known bad patterns is a practical skill that TryHackMe's Threat Hunting with YARA room specifically covers.

Practise on TryHackMe

TryHackMe's Threat Hunting module is the most structured starting point for building practical threat hunting skills. It covers the foundational concepts and then puts you inside real environments to apply them.

The Threat Hunting: Introduction room establishes the mindset and methodology, covering the relationship between threat hunting and incident response, how to form intelligence-driven hypotheses, and the structure of a repeatable hunt process. The Threat Hunting: Foothold room then applies this in practice, working through the initial access and foothold stage of an attack from the hunter's perspective. The Threat Hunting with YARA room develops pattern-matching skills using real malware samples and exercises that build confidence with a tool that professional hunters use daily.

These rooms are part of a broader progression. The SOC Level 1 path builds the log analysis, SIEM proficiency, and Windows event log knowledge that threat hunting depends on. The SAL1 certification validates the SOC skills that threat hunting builds on top of, and is the natural credential for anyone developing toward a threat hunting or L2 analyst role.

The Skill That Separates Hunters From Analysts

Threat hunting is not a different set of tools from alert triage. It is a different way of using the same tools. The underlying skill is the ability to form a testable hypothesis about adversary behaviour, translate it into queries that surface relevant evidence, and follow threads through data until you reach a conclusion with confidence.

That skill develops through repetition with real data. Reading about MITRE ATT&CK techniques and then querying EVTX-ATTACK-SAMPLES to find the traces those techniques leave is one iteration. Working through a TryHackMe threat hunting room that puts you inside a simulated environment and asks you to find evidence of specific adversary behaviour is another. Every hunt you complete, document, and review builds the pattern recognition that makes the next one faster.

How to Practise Threat Hunting With Real Data

What Threat Hunting Actually Is

The Threat Hunting Process

Where Real Data Comes From

Key Tools for Threat Hunting Practice

Practise on TryHackMe

The Skill That Separates Hunters From Analysts

Recommended

CTF Practice for Penetration Testers: How to Use Challenges to Build Real Offensive Skills

How to Build an Incident Response Career: From First SOC Role to IR Analyst

What AI Security Skills Should Defenders Actually Learn in 2026?