Collection Plan and Types of Sources
After defining the Terms of Reference (ToR), Priority Intelligence Requirements (PIRs), and Requests for Information (RFIs), the CTI team must determine an intelligence collection plan. It is implemented into a physical or digital document, typically a worksheet (like an Excel spreadsheet). This plan fills multiple roles inside the project:
- Strategical: provides an audit trail of the team’s production and provides data to the senior level on the project’s evolution.
- Operational: central coordination for the team’s effort, a basis for associating PIRs with sources and agencies (SANDAs), further refining PIRs, if necessary, and turning PIRs into RFIs.
- Tactical: delivers important metadata to each RFI, which is necessary to monitor and control their deliveries. Metadata examples include parameters like RFI’s due date, format, associated SANDAs, and search terms.
As the analyst must refine PIRs into RFIs, it must execute a similar operation with SANDAs, with clearly defined sources for each RFI. Such a process requires an adequate grasp of the different types of sources available and the purpose of this categorization inside the intelligence project and, more broadly, to the organization.
Understanding the types of sources helps:
- to discriminate, from the management standpoint, the infrastructures, technological processes, skills, and costs necessary to access and work with a given source;
- to establish a closer view of the specific data each source may provide when compared to others;
- to better establish the security measures necessary to safeguard the team while interacting with each source, i.e., the differences in dealing with a well-known intelligence provider, which requires different policies and procedures than corresponding to potential cyber criminals on an Internet Relay Chat (IRC);
- to provide more granular legal compliance by each source or agency, i.e., understanding the legal risks of interacting with each source and the necessary rules of engagement;
- to associate the correct handling and processing standards/codes (e.g., Police 5x5x5 and NATO Admiralty Code).
Intelligence Sources Types
Each source name has the “INT” suffix, varying the prefix accordingly: HUMINT, OSINT, SOCMINT, IMINT, SIGINT, and ELINT. The aim in the collection worksheet is to associate each source (INT) with different IRs, which, at a more specific level, means putting together SANDAs and RFIs.
Human Intelligence (HUMINT) involves direct or indirect contact with relevant human sources. It may be overt, i.e., interactions with both sides knowing the identities of each other, and covert, where the analyst and the source’s identities are unknown. Considering the organization's exposure and risk appetite, different policies and controls must be in place.
Open-Source Intelligence (OSINT) encompasses all publicly and freely accessible information. It includes sources like news outlets, government websites, social media profiles, domain information, IoT devices’ data (e.g., Shodan), and IRCs. Web browsers and search engines are typical methods for OSINT research. But, given the volume of data, it’s helpful to implement meta-search engines (e.g., Dogpile), which aggregate data from multiple search engines (e.g., Google, Bing, and Yahoo). Specialized collection methods (e.g., ThreatMiner) may help reach surface web locations and CTI-relevant data that are not readily or directly accessible.
OSINT also encompasses collecting dark web data, i.e., places that aren’t indexed by any search engine. It may include accessing legal or illegal activities/sites, like dark markets and hacking forums, so anonymity and encryption are necessary. Specialized web browsers like Tor provide those functionalities.
Lastly, there’s the notion of the deep web, which is the space formed by social media. While search engines may find social media pages, they can’t provide access to them, making them hidden. Considering such unique collection demands, social media intelligence is a legitimate OSINT derivation.
When researching OSINT, the CTI analyst must consider a few critical parameters. Depending on the question that must be answered, the analyst might choose a more limited and narrow approach (for specific RFIs) or broaden the collection effort, which may yield more data but may also incur some barriers, like different languages. Some sources are easily accessible, less risky, and free (most open sources, like ThreatMiner), while others (closed sources) require payment and a more deliberate and careful OPSEC (e.g., avoiding browser fingerprinting, creating covert personas or more technical access methods). Other parameters include the budget available and legal constraints.
For Social Media Intelligence (SOCMINT), aggregation tools like X Pro are necessary to effectively and efficiently consolidate the vast data volumes generated by those websites. Hacktivists are threat actors often targeted by SOCMINT collection.
Operational Intelligence (OPINT) is a real-time collection method that gathers data such as logs, sensors, and data feeds. It typically aids a faster decision-making process.
Other source types are electronic intelligence (ELINT), which collects data from electronic device emissions; imagery intelligence (IMINT), which is extracted from images (from the Internet to satellite); and signals intelligence (SIGINT), which results from collecting data and metadata from telecommunications (e.g., telephones).
For CTI, HUMINT, OSINT, and SOCMINT are the most common sources. OPINT and IMINT may also be used. SIGINT and ELINT are rarer in CTI practice.