Agentless data collection

From HandWiki

In the field of information technology, agentless data collection involves collecting data from computers without installing any new agents on them.

What is an agent?

For the purpose of this discussion, an agent is a software program (sometimes called a service or daemon) that runs on a computer with the primary purpose of collecting information and pushing it over the network to a central location (or else of re-publishing the information in a standard format like SNMP so that it can then be collected over the network from the central location).

The traditional approach to data collection involves installing agents on all computers from which data is needed. Sometimes this installation step is performed manually for each computer, other times it is automated via a centralized installation server that pushes software to other computers. In either case, the cost of installation (and subsequent maintenance and upgrade) is typically proportional to the number of computers that require installation services, and this is in turn equal to the number of computers from which data is needed.

Agentless approach

In the agentless approach, data is collected from computers without installing additional agents. This is accomplished by obtaining data from the software that is already installed on the computer including the operating system as well as previously-installed commercial products (or commercial products which do not require an installation to execute). It turns out that, in many cases, there are already more than enough programs and protocols installed on a computer where the desired information can be obtained.

The primary benefit of the agentless approach is that it is not necessary to install, upgrade and maintain additional software programs on each computer from which information is needed. Software products that use this approach may have a faster rollout and lower TCO than software products that require agents on a substantial number of computers.

Relevant network protocols

Any network protocol that returns useful information can be employed, providing only that the protocol server is already installed. Again, the distinction between agentless and agent-based is not the specific protocol used but whether a new protocol server (agent) must be installed.

In many cases, it is possible to find servers for these protocols: log4j, CIFS, SSH, SNMP, Windows Management Instrumentation (for Windows platform), DTrace (for Solaris 10 platform). However, a large number of other protocols may be helpful as well.

Versus data mining

The meaning of the phrase data mining is related to but different from data collection. The former is typically about finding useful patterns with data that is conveniently accessible in a relational database. In contrast, the latter involves extracting data from a variety of less convenient sources, although in some cases it may also involve identifying or leveraging useful patterns.

See also