Splunk手册学习之Knowledge Manager Manual-2

2.3.Fields and field extractions

about fields
Fields often appear in events as key=value pairs such as user_name=Fred. But in many events, field values appear in fixed, delimited positions without identifying keys,for example :

Nov 15 09:32:22 00224 johnz
Nov 15 09:39:12 01671 dmehta
Nov 15 09:45:23 00043 sting
Nov 15 10:02:54 00676 lscott

Splunk Enterprise can identify these fields using a custom field extraction

About field extraction
As Splunk software processes events, it extracts fields from them. This process is called field extraction

  • Automatically-extracted fields
    Splunk software automatically extracts host, source, and sourcetype values, timestamps, and several other default fields when it indexes incoming events
    It also extracts fields that appear in your event data as key=value pairs
    his process of recognizing and extracting k/v pairs is called field discovery
    When fields appear in events without their keys, Splunk software uses pattern-matching rules called regular expressions to extract those fields as complete k/v pairs

  • to get all of the fields in your data, create custom field extractions
    Custom field extractions should take place at search time,but in certain rare circumstances you can arrange for some custom field extractions to take place at index time

  • Before you create custom field extractions, get to know your data
    we see two kind of logs
    web log: - - [03/Jun/2014:20:49:53 -0700] "GET /wp-content/themes/aurora/style.css HTTP/1.1" 200 7464 "http://www.splunk.com/download" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0;  Trident/5.0)" - - [03/Jun/2014:20:49:33 -0700] "GET / HTTP/1.1" 200 75017 "-" "Mozilla/5.0 (compatible; Nmap Scripting Engine; http://nmap.org/book/nse.html)"

Cisco firewall log:

Jul 15 20:10:27 %ASA-6-113003: AAA group policy for user AmorAubrey is being set to Acme_techoutbound
Jul 15 20:12:42 %ASA-7-710006: IGMP request discarded from to outside:
Jul 15 20:13:52 %ASA-6-302014: Teardown TCP connection 517934 for Outside: to Inside: duration 0:05:02 bytes 297 Tunnel has been torn down (AMOSORTILEGIO)
Apr 19 11:24:32 PROD-MFS-002 %ASA-4-106103: access-list fmVPN-1300 denied udp for user 'sdewilde7' outside/ -> inside1/ hit-cnt 1 first hit [0x286364c7, 0x0] "

we compare the two kind of logs,the first log have reliable format data,but the seconde not
to the second logs,Because these events differ so widely, it is difficult to create a single field extraction that can apply to each of these event patterns and extract relevant field value

  • Using required text in field extractions

Methods of custom field extraction

  • Let the field extractor build extractions for you
    regular expressions and delimiter-based field extraction
    regular expression method is useful for extracting fields from unstructured event data, where events may follow a variety of different event patterns
    delimiter-based field extraction method is suited to structured event data,for example CSV file and sql

  • Define field extractions with the Field extractions and Field transformations pages

  • Configure field extractions directly in configuration files

Use default fields
The fields that are extracted automatically at index time are known as default fields 07ddd
Internal fields

  • _raw
    The _raw field contains the original raw data of an event
eventtype=sendmail | regex _raw=*10.\d\d\d\.\d\d\d\.\d\d\d\*
  • _time
    The _time field contains an event's timestamp expressed in Unix time
  • _indextime
    The _indextime field contains the time that an event was indexed, expressed in Unix time
  • _cd
    provides an address for an event within the index and is hidden field
  • _bkt
    contains the id of the bucket that an event is stored in and is hidden field

Default fields

  • linecount contains the number of lines an event contains
  • sourcetype
  • timestamp specifies the time at which the event occurred
  • punct field contains a punctuation pattern that is extracted from an event
  • index
  • host
  • source
  • splunk_server

When Splunk software extracts fields

  • Field extraction at index time
    At index time, Splunk software extracts a small set of default fields for each event, including host, source, and sourcetype
    Splunk software can also extract custom indexed fields at index time
    Do not add custom fields to the set of default fields that Splunk software extracts and indexes at index time

  • Field extraction at search time
    Splunk software can extract additional fields, depending on its Search Mode setting and whether that setting enables field discovery given the type of search being run

  • Example of automatic field extraction


you search on sourcetype, a default field that Splunk software extracts for every event at index time

About regular expressions with field extractions

In inline field extractions, the regular expression is in props.conf. You have one regular expression per field extraction configuration
In transform extractions,The regular expression is in transforms.conf while the field extraction is in props.conf

  • Regular expressions
  • Proper field name syntax
    Field names must conform to the field name syntax rules:
    Valid characters for field names are a-z, A-Z, 0-9, . , :, and _
    Field names cannot begin with 0-9 or _

2.4.Use the field extractor in Splunk Web

Build field extractions with the field extractor
Overview of the field extractor

Access the field extractor

  • Bottom of the fields sidebar
  • All Fields dialog box
  • Any event in the search results
  • Access the field extractor through the Field Extractions page in Settings
  • Access the field extractor through the Home page
  • Access the field extractor after you add data

Field Extractor: Select Sample step
Field Extractor: Select Method step

Field Extractor: Select Fields step
The Select Fields step of the field extractor is for regular-expression-based field extractions only

Field Extractor: Rename Fields step
The Rename Fields step of the field extractor is for delimiter-based field extractions only

Field Extractor: Validate step
The Validate step of the field extractor is for regular-expression-based field extractions only

Field Extractor: Save step
In the Save step of the field extractor you define the name of the new field extraction definition, set its permissions, and save the extraction
Note: The extraction name cannot include spaces.

2.5.Use the settings pages for field extractions in Splunk Web

Use the Field extractions page
Review search-time field extractions in Splunk Web
Field extractions can be set up entirely in props.conf, in which case they are identified on the Field extractions page as inline field extractions
Some field extractions include a transforms.conf component, and these types of field extractions are called transform field extractions
Use the Field transformations page
Why set up a field transform for a field extraction?

  • euse the same field-extracting regular expression across multiple sources, source types, or hosts
  • Apply more than one field-extracting regular expression to the same source, source type, or host
  • Use a regular expression to extract fields from the values of another field

A typical field transform looks like this in transforms.conf

REGEX = /js/(?<license_type>[^/]*)/(?<version>[^/]*)/login/(?<login>[^/]*)

In props.conf, that transform is matched to the source .../banner_access_log* like so

REPORT-banner = banner


2.6.Use the configuration files to configure field extractions

Configure custom fields at search time
You can set up and manage search-time field extractions via Splunk Web. You cannot configure automatic key-value field extractions through Splunk Web
You can locate props.conf and transforms.conf in $SPLUNK_HOME/etc/system/local/, or your own custom app directory in $SPLUNK_HOME/etc/apps/

  • Types of field extraction
  • When to use inline or transform extractions

Configure inline extractions

  • Identify the source type, source, or host that provide the events that your field should be extracted from
  • Configure a regular expression that identifies the field in the event
  • Follow the format for the EXTRACT field extraction type to configure a field extraction stanza in props.conf
  • If your field value is a subtoken, you must also add an entry to fields.conf
  • Restart Splunk Enterprise

Configure advanced extractions with field transforms

  • Identify the source type, source, or host that provides the events that your field is extracted from.
  • Configure a regular expression that identifies the field in the event
  • Configure a field transform in transforms.conf that utilizes this regular expression or delimiter configuration.
    The transform can define a source key and event value formatting
  • Follow the format for the REPORT field extraction type to configure a field extraction stanza in props.conf that uses the host, source, or source type identified earlier
  • Restart your Splunk deployment for your changes to take effect

Configure automatic key-value field extraction
Automatic key-value field extraction is a search-time field extraction configuration that uses the KV_MODE attribute to automatically extract fields for events associated with a specific host, source, or source type
You can configure it to extract fields from structured data formats like JSON, CSV, and from table-formatted events. Automatic key-value field extraction cannot be configured in Splunk Web, and cannot be used for index-time field extractions

Automatic key-value field extraction format:

KV_MODE = [none|auto|auto_escaped|multi|json|xml]

Disabling automatic extractions for specific sources, source types, or hosts
Add KV_MODE = none for the appropriate [<spec>] in props.conf

KV_MODE = none

Example inline field extraction configurations

EXTRACT-errors = device_id=\[w+\](?<err_code>[^:]+)
EXTRACT-port_flapping = Interface\s(?<interface>(?<media>[^\d]+)(?<slot>\d+)\/(?<port>\d+))\,\schanged

Five fields are extracted as named groups: interface, media, slot, port, and port_status

Create a field from a subtoken
Add an entry to fields.conf


Example transform field extraction configurations
Configure a field extraction that uses multiple field transforms
Configure delimiter-based field extractions

2.7.Configure extractions of multivalue fields with fields.conf

You can use the TOKENIZER setting to define a multivalue field in fields.conf
The TOKENIZER setting is used by the where, timeline, and stats commands
TOKENIZER multivalue field configuration syntax:

[<field name 1>]
TOKENIZER = <regular expression>

[<field name 2>]
TOKENIZER = <regular expression>

2.8.Calculated fields

Calculated fields are fields added to events at search time that perform calculations with the values of two or more fields already present in those events
Calculated fields enable you to define fields with eval expressions
Calculated fields come fifth in the search-time operations sequence
Calculated fields can reference all types of field extractions and field aliasing, but they cannot reference lookups, event types, or tags

  • Preventing overrides of existing fields
    If you do not want the calculated field to override existing fields when the eval statement returns a value, use:
EVAL-field = coalesce(field, <eval expression>)

If you do not want the calculated field to override existing fields when the eval statement returns null, use:

EVAL-field = coalesce(<eval expression>, field)

Calculated fields independence

EVAL-x = x * 2
EVAL-y = x * 2

For a specific event x=4, these calculated fields would replace the value of x with 8, and would add y=8 to the event

Create calculated fields with Splunk Web

  • Select Settings > Fields
  • Select Calculated Fields > New
  • Select the app that will use the calculated field
  • Select host, source, or sourcetype to apply to the calculated field and specify a name You can also enter a wildcard if you want to apply this for all hosts, sources, or sourcetypes
  • Name the resultant calculated field
  • Define the eval expression

Configure calculated fields with props.conf
Create the following stanza in props.conf

Eval-Description = case(Depth<=70, "Shallow", Depth>70 AND Depth<=300, "Mid", Depth>300 AND Depth<=700, "Deep")