Splunk

Splunk Enterprise Fundamental Part2 Module5

1.Introduction

By last documentsFundamentals part1 i have successfully pass the exam of core user certification.Next month i will prepare the exam power user certification.But this time without training documents i can only user official documents to study for the exam.

2.Start Course

2.1.Using Transforming Commands for Visualizations

This course contains next part:

  • Explore data structure requirements(研究数据结构需求)
  • Explore visualization types(研究可视化类型)
  • Create and format charts and timecharts(创建并设计图表和时间柱状图)
    Dashboards and Visualizations

2.1.1.Visualization reference

Compare options and select a visualization to show the data insights that you need(比较选项并选择满足你需要的数据特色的可视化),next form is the instruments of all visualizations
0645

2.1.2.Data structure requirements for visualizations

visualizations require search results in specific formats or data structures,write queries to generate results in the correct format for the visualization that you are building.

2.1.2.1.Event list

add an events list to a dashboard to give users access to the events,fields,and values generated by a search.an events list does not abstract or process search results like a chart orother visualizations does

  • From the search page,run a search
  • select events tab to view the event list
  • (Optional)save as > Dashboard panel and add the events to a dashboard
  • (Optional)use Format menu or Simple XML to configure the event list

events display option

  • List(default)
  • Raw
  • Table(This "Table" is different from Statistics Table visualization)

Format events

  • row number
  • wrapping
  • maximum lines

Drilldown option

  • Full
  • Inner
  • Outer
  • None(default)

0646
case scenario
error OR failed OR severe OR ( sourcetype=access_* ( 404 OR 500 OR 503 ) )
0647
clicking on the /opt/splunk/var/log/splunk/splunkd.log ,you can get next result:
* source="/opt/splunk/var/log/splunk/splunkd.log"

2.1.2.2.Table

generate a table
To generate a table,write a search that includes a transforming command,from the search page,run the search and select Statistics tab to view and format the table
example1: use command chart
index=_internal | chart avg(bytes) over sourcetype
example2: use command stats
index=_internal | stats count by action,host
example3: use command table
index=_internal | stats count by action,host | table host count
example4: add function sparkline
index=_internal | chart sparkline count by sourcetype
format table visualization
summary and data row differences
0748
static summary row fits more use cases.if you generate a total data row using the addcoltotals SPL command in a search,note the following table behavior impact:

  • an addcoltotals row is treated as a data rowin the table
  • because they are handled as data rows,addcoltotals rows are included in a PDF or CSVdashboard export
  • Color scales or data overlay can be skewed if a table includes an addcoltotals data row
  • table should not include an addcoltotals data row and a column totals summary row,if you opt to include a totals summary row,adjust the search to remove the addcoltotals command

summary row examples
example:

... | chart count(itemId) over categoryId by action

config by next graphic of the Format
0749
we can get the totals summary row to the table
0750
percentage summary row
example:

... | chart count(itemId) over action by categoryId

config by next graphic of the Format
0751
we can get the percentage row to the table
0752
format table columns
column formatting is not avaiable for columns representing _time field and sparkline columns
column color

  • Scale
  • Ranges
  • Values

Scale Color option

  • sequential(use a sequential scale to show how results approach a high value in the column)
  • divergent(a divergent scale can show how results approach high and low values)

Configure a custom color scale
0754
Range Configuration options

  • Adjust the default range value and color settings
  • add or remove ranges

Values
use attomatic value coloring or define custom rules,cells with the same value appear in the same color

number format

  • enable or disable number formatting
  • set decimal precison
  • opt to use thousand separators
  • Specify a measurement unit to add context to the values in this column. You can position the unit before or after each value

configure properties

  • The number of rows shown in each table page
  • Wrapping
  • Table row number display

Data overlay

  • Heat map
  • High and low value

column color formatting overwrides data overlay configurations,so use data overlay nonly without adding column color formatting to the table

Drilldown
0755

2.1.2.3.charts

1、Pie chart
data formatting
pie chart represent a single data series

... | stats count by source

pie configuration options
you can use the Format menu to configure the following pie chart components

  • Drilldown
  • Minimum size (set a minimum percentage size to apply when there are more than 10 slices,data values below the minimum percentage are combined into an othe slice)

create a pie chart

  • write a search that uses a transforming command to aggregate values in a field
  • Run the search
  • select statistics tab below the search bar.the statistics table here should have two columns
  • select the visualization tab and use the visualization picker to select the pie chart visualization
  • (optional)use the Format menu to configure the visualization
    example1:
... | stats count by Code

example2:

index = _internal | chart avg(bytes) over source

2、column and bar charts
use column and bar charts to compare field values across a data set
data formatting
column and bar charts represent one or more data series.To make sure that a search generates one or more series,check the Statistic tab.The table should have at least two columns,no x or y axis can not generate column or bar charts.use eval or fields commands might change search result structure
Statistics table order and chart axes

  • Column charts get x-axis values from the first column in the table,the next table columns contain y-axis values
  • Bar charts get y-axis values from the first column in the table,the next table columns contain x-axis values

example:
use command timechart generates a table where _time is the first column. A column chart generated with this search has a _time x-axis,A bar chart useing this search has a _time y-axis

single and multiple data series
single data series

... | chart avg(bytes) over source

column and bar charts represent in the single series differently
column chart
source values a used for x-axis
bar chart
avg(bytes) values are used for the x-axis
Multiple data series

... | chart avg(bytes) over source by clientip   

0756
configure option

  • chart titles
  • axis titles
  • minimum and maximum axis values
  • use a logarithmic unit scale.this option is helpful when there are very small and very large axis values
  • Specify whether to abbreviate y-axis numerical values. For example, a value of 20,000 will be abbreviated to 20K if you toggle this option to On. Only y-axis values can be abbreviated in column and bar charts
  • Chart legend placement and text truncation
  • Label rotation

Multiple series options
Multi-series mode:ompare trends across multiple series. Enable the mode to show independent axis ranges for each series
Stacked charts:Use a stacked chart to see more details for values in a particular field
0757

Create a column or bar chart

  • Write a search that generates one or more data series
  • Run the search
  • Select the Statistics tab below the search bar. The statistics table here should have two or more columns
  • Select the Visualization tab and use the Visualization Picker to select the column or bar chart visualization
  • (Optional) Use the Format menu to configure the visualization

example:
Bar chart

index=_internal "group=pipline" | stats sum(cpu_seconds) as totalCpuSeconds by processors | sort 10 totalCpuSeconds desc

0758
Stacked column chart

...| timechart count by Code | fields _time L B N

0759

3、line and area charts
Use line and area charts to track value trends over time. You can also use a line or area chart x-axis to represent a field value other than time
Data formatting
Line charts can represent one or more data series. Area charts represent multiple data series,The Statistics table should have at least two columns for a single series, and three or more columns for multiple series.

  • Statistics table order and chart axes
    any search using the timechart reporting command generates a table where _time is the first column. A line or area chart generated with this search has a _time x-axis
    Search results not structured as a table with valid x-axis or y-axis values cannot generate line or area charts. For example, using the eval or fields commands might change search result structure.
  • Single and multiple data series
    line or area charts represent multiple series. Line charts can also be used for a single data series, but area charts cannot
    Single series
...| chart avg(bytes) over source 

0760
source values are used for the x-axis. The y-axis represents avg(bytes) values
Multiple data series
To generate multiple data series, introduce the timechart command to add a _time field to search results
0761

configuration option

  • chart title
  • axis title
  • Show minimum and maximum y-axis values.
  • Use a logarithmic unit scale for y-axis values. This option is helpful when there is a wide range in y-axis values
  • Specify whether to abbreviate y-axis numerical values. For example, a value of 20,000 will be abbreviated to 20K if you toggle this option to On. Only y-axis values can be abbreviated in area and line charts
  • Chart legend position and label truncation
  • Null y-axis value handling. Choose one of the following options
    Show null data points as a gap. The chart shows markers for any disconnected data points in this case
    Connect null data points to zero data points
    Connect to the next positive data point
    Multiple series options
  • Multi-series mode
  • Stacked area charts
    Stacked area charts are available when a search generates multiple data series. Stacking is not available for line charts
    0762
    Create a line or area chart
  • Write a search that generates multiple data series. If you are building a line chart you can opt to generate a single data series
  • Run the search
  • Select the Statistics tab below the search bar. The statistics table here should have two or more columns
  • Select the Visualization tab and use the Visualization Picker to select the line or area chart visualization
  • (Optional) Use the Format menu to configure the visualization

example:
line chart

index=_internal | timechart count by sourcetype

0763
area chart

index=_internal source=*metrics.log group=search_concurrency "system total" NOT user=*
| timechart max(active_hist_searches) as "Historical Searches" max(active_realtime_searches) as "Real-time Searches"

0764
stacked area chart

sourcetype=access_* status=200 action=purchase categoryId!=NULL | timechart count(categoryId) by categoryId

0765

4、Scatter chart
Data formatting
Scatter charts work best with two data series,make sure that there are three columns in the Statistics table,use the table command to change the order of the columns if needed
Configuration options

  • Axis titles
  • Legend placement and truncation
  • Axis scale and interval values
  • Axis minimum and maximum values
  • Abbreviate y-axis and x-axis numerical values

Create a scatter chart

  • Write a search that generates two data series
  • Run the search
  • Select the Statistics tab below the search bar. The statistics table here should have three columns
  • Select the Visualization tab and use the Visualization Picker to select the scatter chart visualization
  • (Optional) Use the Format menu to configure the visualization

example:

source="earthquake.csv" | table Region Magnitude Depth

0766

4、Bubble chart
Use a bubble chart to visualize multiple series data in three dimensions. Bubble position represents two dimensions of the data series. Bubble size represents the third dimension

create a Bubble chart

  • Write a search that generates three data series
  • Run the search
  • Select the Statistics tab below the search bar. The statistics table here should have four columns
  • Select the Visualization tab and use the Visualization Picker to select the bubble chart visualization
  • Optional) Use the Format menu to configure the visualization

example:

source="earthquakes.csv" | stats count by Region, Magnitude, Depth

0770

2.1.2.4.Single Value

Single value visualizations work best for queries that create a time series chart using the timechart command or aggregate data using the stats command

  • Use timechart to generate a single value
index=_internal source="*splunkd.log" log_level="error" | timechart count

Using timechart means that time series data becomes available to sparkline and trend indicator processing
note:If you pipe to stats as part of a full timechart query, the visualization does not include a sparkline or trend indicator
0771

  • Use stats to generate a single value
index = _internal source = "*splunkd.log" log_level = "error" | stats count

0772

  • Queries and time ranges for single values
    Search for a single value to avoid unexpected results in the visualization. In the Dashboard Editor, you can select single value visualizations even if a search returns multiple values. In this case, the single value visualization uses the value in the first cell of the results table

A query using stats results in a visualization showing the aggregated total of results in the time range. A query using timechart generates a visualization showing the most recent result within that range

  • Queries to generate a sparkline and trend indicator
    Using the time range picker to select Week to date means that the sparkline reflects the data changes over the last seven days
    Using the time range picker to select Today means that the sparkline shows data changes over the past twenty-four hours
    To include sparklines and trend indicators in a visualization, it is important that the search includes a timechart command

2.1.2.5.Gauges

Gauge types

  • Radial gauge
  • Filler gauge
  • Marker gauge

0773
0774
0775

example:

index=_internal source="*splunkd.log" log_level="error" | stats count as errors

Configuration options
Format > Color Ranges
Set the Color Ranges handling to Automatic if the query includes the gauge command for range configuration
If the query includes gauge, Format menu range configurations override the gauge command settings in the query
Create a gauge visualization

  • Write a search that generates a single aggregated value
  • Run the search
  • Select the Visualization tab and use the Visualization Picker to select a radial, filler, or marker gauge
  • (Optional) Use the Format menu to configure the visualization

2.1.2.6.Map

  • Components for building geographic visualizations
    0776
  • Use normalized data
    Choropleth maps work best when data is normalized. Normalization adjusts data to more accurately reflect the metric that you are visualizing
  • Test custom lookup files
    you can use the inputlookup command to make sure that they are working properly before building a Choropleth map
  • Show all features on a map regardless of data coverage
    If you have a data set that does not include values to aggregate for every feature in a Choropleth map, you can use the geom command allFeatures parameter to show all shapes on the map when it renders
    create Choropleth map
    need coordinates data, a transforming search, and a geospatial lookup to build a Choropleth map
source=my_data_source.csv 
| lookup geo_us_states longitude as Longitude, latitude as Latitude 
| stats count by featureId 
| geom geo_us_states

2.1.3.Data and formatting requirements

many visualizations require a search using transforming commands, such as stats, chart, timechart, or geostats to render
Charts visualize one or more data series
Single value and gauge visualizations represent a single numerical value
Maps combine a query and other data components, including data with coordinates or place information, lookup definitions, and geographical markup files
When creating a visualization, you can check the Statistics table after running a search to make sure that result fields are generated correctly

2.1.4.chart and timechart

  • chart examples
    1、Chart the max(delay) for each value of foo
... | chart max(delay) OVER foo
  1. Chart the max(delay) for each value of foo, split by the value of bar
... | chart max(delay) OVER foo BY bar

3、Chart the ratio of the average to the maximum "delay" for each distinct "host" and "user" pair

... | chart eval(avg(size)/max(delay)) AS ratio BY host user
  1. Chart the maximum "delay" by "size" and separate "size" into bins
... | chart max(delay) BY size bins=10
  1. Chart the average size for each distinct host
... | chart avg(size) BY host
  1. Chart the number of events, grouped by date and hour
... | chart count BY date_mday span=3 date_hour span=12

Return the number of events, grouped by date and hour of the day, using span to group per 7 days and 24 hours per half days
7. Align the chart time bins to local time

...| chart _time span=12h aligntime=@d+5h

Align the time bins to 5am (local time). Set the span to 12h. The bins will represent 5am - 5pm, then 5pm - 5am (the next day), and so on
8、In a multivalue BY field, remove duplicate values

...| chart avg(field) BY mvfield dedup_splitval=true

9、Specify and values with the chart command
0777
source="addtotalsData.csv" | chart sum(sales) BY products quarter
0779
10、chart the number of different page request for each Web Server

sourcetype=access_* | chart count(eval(method="GET")) AS GET,count(eval(method="POST")) AS POST by host

0780
format the results as a column chart
0781
11、Chart the number of transactions by duration

sourcetype=access_* status=200 action=purchase | transaction clientip maxspan=10m | chart count BY duration span=log2

0782
0783
12、Chart the average number of events in a transaction, based on transaction duration

sourcetype=access_* status=200 action=purchase | transaction clientip maxspan=30m | chart avg(eventcount) by duration span=log2

0784
0785
13、Chart customer purchases

sourcetype=access_* status=200 action=purchase | chart dc(clientip) OVER date_hour BY categoryId usenull=f

0786
format the results as a line chart:
0787
format the results as a column chart:
0788
14、 Chart the number of earthquakes and the magnitude of each earthquake

source=all_month.csv place=*alaska* mag>=3.5 | chart count BY mag place useother=f | rename mag AS Magnitude

0789
0790

2.2.Using Mapping and Single Value Commands

2.2.1.iplocation

Required arguments

  • ip-address-fieldname
    such as clientip

Optional arguments

  • allfields
    if set true,add fields City, Continent, Country, lat (latitude), lon (longitude), MetroCode, Region, and Timezone
    if set false(default),Only the City, Country, lat, lon, and Region fields are added to the events
  • lang
  • prefix
    add a prefix to the added field names to avoid name collisions with existing fields
    if you specify prefix=iploc_ the field names that are added to the events become iploc_City, iploc_County, iploc_lat
    default:NULL

Usage
The iplocation command is a distributable streaming command,The Splunk software ships with a copy of the GeoLite2-City.mmdb database file,This file is located in the $SPLUNK_HOME/share/

examples

  • Add location information to web access events
sourcetype=access_* | iplocation clientip
  • Search for client errors and return the first 20 results
sourcetype=access_* status>=400 | head 20 | iplocation clientip | table clientip, status, City, Country

0791

  • Add a prefix to the fields added by the iplocation command
sourcetype = access_* | iplocation prefix=iploc_ allfields=true clientip | fields iploc_*

0792

2.2.2.geostats

Use the geostats command to generate statistics to display geographic data and summarize the data on maps
Required arguments

  • stats-agg-term
    Optional arguments
  • binspanlat
    The size of the bins in latitude degrees at the lowest zoom level.
    Default: 22.5. If the default values for binspanlat and binspanlong are used, a grid size of 8x8 is generated
  • binspanlong
    Default: 45.0. If the default values for binspanlat and binspanlong are used, a grid size of 8x8 is generated
  • by-clause
  • globallimit
    Default: 10
  • locallimit
    Default: 10
  • latfield
    Defaults: lat
  • longfield
    Default: lon
  • maxzoomlevel
    Default: 9
  • outputlatfield
    Default: latitude
  • outputlongfield
    Default: longitude
  • translatetoxy
    Default: true
    Stats function options
    0793
    Usage
    To display the information on a map, you must run a reporting search with the geostats command
  • Memory and maximum results
    In the limits.conf file, the maxresultrows setting in the [searchresults] stanza specifies the maximum number of results to return. The default value is 50,000. Increasing this limit can result in more memory usage
    The max_mem_usage_mb setting in the [default] stanza is used to limit how much memory the geostats command uses to keep track of information. If the geostats command reaches this limit, the command stops adding the requested fields to the search results. You can increase the limit, contingent on the available system memory
    0794

examples

  • Use the default settings and calculate the count
... | geostats count
  • Specify the latfield and longfield and calculate the average of a field
... | geostats latfield=eventlat longfield=eventlong avg(rating) by gender
  • Count each product sold by a vendor and display the information on a map
sourcetype=vendor_sales | stats count by Code VendorID | lookup prices_lookup Code OUTPUTNEW product_name | table product_name VendorID | lookup vendors_lookup VendorID | geostats latfield=VendorLatitude longfield=VendorLongitude count by product_name

0795
0796

2.2.3.geom

The geom command adds a field, named geom, to each result. This field contains geographic data structures for polygon geometry in JSON. These geographic data structures are used to create choropleth map visualizations
required arguments
None
option arguments

  • featureCollection
    Description: Specifies the geographic lookup file that you want to use. Two geographic lookup files are included by default with Splunk software: geo_us_states and geo_countries
  • allFeatures
    Default: false
  • featureIdField
  • gen
    Default: 0.1
  • min_x
    Default: -180
  • min_y
    Default: -90
  • max_x
    Default: 180
  • max_y
    Default: 90
    Usage
  • Specifying no optional arguments
    When no arguments are specified, the geom command looks for a field named featureCollection and a field named featureIdField in the event
  • Testing lookup files
| inputlookup geo_us_states
  • Testing geometric features
| stats count | eval featureId="California" | eval count=10000 | geom geo_us_states allFeatures=true

result next content:
0797
The following image shows the results of the search on the Visualization tab. Make sure that the map is a Cloropleth Map
0798

examples

  • Use the default settings
    When no arguments are provided, the geom command looks for a field named featureCollection and a field named featureId in the event
...| geom
  • Use the built-in geospatial lookup
    This example uses the built-in geo_us_states lookup file for the featureCollection
...| geom geo_us_states
  • Specify a field that contains the featureId
...| geom geo_us_states featureIdField="state"
  • Show all geometric features in the output
...| geom geo_us_states allFeatures=true
  • Use the built-in countries lookup
... | lookup geo_countries latitude AS lat, longitude AS long | stats count BY featureIdField AS country | geom geo_countries featureIdField="country"
  • Specify the bounding box for the geometric shape
... | geom geo_us_states featureIdField="state" gen=0.1 min_x=-130.5 min_y=37.6 max_x=-130.1 max_y=37.7... | geom geo_us_states featureIdField="state" gen=0.1 min_x=-130.5 min_y=37.6 max_x=-130.1 max_y=37.7

2.2.4.addtotals

The addtotals command computes the arithmetic sum of all numeric fields for each search result. The results appear in the Statistics tab
If col=true, the addtotals command computes the column totals, which adds a new result at the end that represents the sum of each field. labelfield, if specified, is a field that will be added to this summary event with the value set by the 'label' option. Alternately, instead of using the addtotals col=true command, you can use the addcoltotals command to calculate a summary event
required arguments
None
optional arguments

  • field-list
    Default: All numeric fields are included in the sum
  • row
    Default: true
  • col
    Default: false
  • fieldname
    Default: Total
  • labelfield
    Default: none
  • label
    Default: Total

Usage
The addtotals command is a distributable streaming command, except when is used to calculate column totals. When used to calculate column totals, the addtotals command is a transforming command
examples

  • Calculate the sum of the numeric fields of each event
    07hhhhh
source="addtotalsData.csv" | chart sum(sales) BY products quarter

The products field is referred to as the field.
The quarter field is referred to as the field.
07c

source="addtotalsData.csv" | chart sum(sales) BY products quarter | addtotals

07cc

source="addtotalsData.csv" | stats sum(sales) BY products   

07ccc

  • Specify a name for the field that contains the sums for each event
... | addtotals fieldname=sum
  • Use wildcards to specify the names of the fields to sum
... | addtotals fieldname=TotalAmount amount* *size*
  • Calculate the sum for a specific field
source="addtotalsData.csv" | stats sum(quota) by quarter| addtotals row=f col=t labelfield=quarter sum(quota)

075

  • Calculate the field totals and add custom labels to the totals
source="addtotalsData.csv" | chart sum(sales) by products quarter| addtotals col=t labelfield=products label="Quarterly Totals" fieldname="Product Totals"

076

2.3.Filtering and Formatting Results

2.3.1.The eval command

  • Types of eval expressions
    An eval expression is a combination of literals, fields, operators, and functions that represent the value of your destination field
    Use an eval expression with a stats function:
index=* | stats count(eval(status="404")) as count_status by sourcetype   

Search all indexes and count the number of events where the status field value is 404. Rename the results to a field called count_status and organize the results by source type

Define a field that is the sum of the areas of two circles:

... | eval sum_of_areas = pi() * pow(radius_a, 2) + pi() * pow(radius_b, 2)

Define a location field using the city and state fields:

... | eval location=city.", ".state

Use eval functions to classify where an email came from

sourcetype="cisco:esa" mailfrom=*| eval accountname=split(mailfrom,"@"), from_domain=mvindex(accountname,-1), location=if(match(from_domain, "[^\n\r\s]+\.(com|net|org)"), "local", "abroad") | stats count BY location

The split() function is used to break up the email address in the mailfrom field. The mvindex function defines the from_domain as the portion of the mailfrom field after the @ symbol
If the from_domain value ends with a .com, .net., or .org, the location field is assigned the value local
If from_domain does not match, location is assigned the value abroad
077

2.3.2.use search and where commands to filter result

07164
TIP:Inclusion is generally better than exclusion. Searching for "access denied" will yield faster results than NOT "access granted"

2.3.3.The fillnull command

Use fillnull to replace null field values with a string. If you do not specify a field list, fillnull replaces all null values with 0 (the default) or a user-supplied string
The fillnull command is a distributable streaming command when a field-list is specified
example1:fill all empty fields with NULL

... | fillnull value=NULL

example2:fill all empty field values of "foo" and "bar" with NULL

... | fillnull value=NULL foo bar

example3:fill all empty fields with zero

... | fillnull

example4:Build a time series chart of web events by host and fill all empty fields with NULL

sourcetype="web" | timechart count by host | fillnull value=NULL

支付宝扫码打赏 微信打赏

若你觉得我的文章对你有帮助,欢迎点击上方按钮对我打赏

扫描二维码,分享此文章

linuxwt's Picture
linuxwt

我叫王腾,来自武汉,2016年毕业后在上海做了一年helpdesk,自学了linux后回武汉从事系统运维的工作,从2017年开始写博客记录自己的学习工作,现在正在进行数据迁移到此博客,目前就职于北京神州新桥科技有限公司,个人的座右铭是:逃脱舒适区才能在闲暇的时候惬意的玩耍。

武汉光谷 https://linuxwt.com

Subscribe to 今晚打老虎

Get the latest posts delivered right to your inbox.

or subscribe via RSS with Feedly!

Comments