Wednesday, January 20, 2016

Splunk Internal and Admin Index

Splunk Internal and Admin Index

Splunk's internal indexes like _internal, _audit, mail and metadata holds quiet a lot of information that would assist the admin in troubleshooting or simply to get more information.

These are administrative index's so elevated privilege required to run them.

Below are list of those that helped me troubleshoot range of issues.

Objective

Splunk Search Samples

To check if all source types were sending logs
| metadata type=sourcetypes index=* | eval diff=now()-recentTime | eval fifteenMinutes=60*15 | eval fourWeeks=60*60*24*28 | where diff>fifteenMinutes| where diff<fourWeeks
| convert ctime(recentTime) | rename recentTime as "Last Seen"
| convert ctime(firstTime)  | rename firstTime as "First Seen"
| convert timeformat="%d(days)-%H(Hours)" ctime(diff) AS Difference
| table sourcetype "Last Seen" "First Seen" Difference | sort Difference
To check if all hosts were sending logs
| metadata type=hosts index=* | eval diff=now()-recentTime | eval oneDay=60*60*24 | eval fourWeeks=60*60*24*28 | where len(host)<50  | where diff>oneDay| where diff<fourWeeks
 | convert ctime(recentTime) | rename recentTime as "Last Seen"
| convert ctime(firstTime)  | rename firstTime as "First Seen"
| convert timeformat="%d(days)-%H(Hours)" ctime(diff) AS Difference
|table customer host "Last Seen" "First Seen" Difference | sort Difference
List of Index
| metadata index=* type=hosts  | search host=* | eval delta = now() - lastTime | convert ctime(lastTime)  | convert ctime(firstTime)  | fields host,  lastTime, firstTime | RENAME delta AS "L
License Usage
 index=_internal source=*license_usage.log type=usage | eval h=coalesce(h,"-") | rename st as sourcetype h as host | typer | stats sum(b) as bytes count values(eval(if(customer="-",host,null()))) as hosts by customer sourcetype

index=_internal source=*license_usage.log type=usage  | rename st as sourcetype h as host  | stats sum(b) as bytes  count  by  index

To check the no of incident raised and description
index=mail sourcetype=imap  earliest=-15d@d Subject="*incident Creation*" OR Subject="*Resolved*"  | stats count by  incident_no Subject _time | sort -incident_no

index=mail sourcetype=imap  earliest=-30d@d Subject="*incident Creation*" OR Subject="*Resolved*"  | table  Refno Subject Opened_at Resolved_at Incident_Description Incident_Record_No_

index=mail sourcetype=imap   Subject="*incident Creation*"   | table Incident_Record_No_ Subject Description

index=mail sourcetype=imap   Subject="*Resolved*"  | table  Refno Subject Opened_at Resolved_at Incident_Description Incident_Record_No_
Saved searches run time
index=_internal sourcetype=searches OR sourcetype=scheduler | convert timeformat="%H:%M:%S" ctime(search_run_time) AS search_run_time | stats values(search_run_time) as run_time by user

index=_internal  sourcetype=scheduler | convert timeformat="%H:%M:%S" ctime(run_time)) as ms_run_time|  stats sum(ms_run_time) avg(ms_run_time) max(ms_run_time) by user, savedsearch_name

index=_internal  sourcetype=scheduler | stats min(run_time) as "Min runtime (seconds)", median(run_time) as median_runtime, max(run_time) as max_runtime, count(eval(status!="continued")) AS total_exec, count(eval(status=="success")) as "Successful executions", count(eval(status=="skipped")) AS "Skipped executions" by app, savedsearch_name, user
Adhoc searches Reporting
index=_audit action=search (id=* OR search_id=*) | eval search_id = if(isnull(search_id), id, search_id) | replace '*' with * in search_id |  rex "search='(?<search>.*?)', autojoin" | search search_id!=scheduler_* | convert num(total_run_time) | eval user = if(user="n/a", null(), user) | stats min(_time) as _time first(user) as user max(total_run_time) as total_run_time first(search) as search by search_id | search search=search* search!=*_internal* search!=*_audit* | chart median(total_run_time) as "Median search time" perc95(total_run_time) as "95th Percentile search time" sum(total_run_time) as "Total search time" count as "Search count" max(_time) as "Last use" by user | fieldformat "Last use" = strftime('Last use', "%F %T.%Q %:z")

index=_audit action=search (id=* OR search_id=*) | eval search_id = if(isnull(search_id), id, search_id) | replace '*' with * in search_id |  rex "search='(?<search>.*?)', autojoin" | search search_id!=scheduler_* | convert num(total_run_time) | eval user = if(user="n/a", null(), user) | stats min(_time) as _time first(user) as user max(total_run_time) as total_run_time first(search) as search by search_id | search search=search* search!=*_internal* search!=*_audit* | timechart count as "Search count" by user

Rules not alerted
index=_internal sourcetype="aggregate:search_command"  _raw="*Received 0 results for processing*" | stats count
index=_internal sourcetype="aggregate:search_command"  _raw="*Received 0 results for processing*" | dedup rule_name |stats count
Rules Alerted
index=_internal sourcetype="aggregate:search_command"  _raw!="*Received 0 results for processing*" | stats count
index=_internal sourcetype="aggregate:search_command"  _raw!="*Received 0 results for processing*" | dedup rule_name |stats count

Rare Punct
| stats count(punct) as count first(punct) as punctuation first(_raw) as sample by sourcetype  |sort count|head 10|table count punctuation sample 

Top Punct
|stats count first(punct) as punctuation first(_raw) as sample first(sourcetype) as sourcetype by punct|sort - count|head 10 |table count punct sample sourcetype 

Metadata search samples
| metasearch index=* | eval Date=strftime(_time,"%Y-%m-%d") | chart count over host by Date

| metadata type=hosts | eval FirstSeen=firstTime | eval RecentSeen=recentTime | eval seconds_since=now()-recentTime | convert ctime(RecentSeen) | convert ctime(FirstSeen) | eval hours=floor(seconds_since/60/60) | eval minutes=floor((seconds_since-hours*60*60)/60) | eval seconds=seconds_since-hours*60*60-minutes*60 | fields host, source, sourcetype, FirstSeen, RecentSeen, hours, minutes, seconds

| metadata type=hosts index=*  | convert ctime(recentTime) AS "Last Indexed"  | convert ctime(firstTime) AS "First Indexed" | search NOT (host="*:*:*")

| metadata type=hosts index=*  | rename totalCount as Count firstTime as "First Event" lastTime as "Last Event" recentTime as "Last Update" | fieldformat Count=tostring(Count, "commas") | fieldformat "First Event"=strftime('First Event', "%c") | fieldformat "Last Event"=strftime('Last Event', "%c") | fieldformat "Last Update"=strftime('Last Update', "%c") | search NOT (host="*:*:*")
  
| tstats latest(_time) as latest where index=* by host source sourcetype index | where latest < relative_time(now(), "-5m") | convert ctime(latest) AS "Latest"
  
| metadata type=hosts |where recentTime < now() - 86400 | eval lastSeen = strftime(recentTime, "%F %T") | fields + host lastSeen

 | metadata index=* type=hosts | eval age = now()-lastTime | where age > (2*86400) | sort age d | convert ctime(lastTime) | fields age,host, lastTime

 Host that has not send logs for greater than 5 days.
| metadata index=* type=hosts | search NOT (host=*:*) | eval age = now()-lastTime | where age > (5*86400) | sort age | convert ctime(lastTime) | eval age=tostring($age$,"duration") | fields host, age, lastTime

 | metadata index=* type=hosts | search NOT (host=*:*) | eval age = now()-lastTime | where age > (5*86400) AND age < (30*86400) | sort age | convert ctime(lastTime)
Count of events
| eventcount summarize=false index=* | search NOT count=0
| eventcount summarize=false index=* | search NOT count=0   | eventstats sum(count) as Total by index | eval percent=round(count*100/Total,1)."%"
| eventcount summarize=false index=* | search NOT count=0  | fieldformat count=tostring(count,"commas") | eventstats sum(count) as Total by index | eval percent=round(count*100/Total,1)."%"  | sort -count | table index count percent

| eventcount summarize=false index=* | search NOT count=0 | eval date=strftime(_time,"%Y-%m-%d") | stats sum(count) As count avg(count) as avg by index  | fieldformat count=tostring(count,"commas") | fieldformat avg=tostring(avg,"commas")

| eventcount summarize=false index=* |stats count by index |table index,count

| dbinspect index="*" | table index state eventCount sizeOnDiskMB   | sort + endEpoch | eval age = now()-endEpoch | eval humanAge = tostring(age, "duration") | streamstats sum(sizeOnDiskMB) as recoverableDiskMB

| dbinspect index="*" | eval rawSizeMB = rawSize / 1048576 | table index  eventCount rawSizeMB sizeOnDiskMB

Index time VS actual time
index=_internal earliest=-3h@h latest=-2h@h
   | rename _indextime as IndexTime
   | eval diff=IndexTime-_time
   | convert ctime(IndexTime) as IndexTime
   | eval diff=if(diff < 0, "0", diff)
   |  table _time IndexTime diff

source="udp:514"
    | eval IndexTime=_indextime
    | eval TimeStamp=_time
    | eval delta=_indextime-_time
    | eval Raw=_raw
    | where delta > 3600
    | convert ctime(IndexTime)
    | convert ctime(TimeStamp)
    | table  TimeStamp IndexTime delta Raw

Events happened “120s” around a Splunk restart
index=_internal  [ search index=_internal sourcetype="splunkd" "(build"
| eval earliest=_time-120 | eval latest=_time+120 | fields host earliest latest | FORMAT "(" "(" "" ")" "OR" ")" ]

List of searches ran by the users
index=_audit action=search info=granted search=* NOT "search_id='scheduler" NOT "search='|history" NOT ("user=splunk-system-user" OR user=admin) NOT "search='typeahead" NOT "search='| metadata type=* | search totalCount>0" | stats count by user search _time | sort _time | convert ctime(_time) | stats list(search) as search by user
Storage Calculator
earliest=-7d@m latest=now index=_internal source=*metrics.log* group=per_index_thruput series!=_* | eval MB=round(kb/1024,2) | bin _time span=1d | stats sum(MB) as TotalMB by series _time | stats avg(TotalMB) as AvgMBperDay by series | rename series as index | join index [| dbinspect index=* | search index!=_* | fields state,id,rawSize,sizeOnDiskMB,index | stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB by index | eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal | eval compressionrate = round(diskTotalinMB / rawTotalinMB, 2) | table index compressionrate] | join index [| rest /services/admin/indexes | search eai:acl.app!=system currentDBSizeMB>1 | table title maxTotalDataSizeMB frozenTimePeriodInSecs | eval frozenTimePeriodInDays=frozenTimePeriodInSecs/86400 | rename title as index | table index maxTotalDataSizeMB frozenTimePeriodInDays] | eval StorageRequiredMB=AvgMBperDay*frozenTimePeriodInDays*compressionrate | eval ActualStorageRequiredMB=case(StorageRequiredMB>maxTotalDataSizeMB, maxTotalDataSizeMB, StorageRequiredMB<maxTotalDataSizeMB, StorageRequiredMB) | eval ActualStorageRequiredGB=round(ActualStorageRequiredMB/1024,2) | eval StorageRequiredGB=round(StorageRequiredMB/1024 ,2) | eval maxTotalDataSizeGB=round(maxTotalDataSizeMB/1024,2) | eval AvgMBperDay=round(AvgMBperDay,2) | rename index as Index, AvgMBperDay as "Average Volume/Day (MB)", ActualStorageRequiredGB as "Actual Disk Space Requried (GB)", StorageRequiredGB as "Disk Space Required (GB) (per Retention Time w/Compression)", compressionrate as "Compression Rate", frozenTimePeriodInDays as "Retention Time (Days)", maxTotalDataSizeGB as "Maximum Configured Index Size (GB)" | table Index, "Average Volume/Day (MB)", "Retention Time (Days)", "Compression Rate", "Disk Space Required (GB) (per Retention Time w/Compression)", "Maximum Configured Index Size (GB)", "Actual Disk Space Requried (GB)" | addcoltotals "Actual Disk Space Requried (GB)" labelfield="Maximum Configured Index Size (GB)" label="Total Disk Required:"

Utilization by user
index=_audit action=search (id=* OR search_id=*) host=$host$ | eval search_id = if(isnull(search_id), id, search_id) | replace '*' with * in search_id |  rex "search='(?<search>.*?)', autojoin" | search search_id!=scheduler_* | convert num(total_run_time) | eval user = if(user="n/a", null(), user) | stats min(_time) as _time first(user) as user max(total_run_time) as total_run_time first(search) as search by search_id | search search=search* search!=*_internal* search!=*_audit* | eval total_run_time_min=round(total_run_time/60,2) | stats sum(total_run_time_min) as "Total Run Time Minutes", count(search_id) as "Total Searches" by user | search user!="splunk-system-user" | rename user as User | sort -"Total Searches" | addcoltotals labelfield=User label="Total:"
Splunk forwarder Details
index=_internal sourcetype=splunkd group=tcpin_connections NOT eventType=*  fwdType=full | eval forwarder = guid | eval receiver = host.":".destPort  | stats  dc(forwarder) as count_forwarder, dc(receiver) as count_receiver

Current Splunk Local Time
index=* | stats count AS tnow | eval tnow = now() | convert ctime(tnow) | head 1

index=* | eval tnow = now() | fields tnow | fields - _</em> | convert ctime(tnow) AS time_now | table  time_now | head 1

foo NOT foo | stats count | eval clock = strftime(time(), "%d-%m- %Y - %H:%M:%S") | table clock  (with single value visualisation)

foo NOT foo | stats count as Now| eval clock = strftime(time(), "%d-%m- %Y - %H:%M:%S")   | eval Now = strftime(now(), "%+") | table clock Now
Splunk Account Modification
index=_audit action=edit_user (operation=create OR operation=remove) | table _time, operation, user, object, info | rename info AS status, user AS src_user, object AS dest_user

index=_audit action=edit_roles (operation=create OR operation=remove) | table _time, operation, user, object, info | rename info AS status, user AS src_user, object AS role

Search Audit
index=_audit action=search search_id=* | replace '*' with * in search_id | rex "search='(?<search>.*?)', autojoin" | rex field=search "(?<search>.+?)(\|\s+summaryindex.*|$)" | eval user=if(user=="n/a" OR user=="-",null(),user) | transaction maxevents=2 maxspan=1d startswith=(info=granted) endswith=(info!=granted) keepevicted=true search_id | eval search_type=mvindex(search_type,0) | eval search=if(isnull(search),"No search string available. Search ID=".search_id,mvindex(search,0)) | eval user=if(isnull(user),"unknown",mvindex(user,0)) | eval info_count=mvcount(info) | eval search_type=case((id LIKE "DM_%" OR savedsearch_name LIKE "_ACCELERATE_DM%"), "dm_acceleration", search_id LIKE "scheduler%", "scheduled", search_id LIKE "rt%", "realtime", search_id LIKE "subsearch%", "subsearch", (search_id LIKE "SummaryDirector%" OR search_id LIKE "summarize_SummaryDirector%"), "summary_director", 1=1, "adhoc") | timechart count by search_type

Searches Running More than 15 Minutes
index=_audit  action=search (id=* OR search_id=*)   | eval search_id = if(isnull(search_id), id, search_id) | replace '*' with * in search_id |  rex "search='(?<search>.*?)', autojoin" | search search_id!=scheduler_* | convert num(total_run_time) | rex "\,\s+user=(?<user>\S+)," | eval user = if(user="n/a", null(), user) | stats min(_time) as _time first(user) as user max(total_run_time) as total_run_time first(search) as search by search_id host  | search search=search* search!=*_internal* search!=*_audit* | stats median(total_run_time) as "Median search time" perc95(total_run_time) as "95th Percentile search time" sum(total_run_time) as "Total search time" count as "Search count" max(_time) as "Last use" by user   | fieldformat "Last use" = strftime('Last use', "%F %T.%Q %:z") | rename host AS "Search Head" | sort - "Median search time"
Currently logged in Users
index="_audit" [search index=_internal source="*web_access.log" user!="-" | stats by user | fields user] | search action="search" OR action="rtsearch" | stats values(action) as Action, values(info) as Info, max(timestamp) as lastTime, min(timestamp) as firstTime by user | table user

| rest /services/authentication/current-context | search NOT username="splunk-system-user" | table username roles updated

Individual User Searches Monitoring
Individual users and their dashboard access
index=_audit host=* action=search user!=splunk-system-user search_id=* (info=granted OR info=completed)  user=$user$ | rex field=apiStartTime "'(?<start_time>[^']+)'" | rex field=apiEndTime "'(?<end_time>[^']+)'" | eval search_id = trim(if(isnull(search_id), id, search_id), "'") | eval run_time_min=round(total_run_time/60,2) | eval range=if(start_time=="ZERO_TIME","All Time", tostring(strptime(end_time, "%a %b %d %H:%M:%S %Y") - strptime(start_time, "%a %b %d %H:%M:%S %Y"),"duration"))  | stats  earliest(_time) AS "Start Time" latest(_time) AS "End Time" values(start_time) AS "Search Earliest" values(end_time) AS "Search Latest" count values(range) AS range values(search) AS Search values(user) AS User max(run_time_min) AS "RunTime(Min)" by search_id  | convert  ctime(*Time)  | where  count>1  | rename  search_id AS SID range AS "Search Range"  | table  "Start Time" "End Time" SID Search User "RunTime(Min)" "Search Range" "Search Earliest" "Search Latest" | sort  - "Run Time (Min)"  | sort  - RunTime(Min)

index=_internal host=* source=*web_access.log* /app/  | rex "GET /[^/]+/app/(?<app_view>[^/ ?]+/[^/ ?]+) " | search app_view="*/home*" | stats values(user) count by app_view



Refer :-

Rest API calls in Splunk (Frequently used)