elk6安装过程
logstash是一个数据分析软件,主要目的是分析log日志。整一套软件可以当作一个MVC模型,logstash是controller层,Elasticsearch是一个model层,kibana是view层。
首先将数据传给logstash,它将数据进行过滤和格式化(转成JSON格式),然后传给Elasticsearch进行存储、建搜索的索引,kibana提供前端的页面再进行搜索和图表可视化,它是调用Elasticsearch的接口返回的数据进行可视化。logstash和Elasticsearch是用Java写的,kibana使用node.js框架。
1.安装java环境
apt install openjdk-8-jre-headless
2.安装elasticsearch
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.2.deb
dpkg -i elasticsearch-6.3.2.deb
2.1.安装elasticsearch-head插件
git clone https://github.com/mobz/elasticsearch-head.git
cd elasticsearch-head
npm install
# 如果报错 npm ERR! Failed at the phantomjs-prebuilt@2.1.16 install script ,则安装
npm install phantomjs-prebuilt@2.1.16 --ignore-scripts
npm run start
安装完head插件需要把elasticsearch的跨域访问开启,往elasticsearch.yml添加
#allow origin
http.cors.enabled: true
http.cors.allow-origin: "*"
3.安装logstash
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.3.2.deb
dpkg -i logstash-6.3.2.deb
添加配置软链,防止警告:
ln -s /etc/logstash /usr/share/logstash/config
3.1.添加logstash配置文件
cat /etc/logstash/conf.d/logstash.conf
input {
file {
path => "/app/log/*.log"
start_position => beginning
sincedb_path => "/tmp/logstash.db"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
date {
locale => "en"
match => ["timestamp" , "dd/MMM/YYYY:HH:mm:ss Z"]
}
geoip {
source => "clientip"
}
if [@timestamp] < "2018-08-17" {
drop {}
}
}
output {
elasticsearch {
index => "logstash-16988.com"
workers => 5
hosts => "localhost:9200"
}
}
logstash 日期转换成时间戳
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
date {
locale => "en"
match => ["timestamp" ,"dd/MMM/YYYY:HH:mm:ss Z"]
target => "pubdate"
}
ruby {
code => 'event.set("pubdate",event.get("pubdate").to_i)'
}
}
logstash.conf 根据file内的日期过滤,例如只导10天以内的数据
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
date {
locale => "en"
match => ["timestamp" ,"dd/MMM/YYYY:HH:mm:ss Z"]
}
geoip {
source => "clientip"
}
ruby {
code => 'if event.get("[@timestamp]").to_i < ( Time.now - 864000).to_i
event.cancel
end'
}
}
logstash 分析新浪SAE访问日志
filter {
grok {
match => { "message" => "%{HOSTNAME:domain} %{IPV4:clientip} %{INT:runtime} %{INT:cputime} \[%{HTTPDATE:timestamp}\] %{WORD:app} %{WORD:app_hash} %{INT:app_version} \"%{WORD:request_method} %{DATA:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:status} %{DATA:bytes} \"%{GREEDYDATA:referrer}\" \"%{GREEDYDATA:agent}\" *%{WORD:app_host}"}
}
date {
locale => "en"
match => ["timestamp" , "dd/MMM/YYYY:HH:mm:ss Z"]
timezone => "Asia/Shanghai"
}
geoip {
source => "clientip"
}
}
自动解压SAE压缩日志脚本:
#!/bin/bash
for file in *.tar.gz
do
if [[ $file != "*.tar.gz" | $file != "*.tar.gz" ]]; then
tar xf $file
DATA=`echo $file|awk -F "_" '{print $1}'`
NAME_1=`echo $file|awk -F "_" '{print $3}'`
mv ${NAME_1}_log ${DATA}_${NAME_1}_log
echo ${DATA}_${NAME_1}_log
rm $file
fi
done
只要把该脚本放在压缩日志同级目录下,运行即可
新浪云日志说明: http://www.sinacloud.com/doc/sae/tutorial/app-management.html#ri-zhi-zhong-xin
grok规则调试工具:
https://grokconstructor.appspot.com/do/match (推荐)
http://grok.qiexun.net/
http://grokdebug.herokuapp.com
nginx对应正则
nginx日志字段定义 | nginx访问字段 | 正则表达式 |
---|---|---|
$time_local | 08/Jan/2016:08:27:43 +0800 | %{HTTPDATE:timestamp} |
$upstream_addr | 10.10.6.212:8088 | %{HOSTPORT:upstream} |
serveraddr:server_port | 10.10.6.110:80 | %{HOSTPORT:request_server} |
$request_method | GET | %{WORD:request_method} |
$uri | /vvv/test/stat/index | %{URIPATH:uri} |
$args | proptype=11&level=2&srtype=2&city=dz®ion=XJ&begindate=2016-01-08&enddate=2016-01-08&apiKey=c2c959b203d669a9a21861379cb4523c&test=2 | %{URIPARAM:args} |
$remote_addr | 10.10.6.10 | %{IPV4:clientip} |
$server_protocol | HTTP/1.1 | HTTP/%{NUMBER:httpversion} |
[$http_user_agent] | [Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/537.36] | [%{GREEDYDATA:agent}] |
[$http_cookie] | [JSESSIONID=kq3v6xi2b74j1a9yxvfvcq131] | [%{GREEDYDATA:cookie}] |
$http_referer | http://10.10.6.110/test | %{GREEDYDATA:referrer} |
$host | www.test.com | %{HOSTNAME:domain} |
$status | 200 | %{NUMBER:status:int} |
$bytes_sent | 485 | %{NUMBER:body_sent:int} |
$request_length | 209 | %{NUMBER:request_length:int} |
0”$upstream_cache_status” | 0”-“ | 0"-" |
$request_time | 1.642 | %{NUMBER:request_time:float} |
$upstream_response_time | 1.642 | %{NUMBER:response_time:float} |
3.2.运行logstash
/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/
3.3.logstash的sincedb_path读取进度记录
/usr/share/logstash/data/plugins/inputs/file/
如果设置过软链,会读logstash.yml文件的path.data路径,则会在该目录下:
/var/lib/logstash/plugins/inputs/file/
3.4.logstash性能优化,调整参数:修改配置文件 logstash.yml
pipeline.workers: 1
pipeline.batch.size: 125
pipeline.batch.delay: 250
命令参数:–filterworkers 或 -w 意即工作线程。Logstash 会运行多个线程。你可以用 bin/logstash -w 5 这样的方式强制 Logstash 为过滤插件运行 5 个线程。
4.安装kibana
wget https://artifacts.elastic.co/downloads/kibana/kibana-6.3.2-amd64.deb
dpkg -i kibana-6.3.2-amd64.deb
4.1.修改kibana监听地址
vim /etc/kibana/kibana.yml
server.host: "0.0.0.0"
其他
elasticsearch-head 通过base auth访问es
http://localhost:9100/?base_uri=http://localhost:9200&auth_user=elastic&auth_password=password
默认端口
elasticsearch集群通讯端口 :9300
elasticsearch :9200
elasticsearch-head :9100
kibana :5601
ES操作
查看总数:
GET /logstash-*/_count?pretty
获取一条:
索引库/类型/ID
GET /{index}/{type}/{id}
添加一条:
PUT /{index}/{type}/{id}
{
"field": "value",
...
}
查询:
POST /{index}/_search
{"query":{"range":{"@timestamp":{"lt":"2018-08-17"}}}}
查询删除:
POST /{index}/_delete_by_query
{"query":{"range":{"@timestamp":{"lt":"2018-08-17"}}}}
删除索引:
DEL /{index}
设置副本数量
PUT /_settings
{
"number_of_replicas": 0
}
创建只有 一个主分片,没有副本的小索引:
PUT /my_temp_index
{
"settings": {
"number_of_shards" : 1,
"number_of_replicas" : 0
}
}
number_of_shards
每个索引的主分片数,默认值是 5 。这个配置在索引创建后不能修改。
number_of_replicas
每个主分片的副本数,默认值是 1 。对于活动的索引库,这个配置可以随时修改。
ES 配置/状态
基本索引度量标准(分片数,存储大小,内存使用情况)以及有关组成集群的当前节点的信息(数字,角色,操作系统,jvm版本,内存使用情况,cpu和已安装的插件)。
GET /_cluster/stats
节点统计信息
GET /_nodes/stats
配置优化
jvm内存建议设置为物理内存的二分之一
cat /etc/elasticsearch/jvm.options
-Xms2g
-Xmx2g
filebeat 作为日志搜集器
安装
wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.4.2-amd64.deb
dpkg –i filebeat-6.4.2-amd64.deb
filebeat modules list
filebeat modules enable nginx
安装elasticsearch依赖插件
sudo bin/elasticsearch-plugin install ingest-geoip
sudo bin/elasticsearch-plugin install ingest-user-agent
重启elasticsearch
/etc/init.d/elasticsearch restart
修改配置文件
vim /etc/filebeat/modules.d/nginx.yml
- module: nginx
access:
enabled: true
var.paths: ["/app/log/nginx_ppt/access.log*"]
error:
enabled: true
运行filebeat
filebeat --modules nginx,mysql,system
或者直接启动
service filebeat start
filebeat读取进度文件默认路径
/var/lib/filebeat/registry