有用链接:
最有用的:
不错的博客:
其他1:
其他2:
上面链接有点老了。新链接
1.查询索引中的所有内容
#coding=utf8from elasticsearch import Elasticsearches = Elasticsearch([{ 'host':'x.x.x.x','port':9200}])index = "test"query = { "query":{ "match_all":{}}}resp = es.search(index, body=query)resp_docs = resp["hits"]["hits"]total = resp['hits']['total']print total #总共查找到的数量print resp_docs[0]['_source']['@timestamp'] #输出一个字段
2.用scroll分次查询所有内容+复杂条件
过滤条件:字段A不为空且字段B不为空,且时间在过去10天~2天之间
#coding=utf8from elasticsearch import Elasticsearchimport jsonimport datetimees = Elasticsearch([{ 'host':'x.x.x.x','port':9200}])index = "test"query = { \ "query":{ \ "filtered":{ \ "query":{ \ "bool":{ \ "must_not":{ "term":{ "A":""}}, \ "must_not":{ "term":{ "B":""}}, \ } \ }, \ "filter":{ "range":{ '@timestamp':{ 'gte':'now-10d','lt':'now-2d'}} } }\ } \ }resp = es.search(index, body=query, scroll="1m",size=100)scroll_id = resp['_scroll_id']resp_docs = resp["hits"]["hits"]total = resp['hits']['total']count = len(resp_docs)datas = resp_docswhile len(resp_docs) > 0: scroll_id = resp['_scroll_id'] resp = es.scroll(scroll_id=scroll_id, scroll="1m") resp_docs = resp["hits"]["hits"] datas.extend(resp_docs) count += len(resp_docs) if count >= total: breakprint len(datas)
3.聚合
查看一共有多少种@timestamp字段
#coding=utf8from elasticsearch import Elasticsearches = Elasticsearch([{ 'host':'x.x.x.x','port':9200}])index = "test"query = { "aggs":{ "all_times":{ "terms":{ "field":"@timestamp"}}}}resp = es.search(index, body=query)total = resp['hits']['total']print totalprint resp["aggregations"]