You need to enable JavaScript to run this app.
优惠活动
大模型
产品
解决方案
定价
更多
文档控制台
免费开始使用

使用Elasticsearch与Kibana5.6.9,求助导出原始数据为CSV的方法

Exporting Raw Elasticsearch Data to CSV (5.6.9 + Kibana)

Hey there! I totally get the frustration when you can export aggregated visualizations but not the raw data you need. Let's walk through a few reliable methods tailored to your Elasticsearch 5.6.9 and Kibana setup:

Method 1: Use Kibana Discover (Simplest GUI Approach)

You might have missed this, but Kibana's Discover tab is built specifically for working with raw index data:

  • Open Kibana and navigate to the Discover page (left sidebar)
  • Select your target index from the dropdown at the top left
  • Adjust the time range or add a query to filter exactly the raw data you want to export
  • Look for the Export button at the top right (it looks like a download icon)
  • Choose the CSV (raw data) option from the menu, then download the file

This exports all raw fields from your matching documents, not just the aggregated values from visualizations.

Method 2: Command Line with curl + jq (For Customization)

If you prefer terminal workflows, combining Elasticsearch's API with curl and jq (a lightweight JSON processor) gives you full control over which fields to export:

Step 1: Fetch raw data with curl

Retrieve your data using the _search API. Adjust size to match your document count (note: 5.6.9's default max_result_window is 10,000—use the scroll API below for larger datasets):

curl -X GET "http://your-es-host:9200/your-target-index/_search?size=1000&q=*" -H "Content-Type: application/json"

Step 2: Convert to CSV with jq

Pipe the output to jq to extract specific fields and format them as CSV. Replace field1, field2, etc., with your actual field names (including nested paths like user.email):

curl -X GET "http://your-es-host:9200/your-target-index/_search?size=1000&q=*" -H "Content-Type: application/json" | jq -r '["field1", "field2", "field3"], (.hits.hits[] | [._source.field1, ._source.field2, ._source.field3]) | @csv' > raw_data.csv

For large datasets (scroll API)

If you have more than 10,000 documents, use Elasticsearch's scroll API to batch fetch data. Here's a simplified shell script example:

# Initialize scroll and save the scroll ID
curl -X GET "http://your-es-host:9200/your-target-index/_search?scroll=1m&size=1000" -H "Content-Type: application/json" -d '{"query": {"match_all": {}}}' > scroll_response.json
SCROLL_ID=$(jq -r '._scroll_id' scroll_response.json)

# Write header to CSV
echo '"field1","field2","field3"' > large_raw_data.csv

# Loop to fetch batches until no more hits
while true; do
  curl -X GET "http://your-es-host:9200/_search/scroll" -H "Content-Type: application/json" -d "{\"scroll\": \"1m\", \"scroll_id\": \"$SCROLL_ID\"}" > scroll_batch.json
  HITS_COUNT=$(jq '.hits.hits | length' scroll_batch.json)
  if [ "$HITS_COUNT" -eq 0 ]; then
    break
  fi
  # Append batch data to CSV
  jq -r '.hits.hits[] | [._source.field1, ._source.field2, ._source.field3] | @csv' scroll_batch.json >> large_raw_data.csv
  SCROLL_ID=$(jq -r '._scroll_id' scroll_batch.json)
done

# Clean up temporary files
rm scroll_response.json scroll_batch.json

Method 3: Logstash (Best for Very Large Datasets)

For exporting hundreds of thousands or millions of documents, Logstash handles pagination and batch processing automatically. Here's how to set it up:

  1. Create a Logstash configuration file (e.g., es-to-csv.conf):
input {
  elasticsearch {
    hosts => ["your-es-host:9200"]
    index => "your-target-index"
    query => '{"query": {"match_all": {}}}' # Adjust query if filtering is needed
    scroll => "5m"
    size => 5000
  }
}

output {
  csv {
    path => "/path/to/your/exported_data.csv"
    fields => ["field1", "field2", "field3"] # List your target fields here
    separator => ","
    write_headers => true # Include CSV header row
  }
  stdout { codec => dots } # Optional: Shows progress as dots
}
  1. Run Logstash with the configuration:
bin/logstash -f es-to-csv.conf

Quick Notes:

  • Ensure your user account has read access to the target Elasticsearch index
  • For nested fields, use full paths like ._source.user.profile.name in jq or Logstash
  • If you hit permission issues, check Elasticsearch's role-based access control settings

内容的提问来源于stack exchange,提问作者kumar

火山引擎 最新活动