使用Elasticsearch与Kibana5.6.9,求助导出原始数据为CSV的方法
Hey there! I totally get the frustration when you can export aggregated visualizations but not the raw data you need. Let's walk through a few reliable methods tailored to your Elasticsearch 5.6.9 and Kibana setup:
Method 1: Use Kibana Discover (Simplest GUI Approach)
You might have missed this, but Kibana's Discover tab is built specifically for working with raw index data:
- Open Kibana and navigate to the Discover page (left sidebar)
- Select your target index from the dropdown at the top left
- Adjust the time range or add a query to filter exactly the raw data you want to export
- Look for the Export button at the top right (it looks like a download icon)
- Choose the CSV (raw data) option from the menu, then download the file
This exports all raw fields from your matching documents, not just the aggregated values from visualizations.
Method 2: Command Line with curl + jq (For Customization)
If you prefer terminal workflows, combining Elasticsearch's API with curl and jq (a lightweight JSON processor) gives you full control over which fields to export:
Step 1: Fetch raw data with curl
Retrieve your data using the _search API. Adjust size to match your document count (note: 5.6.9's default max_result_window is 10,000—use the scroll API below for larger datasets):
curl -X GET "http://your-es-host:9200/your-target-index/_search?size=1000&q=*" -H "Content-Type: application/json"
Step 2: Convert to CSV with jq
Pipe the output to jq to extract specific fields and format them as CSV. Replace field1, field2, etc., with your actual field names (including nested paths like user.email):
curl -X GET "http://your-es-host:9200/your-target-index/_search?size=1000&q=*" -H "Content-Type: application/json" | jq -r '["field1", "field2", "field3"], (.hits.hits[] | [._source.field1, ._source.field2, ._source.field3]) | @csv' > raw_data.csv
For large datasets (scroll API)
If you have more than 10,000 documents, use Elasticsearch's scroll API to batch fetch data. Here's a simplified shell script example:
# Initialize scroll and save the scroll ID curl -X GET "http://your-es-host:9200/your-target-index/_search?scroll=1m&size=1000" -H "Content-Type: application/json" -d '{"query": {"match_all": {}}}' > scroll_response.json SCROLL_ID=$(jq -r '._scroll_id' scroll_response.json) # Write header to CSV echo '"field1","field2","field3"' > large_raw_data.csv # Loop to fetch batches until no more hits while true; do curl -X GET "http://your-es-host:9200/_search/scroll" -H "Content-Type: application/json" -d "{\"scroll\": \"1m\", \"scroll_id\": \"$SCROLL_ID\"}" > scroll_batch.json HITS_COUNT=$(jq '.hits.hits | length' scroll_batch.json) if [ "$HITS_COUNT" -eq 0 ]; then break fi # Append batch data to CSV jq -r '.hits.hits[] | [._source.field1, ._source.field2, ._source.field3] | @csv' scroll_batch.json >> large_raw_data.csv SCROLL_ID=$(jq -r '._scroll_id' scroll_batch.json) done # Clean up temporary files rm scroll_response.json scroll_batch.json
Method 3: Logstash (Best for Very Large Datasets)
For exporting hundreds of thousands or millions of documents, Logstash handles pagination and batch processing automatically. Here's how to set it up:
- Create a Logstash configuration file (e.g.,
es-to-csv.conf):
input { elasticsearch { hosts => ["your-es-host:9200"] index => "your-target-index" query => '{"query": {"match_all": {}}}' # Adjust query if filtering is needed scroll => "5m" size => 5000 } } output { csv { path => "/path/to/your/exported_data.csv" fields => ["field1", "field2", "field3"] # List your target fields here separator => "," write_headers => true # Include CSV header row } stdout { codec => dots } # Optional: Shows progress as dots }
- Run Logstash with the configuration:
bin/logstash -f es-to-csv.conf
Quick Notes:
- Ensure your user account has read access to the target Elasticsearch index
- For nested fields, use full paths like
._source.user.profile.namein jq or Logstash - If you hit permission issues, check Elasticsearch's role-based access control settings
内容的提问来源于stack exchange,提问作者kumar




