使用Shell脚本提取GitLab SSH日志指定详情的技术需求
Got it, let's tackle parsing those GitLab Shell SSH logs—whether you're working with the active, live log file or the archived tar files that get created once the log hits its rotation threshold. Below are practical shell scripts and explanations to extract the key details you need.
Start by understanding the structure of each log line, using your example:
I, [2018-03-07T10:21:26.315799 #7531] INFO -- : gitlab-shell: executing git command *git-upload-pack /var/opt/gitlab/git-data/repositories/GitLab-write-logs/write-logs.git> for user with key key-1
The critical fields we care about are:
- Timestamp:
2018-03-07T10:21:26.315799(wrapped in[]) - User SSH Key:
key-1(the final segment of the line) - Git Command:
git-upload-pack /var/opt/gitlab/git-data/repositories/GitLab-write-logs/write-logs.git(wrapped between*and>)
Here's a reusable shell script that parses the active log file and outputs formatted, easy-to-read details:
#!/bin/bash # Update this path to match your GitLab Shell log location ACTIVE_LOG="/var/log/gitlab/gitlab-shell/gitlab-shell.log" # Extract timestamp, user key, and git command from each log line awk '{ # Capture timestamp using regex match if (match($0, /\[(.*) #/, timestamp_match)) { timestamp = timestamp_match[1] } # Capture user SSH key if (match($0, /key-(.*)$/, key_match)) { user_key = "key-" key_match[1] } # Capture the full git command if (match($0, /\*(.*)>/, cmd_match)) { git_command = cmd_match[1] } # Only print if all fields are found (skip malformed lines) if (timestamp && user_key && git_command) { printf "📅 Timestamp: %s | 🔑 Key: %s | 🛠️ Command: %s\n", timestamp, user_key, git_command } }' "$ACTIVE_LOG"
How this works:
- Uses
awk'smatch()function with regex to target the specific, wrapped fields in the log line (since the log uses non-standard separators, regex is more reliable than splitting on delimiters). - Skips any malformed lines that don't have all three fields.
- Outputs a clean, labeled format for easy scanning.
Since GitLab rotates these logs into tar files, you'll need to parse those archives without extracting the entire thing (saves disk space). Here's a script that loops through all tar archives and aggregates parsed logs:
#!/bin/bash # Update this to your archive directory ARCHIVE_DIR="/var/opt/gitlab/gitlab-shell/logs/archives" # Output file for aggregated parsed logs OUTPUT_FILE="parsed_gitlab_ssh_logs.txt" # Clear the output file if it exists > "$OUTPUT_FILE" # Loop through each tar file in the archive directory for tar_archive in "$ARCHIVE_DIR"/*.tar; do echo "Processing archive: $(basename "$tar_archive")" | tee -a "$OUTPUT_FILE" # Extract log content directly to stdout (no disk extraction) and parse tar -xf "$tar_archive" -O | awk '{ match($0, /\[(.*) #/, ts_match); timestamp = ts_match[1] match($0, /key-(.*)$/, key_match); user_key = "key-" key_match[1] match($0, /\*(.*)>/, cmd_match); git_command = cmd_match[1] if (timestamp && user_key && git_command) { printf "%s | %s | %s\n", timestamp, user_key, git_command } }' >> "$OUTPUT_FILE" done echo "Done! Parsed logs saved to $OUTPUT_FILE"
Key features:
- Uses
tar -Oto output the contents of log files directly toawkwithout writing them to disk. - Aggregates all parsed logs into a single output file for easy analysis.
- Prints progress to the terminal and logs it to the output file.
If you only care about certain commands or keys, modify the awk block to add filters. For example:
- Filter for git-upload-pack commands:
Add this line right before theprintfstatement:if (git_command !~ /git-upload-pack/) next - Filter for a specific SSH key:
Add this line:if (user_key != "key-1") next
内容的提问来源于stack exchange,提问作者Prs




