如何将文件内容分割为double类型数据？

阿华AIGC实验室

2026-5-11

Got it, let's break this down for you. You need to pull out double-type values from that raw string for plotting—here's how to do it in two common languages used for data visualization: Python and C++.

Step-by-Step Solution

First, let's clarify the raw data structure: it looks like two separate records stuck together with a space (between 3.95 and 4000). We'll split those first, then extract numeric values that can be converted to doubles.

Python Implementation

Python's float type is equivalent to a double in other languages, so we can use that to parse the values. Here's a flexible script that handles all numeric fields, skipping non-numeric entries like the date string:

raw_data = "3000,1273010256, 2010/5/4,344,78.32,3.95 4000,1273010257, 2010/5/4,326,78.32,3.97"

# Split the raw string into two separate records
records = raw_data.split(" ")

# Store all converted double values here
double_values = []

for record in records:
    # Split each record into individual fields, stripping extra whitespace
    fields = [field.strip() for field in record.split(",")]
    for field in fields:
        # Try converting to float (Python's double equivalent)
        try:
            numeric_val = float(field)
            double_values.append(numeric_val)
        except ValueError:
            # Skip non-numeric fields like the date "2010/5/4"
            continue

# Output the result
print(double_values)
# Expected output: [3000.0, 1273010256.0, 344.0, 78.32, 3.95, 4000.0, 1273010257.0, 326.0, 78.32, 3.97]

If you only need specific fields (like the last two decimal values per record for plotting), you can target those directly instead of checking every field:

target_values = []
for record in records:
    fields = [field.strip() for field in record.split(",")]
    # Grab the 5th and 6th fields (0-indexed: indices 4 and 5)
    for idx in [4, 5]:
        try:
            target_values.append(float(fields[idx]))
        except:
            pass

print(target_values)
# Expected output: [78.32, 3.95, 78.32, 3.97]

C++ Implementation

If you're working in C++ (common for plotting libraries like Qt or Matplotlibcpp), here's a robust way to extract doubles:

#include <iostream>
#include <vector>
#include <sstream>
#include <string>
#include <cctype>

using namespace std;

// Helper function to check if a string is a valid numeric value (integer or decimal)
bool isNumeric(const string& s) {
    bool hasDecimal = false;
    for (char c : s) {
        if (isdigit(c)) continue;
        if (c == '.' && !hasDecimal) {
            hasDecimal = true;
            continue;
        }
        return false;
    }
    return !s.empty();
}

int main() {
    string rawData = "3000,1273010256, 2010/5/4,344,78.32,3.95 4000,1273010257, 2010/5/4,326,78.32,3.97";
    vector<double> doubleValues;

    // Split into individual records using space as the delimiter
    stringstream recordStream(rawData);
    string record;
    while (getline(recordStream, record, ' ')) {
        // Split each record into fields using comma as the delimiter
        stringstream fieldStream(record);
        string field;
        while (getline(fieldStream, field, ',')) {
            // Trim leading/trailing whitespace from the field
            size_t start = field.find_first_not_of(" \t");
            size_t end = field.find_last_not_of(" \t");
            if (start == string::npos || end == string::npos) continue;
            string trimmedField = field.substr(start, end - start + 1);

            // Convert to double if the field is numeric
            if (isNumeric(trimmedField)) {
                double val = stod(trimmedField);
                doubleValues.push_back(val);
            }
        }
    }

    // Print all extracted double values
    for (double val : doubleValues) {
        cout << val << " ";
    }
    // Expected output: 3000 1273010256 344 78.32 3.95 4000 1273010257 326 78.32 3.97
    return 0;
}

Key Notes

Fixed Format Optimization: If your data always follows the same structure (e.g., 6 fields per record, with fields 4 and 5 being the decimals you need), you can skip the numeric check and directly extract those positions—this is faster for large datasets.
Edge Cases: Make sure to handle extra whitespace (like the space after commas in your raw data) by trimming fields before conversion.

内容的提问来源于stack exchange，提问作者DLxe