如何读取含混合数据类型的Excel并获取正确数值格式？

阿华AIGC实验室

2026-5-15

解决OleDb读取Excel混合格式列时的科学计数法问题

我之前处理Excel数据的时候也遇到过一模一样的情况——明明是整数长数字，OleDb却返回科学计数法格式的字符串，还搞不清真实的数据类型。下面分享几个亲测有效的解决思路：

一、优化连接字符串，强化文本读取规则

你已经加了IMEX=1和TypeGuessRows=0，但可以再补充参数来强化文本读取的优先级：

OleDbConnection cn = new OleDbConnection(
    $"Provider=Microsoft.ACE.OLEDB.12.0;Data Source='{physicalPath}';Extended Properties='Excel 12.0 Xml;HDR=NO;IMEX=1;TypeGuessRows=0;ImportMixedTypes=Text;ReadOnly=True;';"
);

重点是新增的ImportMixedTypes=Text，它会强制OleDb把混合格式的列统一按文本类型读取，避免长数字被识别为数值型后转换成科学计数法。配合TypeGuessRows=0一起生效——这个参数会让OleDb扫描整个列的所有行推断类型，而非默认的前8行。

二、对已读取的科学计数法字符串进行转换

如果连接字符串调整后还是偶尔出现科学计数法，可以在读取数据时主动转换：

while (reader.Read())
{
    string rawValue = reader.GetString(0); // 假设目标列是第0列
    string formattedValue;
    // 判断是否为科学计数法格式
    if (rawValue.Contains("e+") || rawValue.Contains("E+"))
    {
        // 优先用Decimal转换，确保精度
        if (decimal.TryParse(rawValue, out decimal num))
        {
            formattedValue = num.ToString("F0"); // 输出不带小数的纯数字字符串
        }
        else
        {
            // 处理超大数字，自定义解析科学计数法
            formattedValue = ParseScientificNotation(rawValue);
        }
    }
    else
    {
        formattedValue = rawValue;
    }
    
    // 获取列的推断类型
    Type fieldType = reader.GetFieldType(0);
    Console.WriteLine($"值：{formattedValue}，推断类型：{fieldType.Name}");
}

针对超大数字的自定义解析方法示例：

private static string ParseScientificNotation(string sciStr)
{
    string[] parts = sciStr.Split(new[] { 'e', 'E' }, StringSplitOptions.RemoveEmptyEntries);
    if (parts.Length != 2) return sciStr;
    
    string numberPart = parts[0].Replace(".", "");
    int exponent = int.Parse(parts[1]);
    int decimalShift = exponent - (numberPart.Length - IndexOfNonZeroDigit(numberPart) - 1);
    
    if (decimalShift >= 0)
    {
        return numberPart + new string('0', decimalShift);
    }
    else
    {
        return numberPart.Insert(numberPart.Length + decimalShift, ".");
    }
}

// 辅助方法：找到第一个非零数字的索引
private static int IndexOfNonZeroDigit(string str)
{
    for (int i = 0; i < str.Length; i++)
    {
        if (str[i] != '0' && str[i] != '.')
        {
            return i;
        }
    }
    return 0;
}

三、绕过OleDb，使用第三方库读取

如果需要精准获取每个单元格的原始类型和值，推荐用ExcelDataReader这类轻量级库，它能直接读取Excel原始单元格信息，无需依赖OleDb驱动：

using (var stream = File.Open(physicalPath, FileMode.Open, FileAccess.Read))
{
    using (var reader = ExcelReaderFactory.CreateReader(stream))
    {
        do
        {
            while (reader.Read())
            {
                // 获取单元格原始值和类型
                object cellValue = reader.GetValue(0);
                Type cellType = cellValue.GetType();
                
                // 处理数值类型，避免科学计数法
                string displayValue;
                if (cellType == typeof(double))
                {
                    displayValue = ((double)cellValue).ToString("F0");
                }
                else
                {
                    displayValue = cellValue.ToString();
                }
                
                Console.WriteLine($"值：{displayValue}，原始类型：{cellType.Name}");
            }
        } while (reader.NextResult());
    }
}

这类库完全绕过OleDb的类型推断逻辑，能直接拿到Excel单元格的原始数据类型（如数字、文本、日期等），处理混合格式列更精准。