如何在C++读取文本文件时忽略特定标点符号？

阿华AIGC实验室

2026-5-8

如何在读取文本文件时忽略指定标点符号

嘿，我来帮你搞定这个读取文件时忽略标点的问题！要忽略'、,、:、.这些标点，我们可以从读取后过滤或者读取时自动跳过两个方向入手，结合你的代码，我给你调整出可行的方案，顺便也修正原代码里的小问题（比如不该修改argv[0]，它是程序本身的名称哦）。

方案一：读取后过滤每个单词的标点

这种方法直观易懂，先读取完整的单词，再逐个字符检查，把不需要的标点剔除。修改后的代码如下：

#include <iostream>
#include <fstream>
#include <string>

int count = 0;
std::string fileName;
std::fstream readFile;
std::string storeFile;

int main(int argc, char *argv[]) {
    std::cout << "Please enter the name of the file: " << std::endl;
    // 别修改argv[0]，用单独变量存文件名才是正确做法
    std::cin >> fileName; 

    readFile.open(fileName);
    if(!readFile) {
        std::cerr << "ERROR: failed to open file " << std::endl;
        exit(0);
    } else {
        std::cerr << "File successfully opened" << std::endl;
    }

    // 循环读取每个单词
    while(readFile >> storeFile){
        if(readFile.bad()) {
            std::cerr << "File failed to read " << std::endl;
            break;
        } else {
            // 过滤当前单词里的目标标点
            std::string cleanedWord;
            for(char c : storeFile) {
                // 只保留不是指定标点的字符
                if(c != '\'' && c != ',' && c != ':' && c != '.') {
                    cleanedWord += c;
                }
            }
            // 处理清理后的单词（这里示例是计数+输出）
            if(!cleanedWord.empty()) {
                count++;
                std::cout << "处理后的单词: " << cleanedWord << std::endl;
            }
        }
    }

    readFile.close();
    std::cout << "有效单词总数: " << count << std::endl;
    return 0;
}

代码说明：

用范围for循环遍历每个单词的字符，筛选出不需要剔除的字符，生成干净的单词。
修正了原代码中错误修改argv[0]的问题，argv[0]是程序的执行路径/名称，应该用独立变量存储用户输入的文件名。
原代码里的myWord数组容易出现越界问题，改用字符串处理更安全灵活。

方案二：修改输入流分隔符（更高效的读取方式）

如果想在读取时就自动跳过标点，可以自定义输入流的字符分类，让>>运算符把标点当作空白符处理。代码如下：

#include <iostream>
#include <fstream>
#include <string>
#include <locale>

// 自定义字符分类规则，把指定标点标记为空白符
class CustomPunct : public std::ctype<char> {
protected:
    mask const* do_table() const override {
        static std::vector<mask> table(classic_table(), classic_table() + table_size);
        // 将目标标点设为空白符，这样读取时会自动跳过
        table['\''] |= space;
        table[','] |= space;
        table[':'] |= space;
        table['.'] |= space;
        return &table[0];
    }
};

int count = 0;
std::string fileName;
std::fstream readFile;
std::string storeFile;

int main(int argc, char *argv[]) {
    std::cout << "Please enter the name of the file: " << std::endl;
    std::cin >> fileName; 

    readFile.open(fileName);
    if(!readFile) {
        std::cerr << "ERROR: failed to open file " << std::endl;
        exit(0);
    } else {
        std::cerr << "File successfully opened" << std::endl;
    }

    // 给输入流安装自定义的字符分类规则
    readFile.imbue(std::locale(readFile.getloc(), new CustomPunct));

    // 现在读取时直接拿到不带标点的干净单词
    while(readFile >> storeFile){
        if(readFile.bad()) {
            std::cerr << "File failed to read " << std::endl;
            break;
        } else {
            count++;
            std::cout << "处理后的单词: " << storeFile << std::endl;
        }
    }

    readFile.close();
    std::cout << "有效单词总数: " << count << std::endl;
    return 0;
}