GeoPandas to_file报utf8解码错误求助(Python2.7)
to_file() in Python 2.7 Hey there, let's work through this encoding error you're facing—no need to hunt for that ogrext.pyx file (it's a compiled Cython source file, so you won't find it as plain text in your installed packages anyway). Here's what's going on and how to fix it:
What's Causing the Error?
The 'utf8' codec can't decode byte 0xb9... error means your GeoDataFrame contains string values that aren't encoded in UTF-8. Likely, they're using a Windows/ArcGIS common encoding like GBK or CP1252. Fiona (the library GeoPandas relies on for file I/O) defaults to UTF-8, so it chokes when trying to parse those non-UTF-8 bytes.
Step-by-Step Fixes
1. Specify the Correct Encoding Directly in to_file()
The simplest fix is to tell Fiona which encoding your data uses when saving. Pass an encoding parameter to to_file()—use the encoding that matches your data (common options for Chinese/Windows environments are gbk or cp1252):
df.to_file('psuedo.shp', encoding='gbk')
This skips the UTF-8 decoding step and uses your specified encoding to handle string values properly.
2. Convert Your Data's Encoding to UTF-8 Explicitly
If specifying the encoding doesn't resolve the issue, you can convert all string columns in your GeoDataFrame to UTF-8 directly. In Python 2.7, strings are byte-based, so we'll decode from the original encoding and re-encode to UTF-8:
# Iterate over all columns with string/object data types for col in df.columns: if df[col].dtype == object: # Replace 'gbk' with your actual data encoding if needed df[col] = df[col].apply(lambda x: x.decode('gbk').encode('utf-8') if isinstance(x, str) else x) # Save the cleaned data df.to_file('psuedo.shp')
If you're unsure of the original encoding, try gb2312 or cp1252 as alternatives to gbk.
3. Convert Strings to Unicode (Python 2.7-Specific)
Python 2.7 treats Unicode and byte strings separately. Converting all string values to Unicode can eliminate encoding mismatches:
for col in df.columns: if df[col].dtype == object: # Replace 'gbk' with your data's actual encoding df[col] = df[col].apply(lambda x: unicode(x, 'gbk') if isinstance(x, str) else x) # Save the data—Fiona will handle Unicode to UTF-8 conversion automatically df.to_file('psuedo.shp')
Why You Don't Need ogrext.pyx
That file is part of Fiona's internal compiled source code, so you can't modify it directly. The fixes above target your data instead, which is the correct approach for encoding issues like this.
内容的提问来源于stack exchange,提问作者Samantha Leo




