如何用Python读取Microsoft BI文件并自动导出工作表为CSV?
Hey there! No need to apologize at all—Power BI + Python automation docs are definitely scattered, so this is a super common pain point. Let's break down how to automate exporting those 175 sheets/tables from your Power BI file (I’m assuming it’s a .pbix file here) to CSV, since that’s your preferred format.
Option 1: Control Power BI Desktop with pywin32 (Most Like Manual Workflow)
This method mimics exactly what you’re doing manually—opening Power BI, navigating to each sheet/table, and exporting it. It’s reliable and doesn’t require digging into PBIX file internals.
First, install the required packages:
pip install pywin32 pandas
Here’s a working script example:
import win32com.client import pandas as pd import os # Set up your output folder (creates it if it doesn't exist) output_dir = "C:/your/preferred/output/path" os.makedirs(output_dir, exist_ok=True) # Launch Power BI Desktop pbi_app = win32com.client.Dispatch("MicrosoftPowerBI.Application") pbi_app.Visible = True # Keep this True for debugging; set to False once it's stable # Open your PBIX file pbix_path = "C:/path/to/your/file.pbix" report = pbi_app.OpenReport(pbix_path) # Export data model tables (adjust this if you need report pages instead) for table in report.Model.Tables: # Clean up the table name to avoid invalid filename characters clean_table_name = table.Name.replace("/", "_").replace("\\", "_").replace(":", "_") csv_file_path = os.path.join(output_dir, f"{clean_table_name}.csv") # Load the table data into a pandas DataFrame df = pd.DataFrame(table.Rows) # Export to CSV with UTF-8 encoding to avoid Chinese character issues df.to_csv(csv_file_path, index=False, encoding="utf-8-sig") print(f"Exported {clean_table_name} to {csv_file_path}") # Close Power BI when done pbi_app.Quit()
Notes:
- This requires Power BI Desktop to be installed locally.
- If you need to export report pages (visuals) instead of data model tables, swap the
for table in report.Model.Tablesloop withfor page in report.Pagesand use the page's export method (you may need to tweak this based on your visuals).
Option 2: Parse the PBIX File Directly (No Power BI Desktop Required)
A PBIX file is actually a ZIP archive under the hood. If your PBIX uses import mode (meaning data is embedded in the file, not live-connected to a source), you can extract and read its contents directly.
Install dependencies first:
pip install pandas zipfile36 pyarrow
Example script:
import zipfile import pandas as pd import os import json from pyarrow import feather # Extract the PBIX file to a temporary folder pbix_path = "C:/path/to/your/file.pbix" temp_extract_dir = "C:/temp/pbix_extraction" os.makedirs(temp_extract_dir, exist_ok=True) with zipfile.ZipFile(pbix_path, 'r') as zip_ref: zip_ref.extractall(temp_extract_dir) # Load the model schema to get table names schema_path = os.path.join(temp_extract_dir, "Model", "ModelSchema.json") with open(schema_path, 'r', encoding='utf-8') as f: model_schema = json.load(f) table_names = [table["name"] for table in model_schema["tables"]] # Load the embedded data (stored as a Feather file) feather_data_path = os.path.join(temp_extract_dir, "Model", "Model.feather") model_data = feather.read_feather(feather_data_path) # Export each table to CSV output_dir = "C:/your/preferred/output/path" os.makedirs(output_dir, exist_ok=True) for table_name in table_names: clean_name = table_name.replace("/", "_").replace("\\", "_") csv_path = os.path.join(output_dir, f"{clean_name}.csv") # Filter the model data to get only rows for this table (adjust logic if needed) # Note: PBIX data structures can vary by version—you may need to debug this part table_data = model_data[model_data["TableName"] == table_name] table_data.to_csv(csv_path, index=False, encoding="utf-8-sig") print(f"Exported {clean_name} successfully")
Notes:
- This only works for import-mode PBIX files. If your file uses live connections, there’s no embedded data to extract.
- PBIX file structures can change between Power BI versions, so you may need to tweak the data filtering logic.
Option 3: Use the Power BI REST API (For Cloud-Hosted Reports)
If your PBIX is published to the Power BI Service (cloud), you can use the REST API to pull data and export it to CSV. This requires setting up an Azure AD app for authentication.
First, install packages:
pip install requests pandas
Simplified example script:
import requests import pandas as pd import os # Your Azure AD & Power BI service details tenant_id = "your-azure-tenant-id" client_id = "your-app-client-id" client_secret = "your-app-client-secret" workspace_id = "your-power-bi-workspace-id" dataset_id = "your-dataset-id" # Get an access token for the API token_url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token" token_payload = { "grant_type": "client_credentials", "client_id": client_id, "client_secret": client_secret, "scope": "https://analysis.windows.net/powerbi/api/.default" } token_response = requests.post(token_url, data=token_payload) access_token = token_response.json()["access_token"] headers = {"Authorization": f"Bearer {access_token}"} # Get all tables in the dataset tables_url = f"https://api.powerbi.com/v1.0/myorg/groups/{workspace_id}/datasets/{dataset_id}/tables" tables_response = requests.get(tables_url, headers=headers) tables = tables_response.json()["value"] # Export each table to CSV output_dir = "C:/your/preferred/output/path" os.makedirs(output_dir, exist_ok=True) for table in tables: clean_name = table["name"].replace("/", "_").replace("\\", "_") csv_path = os.path.join(output_dir, f"{clean_name}.csv") # Pull table rows from the API rows_url = f"https://api.powerbi.com/v1.0/myorg/groups/{workspace_id}/datasets/{dataset_id}/tables/{table['name']}/rows" rows_response = requests.get(rows_url, headers=headers) table_rows = rows_response.json()["value"] # Convert to DataFrame and export df = pd.DataFrame(table_rows) df.to_csv(csv_path, index=False, encoding="utf-8-sig") print(f"Exported {clean_name} from cloud dataset")
Notes:
- You’ll need to set up an Azure AD application with the correct permissions to access your Power BI workspace.
- This is ideal if you’re already using the Power BI Service and want to avoid local dependencies.
Final Recommendations
- For local PBIX files, Option 1 is the most reliable and easiest to set up—it mirrors your manual workflow perfectly.
- If you can’t install Power BI Desktop locally and have an import-mode PBIX, give Option 2 a try (but be prepared to debug the data structure).
- For cloud-hosted reports, Option 3 is the way to go for scalable automation.
All scripts export to CSV with UTF-8 encoding to prevent common issues with non-English characters.
内容的提问来源于stack exchange,提问作者Zonk




