You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

关于R Officer包支持Windows97-2003格式PPT及读取报错的问询

Does the R Officer package support legacy PowerPoint 97-2003 (.ppt) files?

Short Answer

The R officer package does NOT natively support the older PowerPoint 97-2003 (.ppt) binary format. It’s built exclusively to work with modern Office Open XML files (.pptx, .docx, etc.), which are ZIP-compressed archives.

Why You’re Seeing the Zip Error

The error message you encountered:

simpleError in zip::unzip(zipfile = newfile, exdir = folder): zip error: Cannot open zip file C:\Users\user1\AppData\Local\Temp\RtmpeYD5pQ\file41fc3c39b6.ppt for reading in file zip.c:238

This makes perfect sense: officer is designed to parse input files as ZIP archives (since .pptx files are structured this way). Legacy .ppt files are binary data, not ZIP packages, so the unzip operation fails immediately.

Solutions to Read Legacy .ppt Files in R

If you need to work with these older PowerPoint files, here are two reliable approaches:

1. Convert .ppt to .pptx First

The simplest fix is to convert your legacy .ppt files to the modern .pptx format first. You can do this:

  • Manually: Open the file in Microsoft PowerPoint or LibreOffice Impress, then save it as .pptx.
  • Automatically (for bulk processing): Use a command-line tool like LibreOffice’s headless mode. For example:
    soffice --headless --convert-to pptx your_file.ppt
    

Once converted, your original officer code will work as expected:

content <- read_pptx(fileName) # Now fileName points to a .pptx file
data <- pptx_summary(content)

2. Use the RDCOMClient Package (Windows Only)

If you need to read .ppt files directly without conversion, you can use RDCOMClient, which interacts with Windows’ COM interface to control Microsoft PowerPoint programmatically. Here’s a quick example:

library(RDCOMClient)

# Initialize PowerPoint COM object
ppt_app <- COMCreate("PowerPoint.Application")
# Open the legacy .ppt file
presentation <- ppt_app$Presentations()$Open("path/to/your/file.ppt")

# Example: Extract text from each slide
slide_count <- presentation$Slides()$Count()
for (i in 1:slide_count) {
  slide <- presentation$Slides(i)
  # Get text from the first shape (adjust as needed for your slides)
  if (slide$Shapes()$Count() > 0) {
    text_content <- slide$Shapes()$Item(1)$TextFrame()$TextRange()$Text()
    cat(paste("Slide", i, ":\n", text_content, "\n\n"))
  }
}

# Clean up: Close presentation and quit PowerPoint
presentation$Close()
ppt_app$Quit()

Note: This requires Microsoft PowerPoint to be installed on your Windows machine, and RDCOMClient only works on Windows.


内容的提问来源于stack exchange,提问作者aschinch

火山引擎 最新活动