能否用Google Apps Script将Google Drive中非可搜索PDF转为带文本层的可搜索PDF?
Great question—let’s break this down clearly for you:
1. Can GAS replicate Adobe Acrobat's OCR text overlay effect?
Unfortunately, Google Apps Script doesn’t have native support for directly adding an OCR text layer to an existing PDF like Acrobat does. The built-in services (such as DriveApp or the Advanced Drive Service) don’t expose low-level PDF editing tools to insert a hidden text layer over scanned images. Most GAS workflows for PDFs focus on converting them to text, Google Docs, or other formats rather than modifying the original PDF’s structure directly.
2. Can GAS convert non-searchable PDFs to searchable PDFs?
Yes! While you can’t overlay text directly onto the original PDF, you can use Google’s built-in OCR via Google Docs to generate a new searchable PDF. Here’s the workflow:
- Upload the non-searchable PDF to Google Drive.
- Use GAS to convert the PDF to a Google Doc (Google automatically runs OCR on scanned PDFs during this conversion).
- Export the Google Doc back to a PDF—this new file will include a fully searchable text layer.
Example GAS Code
Here’s a script to automate this process:
function convertNonSearchablePDFToSearchable() { // Replace with your target folder ID const folderId = "YOUR_FOLDER_ID"; const folder = DriveApp.getFolderById(folderId); const pdfFiles = folder.getFilesByType(MimeType.PDF); while (pdfFiles.hasNext()) { const file = pdfFiles.next(); const originalName = file.getName(); // Skip already processed files (optional check) if (file.getDescription() === "Searchable PDF") continue; try { // Convert PDF to Google Doc with OCR (specify your language code) const docFile = Drive.Files.insert( { title: `${originalName} (OCR Temp)`, mimeType: MimeType.GOOGLE_DOCS }, file.getBlob(), { ocr: true, ocrLanguage: "en" // Use "zh-CN" for Chinese, "es" for Spanish, etc. } ); // Wait for OCR processing (helpful for large files) Utilities.sleep(6000); // Export Doc back to PDF const searchablePdfBlob = DriveApp.getFileById(docFile.id).getAs(MimeType.PDF); const searchablePdf = folder.createFile(searchablePdfBlob) .setName(`${originalName} (Searchable)`); searchablePdf.setDescription("Searchable PDF"); // Clean up temporary Google Doc DriveApp.getFileById(docFile.id).setTrashed(true); console.log(`Successfully converted: ${originalName}`); } catch (error) { console.log(`Failed to convert ${originalName}: ${error.message}`); } } }
Key Notes
- Enable Drive API: Before running the script, go to
Extensions > Apps Script > Resources > Advanced Google Servicesand enable the Drive API. - Language Accuracy: Adjust the
ocrLanguageparameter to match your PDF’s language for better OCR results. - Formatting Caveats: The converted PDF may not perfectly match the original’s layout (e.g., image positions, fonts, or complex formatting could shift). This is due to how Google Docs renders content compared to the original PDF.
- File Restrictions: Password-protected PDFs or files larger than ~10MB may fail to convert. Ensure you have edit access to the target files.
内容的提问来源于stack exchange,提问作者gg-edu




