python从PDF中提取附件

python

下载 Pdftk server:https://www.pdflabs.com/tools/pdftk-server/

如果有密码,先把带密码的PDF的转成无密码的PDF

pdftk 有密码.pdf  input_pw 密码  output 无密码.pdf

如果不带密码,上一步可以跳过

提取附件(必须不带密码)

pdftk 无密码.pdf unpack_files 解压目录

如果python cmd命令时显示不存在命令,

加入 os.chdir(pdftk的bin目录)

完整代码:

import os

def get_attachment(pdf_path,psd,pdftk_bin_folder):

pdf_folder_path=pdf_path.strip(pdf_path.split("")[-1])

tem_pdf_path=pdf_folder_path+"temp.pdf"

decrypt_command=f"pdftk {pdf_path} input_pw {psd} output {tem_pdf_path}"

extract_command=f"pdftk {tem_pdf_path} unpack_files output {pdf_folder_path}"

os.chdir(pdftk_bin_folder)

os.system(decrypt_command)

os.system(extract_command)

if__name__ == "__main__":

# pdf_path = r"C:Users86173Desktop estword2-protected.pdf"

# psd = "dfcver"

pdf_path = r"C:Users86173Desktop estword无密码1.pdf"

psd = ""

pdftk_bin_folder = r"C:Program Files (x86)PDFtk Serverin"

try:

get_attachment(pdf_path,psd,pdftk_bin_folder)

print("提取成功")

except Exception as e:

print("提取失败")

print(e)

 

以上是 python从PDF中提取附件 的全部内容, 来源链接: utcz.com/z/530559.html

回到顶部