如何使用iText提取PDF中矩形的颜色

我正在尝试使用iText提取PDF中矩形的颜色。在下面的是所有的PDF页面有什么:

这是使用iText提取的页面内容:

q

BT

36 806 Td

0 -18 Td

/F1 12 Tf

(Option 1:)Tj

0 0 Td

0 -94.31 Td

ET

Q

q

Q

q

2 J

0 G

0.5 w

88.3 693.69 139.47 94.31 re

S

0.5 w

227.77 693.69 139.47 94.31 re

S

0.5 w

367.23 693.69 139.47 94.31 re

S

Q

BT

1 0 0 1 90.3 774 Tm

/F1 12 Tf

(A rectangle:)Tj

ET

q 1.13 0 0 1.13 229.77 695.69 cm /Xf1 Do Q

BT

1 0 0 1 369.23 774 Tm

/F1 12 Tf

(The rectangle is scaled)Tj

1 0 0 1 369.23 762 Tm

(to fit inside the cell, you)Tj

1 0 0 1 369.23 750 Tm

(see a padding.)Tj

ET

228 810 m

338 810 l

S

但是,有些东西我无法从该代码中提取,我说的

是红色,如果我生成相同的PDF,但是用另一种

颜色而不是红色,则页面内容没有任何变化(上面的代码显示了) )。

因此,我的问题是,如何使用

iText库Java的某些方法或属性来提取该颜色。

我正在使用iText 5.5.9,这是我用来生成

PDF示例的代码示例:

感谢您的任何帮助,您可以提供!

这是我用来生成PDF的代码:

String dest = "C:\\TestCreation.pdf";

Document document = new Document();

PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(dest));

document.open();

document.add(new Paragraph("Option 1:"));

PdfPTable table = new PdfPTable(3);

table.addCell("A rectangle:");

PdfTemplate template = writer.getDirectContent().createTemplate(120, 80);

template.setColorFill(BaseColor.RED);

template.rectangle(0, 0, 120, 80);

template.fill();

writer.releaseTemplate(template);

table.addCell(Image.getInstance(template));

table.addCell("The rectangle is scaled to fit inside the cell, you see a padding.");

document.add(table);

PdfContentByte cb = writer.getDirectContent();

cb.moveTo(228, 810);

cb.lineTo(338, 810);

cb.stroke();

document.close();

您可以在此处看到PDF文件:PDF

示例

这是我用来获取页面内容的代码:String pageContent =

new String(reader.getPageContent(1));

回答:

您的代码显示了它,这是创建矩形并添加矩形的方法:

PdfTemplate template = writer.getDirectContent().createTemplate(120, 80);

template.setColorFill(BaseColor.RED);

template.rectangle(0, 0, 120, 80);

template.fill();

writer.releaseTemplate(template);

table.addCell(Image.getInstance(template));

iTextPdfTemplate生成PDF表单XObject。XObject表单又是PDF内容流,它是对

图形对象(包括路径对象,文本对象和采样图像)的任何序列的独立描述

(ISO 32000-1的8.10.1节),即单独的图形

指令流,其内容可以从任何其他内容流中引用。

对于页面内容流,这

是包含表格XObject的行:

q 1.13 0 0 1.13 229.77 695.69 cm /Xf1 Do Q

(将变换矩阵拉伸到1.13并移动一点,

然后绘制XObject Xf1,然后重置变换矩阵。)

该XObject Xf1的内容流是这样的:

1 0 0 rg

0 0 120 80 re

f

即,它将不描边的颜色设置为RGB红色,在

原点定义一个120x80的矩形,并填充它。


这是我用来获取页面内容的行代码:

String pageContent = new String(reader.getPageContent(1));

该行不足以获取所有内容详细信息:

它仅返回即时页面内容,而不返回即时内容中使用的XObjects形式和模式的详细指令。人们经常会发现其直接页面内容仅引用一个或多个XObjects形式的PDF。

尽管有外观,但页面内容是二进制性质,而不是文本性质。一旦使用了非标准编码的字体,PDF字符串的内容在Java字符串中就毫无意义,或者(取决于您的标准编码)甚至中断了。

相反,应该使用iText解析器框架,例如:

ExtRenderListener extRenderListener = new ExtRenderListener()

{

@Override

public void beginTextBlock() { }

@Override

public void renderText(TextRenderInfo renderInfo) { }

@Override

public void endTextBlock() { }

@Override

public void renderImage(ImageRenderInfo renderInfo) { }

@Override

public void modifyPath(PathConstructionRenderInfo renderInfo)

{

pathInfos.add(renderInfo);

}

@Override

public Path renderPath(PathPaintingRenderInfo renderInfo)

{

GraphicsState graphicsState;

try

{

graphicsState = getGraphicsState(renderInfo);

}

catch (NoSuchFieldException | SecurityException | IllegalArgumentException | IllegalAccessException e)

{

e.printStackTrace();

return null;

}

Matrix ctm = graphicsState.getCtm();

if ((renderInfo.getOperation() & PathPaintingRenderInfo.FILL) != 0)

{

System.out.printf("FILL (%s) ", toString(graphicsState.getFillColor()));

if ((renderInfo.getOperation() & PathPaintingRenderInfo.STROKE) != 0)

System.out.print("and ");

}

if ((renderInfo.getOperation() & PathPaintingRenderInfo.STROKE) != 0)

{

System.out.printf("STROKE (%s) ", toString(graphicsState.getStrokeColor()));

}

System.out.print("the path ");

for (PathConstructionRenderInfo pathConstructionRenderInfo : pathInfos)

{

switch (pathConstructionRenderInfo.getOperation())

{

case PathConstructionRenderInfo.MOVETO:

System.out.printf("move to %s ", transform(ctm, pathConstructionRenderInfo.getSegmentData()));

break;

case PathConstructionRenderInfo.CLOSE:

System.out.printf("close %s ", transform(ctm, pathConstructionRenderInfo.getSegmentData()));

break;

case PathConstructionRenderInfo.CURVE_123:

System.out.printf("curve123 %s ", transform(ctm, pathConstructionRenderInfo.getSegmentData()));

break;

case PathConstructionRenderInfo.CURVE_13:

System.out.printf("curve13 %s ", transform(ctm, pathConstructionRenderInfo.getSegmentData()));

break;

case PathConstructionRenderInfo.CURVE_23:

System.out.printf("curve23 %s ", transform(ctm, pathConstructionRenderInfo.getSegmentData()));

break;

case PathConstructionRenderInfo.LINETO:

System.out.printf("line to %s ", transform(ctm, pathConstructionRenderInfo.getSegmentData()));

break;

case PathConstructionRenderInfo.RECT:

System.out.printf("rectangle %s ", transform(ctm, expandRectangleCoordinates(pathConstructionRenderInfo.getSegmentData())));

break;

}

}

System.out.println();

pathInfos.clear();

return null;

}

@Override

public void clipPath(int rule)

{

}

List<Float> transform(Matrix ctm, List<Float> coordinates)

{

List<Float> result = new ArrayList<>();

for (int i = 0; i + 1 < coordinates.size(); i += 2)

{

Vector vector = new Vector(coordinates.get(i), coordinates.get(i + 1), 1);

vector = vector.cross(ctm);

result.add(vector.get(Vector.I1));

result.add(vector.get(Vector.I2));

}

return result;

}

List<Float> expandRectangleCoordinates(List<Float> rectangle)

{

if (rectangle.size() < 4)

return Collections.emptyList();

return Arrays.asList(

rectangle.get(0), rectangle.get(1),

rectangle.get(0) + rectangle.get(2), rectangle.get(1),

rectangle.get(0) + rectangle.get(2), rectangle.get(1) + rectangle.get(3),

rectangle.get(0), rectangle.get(1) + rectangle.get(3)

);

}

String toString(BaseColor baseColor)

{

if (baseColor == null)

return "DEFAULT";

return String.format("%s,%s,%s", baseColor.getRed(), baseColor.getGreen(), baseColor.getBlue());

}

GraphicsState getGraphicsState(PathPaintingRenderInfo renderInfo) throws NoSuchFieldException, SecurityException, IllegalArgumentException, IllegalAccessException

{

Field gsField = PathPaintingRenderInfo.class.getDeclaredField("gs");

gsField.setAccessible(true);

return (GraphicsState) gsField.get(renderInfo);

}

final List<PathConstructionRenderInfo> pathInfos = new ArrayList<>();

};

try ( InputStream resource = [RETRIEVE FILE TO PARSE AS INPUT STREAM])

{

PdfReader pdfReader = new PdfReader(resource);

for (int page = 1; page <= pdfReader.getNumberOfPages(); page++)

{

System.out.printf("\nPage %s\n====\n", page);

PdfReaderContentParser parser = new PdfReaderContentParser(pdfReader);

parser.processContent(page, extRenderListener);

}

}

(ExtractPaths

test method testExtractFromTestCreation)

For your sample file this results in the output

Page 1

====

STROKE (0,0,0) the path rectangle [88.3, 693.69, 227.77, 693.69, 227.77, 788.0, 88.3, 788.0]

STROKE (0,0,0) the path rectangle [227.77, 693.69, 367.24, 693.69, 367.24, 788.0, 227.77, 788.0]

STROKE (0,0,0) the path rectangle [367.23, 693.69, 506.7, 693.69, 506.7, 788.0, 367.23, 788.0]

FILL (255,0,0) the path rectangle [229.77, 695.69, 365.37, 695.69, 365.37, 786.09, 229.77, 786.09]

STROKE (DEFAULT) the path move to [228.0, 810.0] line to [338.0, 810.0]

iText将颜色值表示为字节(0-255),而不是

PDF使用的单位范围(0.0-1.0)。因此,您看到“(255,0,0)”,其中PDF选择了“ 1

0 0 rg”。

以上是 如何使用iText提取PDF中矩形的颜色 的全部内容, 来源链接: utcz.com/qa/420732.html

回到顶部