-
I am developing a tool to edit pdf bookmarks, the first thing I want to do is load existing bookmarks from the pdf file. from PyPDF2 import PdfReader
reader = PdfReader("test.pdf")
for idx,outline in enumerate(reader.outline):
print(outline)
if idx == 0:
break and then the output is:
now, I want to know which page this bookmark reference to. Search for information about IndirectObject, some answers mention there is a method called
So ... is this method has been removed? By read the source code of IndirectObject. I know that if I can get the PageObject this IndirectObject reference to, I can get the page number using the
It contains the following:
I didn't see anything like page numbers. Is this DictionaryObject is the IndirectObject direct reference to? So there is no way to find out the PageObject I am looking for? ( ... whenever you feel like It should be easy ... ) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
The Page Object within the structure does not contain a page index. |
Beta Was this translation helpful? Give feedback.
-
(Sorry for previous short answer, I had only my phone with me) To complete, the get_object() is implemented on all objects but they will return the object themselves, so you don't need to check if it is an IndirectObject before calling get_object() |
Beta Was this translation helpful? Give feedback.
(Sorry for previous short answer, I had only my phone with me)
Objects (pages properties, contents,...) are stored within the PDF as objects which are identified with an id and a generation. those objects are stored in the pdf and most of the time the xref table stores the lookup table from (id,gen) to object position within the pdf. Within the pdf the objects will start with header (id) (gen) obj and a trailer endobj
the IndirectObject is a pointer which is encoded as "(id) (gen) R" and provide a way to point to objects. get_object() returns the pointed data.
as a simplifcation if you use the standard key call (such as ["/Page"]), an implicit call to the get_object() is provided.
note th…