-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Invalid Link #2450
base: main
Are you sure you want to change the base?
BUG: Invalid Link #2450
Conversation
passes an IndirectObject for the target page instead of an integer. passing an integer creates an invalid link. resolves py-pdf#2443 Issue
Are you able to add a corresponding test case as well which shows the previous issue and demonstrates that your fix does indeed solve this? |
testpages.csv link test.txt is the code. I could not upload a *.py file. The code needs to be changed to find the supporting files as the locations are hard coded. An invalid link is created, it works, but if you delete a page the links are broken. Acrobat will re move then as invalid links when optimized. |
Could you add something of this as some automated unit/integration test? |
I don't know what I'm doing. I never read the book on GitHub. I don't known how to automated unit/integration test. |
@rsinger417 The test |
I can't see what the code is, but by the error message it is
using AnnotationBuilder.link which has been deprecated. The class Link in
_markup_annotations.py should be used.
I think AnnotationBuilder is completely gone in version 4.0.0
*Raymond Singer*
*Engineering Technician V*
*T: 262.653.4154*
*625 52nd Street, Room 302*
*Kenosha, WI 53140*
…On Tue, Feb 13, 2024 at 3:57 PM Martin Thoma ***@***.***> wrote:
<https://github.com/rsinger417>
This message originated from outside your organization
------------------------------
<https://github.com/rsinger417>
@rsinger417
<https://github.com/rsinger417>
The test tests/test_generic.py::test_annotation_builder_link fails. Do
you see why? (It could also be a test issue; I haven't looked into it so
far)
—
Reply to this email directly, view it on GitHub
<#2450 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BF5XNWG6OMGU6BLFDUODEA3YTPOWFAVCNFSM6AAAAABDABLCT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBSG4YDIOJTG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I do not see the warning in the CI log, but an actual error which maps to a changed code line:
It seems like |
what is the value in "target_page_index"? It should be a page number in the pdf. |
See the CI output which displays this value:
You should be able to locally debug this as well, as this is a permanent error. The background is that Line 911 in cc306ad
target_page_index=1 now points to an invalid page:
With this analysis, your proposed change is a breaking one and thus most likely requires a deprecation process - although it might be debatable whether the current implementation would constitute as a bug or desired behavior. |
The test is wrong 945 # Part 4: Internal Link |
The test has been there before your change and worked, thus I would assume that this is/was some intended functionality. Apparently it was considered a valid use case to generate annotations for invalid pages which might be added later on. Yes, we could argue about how useful this is, but this is just how it has been in the past. The original change where this has been introduced is #1189. Let's wait for the opinion of the other maintainers regarding this. |
Unless you add_page first there is no page and thus no IndirectLink to it. If there is no IndirecLink then the link will be invalid. How can you link to an internal page that does not exist. PR #1189 does not have this test in it. The date of this PR #1189 commit was when the links became invalid PyPDF2 2.9.0 7/31/2022. The links were good in PyPDF2 2.8.1. The invalid links will work but they are not valid and will be broken if pages are removed and if the file is optimized in Acrobat they will be removed. |
@pubpub-zz @MartinThoma Any input on #2450 (comment) and how to continue with the (now failing) test regarding possibly breaking behavior? |
@pubpub-zz It seems like your misread my comment. The remaining issue is that until now, pypdf would allow links to pages not (yet) added to the file; while with the solution proposed in this PR, we would restrict this to pages already present in the file. I consider this a possibly breaking change, but wanted to get a second opinion on this before continuing. |
Having to add the pages within the PdfWriter before referencing in links is mandatory : you can not guess what will be the IndirectObject before adding it into the PdfWriter. |
In the current release, this would work indeed, id est referencing an invalid page. After merging this PR, this would change, thus it might be a breaking one - although I am not sure whether we consider the old behavior (allowing invalid references) a bug or not, as for a bug we would not have to really consider this a breaking change. |
The legacy code was compatible with both arguments |
@rsinger417 Could you please check/verify whether you can keep support for external links which do not point to the same document? |
external link works, the annotation uses "url" instead of "target_page_index" for an internal link such as |
in PDF there is different links: It is there two links where page index are used with. |
I think this PR may also fix issue #2346. Probably it may make sense to check if indeed that issue will also be solved. |
I'd be fine with removing it. The main question is if we need to go through the deprecation process (which would take quite long) or if we can simply say that it was a bug. I don't see the use-case here, hence I'd say it is a bug - especially when there was this change of behavior in pypdf==2.9.0 (written by rsinger417 before; I didn't check). Is there anything stopping people from adding the pages first? |
passes an IndirectObject for the target page instead of an integer. passing an integer creates an invalid link.
Fixes #2443