-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
squatter: do not hold mailbox locks while calling the attachment extract service #4480
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know a ton about the extractor, but most of this appears to be sane.
I still don't understand with the port_waitevent is used for.
@ksmurchison the comment in this commit indicates that the prot_waitevent timeout is used to close network connections to the backend if it hasn't been used recently. See here: https://github.com/cyrusimap/cyrus-imapd/blame/f0f8a130b889381da1beb89962727f855370a301/imap/index.c#L5590 |
No, this is fine. After thinking about it, this is he same thing that we do with the backend IMAP connections in a Murder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bunch of comments and suggestions, some just nits, some more important.
72ae24b
to
f731de2
Compare
@elliefm thanks, I have updated the code accordingly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird stuff still going on with that one chmod test... I've attached a patch with some extra debug logging that identifies exactly when things get weird, but I have no idea why they get weird.
Also got a failure from SearchFuzzy.squatter_attachextract_nolock_slow but I haven't been able to reproduce it, it started passing as soon as I started looking at it.
f731de2
to
a20be29
Compare
Tagging this Do-Not-Merge for now: I first want to get #4495 into the codebase. |
a20be29
to
5dde1ee
Compare
5dde1ee
to
6e84d70
Compare
6e84d70
to
39ce3dc
Compare
39ce3dc
to
8cd385b
Compare
f8d81fb
to
5110853
Compare
121516b
to
db2b787
Compare
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
Signed-off-by: Robert Stepanek <[email protected]>
This fixes a regression where the HTTP 422 response code of the attachment extractor service was inadvertently handled as an error. As a result, the message was tagged as partially indexed. Signed-off-by: Robert Stepanek <[email protected]>
db2b787
to
406fa26
Compare
This updates squatter to not lock a mailbox while extracting text from attachments. To do so, the attachment extraction module now supports an optional file-system based cache. By default, squatter uses a temporary cache. The newly added experimental squatter argument options v
--attachextract-cache-dir
and--attachextract-cache-only
allow to use user-defined cache directory, as well as instruct squatter to only lookup extracted attachment text in the cache.In addition, handling of timeouts when calling the extractor service has changed: rather than attempting to connect up to 4 times, an error now is returned after the first timeout.
This pull request best is read commit by commit, where the first two commits are refactoring only.