You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many users of the warc library would need to have parsed http headers, so it would be nice to at least have a convenience function to do so. In addition, it might by useful to have a function to stream through the payload and calculate sha1 if the WARC-Payload-Digest header is not present.
I have some changes that implement parsing of http records and calculating sha1 while streaming the payload. However, this happens internal in the library and these changes are not suitable for upstream. https://bitbucket.org/rajbot/warc-tools
Any improvements we can make to mean that large and gargantuan warc files can be read and processed speedily
The text was updated successfully, but these errors were encountered: