-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to parse copyright year #31
Comments
This code came from unending's fork: Unending/Audiobooks.bundle@85694cb I didn't personally test it. I can try and help a bit later. The regex is saying something along the lines of "match 4 digits in a row from the given string". 101regex is a great tool to learn more about regexes. Since copyrights only contain years, all it needs to match is those 4 digits. |
My code starting in line 658 currently looks like this:
It matches the first 4 digits it finds after the (c). I'm just guessing here so everybody is invited to enlighten me. |
Audible isn't very consistent but the way I've noticed the most common use is that (C) is the original copyright year of the work, and (P) is the copyright year of the specific publication. See here for reference: https://www.audible.com/pd/East-of-Eden-Audiobook/B00546SXO0 I personally prioritize (C) year, as I think that sorting by year, or filtering by decade works better when the original copyright year is used, but the (P) is also important and should be equivalent to the release date. Both dates need to be used, but I don't know of any player that takes advantage of them. Ideally the id3 tags should be: |
Regarding the example you posted: Would you prefer to have the year set to 1952, or 1980? As we only have one year to set in Plex I'd suggest to simplify copyright-parsing and do it in the following order: |
For (C) it should be the original year, so 1952. The actual plex tag is "Release Date" so I think (P) 2011 should be the year/date actually imported into plex. |
The part of the code I'm talking about is only called if the preferences are set to "use copyright year instead of date published". |
I'm getting an error when parsing the copyright line of this book:
©Knaus Verlag (P)2002 Mango Studios Köln
The error says
AttributeError: 'NoneType' object has no attribute 'group'
in line 674 executinghelper.date = re.match(".?(\d{4}).*", cstring).group(1)
I had a look at the code and wanted to write a fix but don't understand the cases you're trying to catch.
Maybe we could collect different examples and expected output?
As far as I understand you're stripping the string down to the part before
(P)
and extract the date from that part only.What compells against matching the first 4-digit part in the whole copyright?
The text was updated successfully, but these errors were encountered: