Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Folder organizaiton options #9

Open
nomandera opened this issue Jun 30, 2018 · 3 comments
Open

Folder organizaiton options #9

nomandera opened this issue Jun 30, 2018 · 3 comments

Comments

@nomandera
Copy link

nomandera commented Jun 30, 2018

Enhancement request. Allow for identified ebooks to be organized into a predictable folder structure without renaming the ebooks themselves.

My goal is to organize all books by Author but I can imagine others may want to organise by different metadata.

Note: I may need to also add sharding as filesystems like samba typically dont do well with multi-thousand long folder lists. Sharding may be off topic for this request but included for completeness.

Example for demonstration:

From:

Frank Herbert Dune.epub
Douglas Adams The Hitchhikers Guide to the Galaxy.epub
George Orwell Nineteen Eighty-Four.epub
Isaac Asimov Foundation.epub

To:

\organized-books\D\Douglas Adams The Hitchhikers Guide to the Galaxy.epub
\organized-books\F\Frank Herbert Dune.epub
\organized-books\G\George Orwell Nineteen Eighty-Four.epub
\organized-books\I\Isaac Asimov Foundation.epub

The drivers for this are:

  • I am less interested in having things named correctly as I am filed in the correct location
  • I have found that the match rates of the author are typically much higher than than title
  • I prefer to maintain unrenamed files so to not lose any extra useful data the filename may contain

Excellent work. Took me far to long to stumble uopn this excellent project idea.

@na--
Copy link
Owner

na-- commented Jul 1, 2018

There are several things here that I'll try to unpack 😄, sorry if I don't address something.

  • If you just want to move files to be in subfolders, you can easily do that with bash one-liners. In your example, if you run this in the folder that contains the file you specified in From, you'd get the results you want:
    for f in *; do dir="${f:0:1}"; mkdir -p "$dir"; mv "$f" "$dir/$f"; done
    You can get pretty far with those one-liners (or simple purpose-built scripts) and bash string operations. I can't think how I can improve much over that and a general user-friendly file renamer is a bit out of the scope for this project, sorry.
  • You can also move books to subfolders when you're organizing them with the scripts in this repository. For example, if you specify a different value for --output-filename-template/OUTPUT_FILENAME_TEMPLATE that includes forward slashes (i.e. /), the scripts would automatically create the subfolders you want. If I run this on your example from folder:
    organize-ebooks.sh --output-folder=. --output-filename-template='"${d[AUTHORS]:0:1}/${d[AUTHORS]} - ${d[TITLE]}${d[PUBLISHED]:+ (${d[PUBLISHED]%%-*})}.${d[EXT]}"' .
    You'd get something like this (assuming that the books are correctly detected):
    OK:	./Douglas Adams The Hitchhikers Guide to the Galaxy.epub
    TO:	./D/Douglas Adams - The Hitchhiker's Guide to the Galaxy (1979).epub
    
    OK:	./Frank Herbert Dune.epub
    TO:	./F/Frank Herbert - Dune (1965).epub
    
    OK:	./George Orwell Nineteen Eighty-Four.epub
    TO:	./G/George Orwell - 1984 (1949).epub
    
    OK:	./Isaac Asimov Foundation.epub
    TO:	./I/Isaac Asimov - Foundation (1951).epub
    
    You can go as many levels of subfolers as you want. You may even use the original file path (${d[OLD_FILE_PATH]}) as part of the new name, though that would probably be buggy because of the sanitization these variables go through.
  • A better way to preserve that information would be to enable the --keep-metadata option. This way when organize-ebooks.sh moves and renames file, it also creates a simple text file next to it with a .meta extension. In that file the original file path (and a lot of other information) is kept. So you can have pretty new filenames without losing any information, just in case. This is very useful for organizing messy ebook collections - first you make a couple of automatic passes with organize-ebooks.sh and then use interactive-organizer.sh (which knows about those .meta files) to semi-automatically check for errors and if necessary, manually correct them 😄
  • You can use the split-into-folders.sh script if you want to limit how many files are kept in a single folder.

Hopefully that covers everything 😄. I'm happy that you like the scripts and I'll try to answer if you have any other questions or suggestions.

@nomandera
Copy link
Author

Slowly but surely I have been learning how to get the best out of these tools. I currently have a very low successful match rate but I will raise my findings on another ticket as there is a reasonable chance this is PEBCAK.

In the interim your suggestions above work great and a big thanks for taking the time to help us all with this guide.

I have some relevant follow on's if it is ok

For context I am using this variant with the official docker:

organize-ebooks.sh \
	--keep-metadata \
	--organize-without-isbn \
	--output-folder=/organized-books/isbn \
	--output-folder-uncertain=/organized-books/uncertain \
	--output-folder-corrupt=/organized-books/corrupt \
	--output-filename-template='"${d[AUTHORS]:0:1}/${d[AUTHORS]}/${d[AUTHORS]} - ${d[TITLE]}${d[PUBLISHED]:+ (${d[PUBLISHED]%%-*})}.${d[EXT]}"' \
	/unorganized-books

Is there a way to use what Calibre refers to as Author sort

e.g. rather than Authors: C.L. Scholey we would use Author sort: Scholey, C.L.

I have still not found a way to maintain the original filename. I do not want to rename ebook files at all only organise them into folders based on their primary author. Is there a variable I can use that is the original unadulterated filename excluding path?

Finally to answer your question about metadata. As you can see I am currently using the --keep-metadata option but once I have ironed out all the bugs I will be dropping it. Why? A principle problem I have is filesyststem performance and caching as inode count increases. Doubling the number of files that are needed to be stored isnt viable for me. Perhaps OT but one central large metadata database would not be an issue, it is not a capacity issue just a file count one.

Thanks again

@KeithPetro
Copy link

As na-- mentioned, if your files are already named correctly as you want them (Author first), then you have no need to use these tools to sort them. A simple bash one-liner (as he provided in his post) works just fine:

for f in *; do dir="${f:0:1}"; mkdir -p "$dir"; mv "$f" "$dir/$f"; done

I would suggest that you use these tools as part of a workflow, rather than trying to view them as the be all end all of your tool kit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants