The JAV-Sort-Scrape-javlibrary
repository is officially put on maintenance mode and will not receive any future updates. Please check out the successor to this project, Javinizer.
The JAV-Sort-Scrape-javlibrary repository is a series of scripts used to manage your local JAV (Japanese Adult Video) library. It automatically scrapes content from JavLibrary, R18, and 7mmtv to create an easily usable content library within Emby or Jellyfin. My goal in maintining this project is for it to function as a simple and lightweight alternative to [JAVMovieScraper](https://github.com/DoctorD1501 JAVMovieScraper). If you have any questions, criticisms, or requests, or want to be kept up-to-date of any new features and releases, feel free to hop into my discord channel.
Big thanks to the original author of the sort_jav.py script /u/Oppaira.
Older changes have been moved to the wiki.
- Changes
- Add actor thumbnail URL directly in the .nfo metadata within
Set-JAVNfo.ps1
to work with Plex - Change some default video naming settings in
settings_sort_jav.ini
- Change directory structure for
Get-R18ThumbUrls.ps1
- Add updated actor .csv files
- Add actor thumbnail URL directly in the .nfo metadata within
- Fixes
- Fix
Sort-7mmtv.ps1
not sorting some videos - Fix
$SettingsPath
onGet-R18ThumbUrls.ps1
- Fix
- Additions
- Add setting
scrape-r18-other-metadata
to scrape and set R18.com metadata for series and video director - Experimental - Add script
Sort-7mmtv.ps1
to scrape and set 7mmtv.tv metadata for uncensored and amateur JAV
- Add setting
- Fixes
- Fix setting
do-not-rename-file
causing script to break when setting is true
- Fix setting
- Additions
- Add setting
prefer-r18-title
to prefer R18.com title for video metadata Add settingRemoved Sep 10, 2019 due to breaking bugdo-not-rename-file
to run sort_jav.py without renaming video files
- Add setting
- Fixes
- Fix html causing inconsistencies on video titles in Emby
- Fix Set-JAVNfo.ps1 not writing metadata for videos with naming starting with "R18"
- Fix R18 titles not scraping with PowerShell 6.0+
- Fixes
- Fix error causing nfo metadata file not being read by Emby
- Fix Set-JAVNfo.ps1 file compatibility with PowerShell Core 6.0+
- Python 3.5+
- Pillow
- cfscrape - requires Node.js
- cloudscraper
- PowerShell 5.0 or higher (6.0+ recommended)
Clone the repository or download the latest release.
# Required to crop cover images
> pip install Pillow
# Required to scrape JavLibrary
# If you are using the latest release, install cfscrape
# If you cloned the repository, install cloudscraper
# Used in latest commit in master
> pip install cloudscraper
# Used in latest release
> pip install cfscrape
Download and install from here.
You will need PowerShell v5.0 or higher installed to run any of the .ps1 scripts (PowerShell 5.0 is installed on Windows 10 by default). If you get a Remote-ExecutionPolicy error when running, open an administrator PowerShell prompt, and run the following to unrestrict the scripts:
PS> Set-ExecutionPolicy Unrestricted
Before running any of the scripts, configure your settings in settings_sort_jav.ini
.
Documentation for each option is listed in the settings file, with defaults set to my best practice
guideline. Most notably, you will need to change each of the path settings to match your local
directory structure. Completely read through this section, the Notes, and FAQ
section before using.
If you are running this for the first time and you have many directories of videos, you can run the
sort_jav.py
or Set-JAVNfo.ps1
scripts concurrently with many different paths. Simply edit the
path
in your settings file, save, and run the script. Edit path
and run again, etc.
To run PowerShell (.ps1) scripts, right click the file and select "Run with PowerShell". To run Python (.py) scripts, double click to run. You can also invoke the scripts from a non-administrator shell like shown in the demo.
The scripts are numbered in the order that they should be run. They were written with ease-of-use in mind, so they are a one-click solution once your settings are configured properly.
-
Run
sort_jav.py
to sort your JAV files. - Stop here if you don't use any media servers that use .nfo metadata -
Run
Get-R18ThumbUrls.ps1
to scrape R18 and get actor thumbnail urls and write to .csv spreadsheet specified inr18-export-csv-path
(you only need to do this once every so often when new actors are added to R18).- I have provided two recently scraped spreadsheets in the repository as
emby_actor_thumbs/R18-Aug-30-2019.csv
Use last-first if yourname-order
in your settings is set to last, and vice-versa.
- I have provided two recently scraped spreadsheets in the repository as
-
Run
Set-JAVNfo.ps1
to create .nfo metadata files for each video. - Stop here if you are using Plex or you don't want actor images in Emby/Jellyfin- You can re-run this script on existing directories to replace old metadata files if you make changes to your settings
-
Go to your Emby server and make sure all your videos are imported. This is important, as the next step will call Emby's API to get your current actor list. Take a look at Emby's documentation on how to generate an API key to insert into your settings file.
-
Run
Get-EmbyActorThumbs.ps1
to get Emby actor name/id to compare with R18 spreadsheet and create a .csv spreadsheet specified inactor-csv-export-path
. Make sure the spreadsheet you created inGet-R18ThumbUrls.ps1
is specified inr18-export-csv-path
as it will be referenced in this script.- After creating this spreadsheet, you can modify specific actor images manually
- You can re-run this script after importing new videos/actors into Emby to update your spreadsheet
-
Run
Set-EmbyActorThumbs.ps1
to import actor images into Emby. It will import based on the spreadsheet specified inactor-csv-export-path
. A .csv database will be created to record all changes made to your actor images specified inactor-csv-database-path
.- After your first write to Emby, your .csv database will be created
- Any changes made to your
actor-csv-export-path
file will be compared to the database inactor-csv-database-path
and be written accordingly
-
Optional Run
Sort-7mmtv.ps1
download cover and create a metadata file for uncensored and amateur JAV videos as listed on 7mmtv.tv. There are currently no configurable settings for this script besides the path7mm-files-path
. By default, the video will be moved to a new folder, have a cover image downloaded, and .nfo metadata file created. Review additional notes and how to run in the notes, and in the settings file comments above7mm-files-path
If you are having trouble with any of these steps, review my script-run demos, or send me a message in my discord channel.
Set up a Movies
media library, and select show advanced settings
. Uncheck all settings that try
to get movie metadata from the internet.
Install the XBMCnfoMoviesImporter
plugin on your Plex Media Server. Set up a Movies
media library, and under the Advanced
settings, set XBMCnfoMoviesImporter as the scanner agent. Leave all other settings default.
- Matches your jav file and scrapes javlibrary for content
- Renames and sorts your jav file to your specified settings
sort_jav.py will run a non-recursive search of video files located in the path
specified in your
settings file. Files located in folders will not be detected.
If you are trying to sort a video with multiple parts, follow any of the naming schemas below:
-
MIRD-151A, MIRD-151B
-
MIRD151A, MIRD-151B
-
MIRD-151[delimiter-between-multiple-videos]A, MIRD-151[delimiter-between-multiple-videos]B
-
MIRD-151[delimiter-between-multiple-videos]1[delimiter-between-multiple-videos], MIRD-151[delimiter-between-multiple-videos]2[delimiter-between-multiple-videos]
- Matches all html .txt files created by sort_jav.py
- Creates a .nfo metadata file that is readable by Media servers like Emby/Jellyfin
Set-JAVNfo.ps1
will run a recursive search of .txt files located in the path
specified in your
settings file. A .nfo metadata file will be generated with information such as title, release date,
studio, genres, actors, actor thumb URL. Set prefer-r18-titles
true to do an additional scrape of
R18.com for better translated titles in your metadata.
- Scrapes R18 for all actor thumbnails and creates a csv database for actor name and their thumbnail URL.
Get-R18ThumbUrls.ps1
will take a while to run, as it needs to parse over 300 pages of R18 actors.
I have provided recent (Aug-30-2019) scrape files created with this script for you to use if you do
not want to create your own file. Use R18-Aug-30-2019-last-first.csv
if you have name-order
set
to last, and vice versa. Specify this path in r18-export-csv-path
.
- Calls Emby/Jellyfin API to get a list of actors and their IDs
- Matches names of both Emby actor list and R18 csv database created by Get-R18ThumbUrls
- Creates a modifiable .csv spreadsheet to import into Emby
Get-EmbyActorThumbs.ps1
will parse the R18ThumbUrl .csv and match with your Emby/Jellyfin actor
list generated by an API call. A new spreadsheet will be created for direct import. You can modify
this spreadsheet by hand if you want to add/delete/update external actor images (such as from WAPdB)
in Emby/Jellyfin. You can also run this script again after adding new videos/actors into Emby to
update the spreadsheet with new actors.
- Reads csv spreadsheet created by Get-EmbyActorThumbs.ps1 and imports matching thumbnails into Emby/Jellyfin using API
- Creates an up-to-date csv database of all API calls made to Emby/Jellyfin
Set-EmbyActorThumbs.ps1
will read the csv created by Get-EmbyActorThumbs.ps1
and import new
actor images to Emby while writing a separate csv database of all changes made. If the csv database
already exists, it will be compared to your actor csv and only import new changes.
- Scrapes and sorts all videos in path set in
7mm-files-path
in the settings file
Sort-7mmtv.ps1
will scrape metadata, download the video cover, write a .nfo metadata file, and
move videos to a new directory. It will scrape metadata information from the Japanese version of the
site. Unfortunately the English version of the site has too many errors and discrepencies. I
recommend you separate your uncensored/amateur JAV from your censored ones in Emby/Jellyfin as to
better filter through them. This is a very basic script, with no settings options besides the path.
The script will sort one video file every 10 seconds, as the script scrapes from Google, and you
will be blocked if you scrape too much.
- Finds all original thumbnail-size covers and creates an extra poster-size cover
edit_covers.py
is deprecated past v1.4.5. Only use this script if you are using a very old version
that did not have poster covers and you want to add them to your already scraped files.
Please note that where it gives you options to include delimiters, certain characters are disallowed by the OS. If you include them, they will be forcibly removed from your delimiter. For windows, that would be: / \ : * ? < > |
Try renaming your file exactly how it appears on javlibrary. If it still doesn't work for some reason, follow the instructions below to manually sort the file.
This problem likely stems from the cfscrape/cloudscraper module failing. First ensure that both cfscrape AND
Node.js are installed properly (restart computer after installing Node.js). If that still does not
work, run pip install -U cfscrape
to upgrade your cfscrape module.
Occasionally the results will be incorrect due to there being multiple videos with the same title.
To manually sort the file, go to the correct javlibrary page, and save the page as an html. Also
save the cover image. Rename the .html to a .txt, and rename both the cover and .txt the same as the
video. Run Set-JAVNfo.ps1 and it will create a metadata file for you. To create a poster cover, you
can either crop it yourself, or run the edit_covers.py
script with the FULL directory path of your
manually sorted files set in the settings path.
For these special cases, rename the files to exactly how they appear on javlibrary. Anything extra in the video name will cause it not to sort. If you have a lot of them, check out my JAV renamer script.
Try calling the scripts through a shell window rather than double-clicking to run. You will be able to diagnose what the issue is through the error messages.
Unfortunately R18 and javlibrary use different English naming conventions for their actresses, so
until I have a solution in place to either scrape Japanese, or do some manual switcharoos, we'll
have to deal with it. You can also manually put in image urls into the .csv specified in
actor-csv-export-path
.
As of now, I dont have a solution in place to handle hard caps on filename. If you have files that won't sort due to this error, turn off actor names for that file.
- Add option to input tags/genres in metadata file - v.1.4.0
- Add functionality to crop cover to poster size - v1.4.4
- Scrape actor images from R18.com and push to Emby - v1.5.0
- Add option to run sort_jav.py without renaming local files - v1.5.2
- Scrape scene title from R18.com - v1.5.2
- Scrape series title and director name from R18.com - v1.5.3
- Scrape amateur/uncensored video metadata from 7mmtv - v1.5.3
- Add functionality to add actor thumb link in .nfo metadata - v1.5.4
- Add option to rename *-thumb.jpg files to fanart.jpg for better Plex compatibility
- Add option to do recursive search on sort_jav.py
- Add option to manually scrape a javlibrary url if it can't match automatically
- Add more video title renaming options
- Scrape video plot/description