Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[instagram] downloading posts with co-authors #6208

Open
docholllidae opened this issue Sep 18, 2024 · 3 comments
Open

[instagram] downloading posts with co-authors #6208

docholllidae opened this issue Sep 18, 2024 · 3 comments

Comments

@docholllidae
Copy link

docholllidae commented Sep 18, 2024

sometimes when downloading a user's profile it will put a post or two into another folder due to that post being coauthored with another profile

when scraping a profile all posts are downloaded to zzInsta\downloads\{owner.id}.{username}
however when scraping letrileylive's profile, when the extractor reaches this post https://www.instagram.com/letrileylive/reel/C_3dPwEPmzU/ (warning: semi-nsfw) it is downloaded to zzInsta\downloads\6200677336.officialplayboyplus folder instead of the folder for letrileylive
(note the link does redirect in the browser to https://www.instagram.com/officialplayboyplus/reel/C_3dPwEPmzU/)

how can i make sure these collab/coauthor posts are saved to the directory of the profile being scraped?

for reference here is my config:

{
    "extractor": {
        "base-directory": "X:/My Drive/",
        "archive": "%appdata%/gallery-dl/archive.sqlite3",
        "path-restrict": "^A-Za-z0-9_.~!-",
        "#skip": "abort:3",
        "keywords-default": "",

        "instagram": {
            "archive": "X:/My Drive/zzInsta/archive.instagram.sqlite3",
            "cookies": "X:/My Drive/zzInsta/cookies.instagram.1.txt",
            "include": ["avatar","posts","reels","highlights","stories"],

            "#avatar": {
                "#directory": ["zzInsta","downloads","{owner_id}.{username}","media","avatar"],
                "#archive": "",
                "#filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{owner_id}.{username}~_~{filename}.{extension}"
            },
            
            "directory": ["zzInsta","downloads","{owner_id}.{username}","{subcategory}"],
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{post_id}-{post_shortcode}-{num}.{username}~_~{description[0:50]}.{extension}",
            
            "sleep": [11.7,17.4],
            "sleep-request": [11,17],
            
            "posts": {
                "#skip": "abort:5"
            },
            "reels": {
                "#skip": "abort:5"
            }
        }
    }
}
@mikf
Copy link
Owner

mikf commented Sep 19, 2024

Use the {user[...]} values instead of {owner_id} etc. These always reference the user account from your input URLs instead of a potential co-author.

@Hrxn
Copy link
Contributor

Hrxn commented Sep 19, 2024

Shouldn't {username} be the same here?

Also

            "directory": ["zzInsta","downloads","{owner_id}.{username}","{subcategory}"],

doesn't result in instagram\<profilename> like you suggested? What are you actually doing?

@docholllidae
Copy link
Author

docholllidae commented Sep 20, 2024

Shouldn't {username} be the same here?

Also

            "directory": ["zzInsta","downloads","{owner_id}.{username}","{subcategory}"],

doesn't result in instagram\<profilename> like you suggested? What are you actually doing?

you're right, i had a brain fart when writing up my post.
it results in zzInsta\downloads\id.username which i want (zz is prepended just cause i want the scraped sites at the bottom of my directory listing, there's other IG specific files in there so downloads go in a subdirectory, and then I add the the owner_id to the start of the user's folder cause some people tend to be quite liberal with their name changes)

I edited the OP to make those corrections

Use the {user[...]} values instead of {owner_id} etc. These always reference the user account from your input URLs instead of a potential co-author.

I'm not sure what {user[...]} values you refer to? running with -j option on a post the only values i find with "user" in the name is username
(https://www.instagram.com/p/C_3dPwEPmzU/ example)

after some digging I did find a sort of work around, I add this into the extractor's options
"parent-directory": "true"

the only downside being the filename is still named with the co-author's username, but that's a very minor detail to me in this case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants