Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cycle IP addresses #6

Open
carlvaneijk opened this issue Dec 10, 2021 · 13 comments
Open

Cycle IP addresses #6

carlvaneijk opened this issue Dec 10, 2021 · 13 comments

Comments

@carlvaneijk
Copy link

This isn't an issue per se, but if there are multiple submissions from the same ip address, it's likely they can just filter for non-unique on their database and drop those submissions.

Good work so far!

@Kevinsky86
Copy link

Dev can't really fix this.
Maybe you can use TOR for this. Or a VPN service.

@millysecurity
Copy link

Dev can't really fix this.
Maybe you can use TOR for this. Or a VPN service.

Dev could provide a way to implement proxies if he wanted. That'd solve the problem, doubt most of yall know how to use those though.

@mdrews93
Copy link

I wrote a very similar jupyter notebook and I use Mullvad VPN for changing IPs before each new application. It comes with a command-line tool, so within the notebook I just need to use ! to execute bash commands. Here's the simple function:

def change_vpn_location():
    !mullvad disconnect
    !mullvad connect

Ideally it would use IPs local to the postings, but Mullvad doesn't provide IPs for the four states, so I just opted for a random US IP.

@bolshoytoster
Copy link
Contributor

We could use an array of proxies and pick a random one each time.

@kpcyrd
Copy link

kpcyrd commented Dec 12, 2021

If there's a way to configure socks5 you could use it with https://github.com/kpcyrd/laundry5

@bolshoytoster
Copy link
Contributor

bolshoytoster commented Dec 12, 2021

If we used proxies would we just use ones in the states the jobs are advertised in or would it not matter?

Also, from my experience, public proxy servers are typically very slow so it would reduce the rate we can send forms dramatically.

@bolshoytoster
Copy link
Contributor

@kpcyrd if we did, we'd have to either find a python version of that or write it ourselves since that's in rust

@kpcyrd
Copy link

kpcyrd commented Dec 12, 2021

the program acts as a proxy server and binds to a local port (eg. 127.0.0.1:1337), you'd then need to configure headless chrome to use 127.0.0.1:1337 as a socks5 proxy.

@bolshoytoster
Copy link
Contributor

@kpcyrd oh sorry, it would still be a pain to also have to install rustc for this though.

@pws1453
Copy link
Contributor

pws1453 commented Dec 13, 2021

Python's Selenium implementation supports the use of proxies natively, it's finding good proxies that we can use that will be a challenge.
Tutorial: https://www.tutorialspoint.com/running-selenium-webdriver-with-a-proxy-in-python

@bolshoytoster
Copy link
Contributor

@pws1453 if we were only using proxy servers in the states the jobs were advertised it would be pretty much impossible to get a list of good proxies.

@millysecurity
Copy link

@pws1453 if we were only using proxy servers in the states the jobs were advertised it would be pretty much impossible to get a list of good proxies.

The issue isn't the proxy location, or even if it's a good proxy. The only issue is using the same IP repeatedly to do this. If there's multiple submissions from the same IP then they can figure that out. The goal is just to use a different IP everytime, so a proxy from anywhere would work fine. If they actually took the time to check where the ip address is from then yeah like you said we would have a problem lol.

@bolshoytoster
Copy link
Contributor

My suggestion would be to add a new file, constants/proxies.py, add an array with a list of proxy servers to rotate:

PROXY_SERVERS = ['127.0.0.1:8080',
                 '1.2.3.4:6969',
                 …
                 '4.3.2.1:1234']

Then in main.py in start_driver:

KelloggBot/main.py

Lines 136 to 143 in dd481cf

def start_driver(random_city):
options = Options()
if (args.debug == DEBUG_DISABLED):
options.add_argument(f"user-agent={USER_AGENT}")
options.add_argument('disable-blink-features=AutomationControlled')
options.headless = True
driver = webdriver.Chrome(options=options)
driver.set_window_size(1440, 900)

We could add
options.add_argument(f'--proxy-server={random.choice(PROXY_SERVERS)}').

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants