Skip to content
This repository has been archived by the owner on Apr 17, 2024. It is now read-only.

Is already working with the new version of Linkedin? #104

Open
fritzZz opened this issue May 14, 2017 · 4 comments
Open

Is already working with the new version of Linkedin? #104

fritzZz opened this issue May 14, 2017 · 4 comments

Comments

@fritzZz
Copy link

fritzZz commented May 14, 2017

When I execute the command ./linkedin-scraper https://www.linkedin.com/in/blablabla/
I got this error:

/usr/lib/ruby/gems/1.9.1/gems/mechanize-2.7.4/lib/mechanize/http/agent.rb:942:in response_read': 999 => -- https://www.linkedin.com/in/blablabla/ (Mechanize::ResponseCodeError) from /usr/lib/ruby/gems/1.9.1/gems/mechanize-2.7.4/lib/mechanize/http/agent.rb:270:in block in fetch'
from /usr/lib/ruby/1.9.1/net/http.rb:1323:in block (2 levels) in transport_request' from /usr/lib/ruby/1.9.1/net/http.rb:2672:in reading_body'
from /usr/lib/ruby/1.9.1/net/http.rb:1322:in block in transport_request' from /usr/lib/ruby/1.9.1/net/http.rb:1317:in catch'
from /usr/lib/ruby/1.9.1/net/http.rb:1317:in transport_request' from /usr/lib/ruby/1.9.1/net/http.rb:1294:in request'
from /usr/lib/ruby/gems/1.9.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:999:in request' from /usr/lib/ruby/gems/1.9.1/gems/mechanize-2.7.4/lib/mechanize/http/agent.rb:267:in fetch'
from /usr/lib/ruby/gems/1.9.1/gems/mechanize-2.7.4/lib/mechanize.rb:464:in get' from /home/fritzzz/Downloads/linkedin-scraper-master/lib/linkedin-scraper/profile.rb:34:in initialize'
from ./linkedin-scraper:11:in new' from ./linkedin-scraper:11:in

'

Anyone of you has the same problem?

@rubenbaden
Copy link

It worked twice for me out of a few hundred times - Im assuming maybe we need new user agents?

I'm not sure but need help!

Will post if I find anything out.

@Startouf
Copy link

I don't understand why you have to start another issue when there are 2 discussing about this already =_=

@yatish27
Copy link
Owner

Linkedin is strict. It identifies bot requests and sends a 404 repsonse

@cyberfab007
Copy link

I found using curl to authentic linkedin worked well, also I have been able to pull down profile requests as well, the issue I am running in too is processing the java script so it can be readable in DOMdocument so I can use XPATH to scrape the information. right now I have a bunch of pregmatch trickery going on sorting through json output that comes down. I wrote my script php, its a class object, anyone care to help with it ? I tried using php-phantomjs , it works well unless you hit a redirect or need to use cookies. I am sure with some time and effort it will work.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants