Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

抓取一批账号信息,账号信息还没有抓取结束,但程序运行到一半就会报错,报错之后的数据抓取全部丢失,稳定复现,请问如何解决 #596

Open
tonymao51 opened this issue Jul 19, 2024 · 3 comments
Labels
failed 程序运行出错

Comments

@tonymao51
Copy link

为了更好的解决问题,请认真回答下面的问题。等到问题解决,请及时关闭本issue。

  • 问:请您指明哪个版本运行出错(github版/PyPi版/全部)?

答:

  • 问:您使用的是否是最新的程序(是/否)?

答:是

  • 问:爬取任意用户都会运行出错吗(是/否)?

答:是

  • 问:若只有爬特定微博时才出错,能否提供出错微博的weibo_id或url(非必填)?

答:

  • 问:若您已提供出错微博的weibo_id或url,可忽略此内容,否则能否提供出错账号的user_id及您配置的since_date,方便我们定位出错微博(非必填)?

答:

  • 问:如果方便,请您描述出错详情,最好附上错误提示。

答:****************************************************************************************************
'NoneType' object has no attribute 'xpath'
Traceback (most recent call last):
File "D:\WBspider_code\weiboSpider-master\weibo_spider\parser\index_parser.py", line 49, in get_page_num
if self.selector.xpath("//input[@name='mp']") == []:
AttributeError: 'NoneType' object has no attribute 'xpath'
unsupported operand type(s) for +: 'int' and 'NoneType'
Traceback (most recent call last):
File "D:\WBspider_code\weiboSpider-master\weibo_spider\spider.py", line 167, in get_weibo_info
if self.page_count > 2 and (self.page_count +
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
共爬取0条微博
信息抓取完毕


@tonymao51 tonymao51 added the failed 程序运行出错 label Jul 19, 2024
@dataabc
Copy link
Owner

dataabc commented Jul 19, 2024

速度太快,被暂时限制了,一般来说,过一段时间限制会自动解除。

@tonymao51
Copy link
Author

速度太快,被暂时限制了,一般来说,过一段时间限制会自动解除。

辛苦指正,但是爬取当中还发现了一个问题。一批账号里面,我爬取的是昨日和今日的微博,但是有些账号会出现该时间内发的微博有时候爬取不到(不稳定复现,但账号列表里账号出现20个及以上容易发生)。

@dataabc
Copy link
Owner

dataabc commented Jul 20, 2024

可能是接口不稳定吧。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
failed 程序运行出错
Projects
None yet
Development

No branches or pull requests

2 participants