cort-predict-raw runs on python2 but not python3.5 #17

bennytieu · 2017-04-27T09:20:50Z

I was trying to run cort-predict-raw with following command:

python3.5 /usr/local/bin/cort-predict-raw -in ~/data/pilot_44_docs/*.txt
-model models/model-pair-train.obj
-extractor cort.coreference.approaches.mention_ranking.extract_substructures
-perceptron cort.coreference.approaches.mention_ranking.RankingPerceptron
-clusterer cort.coreference.clusterer.all_ante
-corenlp ~/systems/stanford/stanford-corenlp-full-2016-10-31

and got the following error message:

Traceback (most recent call last):
File "/usr/local/bin/cort-predict-raw", line 136, in
doc.system_mentions = mention_extractor.extract_system_mentions(doc)
File "/usr/local/lib/python3.5/dist-packages/cort/core/mention_extractor.py", line 36, in extract_system_mentions
for span in __extract_system_mention_spans(document)]
File "/usr/local/lib/python3.5/dist-packages/cort/core/mention_extractor.py", line 36, in
for span in __extract_system_mention_spans(document)]
File "/usr/local/lib/python3.5/dist-packages/cort/core/mentions.py", line 126, in from_document
i, sentence_span = document.get_sentence_id_and_span(span)
TypeError: 'NoneType' object is not iterable
2017-04-27 09:17:06,058 WARNING Killing subprocess 14154
2017-04-27 09:17:06,395 INFO Subprocess seems to be stopped, exit code -9

It works without a problem with python2 though. I'm running this on Ubuntu16.04.

smartschat · 2017-04-27T09:42:48Z

Can you isolate (and post) the document which causes the error message?

bennytieu · 2017-04-27T10:21:51Z

I have isolated it to this string:

Contact for company: Sven Svensson 212 584 5242
[email protected].

I'm guessing it is the sequence of number that is at fault. Single instances of numbers are ok, for example, there are years like 2017 in other documents that are fine.

This example works:

Contact for company: Sven Svensson 584 5242
[email protected].

smartschat · 2017-04-27T11:41:32Z

I did some debugging, the first example is tokenized as ['Contact', 'for', 'company', ':', 'Sven', 'Svensson', '212Â\xa0584Â\xa05242', '[email protected]', '.']. I suspect that the TypeError happens because some representation I rely on handles the numbers as individual tokens. I will not be able to fix this right now, is using Python2 an option for you?

bennytieu · 2017-04-27T11:56:47Z

I will try and run on Python2 in the meantime or just skip this special case. I'm doing a study on efficiency, so it would be most optimal to run it using Python3. Thank you for your quick reply!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cort-predict-raw runs on python2 but not python3.5 #17

cort-predict-raw runs on python2 but not python3.5 #17

bennytieu commented Apr 27, 2017 •

edited

Loading

smartschat commented Apr 27, 2017

bennytieu commented Apr 27, 2017

smartschat commented Apr 27, 2017

bennytieu commented Apr 27, 2017

cort-predict-raw runs on python2 but not python3.5 #17

cort-predict-raw runs on python2 but not python3.5 #17

Comments

bennytieu commented Apr 27, 2017 • edited Loading

smartschat commented Apr 27, 2017

bennytieu commented Apr 27, 2017

smartschat commented Apr 27, 2017

bennytieu commented Apr 27, 2017

bennytieu commented Apr 27, 2017 •

edited

Loading