Project Description

Short DEscription of what were gonna do, in steps:

receive RTE6
parse structure and create an object/list
this might be done effortlessly with json.loads
enumerate on hypothesis and given sentences
for each input hypothesis and candidate sentence.
for each word in the sentence (or hypo, or both?) create a vector with the entailing words (hyponyms)
using wordnet and another resource
try to match each word in the hypo to a word in the sentence
this should be done by POS, ie. a noun in the hypo will only be matched to a noun in the sentence
the previous step creates a feature vector, which will be used to decide whether the sentence entails the hypo
this step can be done by guessing (with manual limits)
or, using a machine learning approche. this depends on Ido Dagan's answer...
mark V or X for each sentence
at the end grade ourselves against the given answers to determine recall and precision