-
Notifications
You must be signed in to change notification settings - Fork 4
SemCor
FrancisBond edited this page Aug 31, 2012
·
14 revisions
SC corpus sense annotation alignment
SC corpus has now been automatically aligned to the SemCor sense annotations. The alignment process found realpred or gpred matches for 96.3% of SemCor word forms. The remaining word forms were either mapping to elements treated by the ERG as semantically empty (e.g., copulas), or treated as MWE by the ERG but not by WordNet (‘such+as’, ‘right+then’, ‘not+even’).
The alignment program generated modified DMRS files, with an optional <sense> element:
<node nodeid='10002' cfrom='0' cto='6'>
<realpred lemma='first' pos='a' sense='1'/>
<sortinfo cvarsort='e' sf='prop' tense='untensed' mood='indicative' prog='minus' perf='minus'/>
<sense wn='2' lexsn='5:00:00:ordinal:00' wn_lemma='first'/>
</node>
The sense-annotated DMRS output is available here
There is also an updated dmrs.dtd and SemCoreMapping.csv: a mapping from each SC corpus item to the annotated SemCor 3.0 concordance, context, and sentence number.
Home | Forum | Discussions | Events