Skip to content

MatrixMrsTestSuiteTatoeba

LilingTan edited this page Jun 19, 2013 · 20 revisions

This is a small experiment to put MRS test suite sentences into Tatoeba corpus. The primary purposes of this experiment are to (a) verify the naturalness of the sentences and (b) bootstrap a crosslingual MRS corpus through crowdsourcing.

no. English Tatoeba url
11 It rained. http://tatoeba.org/eng/sentences/show/2481815
21 Abram barked. http://tatoeba.org/eng/sentences/show/2481838
31 The window opened. http://tatoeba.org/eng/sentences/show/2481909
351 The dog is barking. http://tatoeba.org/eng/sentences/show/2482132
421 The dog couldn't bark. http://tatoeba.org/eng/sentences/show/2489590

Issues that came up when putting MRS test suite sentences into Tatoeba.

  1. @possible copyright violation: How are MRS test suites licensed?
  2. This is not a good translation: Some supposedly grammatical sentences were deemed as unnatural by Tatoeba's user, should these sentences be corrected to ensure that our grammars are parsing the right stuff? 3. Sentences are machine-translated.: Are sentences from the MRS test suite humanly translated? Or are they generated using LOGON? 4. I cannot think of any situation where this would be used.: Sentences from the test suite should validate the grammars' correctness, so it's understandable that some sentences are not very useful to Tatoeba's user on the surface. Is there a page where it says which phenomenon/phenomena each sentence captures? Are these phenomena language dependent?
Clone this wiki locally