Skip to content
This repository has been archived by the owner on Feb 15, 2024. It is now read-only.

Token/PostaggedToken/ChunkedToken has poor serialization support #25

Open
schmmd opened this issue Oct 11, 2013 · 3 comments
Open

Token/PostaggedToken/ChunkedToken has poor serialization support #25

schmmd opened this issue Oct 11, 2013 · 3 comments

Comments

@schmmd
Copy link
Member

schmmd commented Oct 11, 2013

There's half-completed code to serialize these as a tab separated list of whitespace separated token aspects. We also want some serialization that keeps all the aspects together.

I@0/PRP/B-NP rode@5/VB/B-VP

vs.

I@0 rode@5  \t  PRP VB  \t  B-NP B-VP

In the tab format, should the offsets be separated from the tokens?

I rode  \t  0 5  \t  PRP VB  \t  B-NP B-VP
@ghost ghost assigned jgilme1 Oct 11, 2013
@schmmd
Copy link
Member Author

schmmd commented Oct 11, 2013

@jgilme1, @rbart I wrote down some notes from our meeting.

@jgilme1
Copy link
Contributor

jgilme1 commented Oct 16, 2013

Use Format objects from this example 483dfdc to implement serialization on PostaggedToken/PosTagger ChunkedToken/Chunker

@schmmd
Copy link
Member Author

schmmd commented Oct 16, 2013

Btw, I pushed those change to master. 5534ea4

@jgilme1 jgilme1 removed their assignment May 26, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants