LingSync database glossing conventions

Since we’re getting a lot of work done with the LingSync application* from the programming side of things, we decided it was high time we start using it the way it was intended to be used; as a database making it easier for us to share data and collaborate.

As we’ve started putting in data, we’ve been laying down some conventions for us to follow in the rest of the database. This blog seemed like a good place to discuss the conventions we’ve established, and an even better place to debate new conventions for areas we haven’t fully fleshed out yet (verbs…).

All of this information and more is stored also on the specific wiki page for LingSync glossing: http://wiki.migmaq.org/index.php?title=LingSync_Glosses

Our general guiding principles are as follow:

  • Gloss everything–no defaults!
    • This is mainly to make search easier and more intuitive. If we had, for instance, “animate” as the understood default person and only glossed inanimate morphology as such, it would be very difficult to get a datalist of all animate words. Having no default glossing means that all our glosses will be very explicit and therefore easy to search.
    • This will be painful at the start, but once we have enough data in there, LingSync will autogloss and make our lives much easier! Hang in there.
  • When in doubt, don’t parse it out!
    • Only separate morphemes if you and a collaborator are completely, 100% sure that they are separable. Make sure you pass your theory by someone else’s eyes first, too!
    • Feel free to use dots frequently in your glosses. It is safer, generally speaking, to group morphemes (and later split them up) than it is to be over-enthusiastic about splitting them up (and later having to go back and re-group).
  • In general, be faithful to the surface/pronounced form when drawing morpheme boundaries. (ie match the morpheme line to the utterance line as closely as possible)
    • Please use the Notes section to leave comments about phonology if you think there is a predictable process going on!
    • (One exception is the palatalization of ‘t’ at morpheme boundaries. Throughout LingSync we will assume that t -> j / _-i, so it is safe to have the utterance and morpheme lines different here.)

As far as specifics go, there is more information on the wiki page itself (WordPress hyperlinks seem broken, here’s the address again http://wiki.migmaq.org/index.php?title=LingSync_Glosses ).

Please use the comments to discuss…

  • Verbs! Since we are glossing with the maximal amount of information, including tense and mood (ie. present indicative), where should we put this information? So far we’ve been sticking it onto the end of the root using dots (ie. tli’ma-tis = tell.TA.PRES.IND-1SG>2SG). Any other suggestions for the placement of tense/aspect/mood?
  • Verbs part 2! How should we identify the difference between various evidentialities? And what about tense, aspect, mood? Right now we’re marking present indicative, imperative, future, evidential past, inferentialiOpen discussion in comments below!

*(The LingSync extension can be found here: https://chrome.google.com/webstore/detail/lingsync/ocmdknddgpmjngkhcbcofoogkommjfoj )