LingSync database glossing conventions

Since we’re getting a lot of work done with the LingSync application* from the programming side of things, we decided it was high time we start using it the way it was intended to be used; as a database making it easier for us to share data and collaborate.

As we’ve started putting in data, we’ve been laying down some conventions for us to follow in the rest of the database. This blog seemed like a good place to discuss the conventions we’ve established, and an even better place to debate new conventions for areas we haven’t fully fleshed out yet (verbs…).

All of this information and more is stored also on the specific wiki page for LingSync glossing: http://wiki.migmaq.org/index.php?title=LingSync_Glosses

Our general guiding principles are as follow:

  • Gloss everything–no defaults!
    • This is mainly to make search easier and more intuitive. If we had, for instance, “animate” as the understood default person and only glossed inanimate morphology as such, it would be very difficult to get a datalist of all animate words. Having no default glossing means that all our glosses will be very explicit and therefore easy to search.
    • This will be painful at the start, but once we have enough data in there, LingSync will autogloss and make our lives much easier! Hang in there.
  • When in doubt, don’t parse it out!
    • Only separate morphemes if you and a collaborator are completely, 100% sure that they are separable. Make sure you pass your theory by someone else’s eyes first, too!
    • Feel free to use dots frequently in your glosses. It is safer, generally speaking, to group morphemes (and later split them up) than it is to be over-enthusiastic about splitting them up (and later having to go back and re-group).
  • In general, be faithful to the surface/pronounced form when drawing morpheme boundaries. (ie match the morpheme line to the utterance line as closely as possible)
    • Please use the Notes section to leave comments about phonology if you think there is a predictable process going on!
    • (One exception is the palatalization of ‘t’ at morpheme boundaries. Throughout LingSync we will assume that t -> j / _-i, so it is safe to have the utterance and morpheme lines different here.)

As far as specifics go, there is more information on the wiki page itself (WordPress hyperlinks seem broken, here’s the address again http://wiki.migmaq.org/index.php?title=LingSync_Glosses ).

Please use the comments to discuss…

  • Verbs! Since we are glossing with the maximal amount of information, including tense and mood (ie. present indicative), where should we put this information? So far we’ve been sticking it onto the end of the root using dots (ie. tli’ma-tis = tell.TA.PRES.IND-1SG>2SG). Any other suggestions for the placement of tense/aspect/mood?
  • Verbs part 2! How should we identify the difference between various evidentialities? And what about tense, aspect, mood? Right now we’re marking present indicative, imperative, future, evidential past, inferentialiOpen discussion in comments below!

*(The LingSync extension can be found here: https://chrome.google.com/webstore/detail/lingsync/ocmdknddgpmjngkhcbcofoogkommjfoj )

This entry was posted in Linguistics, Mi'gmaq grammar, Mi'gmaq Online by Elise. Bookmark the permalink.
Elise

About Elise

Elise recently got her BA in Linguistics from McGill, having written her undergraduate thesis about Mi'gmaq possession. She spent the summer of 2012 working closely with the teachers at the Listuguj Education Centre, and learned a lot of the language! Now she's devoting her time to the various programs run through this blog, and working at the McGill Prosodylab.

4 thoughts on “LingSync database glossing conventions

  1. Hi Elise! This is great! Could I also suggest that we discuss how to gloss classifiers (for counting) in the comments section? I feel this is an area we haven’t touched yet.

    • Yeah, we should definitely talk about that!

      I think that Alan was using something along the lines of:

      newte-’jit … asugoum-te’s-ijig
      one-AN.SG … six-CL-AN.PL
      one (animate being) … six (animate beings)

      Does this look good?

  2. I have a question about the suffix -a’pn for the 3>4.PST ending: is it jus -a’pn or is it -a’pnn? If it is -a’pnn, would you gloss this ending as -3>4-PST-OBV? Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>