I just got back from attending the Chicago Colloquium for Digital Humanities and Computer Science (21-22 of November). I presented the paper “Language Preservation: A case study in collecting and digitizing machine-tractable language data.” The paper was about work I have done with Jim Cowie and Steve Helmreich of New Mexico State University on our collection efforts to collect resources for lesser-studied languages. It reported on work we have done on the Paraguayan indigenous language Guarani, and Uighur, an Altaic Turkic language spoken in the Xinjiang province of China.
Author Archives: raz
THATCamp Chicago
I was an invited participant at THATCamp Chicago (The Humanities and Technology Camp), “a user-generated unconference where humanists and technologists work together for the common good” which was held on November 20th. I participated in a number of great sessions. Of particular interest to me was the GeoTools/GIS session. Jo Guldi, a historian at Harvard, was interested in what she calls ‘geo-parsing’– identifying place names in text. She is interested in detecting subaltern agency in Britain by analyzing books published between 1848 and 1919. It sounds like a fun named entity extraction task and I volunteered to help her. I also attended sessions on GIT and XML/TEI.
topiCS paper – revised
Jeanette Gundel, Nancy Hedberg, and I just finished a revision of our paper: Underspecification of Cognitive Status in Reference Production: Some Empirical Predictions and resubmitted it to the journal, Topics In Cognitive Science.
Abstract submitted: Language Preservation
My colleagues (Jim Cowie and Steve Helmreich of New Mexico State University) and I just submitted a paper titled “Language Preservation: A case study in collecting and digitizing machine-tractable language data” to the Chicago Colloquium. The abstract is:
In this paper we describe a process for collecting and digitizing machine-tractable resources for lesser-studied languages. We illustrate this process by using examples from the Paraguayan indigenous language Guarani, and Uighur, a Altaic Turkic language spoken in the Xinjiang province of China. By ‘machine-tractable’ we mean that in addition to being readable by people, the resource can also be processed by a computational tool. Our goal in acquiring these resources is to use them for quick ramp-up machine translation. These resources are also useful to scholars who are studying these particular languages. Continue reading
Paper accepted to the journal topiCS
Jeanette Gundel (University of Minnesota), Nancy Hedberg (Simon Fraser University) and I just had our paper, Underspecification of Cognitive Status in Reference Production: Some Empirical Predictions, accepted for publication in the Cognitive Science Society journal, Topics in Cognitive Science. To quote Nancy: “Hallelujia!!! … I am ecstatic!!!” That mirrors my feelings. I am grateful to the reviewers for their wonderful comments. Now there is a moderate amount of work to do to address the reviewers’ comments. Here is the abstract. Continue reading
My research mentioned in novel
For over 20 years I have been collaborating with Jeanette Gundel and Nancy Hedberg on research focusing on referring expressions. As part of this research we propose something we term the Givenness Hieararchy–a set of cognitive statuses the are on an implicational scale. Oddly enough this research is mentioned in the just-published novel Starting from Scratch by Susan Gilbert-Collins. Continue reading