Tuesday, April 04, 2006
narrow transcriptions of CMD
i've seen two systems that attempt to do this (i'm sure there are more, and i'd love citations if anyone has them). both rely on video recordings of interactions which are later transcribed with a unique set of conventions: garcia and jacobs 1999, which captures very narrow detail about chatroom discourse but has a transcription style that is very awkward to read; and markman 2006, also of chat discourse, which is much simpler on the eyes but captures a broader level of detail than garcia and jacobs. adding to the bunch, here's my take on the whole transciption movement, a work in progress in transcribing instant message discourse:
and the video that it's based off of:
i know this is of interest to only a piece of this readership, so comment as necessary.
I haven’t heard of anyone doing this specific type of transcription before, either, but there has to be more people out there investigating these types of details of CMD interaction, or? Have you looked at multimodal discourse analysis and the transcription models used here?
In December, I did some work for the Open University in England, transcribing videotaped interactions in an audiographic platform used for a student project at the Department of Languages. Here I used an Excel file and numbered each turn, with the different modes numbered separately, after an idea that I got from Anna Vetter, a PhD student in France who is working with these types of transcripts. I chose to time stamp the audio (and pauses in the audio) and then I marked the other modes relative to this by inserting codes in the text showing when a turn in a new mode was initiated and completed. I found this provided a relatively good overview, but then, in order for this to work, one of the modes needs to observable in real-time during production. Also, it doesn't provide the level of detail concerning gaps that you're after for any of the modes but the audio.
If you don't get any other comments on this post, at least you got a really long one from me :-) .
we talked earlier about relationship between typing speed and judgments about intelligence. you've given us a methodology here with which to explore this relationship.
i don't like the way the timing of gaps is displayed in this transcript, but the only other way i've seen to capture them in detail involved a system where an exact timestamp of each action is provided - this was garcia and jacobs. so it told you exactly when each action or utterance was accomplished, but then you had to sit down and do the math to figure out the length of silences or the time it took to erase an utterance and then retype it. also, because i was transcribing meta messages like 'your buddy is typing' as well as actual talk, there needs to be a nice way to set both up visually without causing confusion. i'm looking for that middle ground there, and i'll find it, damn it. and i haven't looked at multi-modal DA for inspiration, actually. that's officially on my to-do list.
those two minutes of conversation took roughly seven hours to transcribe, though some of that time was spent figuring out transcription conventions and the like. so it was time-consuming and tedious, but not much more so than doing a narrow transcription of spoken discourse (which, admittedly, can take a good five hours to do two minutes of conversation). it makes the use of a logfile when it's suitable a whole lot more appealing, but i think some of us are going to have to toughen up and figure this kind of transcription out at some point (which, admittedly, puts the fear of god in me). this is a nice way to discuss typing speed, actually - i wonder if anyone else is considering that?
what i'd *really* like to see is a similar kind of narrow transcription used in discourse held through online games and the like, where interactions occur through the actions of avatars as well as written speech.
My transcripts were again modeled (in spirit anyway) after CA transcription, but I was not happy with the Garcia & Jacobs method either, as I am just naturally a more visual person, so I wanted something that would let me *see* the phenomena under study. The discourse being transcribed in small group chat, not IM, and I also captured and transcribed other actions on the participants' computer screens (though this could easily be left out). I chose a columnar view because I felt it best represented the phenomenon of chat, which (I think in contrast to IM) is not organized on a strict turn-exchange system, and of course, there are no "your buddy is typing" messages for participants. So my transcript gives the reader a view of the main chat window, where the interaction is taking place, and then side-by-side, each participant's screen activity/typing (this excerpt shows 4 participants). For mostly logistical reasons I chose to transcribe in one-second intervals, so it's not quite as precise for timing pauses/gaps, but you can see, both with the time stamp, but also with the physical layout, gaps and pauses in the conversatrion and in each person's activity.
Main drawback is that this is unweildy, as you see I use 8.5x14" paper just to get four peole and the chat window; the max number of people in my project at any given time was 6, and that's a pretty large transcript. So this would be hard to work with for large group chat (alnd that would be difficult data to capture in the first place). I tried aligning the participants vertically, but then it really didn't convey the same sense of time, at least not to me.
The transcription is rather labor intensive, especially as I have to sync up each person's movie to the chat timestamp, but I have gotten the procces down to about 1 hour for 1 minute of 1 person's activity. So one minute of a 4-person conversation would be about 4 hours, depending on the level of activity.
Anyway, I'm interested in comments and suggestions, etc!
Links to this post: