Tuesday, April 04, 2006

narrow transcriptions of CMD

as someone who's real into conversation analysis and similar kinds of micro-analyses of discourse, i've always been a little, well, discontent with the kinds of data used to carry out this type of research. a lot of it is done using logfiles, which fails to capture a whole mess of things like detailed measurements of silences, error correction done in the message composition process, and meta details of the interaction that don't show in the chat or message box (this can range from actions and gestures from avatars to temporary messages such as 'your buddy is typing'). the use of transcription methods similar to those used to transcribe spoken discourse can potentially fix some of these problems, though the conventions of those systems need to be almost completely scrapped if we want to transcribe synchronous online interactions.

i've seen two systems that attempt to do this (i'm sure there are more, and i'd love citations if anyone has them). both rely on video recordings of interactions which are later transcribed with a unique set of conventions: garcia and jacobs 1999, which captures very narrow detail about chatroom discourse but has a transcription style that is very awkward to read; and markman 2006, also of chat discourse, which is much simpler on the eyes but captures a broader level of detail than garcia and jacobs. adding to the bunch, here's my take on the whole transciption movement, a work in progress in transcribing instant message discourse:


and the video that it's based off of:


i know this is of interest to only a piece of this readership, so comment as necessary.

Well, I for one am interested in your attempts to capture this complex data in a single transcript :-) . To begin with, I agree that log files are not enough if you want to get the complete picture of what is influencing the patterns that you find, not least since information about whether the other person is typing is likely to have an important impact here (as you point out). I think your transcript succeeds in capturing the two levels of interaction that you’re interrelating, but I have to admit I had to look at it a few times to understand how the different gaps were timed (relative to what).

I haven’t heard of anyone doing this specific type of transcription before, either, but there has to be more people out there investigating these types of details of CMD interaction, or? Have you looked at multimodal discourse analysis and the transcription models used here?

In December, I did some work for the Open University in England, transcribing videotaped interactions in an audiographic platform used for a student project at the Department of Languages. Here I used an Excel file and numbered each turn, with the different modes numbered separately, after an idea that I got from Anna Vetter, a PhD student in France who is working with these types of transcripts. I chose to time stamp the audio (and pauses in the audio) and then I marked the other modes relative to this by inserting codes in the text showing when a turn in a new mode was initiated and completed. I found this provided a relatively good overview, but then, in order for this to work, one of the modes needs to observable in real-time during production. Also, it doesn't provide the level of detail concerning gaps that you're after for any of the modes but the audio.

If you don't get any other comments on this post, at least you got a really long one from me :-) .
I'm just curious how long it took you to transcribe that 2 minute selection. I like your system. It would be sweet if you could sync the videos of both people communicating and have them side by side for analysis.

we talked earlier about relationship between typing speed and judgments about intelligence. you've given us a methodology here with which to explore this relationship.
therese, you the bomb.

i don't like the way the timing of gaps is displayed in this transcript, but the only other way i've seen to capture them in detail involved a system where an exact timestamp of each action is provided - this was garcia and jacobs. so it told you exactly when each action or utterance was accomplished, but then you had to sit down and do the math to figure out the length of silences or the time it took to erase an utterance and then retype it. also, because i was transcribing meta messages like 'your buddy is typing' as well as actual talk, there needs to be a nice way to set both up visually without causing confusion. i'm looking for that middle ground there, and i'll find it, damn it. and i haven't looked at multi-modal DA for inspiration, actually. that's officially on my to-do list.

those two minutes of conversation took roughly seven hours to transcribe, though some of that time was spent figuring out transcription conventions and the like. so it was time-consuming and tedious, but not much more so than doing a narrow transcription of spoken discourse (which, admittedly, can take a good five hours to do two minutes of conversation). it makes the use of a logfile when it's suitable a whole lot more appealing, but i think some of us are going to have to toughen up and figure this kind of transcription out at some point (which, admittedly, puts the fear of god in me). this is a nice way to discuss typing speed, actually - i wonder if anyone else is considering that?

what i'd *really* like to see is a similar kind of narrow transcription used in discourse held through online games and the like, where interactions occur through the actions of avatars as well as written speech.
Well, I feel so cool to have been cited in a blog! Woo hoo. But seriously, for anyone interested in microanalysis of CMD, this is a major issue. For anyone who doesn't want to slog through my entire dissertation, I put a little transcript excerpt here

My transcripts were again modeled (in spirit anyway) after CA transcription, but I was not happy with the Garcia & Jacobs method either, as I am just naturally a more visual person, so I wanted something that would let me *see* the phenomena under study. The discourse being transcribed in small group chat, not IM, and I also captured and transcribed other actions on the participants' computer screens (though this could easily be left out). I chose a columnar view because I felt it best represented the phenomenon of chat, which (I think in contrast to IM) is not organized on a strict turn-exchange system, and of course, there are no "your buddy is typing" messages for participants. So my transcript gives the reader a view of the main chat window, where the interaction is taking place, and then side-by-side, each participant's screen activity/typing (this excerpt shows 4 participants). For mostly logistical reasons I chose to transcribe in one-second intervals, so it's not quite as precise for timing pauses/gaps, but you can see, both with the time stamp, but also with the physical layout, gaps and pauses in the conversatrion and in each person's activity.

Main drawback is that this is unweildy, as you see I use 8.5x14" paper just to get four peole and the chat window; the max number of people in my project at any given time was 6, and that's a pretty large transcript. So this would be hard to work with for large group chat (alnd that would be difficult data to capture in the first place). I tried aligning the participants vertically, but then it really didn't convey the same sense of time, at least not to me.

The transcription is rather labor intensive, especially as I have to sync up each person's movie to the chat timestamp, but I have gotten the procces down to about 1 hour for 1 minute of 1 person's activity. So one minute of a 4-person conversation would be about 4 hours, depending on the level of activity.

Anyway, I'm interested in comments and suggestions, etc!
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?