Quick auto-indent script for overlap for use with Du Bois (2015)'s discourse transcription conventions
The goal is to turn text like this:
SPEAKER1; Oh [are] we about to [2overlap]?
SPEAKER2; [whoa, it] is [2overlap]
into text like this:
SPEAKER1; Oh [are] we about to [2overlap]?
SPEAKER2; [whoa, it] is [2overlap]
The Python version (elan-overlapper.py
) requires Python 3 to be installed in your system. It can be run like this:
$ python elan-overlapper.py inputfile1.txt ...
Elan Overlapper assumes that export files are in plaintext of the sort generated by Elan's Traditional Transcript output setting (others may work, but no guarantees). Overlap is indicated as in Du Bois (2015), with brackets, like so:
SPEAKER1; Oh [are] we about to [2overlap]?
SPEAKER2; [whoa, it] is [2overlap]
Here, the number (or lack of number) is an index; [are] overlaps with [le]ts, and [overlap] with [overlap].
The script assumes that overlap indexes increase by 1 until there is a line where no overlap of any kind occurs. At this point, indexing resumes at 0 (unmarked).