0

I am trying write a software, which will have a video player, transcripts and run them in sync.

I am having a problem with the transcripts at the moment. I have attached the XML file I am using:

NodeList nodeParagraphs = root.getElementsByTagName("u");
NodeList nodeParagraphs2 = root.getElementsByTagName("internal-media");
  for(int i=0; i < nodeParagraphs.getLength(); i++){
    Element nodeParagraph = (Element)nodeParagraphs.item(i); 
    Element nodeParagraph2 = (Element)nodeParagraphs2.item(i);
    String id = nodeParagraph.getAttribute("uID");
    String who = nodeParagraph.getAttribute("who");
    String Time = nodeParagraph2.getAttribute("start");
    Paragraph p = new Paragraph(who, id, Time);

    NodeList wNodeList = nodeParagraph.getElementsByTagName("w");
    for(int j=0; j < wNodeList.getLength(); j++){
      Element wElem = (Element)wNodeList.item(j);
      String word = wElem.getTextContent();
      p.addWord(word);
    }
    chat.addParagraph(p);
  }

The problem is when I am displaying the transcripts, they are printing at the wrong time because there are multiple internal-media tags in each u section. It is taking all of them when I only need the first one for each paragraph. example show below:

<?xml version="1.0" encoding="UTF-8"?>

<CHAT xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xmlns="http://www.talkbank.org/ns/talkbank"
      xsi:schemaLocation="http://www.talkbank.org/ns/talkbank http://talkbank.org/software/talkbank.xsd"
      Media="future" Mediatypes="video"
      PID="11312/t-00017262-1"
      Font="CAfont:13:0"
      Version="2.2.1"
      Lang="eng"
      Options="CA"
      Corpus="DaCapo"
      Date="1984-01-01">
  <Participants>
    <participant
      id="DAC"
    name="Dacapo_Leader"
      role="Adult"
      language="eng"

    />
    <participant
      id="MIC"
    name="Michael"
      role="Adult"
      language="eng"

    />
    <participant
      id="LUI"
    name="Luis"
      role="Adult"
      language="eng"

    />
    <participant
      id="NIN"
    name="Nina"
      role="Adult"
      language="eng"

    />
    <participant
      id="KEN"
      role="Adult"
      language="eng"

    />
    <participant
      id="JAK"
    name="Jakob"
      role="Adult"
      language="eng"

    />
    <participant
      id="XXX"
      role="Unidentified"
      language="eng"

    />
    <participant
      id="WOM"
    name="Dacapo_Woman"
      role="Adult"
      language="eng"

    />
  </Participants>
  <u who="KEN" uID="u0">
    <w>as</w>
    <w>it</w>
    <w>currently</w>
    <w>stands</w>
    <w>one</w>
    <w>of</w>
    <w>the</w>
    <w>things</w>
    <w>that</w>
    <w>people</w>
    <w>do</w>
    <internal-media
      start="0.000"
      end="2.520"
      unit="s"
    />
    <w>is</w>
    <w>create</w>
    <internal-media
      start="2.520"
      end="3.240"
      unit="s"
    />
    <w>one</w>
    <w>of</w>
    <w>the</w>
    <w>things</w>
    <w>that</w>
    <w>anthropologists</w>
    <w>design</w>
    <w>researchers</w>
    <w>do</w>
    <internal-media
      start="3.240"
      end="6.720"
      unit="s"
    />
    <w>is</w>
    <w>they</w>
    <w>create</w>
    <w>distance</w>
    <w>between</w>
    <w>business</w>
    <w>people</w>
    <internal-media
      start="6.720"
      end="9.160"
      unit="s"
    />
    <w>and</w>
    <w>uh</w>
    <t type="missing CA terminator"></t>
    <media
      start="9.160"
      end="11.200"
      unit="s"
    />
  </u>
  <u who="DAC" uID="u1">
    <w>participants</w>
    <t type="missing CA terminator"></t>
    <media
      start="11.200"
      end="11.800"
      unit="s"
    />
  </u>
Programmerr
  • 961
  • 3
  • 9
  • 11

1 Answers1

0

Your logic is wrong. First, you gather all <internal-media> elements from the entire document into one list. Secondly, you iterate over that list using the same index variable that is used on another, different-sized list.

You need to build the "internal-media" list each time for the descendant elements of current <u> element. after that, just take the first (0 indexed) item from the list

NodeList nodeParagraphs = root.getElementsByTagName("u");
  for(int i=0; i < nodeParagraphs.getLength(); i++){
    Element nodeParagraph = (Element)nodeParagraphs.item(i); 
    NodeList internalMediaList = nodeParagraph.getElementsByTagName("internal-media");
    Element firstInternalMedia = (Element)internalMediaList.item(0);
    String time = firstInternalMedia.getAttribute("start");

and for the love of (whoever you worship), use meaningful, proper-cased variable names....

Sharon Ben Asher
  • 13,849
  • 5
  • 33
  • 47