R - XML of natural language corpus into dataframe -
i handling xml file in r using xml package. final goal create dataframe containing following information. luwpos luwdictionaryform luwlemma orthographictranscription phonetictranscription plainorthographictranscription devoiced moraid toneclass moraid 動詞 ダイスル 題する 題し ダイシ 題し 1 3 accent 1 luwpos, luwdictionaryform, luwlemma atts of luw node. orthographictranscription, phonetictranscriptio, plainorthographictranscription in suw, daughter of luw. devoiced in phone node, descendant of suw. moraid att of mora node, grandmother of phone. toneclass attribute of node xjtobilabeltone, descendant of phone. second moraid closest ancestor of xjtobilabeltone containing toneclass=accent. not phone nodes contain att devoiced. in case, don't need first moraid. when xjtobilabeltone not contain toneclass="accent", don't need second moraid either. so far, following: doc= xmlinternaltreeparse(file="a01f0122.xml") #opens file luw <- xpathsapply(doc, "//luw...