A Linear-Time Algorithm for Optimal Tree Sibling Partitioning and its Application to XML Data Stores

Kanne, Carl-Christian ; Moerkotte, Guido

TR_2006_006.pdf - Veröffentlichte Version

Download (248kB)

URL: http://ub-madoc.bib.uni-mannheim.de/1168
URN: urn:nbn:de:bsz:180-madoc-11689
Dokumenttyp: Arbeitspapier
Erscheinungsjahr: 2006
Sprache der Veröffentlichung: Englisch
Einrichtung: Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Sonstige - Fakultät für Mathematik und Informatik
MADOC-Schriftenreihe: Veröffentlichungen der Fakultät für Mathematik und Informatik > Institut für Informatik > Technical Reports
Fachgebiet: 004 Informatik
Normierte Schlagwörter (SWD): Datenbankmanagementsysteme , Baum <Mathematik> , Partition <Informatik> , Algorithmus
Freie Schlagwörter (Deutsch): tree partitioning
Freie Schlagwörter (Englisch): tree partitioning
Abstract: Document insertion into a native XML Data Store (XDS) requires to partition the document tree into a number of storage units with limited capacity, such as records on disk pages. As intra partition navigation is much faster than navigation between partitions, minimizing the number of partitions has a beneficial effect on query performance. We present a linear time algorithm to optimally partition an ordered, labeled, weighted tree such that each partition does not exceed a fixed weight limit. Whereas traditionally tree partitioning algorithms only allow child nodes to share a partition with their parent node (i.e. a partition corresponds to a subtree), our algorithm also considers partitions containing several subtrees as long as their roots are adjacent siblings. We call this sibling partitioning. Based on our study of the optimal algorithm, we further introduce two novel, near-optimal heuristics. They are easier to implement, do not need to hold the whole document instance in memory, and require much less runtime than the optimal algorithm. Finally, we provide an experimental study comparing our novel and existing algorithms. One important finding is that compared to partitioning that exclusively considers parent-child partitions, including sibling partitioning as well can decrease the total number of partitions by more than 90%, and improve query performance by more than a factor of two.
Zusätzliche Informationen:

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

+ Zitationsbeispiel und Export

Kanne, Carl-Christian und Moerkotte, Guido (2006) A Linear-Time Algorithm for Optimal Tree Sibling Partitioning and its Application to XML Data Stores. [Arbeitspapier]

+ Suche Autoren in

+ Download-Statistik

Downloads im letzten Jahr

Detailierte Angaben

Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail

Actions (login required)

Eintrag anzeigen Eintrag anzeigen