ask_my_thesis / assets /txts /pg_0008.txt
jordyvl's picture
First commit
e0a78f5
raw
history blame
2.02 kB
iv
ABSTRACT
knowledge distillation () model compression in visually-rich document layout
analysis () and classification.
Through empirical studies and methodological contributions, this dissertation
has the following contributions and findings:
First, in a benchmarking study of established methods on real-world text
classification, we find that our novel hybrid method ‘Concrete Dropout
Ensemble’ performs best, enhancing in-domain calibration and novel class
detection, even at a smaller ensemble size. Detailed ablation experiments
reveal the impact of prior, neural architecture, and hyperparameter choices on
estimation quality.
Second, on a prototypical DU task, we identify challenges in DU progress
and propose a formalization of multipage document classification scenarios,
constructed novel datasets, and conducted an experimental analysis showing
the promise of multipage representation learning and inference.
Third, we introduce DUDE, incorporating multifaceted challenges and principles
for a comprehensive evaluation of generic DU. Next to our own benchmarking,
we organize a competition, revealing that while newer document foundation
models show promise, they struggle with questions involving visual evidence or
complex reasoning. Moreover, we find severe problems in the ability of Large
Language Models (s) to reason about documents in their entirety, highlighting
issues with hallucination, long-context reasoning and control.
Fourth, we propose the first methodology for enriching documents with semantic
layout structure using distilled DLA models. We apply KD to visual document
tasks, unraveling the influence of various task and architecture components.
Finally, the dissertation concludes with a discussion of the findings and
implications for future research, emphasizing the need for advancements in
multipage document representation learning and the importance of realistic
datasets and experimental methodologies to measurably move forward to reliable
and robust IA-DU technology.