Zdeněk Kasner

PhD Student


My research revolves around improving data-to-text generation systems.

The projects I have worked on include:

  • domain-independent and low-resource data-to-text generation [1], [2], [3],
  • evaluating semantic accuracy of generated text [1], [4], [5],
  • investigating the role of data labels [6].
  • software toolkit for data-to-text generation [7].

I focus on developing efficient representations of structured data, so that the data can be used as an input to pretrained language models for generating automated reports.

During my internship at Mila, I was working on applying LLMs for autonomous web navigation.

In the future, I would also like to delve deeper into model interpretability: how is the information inside the language models represented, how do language models reason, and how this all relates to human cognition.

Selected publications

Beyond Reference-Based Metrics: Analyzing Behaviors of Open LLMs on Data-to-Text Generation

Zdeněk Kasner, Ondřej Dušek
2024, arXiv
Bechmarks for data-to-text generation are, well... not great. Most of them are small-scale, subject to data contamination, and slowly becoming saturated. But there are plenty of novel structured data online! In this paper, we make use of structured data scraped from open APIs and evaluate open LLMs (Llama2, Mistral, and Zephyr) on five data-to-text generation tasks, such as generating weather forecasts or product descriptions. Using both GPT-4 and human annotators, we find that the models outputs contain plenty of semantic errors on token-level: on average, more than 80% of outputs contain at least one error.

WebLINX: Real-World Website Navigation with Multi-Turn Dialogue

Xing Han Lù, Zdeněk Kasner, Siva Reddy
2024, arXiv
Automating web navigation – that was the project I was working on during my internship at Mila. We started with the idea that fully autonomous agents controlling a web browser are unsafe and impractical. So instead, we built a dataset for train and evaluating conversational web navigation: giving the model instructions via chat. On our dataset, we recorded demonstrations with expert annotators and evaluated quite a lot of models of various kinds. Since multi-modal models are not pretrained on structured inputs (...such as websites), text-only LLMs turned out to be suprisingly good for the task.

TabGenie: A Toolkit for Table-to-Text Generation

Zdeněk Kasner, Ekaterina Garanina, Ondřej Plátek, Ondřej Dušek
ACL 2023 – Demo Track
We began developing this tool to play with the generative language models in real time but it soon evolved into a swiss knife for table-to-text generation. TabGenie provides interactive data visualization, unified data representation & unified programming interface for more than 15 data-to-text datasets. You can use the web interface as a dataset viewer and a model playground, the programming interface then allows to quickly prototype new models.

Mind the Labels: Describing Relations in Knowledge Graphs With Pretrained Models

Zdeněk Kasner, Ioannis Konstas, Ondřej Dušek
EACL 2023
What is a better way to learn the data semantics – memorizing an arbitrary mapping or taking the human-readable data labels into account? On the task of describing a triple (entity_1, relation, entity_2), we show that the models are able to describe previously unseen relations as long as the relation label is meaningful and unambiguous. To put it another way: if you want to train robust data-to-text systems, don't use abbreviations!

Neural Pipeline for Zero-Shot Data-to-Text Generation

Zdeněk Kasner, Ondřej Dušek
ACL 2022
To combine the power of pretrained language models with controllability of pipeline approaches, we formulate data-to-text generation as a sequence of trainable text-to-text operations: ordering, aggregation, and paragraph compression. As a welcome side-effect, we get rid of semantically incorrect outputs arising from noisy human-written references. Is NL-only approach to data-to-text generation the way to go?

Text-in-Context: Token-Level Error Detection for Table-to-Text Generation

Best submission at Shared Task in Evaluating Accuracy
Zdeněk Kasner, Simon Mille, Ondřej Dušek
INLG 2021
How to automatically detect which parts of the generated text do not correspond to the data? In our submission for the Shared Task in Evaluating Accuracy 2021, we devise a 3-step approach combining a rule-based system with pretrained language models. Our approach was the best out of four submitted metrics!

Train Hard, Finetune Easy: Multilingual Denoising for RDF-to-Text Generation

2nd place in Russian RDF-to-text generation
Zdeněk Kasner, Ondřej Dušek
INLG 2020
Data == noisy text. That's definitely an overgeneralization, oversimplification... but um, it works! We succesfully generate text from DBPedia data in English and Russian just by finetuning mBART – a pretrained multilingual denoising autoencoder. This is our submission for the WebNLG Challenge 2020, presented at the 3rd Workshop on Natural Language Generation from the Semantic Web.

Evaluating Semantic Accuracy of Data-to-Text Generation with Natural Language Inference

INLG 2020 Best Short Paper Award
Ondřej Dušek, Zdeněk Kasner
INLG 2020
Does a text contain all the information from the data? Hard to check manually, even harder to code. But what if we reformulate the question a little bit: can we infer all the data from the text and nothing else? That sounds like natural language inference. And guess what - somebody already pretrained a neural model for that!

Data-to-Text Generation with Iterative Text Editing

Zdeněk Kasner, Ondřej Dušek
INLG 2020
Imagine your task is to generate text from data. Which sounds easier: generating text from scratch or joining existing sentences? We propose an approach in which we iteratively join the sentences with a text-editing neural model. Since the model has a limited vocabulary, it has also a limited possibilities of introducing incorrect facts. Moreover, it also turns out that sentence fusion is a quite general task which works on multiple domains.

Improving Fluency of Non-Autoregressive Machine Translation

Zdeněk Kasner, Jindřich Libovický, Jindřich Helcl
2020, arXiv
In the follow-up of my master thesis, we improve translation quality of a CTC-based machine translation model. The model is non-autoregressive, i.e. faster but lacking behind autoregressive models in translation quality. To improve the translation quality, we re-score the hypotheses during the beam search decoding with an n-gram language model and several other features.