Please consider participating in the
CoNLL 2019 Shared Task on Cross-Framework Meaning Representation Parsing
(includes SDP, EDS, AMR and UCCA parsing)
Universal Conceptual Cognitive Annotation (UCCA) is a novel semantic approach to grammatical representation. It was developed in the Computational Linguistics Lab of the Hebrew University by Omri Abend and Ari Rappoport.
The central idea of the project is to analyze and annotate natural languages using purely semantic categories and structure (a graph). Syntactic categories and structure are not part of the manual annotation, and are ideally learned implicitly by the parsers. The basic set of semantic categories (the foundational layer) is inspired by work in linguistic typology, cognitive grammar, and neuroscience. The development of additional layers, such as semantic roles and super-senses (adapted from the CARMLS project) is underway.
The annotation so far focused on argument-structure and linkage phenomena. We build primarily on Basic Linguistic Theory (R.M.W. Dixon, 2010a; 2010b; 2012), a widely used approach for language description. We acknowledge that there many applicable analyses for a given sentence, but select, for practical reasons, a small set of highly useful distinctions, and apply them to provide one plausible annotation.
We have annotated 160K tokens from English Wikipedia with the UCCA scheme, as well as a 30K English-French parallel corpus based on Jules Verne's "20K Leagues Under The Sea", and a 120K tokens corpus of the entire book in German. Pilot studies were conducted on several other languages as well.
This page contains links to all of UCCA's resources: corpora, annotation guidelines, parser and code. If you use these resources in your research, please cite the following or other relevant publications:
The UCCAApp web application for phrase-based annotation in general, and UCCA parsing in particular can be found here.
Formally, it supports DAG structures, discontiguous units and multiple categories.
The app supports configurable multi-layer annotation and task management, and is written in Django and AngularJS.[Paper] [Demo] [Code]
UCCA-annotated corpora include the guidelines version they were compiled with in their repository. The most up to date guidelines are available here (the most recent one is generally in draft mode, but see releases): [pdf].
All publicly available with a Creative Commons Attribution-ShareAlike 3.0 Unported license. The guidelines with which each of them was annotated can be found in the repository.
Semantics-aware Attention Improves Neural Machine Translation.
Aviv Slobodkin, Leshem Choshen and Omri Abend. *SEM 2022 (long paper).
A Transition-Based Directed Acyclic Graph Parser for UCCA.
Daniel Hershcovich, Omri Abend and Ari Rappoport. ACL 2017 (long paper). Outstanding Paper Award.
[Paper: pdf] [Supp. Material: pdf] [Code & Data: github] [Demo]
Conceptual Annotations Preserve Structure Across Translations: A French-English Case Study
Elior Sulem, Omri Abend and Ari Rappoport,
ACL 2015 Workshop on Semantics-Driven Statistical Machine Translation (S2MT).
Measuring Semantic Preservation in Machine Translation with HCOMET: Human Cognitive Metric
for Evaluating Translation
Pedro Marinotti, MSc Thesis,
The University of Edinburgh, 2014
Semi-supervised identification of scene-evoking nouns in UCCA
Amit Beka, MSc Thesis,
The Hebrew University of Jerusalem, 2013
Distinguishing Human Translations and Machine Outputs with UCCA
Michal Kessler, Lab Report,
The Hebrew University of Jerusalem, 2019