Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,87 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
tags:
|
5 |
+
- pytorch
|
6 |
+
- BERT
|
7 |
+
- token-classification
|
8 |
+
- nominalizations
|
9 |
+
datasets:
|
10 |
+
- kleinay/qanom
|
11 |
+
---
|
12 |
+
|
13 |
+
# Nominalization Detector
|
14 |
+
|
15 |
+
This model identifies "predicative nominalizations", that is, nominalizations that carry an eventive (or "verbal") meaning in context. It is a `bert-base-cased` pretrained model, fine-tuned for token classification on top of the "nominalization detection" task as defined and annotated by the QANom project [(Klein et. al., COLING 2020)](https://www.aclweb.org/anthology/2020.coling-main.274/).
|
16 |
+
|
17 |
+
## Task Description
|
18 |
+
|
19 |
+
The model is trained as a binary classifier, classifying candidate nominalizations.
|
20 |
+
The candidates are extracted using a POS tagger (filtering common nouns) and additionally lexical resources (e.g. WordNet and CatVar), filtering nouns that have (at least one) derivationally-related verb. In the QANom annotation project, these candidates are given to annotators to decide whether they carry a "verbal" meaning in the context of the sentence. The current model reproduces this binary classification.
|
21 |
+
|
22 |
+
## Usage
|
23 |
+
|
24 |
+
The candidate extraction algorithm is implemented inside the `qanom` package - see the README in the [QANom github repo](https://github.com/kleinay/QANom) for full documentation. The `qanom` package is also available via `pip install qanom`.
|
25 |
+
|
26 |
+
For ease of use, we encapsulated the full nominalization detection pipeline (i.e. candidate extraction + predicate classification) in the `qanom.nominalization_detector.NominalizationDetector` class, which internally utilize this `nominalization-candidate-classifier`:
|
27 |
+
|
28 |
+
```python
|
29 |
+
from qanom.nominalization_detector import NominalizationDetector
|
30 |
+
detector = NominalizationDetector()
|
31 |
+
|
32 |
+
raw_sentences = ["The construction of the officer 's building finished right after the beginning of the destruction of the previous construction ."]
|
33 |
+
|
34 |
+
print(detector(raw_sentences, return_all_candidates=True))
|
35 |
+
print(detector(raw_sentences, threshold=0.75, return_probability=False))
|
36 |
+
```
|
37 |
+
|
38 |
+
Outputs:
|
39 |
+
```json
|
40 |
+
[[{'predicate_idx': 1,
|
41 |
+
'predicate': 'construction',
|
42 |
+
'predicate_detector_prediction': True,
|
43 |
+
'predicate_detector_probability': 0.7626778483390808,
|
44 |
+
'verb_form': 'construct'},
|
45 |
+
{'predicate_idx': 4,
|
46 |
+
'predicate': 'officer',
|
47 |
+
'predicate_detector_prediction': False,
|
48 |
+
'predicate_detector_probability': 0.19832570850849152,
|
49 |
+
'verb_form': 'officer'},
|
50 |
+
{'predicate_idx': 6,
|
51 |
+
'predicate': 'building',
|
52 |
+
'predicate_detector_prediction': True,
|
53 |
+
'predicate_detector_probability': 0.5794129371643066,
|
54 |
+
'verb_form': 'build'},
|
55 |
+
{'predicate_idx': 11,
|
56 |
+
'predicate': 'beginning',
|
57 |
+
'predicate_detector_prediction': True,
|
58 |
+
'predicate_detector_probability': 0.8937646150588989,
|
59 |
+
'verb_form': 'begin'},
|
60 |
+
{'predicate_idx': 14,
|
61 |
+
'predicate': 'destruction',
|
62 |
+
'predicate_detector_prediction': True,
|
63 |
+
'predicate_detector_probability': 0.8501205444335938,
|
64 |
+
'verb_form': 'destruct'},
|
65 |
+
{'predicate_idx': 18,
|
66 |
+
'predicate': 'construction',
|
67 |
+
'predicate_detector_prediction': True,
|
68 |
+
'predicate_detector_probability': 0.7022264003753662,
|
69 |
+
'verb_form': 'construct'}]]
|
70 |
+
```
|
71 |
+
```json
|
72 |
+
[[{'predicate_idx': 1, 'predicate': 'construction', 'verb_form': 'construct'},
|
73 |
+
{'predicate_idx': 11, 'predicate': 'beginning', 'verb_form': 'begin'},
|
74 |
+
{'predicate_idx': 14, 'predicate': 'destruction', 'verb_form': 'destruct'}]]
|
75 |
+
```
|
76 |
+
|
77 |
+
## Cite
|
78 |
+
|
79 |
+
```latex
|
80 |
+
@inproceedings{klein2020qanom,
|
81 |
+
title={QANom: Question-Answer driven SRL for Nominalizations},
|
82 |
+
author={Klein, Ayal and Mamou, Jonathan and Pyatkin, Valentina and Stepanov, Daniela and He, Hangfeng and Roth, Dan and Zettlemoyer, Luke and Dagan, Ido},
|
83 |
+
booktitle={Proceedings of the 28th International Conference on Computational Linguistics},
|
84 |
+
pages={3069--3083},
|
85 |
+
year={2020}
|
86 |
+
}
|
87 |
+
```
|