kleinay commited on
Commit
c05f63a
·
1 Parent(s): b50e149

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - pytorch
6
+ - BERT
7
+ - token-classification
8
+ - nominalizations
9
+ datasets:
10
+ - kleinay/qanom
11
+ ---
12
+
13
+ # Nominalization Detector
14
+
15
+ This model identifies "predicative nominalizations", that is, nominalizations that carry an eventive (or "verbal") meaning in context. It is a `bert-base-cased` pretrained model, fine-tuned for token classification on top of the "nominalization detection" task as defined and annotated by the QANom project [(Klein et. al., COLING 2020)](https://www.aclweb.org/anthology/2020.coling-main.274/).
16
+
17
+ ## Task Description
18
+
19
+ The model is trained as a binary classifier, classifying candidate nominalizations.
20
+ The candidates are extracted using a POS tagger (filtering common nouns) and additionally lexical resources (e.g. WordNet and CatVar), filtering nouns that have (at least one) derivationally-related verb. In the QANom annotation project, these candidates are given to annotators to decide whether they carry a "verbal" meaning in the context of the sentence. The current model reproduces this binary classification.
21
+
22
+ ## Usage
23
+
24
+ The candidate extraction algorithm is implemented inside the `qanom` package - see the README in the [QANom github repo](https://github.com/kleinay/QANom) for full documentation. The `qanom` package is also available via `pip install qanom`.
25
+
26
+ For ease of use, we encapsulated the full nominalization detection pipeline (i.e. candidate extraction + predicate classification) in the `qanom.nominalization_detector.NominalizationDetector` class, which internally utilize this `nominalization-candidate-classifier`:
27
+
28
+ ```python
29
+ from qanom.nominalization_detector import NominalizationDetector
30
+ detector = NominalizationDetector()
31
+
32
+ raw_sentences = ["The construction of the officer 's building finished right after the beginning of the destruction of the previous construction ."]
33
+
34
+ print(detector(raw_sentences, return_all_candidates=True))
35
+ print(detector(raw_sentences, threshold=0.75, return_probability=False))
36
+ ```
37
+
38
+ Outputs:
39
+ ```json
40
+ [[{'predicate_idx': 1,
41
+ 'predicate': 'construction',
42
+ 'predicate_detector_prediction': True,
43
+ 'predicate_detector_probability': 0.7626778483390808,
44
+ 'verb_form': 'construct'},
45
+ {'predicate_idx': 4,
46
+ 'predicate': 'officer',
47
+ 'predicate_detector_prediction': False,
48
+ 'predicate_detector_probability': 0.19832570850849152,
49
+ 'verb_form': 'officer'},
50
+ {'predicate_idx': 6,
51
+ 'predicate': 'building',
52
+ 'predicate_detector_prediction': True,
53
+ 'predicate_detector_probability': 0.5794129371643066,
54
+ 'verb_form': 'build'},
55
+ {'predicate_idx': 11,
56
+ 'predicate': 'beginning',
57
+ 'predicate_detector_prediction': True,
58
+ 'predicate_detector_probability': 0.8937646150588989,
59
+ 'verb_form': 'begin'},
60
+ {'predicate_idx': 14,
61
+ 'predicate': 'destruction',
62
+ 'predicate_detector_prediction': True,
63
+ 'predicate_detector_probability': 0.8501205444335938,
64
+ 'verb_form': 'destruct'},
65
+ {'predicate_idx': 18,
66
+ 'predicate': 'construction',
67
+ 'predicate_detector_prediction': True,
68
+ 'predicate_detector_probability': 0.7022264003753662,
69
+ 'verb_form': 'construct'}]]
70
+ ```
71
+ ```json
72
+ [[{'predicate_idx': 1, 'predicate': 'construction', 'verb_form': 'construct'},
73
+ {'predicate_idx': 11, 'predicate': 'beginning', 'verb_form': 'begin'},
74
+ {'predicate_idx': 14, 'predicate': 'destruction', 'verb_form': 'destruct'}]]
75
+ ```
76
+
77
+ ## Cite
78
+
79
+ ```latex
80
+ @inproceedings{klein2020qanom,
81
+ title={QANom: Question-Answer driven SRL for Nominalizations},
82
+ author={Klein, Ayal and Mamou, Jonathan and Pyatkin, Valentina and Stepanov, Daniela and He, Hangfeng and Roth, Dan and Zettlemoyer, Luke and Dagan, Ido},
83
+ booktitle={Proceedings of the 28th International Conference on Computational Linguistics},
84
+ pages={3069--3083},
85
+ year={2020}
86
+ }
87
+ ```