diff --git "a/index/docstore.json" "b/index/docstore.json" --- "a/index/docstore.json" +++ "b/index/docstore.json" @@ -1 +1 @@ -{"docstore/data": {"3e9bf844-0a4e-4de1-8be3-8a00f47f9be1": {"__data__": {"id_": "3e9bf844-0a4e-4de1-8be3-8a00f47f9be1", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "13560595-482a-49aa-a22e-445536e10517", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "e17811e7213cf8210798fac0f1cdd533bbdb0c93d1d79ee941cc306fb0e97a7f", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "d950eb15-82e3-4c1c-b8bb-d5a7249aadae", "node_type": "1", "metadata": {}, "hash": "9361f3141db96314f35b9d7266cfa9e85487d4f1f7800286de2823bf579c157f", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "arXiv:2105.07464v6 [cs.CL] 1 Sep 2021\n\n FEW-NERD: A Few-shot Named Entity Recognition Dataset\n Ning Ding1, Pengjun Xie2, Hai-Tao Zheng33\u2217, Xiaobin Wang2,\n Xu Han1,3\u2217 , Guangwei Xu2\u2217, Yulin Chen\u2020, Zhiyuan Liu1\u2020\n 1Department of Computer Science and Technology, Tsinghua University\n 2Alibaba Group, 3Shenzhen International Graduate School, Tsinghua University\n {dingn18, yl-chen17, hanxu17}@mails.tsinghua.edu.cn\n {kunka.xgw, xuanjie.wxb, chengchen.xpj}@alibaba-inc.com\n {zheng.haitao}@sz.tsinghua.edu.cn, {liuzy}@tsinghua.edu.cn\n https://ningding97.github.io/fewnerd/\n\n Abstract\n Recently, considerable literature has grown up\n\n\n around the theme of few-shot named entity\n recognition (NER), but little published bench-\n\n mark data specifically focused on the practical\n and challenging task. Current approaches col-\n lect existing supervised NER datasets and re-\n organize them into the few-shot setting for em-\n pirical study. These strategies conventionally\n aim to recognize coarse-grained entity types\n\n\n with few examples, while in practice, most\n unseen entity types are fine-grained. In this\n\n paper, we present FEW-NERD, a large-scale\n\n\n human-annotated few-shot NER dataset with\n\n a hierarchy of 8 coarse-grained and 66 fine-\n grained entity types. FEW-NERD consists of\n 188,238 sentences from Wikipedia, 4,601,160\n words are included and each is annotated as\n context or a part of a two-level entity type.\n To the best of our knowledge, this is the first\n few-shot NER dataset and the largest human-\n crafted NER dataset. We construct bench-\n mark tasks with different emphases to com-\n prehensively assess the generalization capabil-\n ity of models. Extensive empirical results and\n analysis show that FEW-NERD is challeng-\n ing and the problem requires further research.\n ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 2789, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "d950eb15-82e3-4c1c-b8bb-d5a7249aadae": {"__data__": {"id_": "d950eb15-82e3-4c1c-b8bb-d5a7249aadae", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "13560595-482a-49aa-a22e-445536e10517", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "e17811e7213cf8210798fac0f1cdd533bbdb0c93d1d79ee941cc306fb0e97a7f", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "3e9bf844-0a4e-4de1-8be3-8a00f47f9be1", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "576796b57e689b22de15d34b82d5053b9d5bdf4a99467e23748720950f46b830", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "We make FEW-NERD public at https://\n ningding97.github.io/fewnerd/. 1\n1 Introduction\nNamed entity recognition (NER), as a fundamental\ntask in information extraction, aims to locate and\nclassify named entities from unstructured natural\nlanguage. A considerable number of approaches\nequipped with deep neural networks have shown\npromising performance (Chiu and Nichols, 2016)\non fully supervised NER. Notably, pre-trained lan-\nguage models (e.g., BERT (Devlin et al., 2019a))\n \u2217 equal contributions\n \u2020 corresponding authors\n 1The baselines are available at https://github.\ncom/thunlp/Few-NERD\n AirportOtherPaintingFilmMusicBroadcast\n OtherLibraryHotelHospital TrainShipGameCarSoftwareFoodOther\n War SoldierScholar\n Disaster AirplanePolitician\n Election Other\n Director\n BuildingArt Athlete\n Event\n Product Artist/author\n Person\n GPE Actor\n Location\n OrganizationMISC LivingthingMedical\n LanguageLaw\n EDUdegreeGod\n ChemicalCurrencyDisease\n AstronomyBiology\n Award\n PoliticalpartySportsleagueSportsteam\n ShowORG\n BodiesofwaterIsland GovernmentReligion\n MountainOtherPark MediaOther\n Transit\n Company\n Education\n\n Figure 1: An overview of FEW-NERD. The inner cir-\n cle represents the coarse-grained entity types and the\n outer circle represents the fine-grained entity types,\n some types are denoted by abbreviations.\n\n\n\nwith an additional classifier achieve significant suc-\n cess on this task and gradually become the base\n paradigm. Such studies demonstrate that deep mod-\n els could yield remarkable results accompanied by\n a large amount of annotated corpora.\n With the emerging of knowledge from various\n domains, named entities, especially ones that need\n professional knowledge to understand, are diffi-\n cult to be manually annotated on a large scale.\n Under this circumstance, studying NER systems\n that could learn unseen entity types with few ex-\n amples, i.e., few-shot NER, plays a critical role\n in this area. There is a growing body of litera-\n ture that recognizes the importance of few-shot\n NER and contributes to the task (Hofer et al., 2018;\n Fritzler et al., 2019; Yang and Katiyar, 2020; Li\n et al., 2020a; Huang et al., 2020). Unfortunately,\n there is still no dataset specifically designed for", "mimetype": "text/plain", "start_char_idx": 2789, "end_char_idx": 6880, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "89d6be11-da41-4dd7-899f-1340a92c4cd2": {"__data__": {"id_": "89d6be11-da41-4dd7-899f-1340a92c4cd2", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "19de4927-f29a-4b3f-8694-79ef720fc706", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c9801241be9f2302d87263774a128abb94c2e9fa41f952966c8a9af0d2a7e86e", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "3a410009-58ad-4f35-9627-bfaa50dd56d8", "node_type": "1", "metadata": {}, "hash": "4c97d26a250d3d4ce5e994115b6ac715f8f24719ea9e0fae9bb36e126a07a358", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "few-shot NER. Hence, these methods collect pre-\nviously proposed supervised NER datasets and re-\norganize them into a few-shot setting. Common\noptions of datasets include OntoNotes (Weischedel\net al., 2013), CoNLL\u201903 (Tjong Kim Sang, 2002),\nWNUT\u201917 (Derczynski et al., 2017), etc. These\nresearch efforts of few-shot learning for named\nentities mainly face two challenges: First, most\ndatasets used for few-shot learning have only 4-\n18 coarse-grained entity types, making it hard to\nconstruct an adequate variety of \u201cN-way\u201d meta-\ntasks and learn correlation features. And in real-\nity, we observe that most unseen entities are fine-\ngrained. Second, because of the lack of benchmark\ndatasets, the settings of different works are inconsis-\ntent (Huang et al., 2020; Yang and Katiyar, 2020),\nleading to unclear comparisons. To sum up, these\nmethods make promising contributions to few-shot\nNER, nevertheless, a specific dataset is urgently\nneeded to provide a unified benchmark dataset for\nrigorous comparisons.\n\n To alleviate the above challenges, we present a\nlarge-scale human-annotated few-shot NER dataset,\nFEW-NERD , which consists of 188.2k sentences\nextracted from the Wikipedia articles and 491.7k\nentities are manually annotated by well-trained an-\nnotators (Section 4.3). To the best of our knowl-\nedge, FEW-NERD is the first dataset specially con-\nstructed for few-shot NER and also one of the\nlargest human-annotated NER dataset (statistics\nin Section 5.1). We carefully design an annota-\ntion schema of 8 coarse-grained entity types and\n66 fine-grained entity types by conducting several\npre-annotation rounds. (Section 4.1). In contrast,\nas the most widely-used NER datasets, CoNLL\nhas 4 entity types, WNUT\u201917 has 6 entity types\nand OntoNotes has 18 entity types (7 of them are\nvalue types). The variety of entity types makes\nFEW-NERD contain rich contextual features with\na finer granularity for better evaluation of few-\nshot NER. The distribution of the entity types in\nFEW-NERD is shown in Figure 1, more details are\nreported in Section 5.1. We conduct an analysis of\nthe mutual similarities among all the entity types\nof FEW-NERD to study knowledge transfer (Sec-\ntion 5.2). The results show that our dataset can\nprovide sufficient correlation information between\ndifferent entity types for few-shot learning.\n For benchmark settings, we design three tasks\non the basis of FEW-NERD , including a stan-\ndard supervised task (FEW-NERD (SUP)) and two\n few-shot tasks ( FEW-NERD-INTRA) and FEW-\n NRTD (INTER)), for more details see Section 6.\n FEW-NERD (SUP), FEW-NERD (INTRA), and\n FEW-NERD (INTER) assess instance-level gener-\n alization, type-level generalization and knowledge\n transfer of NER methods, respectively. We im-\n plement models based on the recent state-of-the-\n art approaches and evaluate them on FEW-NERD\n(Section 7). ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 2862, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "3a410009-58ad-4f35-9627-bfaa50dd56d8": {"__data__": {"id_": "3a410009-58ad-4f35-9627-bfaa50dd56d8", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "19de4927-f29a-4b3f-8694-79ef720fc706", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c9801241be9f2302d87263774a128abb94c2e9fa41f952966c8a9af0d2a7e86e", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "89d6be11-da41-4dd7-899f-1340a92c4cd2", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "cb4d43621c35c5b8f1ff16218a1e21d6f654171eea545a23bef5b185dcae6c5d", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "08a4de1b-58e0-4975-a68c-99b215ddca75", "node_type": "1", "metadata": {}, "hash": "14e4f82f60f388fa669cf79d4ab333062b043b8a018d325a887feee503968e1d", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "And empirical results show that\n FEW-NERD is challenging on all these three set-\n tings. We also conduct sets of subsidiary experi-\n ments to analyze promising directions of few-shot\n NER. ", "mimetype": "text/plain", "start_char_idx": 2862, "end_char_idx": 3051, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "08a4de1b-58e0-4975-a68c-99b215ddca75": {"__data__": {"id_": "08a4de1b-58e0-4975-a68c-99b215ddca75", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "19de4927-f29a-4b3f-8694-79ef720fc706", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c9801241be9f2302d87263774a128abb94c2e9fa41f952966c8a9af0d2a7e86e", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "3a410009-58ad-4f35-9627-bfaa50dd56d8", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "973f10c7404cc1f6eb4b7419ecb9f9b65f7f4a1960933a0d9e77ae416d60afb4", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Hopefully, the research of few-shot NER\n could be further facilitated by FEW-NERD.\n\n 2 Related Work\n\nAs a pivotal task of information extraction, NER\n is essential for a wide range of technologies (Cui\n et al., 2017; Li et al., 2019b; Ding et al., 2019; Shen\n et al., 2020). And a considerable number of NER\n datasets have been proposed over the years. For\n example, CoNLL\u201903 (Tjong Kim Sang, 2002) is re-\n garded as one of the most popular datasets, which is\n curated from Reuters News and includes 4 coarse-\n grained entity types. Subsequently, a series of NER\n datasets from various domains are proposed (Bala-\n suriya et al., 2009; Ritter et al., 2011; Weischedel\n et al., 2013; Stubbs and Uzuner, 2015; Derczynski\n et al., 2017). These datasets formulate a sequence\n labeling task and most of them contain 4-18 entity\n types. Among them, due to the high quality and\n size, OntoNotes 5.0 (Weischedel et al., 2013) is\n considered as one of the most widely used NER\n datasets recently.\n As approaches equipped with deep neural net-\nworks have shown satisfactory performance on\n NER with sufficient supervision (Lample et al.,\n 2016; Ma and Hovy, 2016), few-shot NER has\n received increasing attention (Hofer et al., 2018;\n Fritzler et al., 2019; Yang and Katiyar, 2020; Li\n et al., 2020a). Few-shot NER is a considerably\n challenging and practical problem that could facil-\n itate the understanding of textual knowledge for\n neural model (Huang et al., 2020). Due to the lack\n of specific benchmarks of few-shot NER, current\n methods collect existing NER datasets and use dif-\n ferent few-shot settings. To provide a benchmark\n that could comprehensively assess the generaliza-\n tion of models under few examples, we annotate\n FEW-NERD. To make the dataset practical and\n close to reality, we adopt a fine-grained schema of", "mimetype": "text/plain", "start_char_idx": 3051, "end_char_idx": 4882, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "703bb83a-4aea-4eb3-85a6-086d25555ccb": {"__data__": {"id_": "703bb83a-4aea-4eb3-85a6-086d25555ccb", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7a282a7e-fbe9-4d98-a3ca-17e66c5b5818", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "9ff787cc5d317b04ad7ea97478454a58f4d9d955a18ef033743f48266cabc9ed", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "f6c2c3db-ba3c-489e-9459-e6b4f579286b", "node_type": "1", "metadata": {}, "hash": "18f27a3e388a6377a52748c1e10ebabcd503f8e045e24ab7b9b845f67d9291ee", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "entity annotation, which is inspired and modified with dense entities. Thus, as shown in Algorithm 1\nfrom previous fine-grained entity recognition stud- we adopt a N -way K\u223c2K-shot setting in our pa-\nies (Ling and Weld, 2012; Gillick et al., 2014; Choi per, the primary principle of which is to ensure that\net al., 2018; Ringland et al., 2019). each class in S contain K\u223c2K examples, effec-\n tively alleviating the limitations of sampling.\n3 Problem Formulation\n3.1 Named Entity Recognition Algorithm 1: Greedy N -way K\u223c2K-shot\n sampling algorithm\nNER is normally formulated as a sequence labeling Input: Dataset X , Label set Y, N , K\nproblem. Specifically, for an input sequence of Output: output result\ntokens x = {x1, x2, ..., x}, NER aims to assignt 1 S \u2190 \u2205; // Init the support set\neach token xi a label yi \u2208 Y to indicate either the // Init the count of entity types\ntoken is a part of a named entity (such as Person, 2 for i = 1 to N do\nOrganization, Location) or not belong to 3 Count[i] = 0 ;\nany entities (denoted as O class), Y being a set of\npre-defined entity-types. 4 repeat\n 5 Randomly sample (x, y) \u2208 X ;\n3.2 Few-shot Named Entity Recognition 6 Compute |Count| and Counti after\nN -way K-shot learning is conducted by iteratively update ;\nconstructing episodes. For each episode in train- 7 if |Count| > N or \u2203Count[i] > 2K\ning, N classes (N -way) and K examples (K-shot) then\nfor each class are sampled to build a support set 8 Continue ;\nStrain = {x(i), y(i)}i=1 N \u2217K, and K\u2032 examples for 9 elseS = S \u22c3(x, y) ;\neach of N classes are sampled to construct a query 10\nset Qtrain = {x(j), y(j)}j=1 \u2032, and S \u22c2 Q = \u2205.N \u2217K 11 Update Counti ;\nFew-shot learning systems are trained by predict- 12 until Counti \u2265 K for i = 1 to N;\ning labels of query set Qtrain with the information\nof support set Strain. The supervision of Strain and\nQtrain are available in training. In the testing pro- 4 Collection of FEW-NERD\ncedure, all the classes are unseen in the training\nphase, and by using few labeled examples of sup- 4.1 Schema of Entity Types\nport set Stest, few-shot learning systems need to The primary goal of FEW-NERD is to construct a\n(S \u22c2 Q = \u2205). However, in the sequence labeling\n make predictions of the unlabeled query set Qtest fine-grained dataset that could specifically be used\n in the few-shot NER scenario. Hence, schemas\nproblem like NER, a sentence may contain multiple of traditional NER datasets such as CoNLL\u201903,\nentities from different classes. And it is imperative OntoNotes that only contain 4-18 coarse-grained\nto sample examples in sentence-level since contex- types could not meet the requirements. The schema\ntual information is crucial for sequence labeling of FEW-NERD is inspired by FIGER (Ling and\nproblems, especially for NER. Thus the sampling Weld, 2012), which contains 112 entity tags with\nis more difficult than conventional classification good coverage. On this basis, we make some mod-\ntasks like relation extraction (Han et al., 2018). ifications according to the practical situation. It is\n Some previous works (Yang and Katiyar, 2020; worth noting that FEW-NERD focuses on named\nLi et al., 2020a) use greedy-based sampling strate- entities, omitting value/numerical/time/date entity\ngies to iteratively judge if a sentence could be types (Weischedel et al., 2013; Ringland et al.,\nadded into the support set, but the limitation be- 2019) like Cardinal, Day, Percent, etc.\ncomes gradually strict during the sampling. For First, we modify the FIGER schema into a\nexample, when it comes to a 5-way 5-shot setting, two-level hierarchy to incorporate simple do-\nif the support set already had 4 classes with 5 exam- main information (Gillick et al., 2014). The\nples and 1 class with 4 examples, the next sampled coarse-grained types are {Person, Location,\nsentence must only contain the specific one entity Organization, Art, Building, Product,\nto strictly meet the requirement of 5 way 5 shot. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 6523, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "f6c2c3db-ba3c-489e-9459-e6b4f579286b": {"__data__": {"id_": "f6c2c3db-ba3c-489e-9459-e6b4f579286b", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7a282a7e-fbe9-4d98-a3ca-17e66c5b5818", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "9ff787cc5d317b04ad7ea97478454a58f4d9d955a18ef033743f48266cabc9ed", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "703bb83a-4aea-4eb3-85a6-086d25555ccb", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c1fea1f263e861918b8164dd5899ca02120368c82ba7b941e44eca96b94f63f8", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "It Event, Miscellaneous }. Then we statisti-\nis not suitable for FEW-NERD since it is annotated cally count the frequency of entity types in the", "mimetype": "text/plain", "start_char_idx": 6523, "end_char_idx": 6759, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "547f541a-ed82-4d22-af00-51a95dc3f0e1": {"__data__": {"id_": "547f541a-ed82-4d22-af00-51a95dc3f0e1", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "48384cbb-f501-4add-bb32-198c5bc033a8", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "4c2347194cf32acf5b01836de9186cd908d8886adcabbe90c5ec0e8fdfdcafdd", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "5bd4a82d-022c-47e6-9bbb-8bdeef20f515", "node_type": "1", "metadata": {}, "hash": "4f7a69ed7a63d5b5d88d8e0c864f4c8853a1ff280af54e1480ad198bda04f518", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " automatically annotated FIGER. By removing en-\n tity types with low frequency, there are 80 fine-\n grained types remaining. Finally, to ensure the\n practicality of the annotation process, we conduct\n rounds of pre-annotation and make further mod-\n ifications to the schema. For example, we com-\n bine the types of Country, Province/State,\n City, Restrict into a class GPE, since it is\n difficult to distinguish these types only based on\n context (especially GPEs at different times). For\n\n another example, we create a Person-Scholar\n type, because in the pre-annotation step, we found\n that there are numerous person entities that express\n the semantics of research, such as mathematician,\n physicist, chemist, biologist, paleontologist, but\n the Figer schema does not define this kind of entity\n type. We also conduct rounds of manual denoising\n to select types with truly high frequency.\n Consequently, the finalized schema of FEW-\n NERD includes 8 coarse-grained types and 66\n fine-grained types, which is detailedly shown ac-\n companied by selected examples in Appendix.\n\n 4.2 Paragraph Selection\n\nThe raw corpus we use is the entire Wikipedia\n dump in English, which has been widely used in\n constructions of NLP datasets (Han et al., 2018;\nYang et al., 2018; Wang et al., 2020). Wikipedia\n contains a large variety of entities and rich contex-\n tual information for each entity.\n FEW-NERD is annotated in paragraph-level,\n and it is crucial to effectively select paragraphs\nwith sufficient entity information. Moreover, the\n category distribution of the data is expected to\n be balanced since the data is applied in a few-\n shot scenario. It is also a key difference between\n FEW-NERD and previous NER datasets, whose\n entity distributions are usually considerably uneven.\n In order to do so, we construct a dictionary for each\n fine-grained type by automatically collecting entity\n mentions annotated in FIGER, then the dictionaries\n are manually denoised. We develop a search engine\n to retrieve paragraphs including entity mentions of\n the distant dictionary. For each entity, we choose\n10 paragraphs and construct a candidate set. Then,\n for each fine-grained class, we randomly select\n1000 paragraphs for manual annotation. Eventu-\n ally, 66,000 paragraphs are selected, consisting of\n 66 fine-grained entity types, and each paragraph\n contains an average of 61.3 tokens.\n ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 2398, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "5bd4a82d-022c-47e6-9bbb-8bdeef20f515": {"__data__": {"id_": "5bd4a82d-022c-47e6-9bbb-8bdeef20f515", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "48384cbb-f501-4add-bb32-198c5bc033a8", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "4c2347194cf32acf5b01836de9186cd908d8886adcabbe90c5ec0e8fdfdcafdd", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "547f541a-ed82-4d22-af00-51a95dc3f0e1", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "4f67430b5ae0a067bfe8d8399de04e224234e324f5e4e2ca18e032a9ab9275f4", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "de0a20d6-b6dc-4ff3-8b6e-f6ad19472b08", "node_type": "1", "metadata": {}, "hash": "e923b7225a47572e82d13980d6e364dd38b01d009812324d8c368525add1c594", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Paragraph\n London[Art-Music] is the fifth al-\n bum by the British[Loc-GPE] rock band\n Jesus Jones[Org-ShowOrg] in 2001 through\n Koch Records[Org-Company]. Following the com-\n mercial failure of 1997\u2019s \u201dAlready[Art-Music]\u201d\n which led to the band and EMI[Org-Company] part-\n ing ways, the band took a hiatus before regathering\n for the recording of \u201dLondon[Art-Music]\u201d for\n Koch/Mi5 Recordings, with a more alternative\n rock approach as opposed to the techno sounds\n on their previous albums. The album had low-key\n promotion, initially only being released in the\n United States[Loc-GPE]. ", "mimetype": "text/plain", "start_char_idx": 2398, "end_char_idx": 3085, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "de0a20d6-b6dc-4ff3-8b6e-f6ad19472b08": {"__data__": {"id_": "de0a20d6-b6dc-4ff3-8b6e-f6ad19472b08", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "48384cbb-f501-4add-bb32-198c5bc033a8", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "4c2347194cf32acf5b01836de9186cd908d8886adcabbe90c5ec0e8fdfdcafdd", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "5bd4a82d-022c-47e6-9bbb-8bdeef20f515", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "7431dd7f03f5d180517e133d870a0219c77f32d2552a9b6218cce714a9867162", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Two EP\u2019s were released\n from the album, \u201dNowhere Slow[Art-Music]\u201d and\n \u201dIn the Face Of All This[Art-Music]\u201d.\n\n Table 1: An annotated case of FEW-NERD\n\n 4.3 Human Annotation\n As named entities are expected to be context-\n dependent, annotation of named entities is com-\n plicated, especially with such a large number of\n entity types. For example, shown in Table 1,\n\u201cLondon is the fifth album by the British rock\n band Jesus Jones..\u201d, where London should be an-\n notated as an entity of Art-Music rather than\n Location-GPE. Such a situation requires that\n the annotator has basic linguistic training and can\n make reasonable judgments based on the context.\n Annotators of FEW-NERD include 70 annota-\n tors and 10 experienced experts. All the annotators\n have linguistic knowledge and are instructed with\n detailed and formal annotation principles. Each\n paragraph is independently annotated by two well-\n trained annotators. Then, an experienced expert\n goes over the paragraph for possible wrong or omis-\n sive annotations, and make the final decision. With\n 70 annotators participated, each annotator spends\n an average of 32 hours during the annotation pro-\n cess. We ensure that all the annotators are fairly\n compensated by market price according to their\n workload (the number of examples per hour). The\n data is annotated and submitted in batches, and\n each batch contains 1000\u223c3000 sentences. To en-\n sure the quality of FEW-NERD , for each batch\n of data, we randomly select 10% sentences and\n conduct double-checking. If the accuracy of the an-\n notation is lower than 95 % (measured in sentence-\n level), the batch will be re-annotated. Furthermore,\n we calculate the Cohen\u2019s Kappa (Cohen, 1960) to\n measure the aggreements between two annotators,", "mimetype": "text/plain", "start_char_idx": 3085, "end_char_idx": 4866, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "39abd0c8-e1f5-4ee3-8da1-537353646ec6": {"__data__": {"id_": "39abd0c8-e1f5-4ee3-8da1-537353646ec6", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "297587ad-1bff-4027-9c1e-7b5732f0d283", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "63f1d00453bd4c3bd83a691d994f62f1c05e4d9e110966dee0998fbd8a27b3c0", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "ec59971c-cf54-40e2-9a55-c5de0cdbea76", "node_type": "1", "metadata": {}, "hash": "6c9ea7767d7a070032e2a32ebc07b5b3fac0e8ae19e07604840c27e16a867207", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Art Building Event Loc Org MISC Person Product\n\n the result is 76.44%, which indicates a high degree 0\n 1\n 2 z 400\n 3\n\n of consistency. 4\n 5\n\n 6 300\n 7\n\n 8\n 9\n\n 10\n\n 11 200\n 12\n 13\n\n\n\n\n 5 Data Analysis 14\n 15\n 16\n 17 100\n 18\n\n 19\n 20\n\n 21 0\n\n\n\n\n 5.1 Size and Distribution of FEW-NERD 22\n 23\n 24\n 25\n 26\n\n 27\n FEW-NERD is not only the first few-shot dataset 28\n 29\n\n 30\n\n 31\n 32\n\n\n\n\n\n\n\n for NER, but it also is one of the biggest human- 33\n 34\n 35\n 36\n 37\n\n\n\n\n\n\n\n annotated NER datasets. We report the the statistics 38\n 39\n 40\n 41\n 42\n\n\n\n\n of the number of sentences, tokens, entity types and 43\n 44\n 45\n 46\n 47\n\n\n\n\n\n\n\n entities of FEW-NERD and several widely-used 48\n 49\n 50\n 51\n 52\n\n\n\n\n\n\n\n NER datasets in Table 2, including CoNLL\u201903, 53\n 54\n 55\n 56\n 57\n\n\n\n\n WikiGold, OntoNotes 5.0, WNUT\u201917 and I2B2. 58\n 59\n 60\n 61\n 62\n\n\n\n\n We observe that although OntoNotes and I2B2 are 63\n 64\n 65\n x\n considered as large-scale datasets, FEW-NERD is\n significantly larger than all these datasets. More- Figure 2: A heat map to illustrate knowledge correla-\n over, FEW-NERD contains more entity types and tions among type in FEW-NERD, each small colored\n annotated entities. As introduced in Section 4.2, square represents the similarity of two entity types.\n FEW-NERD is designed for few-shot learningy0\n\n and the distribution could not be severely uneven.123ties, most of them across coarse-grained types share45Hence, we balance the dataset by selecting para-678little correlations due to distinct contextual features.\n9\n10\n\n graphs through a distant dictionary. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 24634, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "ec59971c-cf54-40e2-9a55-c5de0cdbea76": {"__data__": {"id_": "ec59971c-cf54-40e2-9a55-c5de0cdbea76", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "297587ad-1bff-4027-9c1e-7b5732f0d283", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "63f1d00453bd4c3bd83a691d994f62f1c05e4d9e110966dee0998fbd8a27b3c0", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "39abd0c8-e1f5-4ee3-8da1-537353646ec6", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "946e0fda6aa5de3de03554f4cd8f69a4c4c4cc39c9c6c3c0d18ab2f3d2397579", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "ce1695d1-7872-48ae-8589-5b5ed5355234", "node_type": "1", "metadata": {}, "hash": "7eea7485e9ce6bab3c3997dfa4a651cba7e8435dca7180317dc7ada601918b86", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "The data distri-111213This result is consistent with intuition. Moreover,1415\n\n bution is illustrated in Figure 1, where Location161718it inspires our benchmark-setting from the perspec-1920(especially GPE) and Person are entity types with212223tive of knowledge transfer (see Section 6.2).\n24\n25\n\n the most examples. Although utilizing a distant2627282930dictionary to balance the entity types could not313233\n34\n35\n produce a fully balanced data distribution, it still 6 Benchmark Settings3637383940ensures that each fine-grained type has a sufficient414243\n44\n45\n number of examples for few-shot learning. We collect and manually annotate 188,238 sen-4647484950tences with 66 fine-grained entity types in to-515253\n\n 5.2 Knowledge Correlations among Types5455tal, which makes FEW-NERD one of the largest5657585960\n Knowledge transfer is crucial for few-shot learn-\n relations among all the entity types of FEW-NERD ,\nwe conduct an empirical study about entity type\n similarities in this section. We train a BERT-Tagger\n(details in Section 7.1) of 70% arbitrarily selected\n data on FEW-NERD and use 10% data to select the\n model with best performance (it is actually the set-\n ting of FEW-NERD (SUP) in Section 6.1). After\n obtaining a contextualized encoder, we produce en-\n tity mention representations of the remaining 20%\n data of FEW-NERD. Then, for each fine-grained\n types, we randomly select 100 instances of entity\n embeddings. ", "mimetype": "text/plain", "start_char_idx": 24634, "end_char_idx": 26795, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "ce1695d1-7872-48ae-8589-5b5ed5355234": {"__data__": {"id_": "ce1695d1-7872-48ae-8589-5b5ed5355234", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "297587ad-1bff-4027-9c1e-7b5732f0d283", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "63f1d00453bd4c3bd83a691d994f62f1c05e4d9e110966dee0998fbd8a27b3c0", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "ec59971c-cf54-40e2-9a55-c5de0cdbea76", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "7164356aa9e8f2df54101a489109a375b5e85b1cb43ad0fb15ccd21f4ecc4ffe", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "We mutually compute the dot product\n among entity embeddings for each type two by two\n and average them to obtain the similarities among\n types, which is illustrated in Figure 2. We observe\n that entity types shared identical coarse-grained\n types typically have larger similarities, resulting in\n an easier knowledge transfer. In contrast, although\n some of the fine-grained types have large similari-\n\n human-annotated NER datasets. To comprehen-6162636465ing (Li et al., 2019a). To explore the knowledge cor-sively exploit such rich information of entities and\n contexts, as well as evaluate the generalization of\n models from different perspectives, we construct\n three tasks based on FEW-NERD (Statistics are\n reported in Table 3).\n\n 6.1 Standard Supervised NER\n\n FEW-NERD (SUP) We first adopt a standard su-\n pervised setting for NER by randomly splitting\n 70% data as the training data, 10% as the validation\n data and 20% as the testing data. In this setting,\n the training set, dev set, and test set contain the\nwhole 66 entity types. Although the supervised\n setting is not the ultimate goal of the construction\n of FEW-NERD, it is still meaningful to assess the\n instance-level generalization for NER models. As\n shown in Section 6.2, due to the large number of\n entity types, FEW-NERD is very challenging even\n in a standard supervised setting.", "mimetype": "text/plain", "start_char_idx": 26795, "end_char_idx": 28280, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "5a2138f4-d397-4d63-9cac-d45d9fe4de7e": {"__data__": {"id_": "5a2138f4-d397-4d63-9cac-d45d9fe4de7e", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "12b98149-050c-4c93-81db-ea720272647e", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "b2c3b4fde5e3823dcb32f15b37d9702a9a73c0b6c0aa3c2e38e38431ce7159d5", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "a2435907-a143-49c8-b483-ee3e8a02ba74", "node_type": "1", "metadata": {}, "hash": "5612a97dd51458cf4337501a25a225772f991690c117abe21e7cf830660c28a7", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Datasets # Sentences # Tokens # Entities # Entity Types Domain\n CoNLL\u201903 (Tjong Kim Sang, 2002) 22.1k 301.4k 35.1k 4 Newswire\n WikiGold (Balasuriya et al., 2009) 1.7k 39k 3.6k 4 General\n OntoNotes (Weischedel et al., 2013) 103.8k 2067k 161.8k 18 General\n WNUT\u201917 (Derczynski et al., 2017) 4.7k 86.1k 3.1k 6 SocialMedia\n I2B2 (Stubbs and Uzuner, 2015) 107.9k 805.1k 28.9k 23 Medical\n FEW-NERD 188.2k 4601.2k 491.7k 66 General\n\n Table 2: Statistics of FEW-NERD and multiple widely used NER datasets. For CoNLL\u201903, WikiGold, and I2B2,\n we report the statistics in the original paper. For OntoNotes 5.0 (LDC2013T19), we download and count all the data\n (English) annotated by the NER labels, some works use different split of OntoNotes 5.0 and may report different\n statistics. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 1479, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "a2435907-a143-49c8-b483-ee3e8a02ba74": {"__data__": {"id_": "a2435907-a143-49c8-b483-ee3e8a02ba74", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "12b98149-050c-4c93-81db-ea720272647e", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "b2c3b4fde5e3823dcb32f15b37d9702a9a73c0b6c0aa3c2e38e38431ce7159d5", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "5a2138f4-d397-4d63-9cac-d45d9fe4de7e", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "1a6d32e20e86e7ff943edededb2390c3598451572dfb8a49fa89230f4d3c598c", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "b3793ecc-96fc-4f50-bc61-21be9868e23b", "node_type": "1", "metadata": {}, "hash": "418385c5f9279f5de08b8b2d3dd2ffdec12c13f3c4ff1b2fa39f93b5eddeaa12", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "For WNUT\u201917, we download and count all the data.\n\n6.2 Few-shot NER\nThe core intuition of few-shot learning is to learn\nnew classes from few examples. Hence, we first\nsplit the overall entity set (denoted as E) into three\nmutually disjoint subsets, respectively denoted\nas Etrain , Edev, Etest, and Etrain\u22c3 Edev\u22c3 Etest = E,\nEtrain\u22c2Edev\u22c2 Etest = \u2205. ", "mimetype": "text/plain", "start_char_idx": 1479, "end_char_idx": 1829, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "b3793ecc-96fc-4f50-bc61-21be9868e23b": {"__data__": {"id_": "b3793ecc-96fc-4f50-bc61-21be9868e23b", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "12b98149-050c-4c93-81db-ea720272647e", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "b2c3b4fde5e3823dcb32f15b37d9702a9a73c0b6c0aa3c2e38e38431ce7159d5", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "a2435907-a143-49c8-b483-ee3e8a02ba74", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "8dae53115baed88ef4682b02b0e2d08e94e24a4984ce6b08396678c1a7c97ce7", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Note that all the entity\n\ntypes are fine-grained types. Under this circum-\nstance, instances in train, dev and test datasets only\nconsist of instances with entities in Etrain, Edev, Etest\nrespectively. However, NER is a sequence labeling\nproblem, and it is possible that a sentence contains\nseveral different entities. To avoid the observation\nof new entity types in the training phase, we replace\nthe labels of entities that belong to Etest with O in\nthe training set. Similarly, in the test set, entities\nthat belongs to Etrain and Edev are also replaced by\nO. Based on this setting, we develop two few-shot\nNER tasks adopting different splitting strategies.\nFEW-NERD (INTRA) Firstly, we construct\nEtrain, Edev and Etest according to the coarse-grained\ntypes. In other words, all the entities in differ-\nent sets belong to different coarse-grained types.\nIn the basis of the principle that we should re-\nplace as few as possible entities with O, we\nassign all the fine-grained entity types belong-\ning to People, MISC, Art, Product to\nEtrain, all the fine-grained entity types belonging\nto Event, Building to Edev, and all the fine-\ngrained entity types belonging to ORG, LOC to\nEtest, respectively. Based on Figure 2, in this set-\nting, the training set, dev set and test set share little\nknowledge, making it a difficult benchmark.\nFEW-NERD (INTER) In this task, although all\nthe fine-grained entity types are mutually disjoint\nin Etrain, Edev, the coarse-grained types are shared.\nSpecifically, we roughly assign 60% fine-grained\ntypes of all the 8 coarse-grained types to Etrain, 20%\nto Edev and 20% Etest, respectively. The intuition of\n Split #Train #Dev #Test\n FEW-NERD (SUP) 131,767 18,824 37,648\n FEW-NERD (INTRA) 99,519 19,358 44,059\n FEW-NERD (INTER) 130,112 18,817 14,007\n\nTable 3: Statistics of train, dev and test sets for three\ntasks of FEW-NERD. We remove the sentences with\nno entities for the few-shot benchmarks.\n\nthis setting is to explore if the coarse information\nwill affect the prediction of new entities.\n\n7 Experiments\n\n7.1 Models\nRecent studies show that pre-trained language mod-\nels with deep transformers (e.g., BERT (Devlin\net al., 2019a)) have become a strong encoder for\nNER (Li et al., 2020b). We thus follow the em-\npirical settings and use BERT as the backbone en-\ncoder in our experiments. We denote the parame-\nters as \u03b8 and the encoder as f\u03b8. Given a sequence\nx = {x1, ..., xn}, for each token xi, the encoder\nproduces contextualized representations as:\n\n h = [h1, ..., hn] = f\u03b8([x1, ..., xn]). (1)\n\nSpecifically, we implement four BERT-based mod-\nels for supervised and few-shot NER, which\nare BERT-Tagger (Devlin et al., 2019b), Proto-\nBERT (Snell et al., 2017), NNShot (Yang and\nKatiyar, 2020) and StructShot (Yang and Katiyar,\n2020).\nBERT-Tagger As stated in Section 6.1, we\nconstruct a standard supervised task based on\nFEW-NERD , thus we implement a simple but\nstrong baseline BERT-Tagger for supervised NER.\nBERT-Tagger is built by adding a linear classifier\non top of BERT and trained with a cross-entropy\nobjective under a full supervision setting.", "mimetype": "text/plain", "start_char_idx": 1829, "end_char_idx": 5108, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "c33b63d5-7341-40f1-9016-43201810afd5": {"__data__": {"id_": "c33b63d5-7341-40f1-9016-43201810afd5", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5c930767-ef2a-434e-91d1-780d7a9deb81", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "951fa020d243279fd40ee2632cbff6d12455ac75bd998f140cc10a0c50148355", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "ecacd21e-1829-48fa-95ab-5c90846e8dd3", "node_type": "1", "metadata": {}, "hash": "0dfd5f3bd973fdb3811f8e1e9d7862cd89fb6c3011ccff576ce2dd533f7d0fab", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " ProtoBERT Inspired by achievements of meta- Datasets P R F1\n learning approaches (Finn et al., 2017; Snell et al., CoNLL\u201903 90.62 92.07 91.34\n 2017; Ding et al., 2021) on few-shot learning. OntoNotes 5.0 90.00 88.24 89.11\nThe first baseline model we implement is Proto- FEW-NERD (SUP) 65.56 (\u2193) 68.78 (\u2193) 67.13 (\u2193)\n BERT, which is a method based on prototypical\n network (Snell et al., 2017) with a backbone of Table 4: Results of BERT-Tagger on previous NER\n BERT (Devlin et al., 2019a) encoder. This ap- datasets and the supervised setting of FEW-NERD.\n proach derives a prototype z for each entity type\n by computing the average of the embeddings of the p(y|x) and solve the problem:\n tokens that share the same entity type. The compu- T\n tation is conducted in support set S. For the i-th y\u2217 = arg max\u220fp(yt|x) \u00d7 p(yt|yt\u22121). ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 2341, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "ecacd21e-1829-48fa-95ab-5c90846e8dd3": {"__data__": {"id_": "ecacd21e-1829-48fa-95ab-5c90846e8dd3", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5c930767-ef2a-434e-91d1-780d7a9deb81", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "951fa020d243279fd40ee2632cbff6d12455ac75bd998f140cc10a0c50148355", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "c33b63d5-7341-40f1-9016-43201810afd5", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "3d5e80599a4375c7886c2967fcf8a54badb3513ff488d9a03b295b8074e1fa7a", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "95509f41-b5f0-4bc4-ba2c-886ad18a6046", "node_type": "1", "metadata": {}, "hash": "2aba92c7724e7aa235b41f83c8177397425cb235110b489a8de524232eb60d41", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "(5)\n type, the prototype is denoted as zi and the support y t=1\n set is S,i \u2211f\u03b8(x).\n zi = 1i|S|x\u2208Si (2) acknowledged baseline that could produceTosum up, BERT-Tagger is a well-\n pronounced results on supervised NER. Proto-\nWhile in the query set Q, for each token x \u2208 Q, BERT, and NNShot & StructShot respectively use\nwe firstly compute the distance between x and all prototype-level and token-level similarity scores to\n the prototypes. We use the l-2 distance as the met- tackle the few-shot NER problem. These baselines\n ric function d(f\u03b8(x), z) = ||f\u03b8(x) \u2212 z||22. Then, are strong and representative models of the NER\n through the distances between x and all other pro- task. For implementation details, please refer to\n totypes, we compute the prediction probability of Appendix.\n ", "mimetype": "text/plain", "start_char_idx": 2341, "end_char_idx": 4328, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "95509f41-b5f0-4bc4-ba2c-886ad18a6046": {"__data__": {"id_": "95509f41-b5f0-4bc4-ba2c-886ad18a6046", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5c930767-ef2a-434e-91d1-780d7a9deb81", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "951fa020d243279fd40ee2632cbff6d12455ac75bd998f140cc10a0c50148355", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "ecacd21e-1829-48fa-95ab-5c90846e8dd3", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "959bf9577e6f311a7c7757d259611276d0f23bc8487222cd6678c94fb44efeba", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "x over all types. In the training step, parameters We evaluate models by considering query sets\n are updated in each meta-task. In the testing step, Qtest of test episodes. We calculate the precision\n the prediction is the label of the nearest prototype (P), recall (R) and micro F1-score over all test\n to x. That is, for a support set SY with types of Y episodes. Instead of the popular BIO schema, we\nand a query x, the prediction process is given as utilize the IO schema in our experiments, using\n y\u2217 = arg mindy(x), I-type to denote all the tokens of a named entity\n y\u2208Y (3) and O to denote other tokens.\n dy(x) = d(f\u03b8(x), zy). 7.2 The Overall Results\n NNShot & StructShot NNShot and Struct- We evaluate all baseline models on the three bench-\n Shot (Yang and Katiyar, 2020) are the state-of-the- mark settings introduced in Section 6, including\n art methods based on token-level nearest neighbor FEW-NERD (SUP), FEW-NERD (INTRA) and\n classification. In our experiments, we use BERT FEW-NERD (INTER).\n as the backbone encoder to produce contextualized Supervised NER As mentioned in Section 6.1,\n representations for fair comparison. Different from we first split the FEW-NERD as a standard super-\n the prototype-based method, NNShot determines vised NER dataset. As shown in Table 4, BERT-\n the tag of one query based on the token-level dis- Tagger yields promising results on the two widely\n tance, which is computed as d(f\u03b8(x), f\u03b8(x\u2032)) = used supervised datasets. The F1-score is 91.34%,\n ||f\u03b8(x) \u2212 f\u03b8(x\u2032)||22. Hence, for a support set SY 89.11%, respectively. However, the model suffers\nwith type of Y and a query x, a grave drop in the performance on FEW-NERD\n y\u2217 = arg mindy(x), (SUP) because the number of types of FEW-NERD\n y\u2208Y (SUP) is larger than others. The results indicate\n dy(x) = minyd(f\u03b8(x), f\u03b8(x\u2032)). (4) that FEW-NERD is challenging in the supervised\n x\u2032\u2208S setting and worth studying.\n We further analyze the performance of different\n With the identical basic structure as NNShot, entity types (see Figure 3). We find that the model\n StructShot adopts an additional Viterbi decoder achieves the best performance on the Person type\n during the inference phase (Hou et al., 2020) (not and yields the worst performance on the Product\n in training phase), where we estimate a transition type. And almost for all the coarse-grained types,\n distribution p(y\u2032|y) and an emission distribution the Coarse-Other type has the lowest F1-score.", "mimetype": "text/plain", "start_char_idx": 4328, "end_char_idx": 10047, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "b778bdc3-b7ac-4222-b5f9-8e068507f3a6": {"__data__": {"id_": "b778bdc3-b7ac-4222-b5f9-8e068507f3a6", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "fc3045bc-4ea0-4c56-aa91-9689fa1cee0d", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "595935693e299e435c4fa70872784da0511641c8bb4e8489e41fc9743f946167", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "810ba2d6-65c6-4378-91c4-4ba38f087746", "node_type": "1", "metadata": {}, "hash": "b42f80987ce6c5ac642f3fdf7ee3893520cb187b60aa749b96b1461ba338bd44", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " 100\n Art - 77% Building - 67% Event - 67% Location - 79% Organization - 73% Person - 85% Product - 60% Miscellaneous - 63%\n\n 80\n\n 60\n\n 40\n\n 20\n\n 0\n\n BroadcastPaintingFilmMusicWrittenAirportHotelArt-OtherHospitalLibrarySportsTheaterDisasterAttackElectionProtestGPESportsWaterMountainTransitIslandLoc-otherCompanyNewspaperPark EducationOrg-OtherPartyShowReligion ActorAthletePer-OtherSoldierCarFoodAuthorDirectorPoliticianAirplaneScholar GameShipSoftwareAstronomyChemicalDegreeTrainWeaponAwardBiologyCurrencyDiseaseLanguageMedicalGod Law\n Restaurant Event-other SportsTeam Livingthing\n Government SportsLeague Product-Other\n Building-Other\n\n Figure 3: F1-scores of different entity types on FEW-NERD (SUP), we report the average performance of each\n coarse-grained entity type on the legends.\n\n FEW-NERD(INTRA)\n Model 5 way 1\u223c2 shot 5 way 5\u223c10 shot 10 way 1\u223c2 shot 10 way 5\u223c10 shot\n P R F1 P R F1 P R F1 P R F1\n Proto 15.97\u00b10.61 29.66\u00b11.39 20.76\u00b10.84 36.34\u00b11.33 51.32\u00b10.45 42.54\u00b10.94 11.33\u00b10.57 22.47\u00b10.49 15.05\u00b10.44 29.39\u00b10.27 44.51\u00b11.00 35.40\u00b10.13\n NNShot 24.15\u00b10.35 27.65\u00b11.63 25.78\u00b10.91 32.91\u00b10.62 40.19\u00b11.22 36.18\u00b10.79 16.25\u00b10.22 20.90\u00b11.38 18.27\u00b10.41 24.86\u00b10.30 30.49\u00b10.96 27.38\u00b10.53\n Struct 32.99\u00b10.76 27.85\u00b10.98 30.21\u00b10.90 46.78\u00b11.00 32.06\u00b12.17 38.00\u00b11.29 26.05\u00b10.53 17.65\u00b11.34 21.03\u00b11.13 40.88\u00b10.83 19.52\u00b10.49 26.42\u00b10.60\n\n Table 5: Performance of state-of-art models on FEW-NERD (INTRA).\n\n FEW-NERD(INTER)\n Model 5 way 1\u223c2 shot 5 way 5\u223c10 shot 10 way 1\u223c2 shot 10 way 5\u223c10 shot\n P R F1 P R F1 P R F1 P R F1\n Proto 32.04\u00b11.75 49.30\u00b10.68 38.83\u00b11.49 52.54\u00b11.32 66.76\u00b11.01 58.79\u00b10.44 26.02\u00b11.32 43.17\u00b10.92 32.45\u00b10.79 46.38\u00b10.42 61.60\u00b10.36 52.92\u00b10.37\n NNShot 42.57\u00b11.27 53.09\u00b10.54 47.24\u00b11.00 51.03\u00b10.63 61.15\u00b10.63 55.64\u00b10.63 34.36\u00b10.24 44.76\u00b10.33 38.87\u00b10.21 44.96\u00b12.69 55.25\u00b12.77 49.57\u00b12.73\n Struct 53.89\u00b10.78 50.02\u00b10.62 51.88\u00b10.69 62.12\u00b10.41 53.21\u00b10.91 57.32\u00b10.63 47.07\u00b10.15 40.16\u00b10.12 43.34\u00b10.10 57.61\u00b11.87 43.54\u00b13.70 49.57\u00b13.08\n\n Table 6: Performance of state-of-art models on FEW-NERD (INTER).\n\nThis is because the semantics of such fine-grained FEW-NERD (INTER) than FEW-NERD (INTRA) ,\n types are relatively sparse and difficult to be recog- and the latter is regarded as a more difficult task as\n nized. A natural intuition is that the performance of we analyze in Section 5.2 and Section 6, it splits the\n each entity type is related to the portion of the type. data according to the coarse-grained entity types,\n But surprisingly, we find that they are not linearly which means entity types between the training set\n correlated. For examples, the model performs very and test set share less knowledge.\nwell on the Art type, although this type represents In a horizontal comparison, consistent with in-\n only a small fraction of FEW-NERD. tuition, almost all the methods produce the worst\n Few-shot NER For the few-shot benchmarks, results on 10 way 1\u223c2 shot and achieve the best\nwe adopt 4 sampling settings, which are 5 way performance on 5 way 5\u223c10. In the comparison\n1\u223c2 shot, 5 way 5\u223c10 shot, 10 way 1\u223c2 shot, across models, ProtoBERT generally achieves bet-\n and 10 way 5\u223c10 shot. Intuitively, 10 way 1\u223c2 ter performance than NNShot and StructShot, es-\n shot is the hardest setting because it has the largest pecially in 5\u223c10 shot setting where calculation by\n number of entity types and the fewest number of prototype may differ more from calculation by en-\n examples, and similarly, 5 way 5\u223c10 shot is the tity. StructShot has seen a large improvement in\n easiest setting. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 11590, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "810ba2d6-65c6-4378-91c4-4ba38f087746": {"__data__": {"id_": "810ba2d6-65c6-4378-91c4-4ba38f087746", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "fc3045bc-4ea0-4c56-aa91-9689fa1cee0d", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "595935693e299e435c4fa70872784da0511641c8bb4e8489e41fc9743f946167", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "b778bdc3-b7ac-4222-b5f9-8e068507f3a6", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "a593b49223ff84672127a243d4951709bb632f75b3dec73238a7928fce3fe317", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "All results of FEW-NERD (INTRA) precision in FEW-NERD (INTRA) . It shows that\n and FEW-NERD (INTER) are reported in Table 5 Viterbi decoder at the inference stage can help re-\n and Table 6 respectively. Overall, we observe move false positive predictions when knowledge\n that the previous state-of-the-art methods equipped transfer is hard. It is also observed that NNShot and\n by BERT encoder could not yield promising re- StructShot may suffer from the instability of the\n sults on FEW-NERD . From a perspective of nearest neighbor mechanism in the training phase,\n high level, models generally perform better on and prototypical models are more stable because", "mimetype": "text/plain", "start_char_idx": 11590, "end_char_idx": 13597, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "27c32a2f-d0a1-4540-90a2-aed3847dc7e4": {"__data__": {"id_": "27c32a2f-d0a1-4540-90a2-aed3847dc7e4", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "69aa87b9-95d8-46bd-b46f-f3948d0a5708", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "f748fdf06ccfc10a5aa0f6be1fb318b5c10741f653d841242c0454432291994a", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "c2ae573a-cfd8-4747-a7c2-ce1d55a0484b", "node_type": "1", "metadata": {}, "hash": "a7910a341863b0ad922932bd832aaf0a68e61920b5a15730883d7ef8e3d7ae92", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Models Span Error Type Error\n FP FN Within Outer\n ProtoNet 4.29% 2.17% 3.87% 5.35%\n NNShot 3.87% 3.67% 3.86% 6.90%\n StructShot 2.84% 4.45% 3.94% 5.56%\n\nTable 7: Error analysis of 5 way 5\u223c10 shot on\nFEW-NERD (INTER), \u201cWithin\u201d indicates \u201cwithin the\ncoarse types\u201d and \u201cOuter\u201d is \u201couter the coarse types\u201d.\nthe calculation of prototypes essentially serves as\nregularization.\n\n7.3 Error Analysis\nWe conduct error analysis to explore the challenges\nof FEW-NERD , the results are reported in Table 7.\nWe choose the setting of FEW-NERD (INTER) be-\ncause the test set contains all the coarse-grained\ntypes. We analyze the errors of models from two\nperspectives. Span Error denotes the misclassify-\ning in token-level classification. If an O token is\nmisclassified as a part of entity, i.e., I-type, it is\nan FP case, and if a token with the type I-type is\nmisclassified to O, it is FN. Type Error indicates the\nmisclassification of entity types when the spans are\ncorrectly classified. A \u201cWithin\u201d error represents\nthe entity is misclassified to another type within the\nsame coarse-grained type, while \u201cOuter\u201d denotes\nthe entity is misclassified to another type in a dif-\nferent coarse-grained type. As the statistics of type\nerrors may be impacted by the sampled episodes\nin testing, we conduct 5 rounds of experiments and\nreport the average results. The results demonstrate\nthat the token-level accuracy is not that low since\nmost O tokens could be detected. But an entity men-\ntion is considered to be wrong if one token is wrong,\nwhich becomes the main reason for the challenge\nof FEW-NERD . If an entity span could be accu-\nrately detected, the models could yield relatively\ngood performance on entity typing, indicating the\neffectiveness of metric learning.\n\n8 Conclusion and Future Work\nWe propose FEW-NERD , a large-scale few-shot\nNER dataset with fine-grained entity types. This\nis the first few-shot NER dataset and also one\nof the largest human-annotated NER dataset.\nFEW-NERD provides three unified benchmarks\nto assess approaches of few-shot NER and could\nfacilitate future research in this area. By imple-\n\nmenting state-of-the-art methods, we carry out a se-\nries of experiments on FEW-NERD , demonstrating\nthat few-shot NER remains a challenging problem\nand worth exploring. In the future, we will extend\nFEW-NERD by adding cross-domain annotations,\ndistant annotations, and finer-grained entity types.\nFEW-NERD also has the potential to advance the\nconstruction of continual knowledge graphs.\n\nAcknowledgements\nThis research is supported by National Natu-\nral Science Foundation of China (Grant No.\n61773229 and 6201101015), National Key Re-\nsearch and Development Program of China\n(No. 2020AAA0106501), Alibaba Innovation Re-\nsearch (AIR) programme, the General Research\nProject (Grand No. JCYJ20190813165003837,\nNo.JCYJ20190808182805919), and Overseas Co-\noperation Research Fund of Graduate School at\nTsinghua University (Grant No. HW2018002). ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 3095, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "c2ae573a-cfd8-4747-a7c2-ce1d55a0484b": {"__data__": {"id_": "c2ae573a-cfd8-4747-a7c2-ce1d55a0484b", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "69aa87b9-95d8-46bd-b46f-f3948d0a5708", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "f748fdf06ccfc10a5aa0f6be1fb318b5c10741f653d841242c0454432291994a", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "27c32a2f-d0a1-4540-90a2-aed3847dc7e4", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "bbabec9c4778e4f90d334b2020e61df2ae1bdc85d03e854ff28e0da66e1a8dcf", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "9cc52dba-eaee-481f-b340-5c0a400c28e7", "node_type": "1", "metadata": {}, "hash": "2a7b6c7d5494f767cf87d1e36d1c3fab50bb293e1a876d3cd0b47a333693080a", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Fi-\nnally, we thank the valuable help of Ronny, Xiaozhi,\nZiyu and comments of anonymous reviewers.\nEthical Considerations\n\nIn this paper, we present a human-annotated\ndataset, FEW-NERD , for few-shot learning in\nNER. We describe the details of the collection pro-\ncess and conditions, the compensation of annota-\ntors, the measurements to ensure the quality in the\nmain text. The corpus of the dataset is publicly ob-\ntained from Wikipedia and we have not modified or\ninterfered with the content. FEW-NERD is likely\nto directly facilitate the research of few-shot NER,\nand further increase the progress of the construction\nof large-scale knowledge graphs (KGs). Models\nand systems built on FEW-NERD may contribute\nto construct KGs in various domains, including\nbiomedical, financial, and legal fields, and further\npromote the development of NLP applications on\nspecific domains. FEW-NERD is annotated in En-\nglish, thus the dataset may mainly facilitate NLP\nresearch in English. For the sake of energy saving,\nwe will not only open source the dataset and the\ncode, but also release the checkpoints of our mod-\nels from the experiments to reduce unnecessary\ncarbon emission.\n\nReferences\n\nDominic Balasuriya, Nicky Ringland, Joel Nothman,\n Tara Murphy, and James R. Curran. ", "mimetype": "text/plain", "start_char_idx": 3095, "end_char_idx": 4369, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "9cc52dba-eaee-481f-b340-5c0a400c28e7": {"__data__": {"id_": "9cc52dba-eaee-481f-b340-5c0a400c28e7", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "69aa87b9-95d8-46bd-b46f-f3948d0a5708", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "f748fdf06ccfc10a5aa0f6be1fb318b5c10741f653d841242c0454432291994a", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "c2ae573a-cfd8-4747-a7c2-ce1d55a0484b", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "15380f84e88453210db0a8c7ab5cc42009ff66f68d4e66e029d3d1fcf38efd86", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2009. Named\n entity recognition in Wikipedia. In Proceedings of\n the 2009 Workshop on The People\u2019s Web Meets NLP:\n Collaboratively Constructed Semantic Resources", "mimetype": "text/plain", "start_char_idx": 4369, "end_char_idx": 4536, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "f1116f47-ab33-4225-bb26-ddc62fe95589": {"__data__": {"id_": "f1116f47-ab33-4225-bb26-ddc62fe95589", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7d05593a-cace-44c8-ad7c-15b475ca0267", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "bd293015dce9ca2a346361ea4af1ca08e03356003311fea5e5c30f67f50034e0", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "e72dac24-34a6-4159-818b-d6f023d89f0c", "node_type": "1", "metadata": {}, "hash": "2dae6087e6c22f27561d177b46f2291d4b30e3204ce4d725d2530af957449325", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " (People\u2019s Web), pages 10\u201318, Suntec, Singapore. As-\n sociation for Computational Linguistics.\n\n Jason P.C. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 114, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "e72dac24-34a6-4159-818b-d6f023d89f0c": {"__data__": {"id_": "e72dac24-34a6-4159-818b-d6f023d89f0c", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7d05593a-cace-44c8-ad7c-15b475ca0267", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "bd293015dce9ca2a346361ea4af1ca08e03356003311fea5e5c30f67f50034e0", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "f1116f47-ab33-4225-bb26-ddc62fe95589", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "1bbeb825aa73acd4f2bda14f92aebcaf9ef109db74efb2e92cd7058f95df94f1", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "00f9b9f2-a717-4ccb-a263-c9c92e3a0604", "node_type": "1", "metadata": {}, "hash": "28d30c7f4af4a30790aa08955d1e377b1d7c49bdbff3510ebba78f4cae663a01", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Chiu and Eric Nichols. ", "mimetype": "text/plain", "start_char_idx": 114, "end_char_idx": 137, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "00f9b9f2-a717-4ccb-a263-c9c92e3a0604": {"__data__": {"id_": "00f9b9f2-a717-4ccb-a263-c9c92e3a0604", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7d05593a-cace-44c8-ad7c-15b475ca0267", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "bd293015dce9ca2a346361ea4af1ca08e03356003311fea5e5c30f67f50034e0", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "e72dac24-34a6-4159-818b-d6f023d89f0c", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "e91fed0f32692abd2070a7bf73e855efebac7c6516b221b1f5ef89283bcad78b", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "0b352382-f3d6-4693-8571-1762bd92e288", "node_type": "1", "metadata": {}, "hash": "fa068ddcf54a1b84dcb277c7d02d80f5ddf7aac4d6517165af8f07df295c728f", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2016. Named entity\n recognition with bidirectional LSTM-CNNs. Trans-\n actions of the Association for Computational Lin-\n guistics, 4:357\u2013370.\n\nEunsol Choi, Omer Levy, Yejin Choi, and Luke Zettle-\n moyer. 2018. Ultra-fine entity typing. In Proceed-\n ings of the 56th Annual Meeting of the Association\n for Computational Linguistics (Volume 1: Long Pa-\n pers), pages 87\u201396, Melbourne, Australia. Associa-\n tion for Computational Linguistics.\nJacob Cohen. 1960. A coefficient of agreement for\n\n nominal scales. Educational and psychological mea-\n surement, 20(1):37\u201346.\n Wanyun Cui, Yanghua Xiao, Haixun Wang, Yangqiu\n Song, Seung-won Hwang, and Wei Wang. ", "mimetype": "text/plain", "start_char_idx": 137, "end_char_idx": 823, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "0b352382-f3d6-4693-8571-1762bd92e288": {"__data__": {"id_": "0b352382-f3d6-4693-8571-1762bd92e288", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7d05593a-cace-44c8-ad7c-15b475ca0267", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "bd293015dce9ca2a346361ea4af1ca08e03356003311fea5e5c30f67f50034e0", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "00f9b9f2-a717-4ccb-a263-c9c92e3a0604", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "1dd73aff08b1e4d79f5ebb9551934a92650bfe7bffcbc212780a9f2d886731eb", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "812846d5-bd57-4218-8039-072d4826c457", "node_type": "1", "metadata": {}, "hash": "9fb0aae4b9f16a6c7e601e74f9730da32b3dee7f43a75a8be3640461813686e7", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2017.\n Kbqa: learning question answering over qa cor-\n pora and knowledge bases. In Proceedings of 43rd\n Very Large Data Base Conference Endowment, vol-\n ume 10.\n Leon Derczynski, Eric Nichols, Marieke van Erp, and\n Nut Limsopatham. 2017. Results of the WNUT2017\n shared task on novel and emerging entity recogni-\n tion. In Proceedings of the 3rd Workshop on Noisy\n User-generated Text, pages 140\u2013147, Copenhagen,\n Denmark. Association for Computational Linguis-\n tics.\n Jacob Devlin, Ming-Wei Chang, Kenton Lee, and\n Kristina Toutanova. 2019a. BERT: Pre-training of\n deep bidirectional transformers for language under-\n standing. In Proceedings of the 2019 Conference\n of the North American Chapter of the Association\n for Computational Linguistics: Human Language\n Technologies, Volume 1 (Long and Short Papers),\n pages 4171\u20134186, Minneapolis, Minnesota. Associ-\n ation for Computational Linguistics.\n Jacob Devlin, Ming-Wei Chang, Kenton Lee, and\n Kristina Toutanova. 2019b. BERT: Pre-training of\n deep bidirectional transformers for language under-\n standing. In Proceedings of the 2019 Conference\n of the North American Chapter of the Association\n for Computational Linguistics: Human Language\n Technologies, Volume 1 (Long and Short Papers),\n pages 4171\u20134186, Minneapolis, Minnesota. Associ-\n ation for Computational Linguistics.\n Ning Ding, Ziran Li, Zhiyuan Liu, Haitao Zheng,\n and Zibo Lin. 2019. Event detection with trigger-\n aware lattice neural network. In Proceedings of\n the 2019 Conference on Empirical Methods in Natu-\n ral Language Processing and the 9th International\n Joint Conference on Natural Language Process-\n ing (EMNLP-IJCNLP), pages 347\u2013356, Hong Kong,\n China. Association for Computational Linguistics.\n Ning Ding, Xiaobin Wang, Yao Fu, Guangwei Xu, Rui\n Wang, Pengjun Xie, Ying Shen, Fei Huang, Hai-Tao\n Zheng, and Rui Zhang. 2021. Prototypical repre-\n sentation learning for relation extraction. In Inter-\n national Conference on Learning Representations.\n\nChelsea Finn, Pieter Abbeel, and Sergey Levine. 2017.\n Model-agnostic meta-learning for fast adaptation of\n deep networks. In Proceedings of the 34th Inter-\n national Conference on Machine Learning, ICML\n 2017, Sydney, NSW, Australia, 6-11 August 2017,\n volume 70 of Proceedings of Machine Learning Re-\n search, pages 1126\u20131135. PMLR.\n\nAlexander Fritzler, Varvara Logacheva, and Maksim\n Kretov. 2019. Few-shot classification in named en-\n ACM/SIGAPP Symposium on Applied Computing\n tity recognition task. In Proceedings of the 34th,\n pages 993\u20131000.\n\n", "mimetype": "text/plain", "start_char_idx": 823, "end_char_idx": 3555, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "812846d5-bd57-4218-8039-072d4826c457": {"__data__": {"id_": "812846d5-bd57-4218-8039-072d4826c457", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7d05593a-cace-44c8-ad7c-15b475ca0267", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "bd293015dce9ca2a346361ea4af1ca08e03356003311fea5e5c30f67f50034e0", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "0b352382-f3d6-4693-8571-1762bd92e288", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "f36c6ca5eebedf6abb5b59826751b9a025960535bfd565d7fd44080971aefc81", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "c52e3f4a-332f-4829-9c57-c42ad62c4c61", "node_type": "1", "metadata": {}, "hash": "2dc6df9854cf3a239010ffb66377ecfcc59fd09c6f6fc747a252d72e8f255a86", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Dan Gillick, Nevena Lazic, Kuzman Ganchev, Jesse\n Kirchner, and David Huynh. ", "mimetype": "text/plain", "start_char_idx": 3555, "end_char_idx": 3635, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "c52e3f4a-332f-4829-9c57-c42ad62c4c61": {"__data__": {"id_": "c52e3f4a-332f-4829-9c57-c42ad62c4c61", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7d05593a-cace-44c8-ad7c-15b475ca0267", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "bd293015dce9ca2a346361ea4af1ca08e03356003311fea5e5c30f67f50034e0", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "812846d5-bd57-4218-8039-072d4826c457", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "8afe57d020b291f7d2c074a8cfbea21e5d91fa2a6ec737f129bf199d39f7c070", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2014. Context-\n dependent fine-grained entity type tagging. arXiv\n preprint arXiv:1412.1820.\nXu Han, Hao Zhu, Pengfei Yu, Ziyun Wang, Yuan\n Yao, Zhiyuan Liu, and Maosong Sun. 2018. FewRel:\n A large-scale supervised few-shot relation classifica-\n tion dataset with state-of-the-art evaluation. In Pro-\n ceedings of the 2018 Conference on Empirical Meth-\n ods in Natural Language Processing, pages 4803\u2013\n 4809, Brussels, Belgium. Association for Computa-\n tional Linguistics.\nMaximilian Hofer, Andrey Kormilitzin, Paul Goldberg,\n and Alejo Nevado-Holgado. 2018. Few-shot learn-\n ing for named entity recognition in medical text.\n arXiv preprint arXiv:1811.05468.\nYutai Hou, Wanxiang Che, Yongkui Lai, Zhihan Zhou,\n Yijia Liu, Han Liu, and Ting Liu. 2020. Few-shot\n slot tagging with collapsed dependency transfer and\n label-enhanced task-adaptive projection network. In\n Proceedings of the 58th Annual Meeting of the Asso-\n ciation for Computational Linguistics, pages 1381\u2013\n 1393, Online. Association for Computational Lin-\n guistics.\n\nJiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien\n Jose, Shobana Balakrishnan, Weizhu Chen, Baolin\n Peng, Jianfeng Gao, and Jiawei Han. 2020. Few-\n shot named entity recognition: A comprehensive\n study. arXiv preprint arXiv:2012.14978.\nGuillaume Lample, Miguel Ballesteros, Sandeep Sub-\n ramanian, Kazuya Kawakami, and Chris Dyer. 2016.\n Neural architectures for named entity recognition.\n In Proceedings of the 2016 Conference of the North\n American Chapter of the Association for Computa-\n tional Linguistics: Human Language Technologies,\n pages 260\u2013270, San Diego, California. Association\n for Computational Linguistics.\nAoxue Li, Tiange Luo, Zhiwu Lu, Tao Xiang, and Li-\n wei Wang. 2019a. Large-scale few-shot learning:\n Knowledge transfer with class hierarchy. In IEEE", "mimetype": "text/plain", "start_char_idx": 3635, "end_char_idx": 5561, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "e32886ff-2b1a-422c-b95b-e421bd43419f": {"__data__": {"id_": "e32886ff-2b1a-422c-b95b-e421bd43419f", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "b8b680e22151226fad595439f435384144884a056595224b163178d8e549e603", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "fbb1da9d-8adb-456b-a269-3544ffe0f8c3", "node_type": "1", "metadata": {}, "hash": "5ab2d48e5650a9d4f146033564432a9f1cad466fcc1f391b86e7fc8b8fc4feec", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Conference on Computer Vision and Pattern Recog-\n nition, CVPR 2019, Long Beach, CA, USA, June 16-\n 20, 2019, pages 7212\u20137220. Computer Vision Foun-\n dation / IEEE.\n\n Jing Li, Billy Chiu, Shanshan Feng, and Hao Wang.\n ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 234, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "fbb1da9d-8adb-456b-a269-3544ffe0f8c3": {"__data__": {"id_": "fbb1da9d-8adb-456b-a269-3544ffe0f8c3", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "b8b680e22151226fad595439f435384144884a056595224b163178d8e549e603", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "e32886ff-2b1a-422c-b95b-e421bd43419f", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "e0846d4696ffe6313c11e374b1bba89f9cac0be7bb6b7a5a01858b2b136efddd", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "5b74caa6-0e1a-4998-8fce-bc485614f693", "node_type": "1", "metadata": {}, "hash": "23766ac4f90b54b4e76856b4cc8ea2fb739af601e40e81609bdaacbf6efdc6aa", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2020a. Few-shot named entity recognition via meta-\n learning. IEEE Transactions on Knowledge and\n Data Engineering.\n\n Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong\n Han, Fei Wu, and Jiwei Li. 2020b. A unified MRC\n framework for named entity recognition. In Pro-\n ceedings of the 58th Annual Meeting of the Asso-\n ciation for Computational Linguistics, pages 5849\u2013\n 5859, Online. Association for Computational Lin-\n guistics.\nZiran Li, Ning Ding, Zhiyuan Liu, Haitao Zheng,\n and Ying Shen. ", "mimetype": "text/plain", "start_char_idx": 234, "end_char_idx": 756, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "5b74caa6-0e1a-4998-8fce-bc485614f693": {"__data__": {"id_": "5b74caa6-0e1a-4998-8fce-bc485614f693", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "b8b680e22151226fad595439f435384144884a056595224b163178d8e549e603", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "fbb1da9d-8adb-456b-a269-3544ffe0f8c3", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "9c1acae81d6cf06908c3d8764c2a1ce3939120549a1ea21483b3414c13017bed", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "ae5d7634-5d34-44d1-a4e7-8d200469f0db", "node_type": "1", "metadata": {}, "hash": "8796dabca3ae60633d853ab117ed2e2b842cf6e5db9116b0d8c53f4f6c41e879", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2019b. Chinese relation extraction\n with multi-grained information and external linguis-\n tic knowledge. In Proceedings of the 57th Annual\n Meeting of the Association for Computational Lin-\n guistics, pages 4377\u20134386, Florence, Italy. Associa-\n tion for Computational Linguistics.\n\n Xiao Ling and Daniel S. Weld. 2012. Fine-grained en-\n tity recognition. In Proceedings of the Twenty-Sixth\n AAAI Conference on Artificial Intelligence, July 22-\n 26, 2012, Toronto, Ontario, Canada. AAAI Press.\n Ilya Loshchilov and Frank Hutter. ", "mimetype": "text/plain", "start_char_idx": 756, "end_char_idx": 1308, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "ae5d7634-5d34-44d1-a4e7-8d200469f0db": {"__data__": {"id_": "ae5d7634-5d34-44d1-a4e7-8d200469f0db", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "b8b680e22151226fad595439f435384144884a056595224b163178d8e549e603", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "5b74caa6-0e1a-4998-8fce-bc485614f693", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "fdfb36f9efb828c887c3928c11563f0abed35d6b5a617e8ea570abd17c2d6464", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "51714cff-a266-4cf3-96f1-bbb555068ce9", "node_type": "1", "metadata": {}, "hash": "191eb860d6e58621b5eeec9b6aa6eb96c8d0dfd21ca58783d63626170959bb07", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2019. Decoupled\n weight decay regularization. In International Con-\n ference on Learning Representations.\n\nXuezhe Ma and Eduard Hovy. 2016. End-to-end\n sequence labeling via bi-directional LSTM-CNNs-\n CRF. In Proceedings of the 54th Annual Meeting of\n the Association for Computational Linguistics (Vol-\n ume 1: Long Papers), pages 1064\u20131074, Berlin, Ger-\n many. Association for Computational Linguistics.\nAdam Paszke, Sam Gross, Francisco Massa, Adam\n Lerer, James Bradbury, Gregory Chanan, Trevor\n Killeen, Zeming Lin, Natalia Gimelshein, Luca\n Antiga, Alban Desmaison, Andreas K\u00a8opf, Edward\n Yang, Zachary DeVito, Martin Raison, Alykhan Te-\n jani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang,\n Junjie Bai, and Soumith Chintala. 2019. Py-\n torch: An imperative style, high-performance deep\n learning library. In Advances in Neural Informa-\n tion Processing Systems 32: Annual Conference\n on Neural Information Processing Systems 2019,\n NeurIPS 2019, December 8-14, 2019, Vancouver,\n BC, Canada, pages 8024\u20138035.\n\n Nicky Ringland, Xiang Dai, Ben Hachey, Sarvnaz\n Karimi, Cecile Paris, and James R. Curran. 2019.\n NNE: A dataset for nested named entity recognition\n in English newswire. In Proceedings of the 57th An-\n nual Meeting of the Association for Computational\n Linguistics, pages 5176\u20135181, Florence, Italy. Asso-\n ciation for Computational Linguistics.\nAlan Ritter, Sam Clark, Mausam, and Oren Etzioni.\n 2011. Named entity recognition in tweets: An ex-\n perimental study. In Proceedings of the 2011 Con-\n ference on Empirical Methods in Natural Language\n Processing, pages 1524\u20131534, Edinburgh, Scotland,\n UK. Association for Computational Linguistics.\n\nYing Shen, Ning Ding, Hai-Tao Zheng, Yaliang Li,\n and Min Yang. 2020. Modeling relation paths for\n knowledge graph completion. IEEE Transactions\n on Knowledge and Data Engineering.\nJake Snell, Kevin Swersky, and Richard S. Zemel.\n 2017. Prototypical networks for few-shot learning.\n In Advances in Neural Information Processing Sys-\n tems 30: Annual Conference on Neural Informa-\n tion Processing Systems 2017, December 4-9, 2017,\n Long Beach, CA, USA, pages 4077\u20134087.\nAmber Stubbs and \u00a8Ozlem Uzuner. 2015. Annotating\n longitudinal clinical narratives for de-identification:\n The 2014 i2b2/uthealth corpus. Journal of biomedi-\n cal informatics, 58:S20\u2013S29.\n\nErik F. Tjong Kim Sang. ", "mimetype": "text/plain", "start_char_idx": 1308, "end_char_idx": 3804, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "51714cff-a266-4cf3-96f1-bbb555068ce9": {"__data__": {"id_": "51714cff-a266-4cf3-96f1-bbb555068ce9", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "b8b680e22151226fad595439f435384144884a056595224b163178d8e549e603", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "ae5d7634-5d34-44d1-a4e7-8d200469f0db", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "f8eb07f417ebf2d68de8a444127f40066c8f8311545bb7cf8fa2a85315239053", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2002. Introduction to the\n CoNLL-2002 shared task: Language-independent\n named entity recognition. In COLING-02: The\n 6th Conference on Natural Language Learning 2002\n (CoNLL-2002).\n\nXiaozhi Wang, Ziqi Wang, Xu Han, Wangyi Jiang,Rong Han, Zhiyuan Liu, Juanzi Li, Peng Li, Yankai\n Lin, and Jie Zhou. 2020. MAVEN: A Massive Gen-\n eral Domain Event Detection Dataset. In Proceed-\n ings of the 2020 Conference on Empirical Methods\n in Natural Language Processing (EMNLP), pages\n 1652\u20131671, Online. Association for Computational\n Linguistics.\nRalph Weischedel, Martha Palmer, Mitchell Marcus,\n Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Ni-\n anwen Xue, Ann Taylor, Jeff Kaufman, Michelle\n Franchini, et al. 2013. Ontonotes release 5.0\n ldc2013t19. Linguistic Data Consortium, Philadel-\n phia, PA, 23.\nThomas Wolf, Lysandre Debut, Victor Sanh, Julien\n Chaumond, Clement Delangue, Anthony Moi, Pier-\n ric Cistac, Tim Rault, Remi Louf, Morgan Funtow-\n icz, Joe Davison, Sam Shleifer, Patrick von Platen,\n Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu,\n Teven Le Scao, Sylvain Gugger, Mariama Drame,\n Quentin Lhoest, and Alexander Rush. 2020. Trans-\n formers: State-of-the-art natural language process-\n ing. In Proceedings of the 2020 Conference on Em-\n pirical Methods in Natural Language Processing:\n System Demonstrations, pages 38\u201345, Online. Asso-\n ciation for Computational Linguistics.\n\nYi Yang and Arzoo Katiyar. 2020. Simple and effective\n few-shot named entity recognition with structured\n nearest neighbor learning. In Proceedings of the\n 2020 Conference on Empirical Methods in Natural\n Language Processing (EMNLP), pages 6365\u20136375,\n Online. Association for Computational Linguistics.", "mimetype": "text/plain", "start_char_idx": 3804, "end_char_idx": 5579, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "22d3e563-46d5-4e6a-a7d5-84b175421878": {"__data__": {"id_": "22d3e563-46d5-4e6a-a7d5-84b175421878", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "003a1c4d-0a7c-41a7-a905-8606e7d9e8d7", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "ad0e18b971bb4ab9c8187f0bb04337219fce5dc6ea6e8318b639d54ec4120184", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "04883a01-7aeb-46c7-ab74-6fa9337c61ee", "node_type": "1", "metadata": {}, "hash": "55aa69047f8e3909256c5458ebfa758fc460ce43edfe33236b07b60402cabefb", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio,\n William Cohen, Ruslan Salakhutdinov, and Christo-\n pher D. Manning. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 126, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "04883a01-7aeb-46c7-ab74-6fa9337c61ee": {"__data__": {"id_": "04883a01-7aeb-46c7-ab74-6fa9337c61ee", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "003a1c4d-0a7c-41a7-a905-8606e7d9e8d7", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "ad0e18b971bb4ab9c8187f0bb04337219fce5dc6ea6e8318b639d54ec4120184", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "22d3e563-46d5-4e6a-a7d5-84b175421878", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "ca7ccbc67094b071ae54681cfe5a52dd20523d7e9f4dc33f993608e3202a94b6", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2018. HotpotQA: A dataset\n for diverse, explainable multi-hop question answer-\n ing. In Proceedings of the 2018 Conference on Em-\n pirical Methods in Natural Language Processing,\n pages 2369\u20132380, Brussels, Belgium. Association\n for Computational Linguistics.", "mimetype": "text/plain", "start_char_idx": 126, "end_char_idx": 408, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "b7178a9a-baa5-4df6-bf34-fe7e2076eb3f": {"__data__": {"id_": "b7178a9a-baa5-4df6-bf34-fe7e2076eb3f", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "a4ada096-e175-47ff-9fc4-b609c217c6ba", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c9cc67b545e7102c022d100a5ebde2ae77b4cd3823b2fdfa28afd9e7eca78373", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "5a13abdf-cef2-4d15-a4c6-2678fd859672", "node_type": "1", "metadata": {}, "hash": "f317051bb1e81164671436f9177d43d02200e199ff23099fb2826ab5291bfafc", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " A Data Details B Implementation Details\n A.1 Processing All the four models use BERTbase (Devlin et al.,\nWe use the dump2 of English Wikipedia, and ex- 2019a) and the backbone encoder and initial-\n tract the raw text by WikiExtractor3. NLTK lan- ized with the corresponding pre-trained uncased\n guage tool4 is used for word and sentence tok- weights6. The hidden size is 768, and the\n number of layers and heads are 12. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 1658, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "5a13abdf-cef2-4d15-a4c6-2678fd859672": {"__data__": {"id_": "5a13abdf-cef2-4d15-a4c6-2678fd859672", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "a4ada096-e175-47ff-9fc4-b609c217c6ba", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c9cc67b545e7102c022d100a5ebde2ae77b4cd3823b2fdfa28afd9e7eca78373", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "b7178a9a-baa5-4df6-bf34-fe7e2076eb3f", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "d2e2e7a04514266dfa807951883842eb702bf4c7ef1babd02dd7c1a89c77e42f", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "310337fe-3f15-42a8-a1fd-8a9bfc87f6a4", "node_type": "1", "metadata": {}, "hash": "e37bef9d9618bebaa14b5891b0c7db2793b16a917ffcfb43e1e7355fe7844552", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Models\n enization in the preprocessing stage. As stated are implemented by Pytorch framework7 (Paszke\n in Section 4.2, we develope a search engine to et al., 2019) and Huggingface transformers8 (Wolf\n index and select paragraphs with key words in dis- et al., 2020). BERT models are optimized by\n tant dictionaries. If the search is performed with AdamW9 (Loshchilov and Hutter, 2019) with the\n linear operations, the calculation process will be learning rate of 1e-4. We evaluate our implemen-\n extremely slow, instead, we adopt a search engine tations of NNShot and StructShot on the datasets\nwith Lucene5 to conduct effective indexing and used in the original paper, producing similar results.\n searching. For supervised NER, the batch size is 8, and we\n train BERT-Tagger for 70000 steps and evaluate\n A.2 More Details of the Schema it on the test set. For 5 way 1\u223c2 and 5\u223c10 shot\nAs stated in Section 4.1, we use FIGER (Ling and settings, the batch sizes are 16 and 4, and for 10\nWeld, 2012) as the start point and conduct rounds of way 1\u223c2 and 5\u223c10 shot settings, the batch sizes\n make a series of modifications. Despite the modifi- are 8 and 1. We train 12000 episodes and use 500\n cations mentioned in Section 4.1, we also conduct episodes of the dev set to select the best model,\n manual denoising of the automatically annotated and test it on 5000 episodes of the test set. Most\n data of FIER. For each entity type and the cor- hyper-parameters are from original settings. We\n responding automatically annotated mentions, we manually tune the hyper-parameter \u03c4 in Viterbi for\n randomly select 500 mentions and compute the StructShot, and the value for 1\u223c2 settings shot is\n accuracy to obtain the real frequency. For exam- 0.320, for 5\u223c10 shot settings is 0.434. All the ex-\n ple, statistics report that cemetery is a type with periments are conducted with CUDA on NVIDIA\n high frequency. However, a plenty number of the Tesla V100 GPUs. With 2 GPUs used, the average\n mentions labeled as cemetery are actually GPE. time to train 10000 episodes is 135 minutes. The\n Similarly, engineer is also affected by noise. ", "mimetype": "text/plain", "start_char_idx": 1658, "end_char_idx": 7692, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "310337fe-3f15-42a8-a1fd-8a9bfc87f6a4": {"__data__": {"id_": "310337fe-3f15-42a8-a1fd-8a9bfc87f6a4", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "a4ada096-e175-47ff-9fc4-b609c217c6ba", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c9cc67b545e7102c022d100a5ebde2ae77b4cd3823b2fdfa28afd9e7eca78373", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "5a13abdf-cef2-4d15-a4c6-2678fd859672", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "4a97f8e28500fa2c16bf6a826663c16c49a54e60f0f8628619b859fb20781d54", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "number of parameters of the models is 120M.\n A.3 Interface C Entity Types\nThe interface in shown in Figure 4, where anno- As introduced in Section 4.1 in main text,\n tators could expediently select entity spans and FEW-NERD is manually annotated with 8 coarse-\n annotate the corresponding coarse and fine types. grained and 66 fine-grained entity types, and we\nAnd annotators could check the current annotation list all the types in Table 8. The schema is designed\n information on the interface. under practical situation, we hope the schema could\n help to better understand FEW-NERD . Note that\n Save AlI Confirm AII Deliver AII Task Summary Guideline Add Query Query List ORG is the abbreviation of Organization, and\n Annotation\n Annotation\n Entities MISC is the abbreviation of Miscellaneous.\n organization sportsteam\n person athlete person athlete person athlete\n On March 4, 2008_ Gilbert surpassed Paul Coffey and Marc- Oilers Nashville Predators\n person athlete\n organization sporsteam Gilbert Paul Coffey Marc-Andre Bergeron Dan Ellis\n AndreBergeron for the Oilers' franchise record for most goals scored by rookie defencemen with his tenth goal on the power play\n Reference\n against the Nashville Predators goaltender Dan Ellis athleteorganization sporsteampelson Reference\n person-athlete\n Figure 4: Screeshot of the interface used to annotate\n FEW-NERD.\n 6https://github.com/google-research/\n 2https://dumps.wikimedia.org/enwiki/ bert7https://pytorch.org\n 3https://github.com/attardi/ 8https://github.com/huggingface/\nwikiextractor4https://www.nltk.org transformers\n 5https://lucene.apache.org/ 9https://www.fast.ai/2018/07/02/\n adam-weight-decay/#adamw", "mimetype": "text/plain", "start_char_idx": 7692, "end_char_idx": 13204, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "91ed48ed-da65-4f77-98c0-99f800d0db39": {"__data__": {"id_": "91ed48ed-da65-4f77-98c0-99f800d0db39", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "df5c2ab9-c043-48d1-b030-1e78c26fe080", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "d658722730c057e212c837b778ba32953eae565ad65c58e2936b7cf1b7c851d0", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "5998e668-1c0b-4446-ba84-6386fe51b607", "node_type": "1", "metadata": {}, "hash": "c4b192940978fdd36510941059d370d29caa71987793f34201bb0472461dca9d", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Coarse Type Fine Type\n GPE\n Body of Water\nLocation\n Island\n Mountain\n\n Park\n\n Road/Transit\n\n Other\n\n Actor\n Artist/Author\n Person Athlete\n Director\n Politician\n\n Scholar\n\n Soldier\n\n Other\n Company\n\n Education\n\n ORG Government\n Media\n\n Political/party\n Religion\n\n Sports League\n Sports Team\n Show ORG\n\n Other\n Airport\n Hospital\n\n Building Hotel\n\n Library\n\n Restaurant\n\n Sports Facility\n Theater\nExample\nThe company moved to a new office in Las Vegas, Nevada.\nThe Finke River normally drains into the Simpson Desert to the north west\nof the Macumba.\nAn invading army of Teutonic Knights conquered Gotland in 1398.\nC.G.E. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 1215, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "5998e668-1c0b-4446-ba84-6386fe51b607": {"__data__": {"id_": "5998e668-1c0b-4446-ba84-6386fe51b607", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "df5c2ab9-c043-48d1-b030-1e78c26fe080", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "d658722730c057e212c837b778ba32953eae565ad65c58e2936b7cf1b7c851d0", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "91ed48ed-da65-4f77-98c0-99f800d0db39", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "8642d18aff05ba41c66b6502118f6cd10945423f7b5edb3c11e6a9783eae2a27", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "06456051-5542-40dd-9ddd-87258d76aa23", "node_type": "1", "metadata": {}, "hash": "a2e0ef7ef3d82054483dbf0249c544a5ee8dfc469082cdc0ac8c46fcd12d7908", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Mannerheim met Thubten Gyatso in Wutai Shan during the course of\nhis expedition from Turkestan to Peking.\nVictoria Park contains examples of work by several architects including\nAlfred Waterhouse (Xaverian College).\nThe thirty-first race of the 1951 season was held on October 7 at the one-mile\ndirt Occoneechee Speedway.\nHerodotus (7.59) reports that Doriscus was the first place Xerxes the Great\nstopped to review his troops.\nThe first performance of any work of Gustav Holst given in that capital.\nA film adaption was made by Arne Bornebusch in 1936.\nSmith was named co-Player of the Week in the Big Ten on offense.\nMargin for Error is a 1943 American drama film directed by Otto Preminger.\nThen-President Gloria Macapagal Arroyo led the inauguration rites of the\nfacility on August 19, 2002.\nJeffery Westbrook and Robert Tarjan (1992) developed an efficient data\nstructure for this problem based on disjoint-set data structures.\nSadowski was promoted to general, and took command of the freshly created\nFortified Area of Silesia.\nIn Albany, Doane planned a cathedral like those in England.\nA Vocaloid voicebank developed and distributed by Yamaha Corporation for\nVocaloid 4.\nLong volunteer coached the offensive line for Briarcrest Christian School\nfor 9 seasons.\nIt was constructed using the savings of the Quezon provincial government.\nHe was the Editor in Chief of Grenada\u2019s national newspaper \u201dThe Free West\nIndian\u201d.\nStanley Norman Evans was a British industrialist and Labour Party politician.\nD\u2019Souza was born on 10 November 1985 into a Goan Catholic family in\nGoa, India.\nHis strong performances convinced him that he was ready for the NBA.\n", "mimetype": "text/plain", "start_char_idx": 1215, "end_char_idx": 2867, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "06456051-5542-40dd-9ddd-87258d76aa23": {"__data__": {"id_": "06456051-5542-40dd-9ddd-87258d76aa23", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "df5c2ab9-c043-48d1-b030-1e78c26fe080", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "d658722730c057e212c837b778ba32953eae565ad65c58e2936b7cf1b7c851d0", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "5998e668-1c0b-4446-ba84-6386fe51b607", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "ce00a88f68c0ad1319ad7e09222473578e04f19a9e6109de5ded3c9c1d9d14b5", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "The Pirates won the game and the World Series with Oldham on the mound.\nStanding in the Way of Control is the third studio album by American indie\nrock band Gossip.\nHe is the Creative Director of the Oliver Sacks Foundation.\nThe city is served by the Sir Seretse Khama International Airport.\nThen he did residency in ophthalmology at Farabi Eye Hospital from 1979\nto 1982.\nNick also played at the regular Sunday evening sessions that were held at\nthe Ramada Inn in Schenectady.\nRMIT University Library consists of six academic branch libraries in Aus-\ntralia and Vietnam.\nThe first Panda Express restaurant opened in Galleria II in the same year, on\nlevel 3 near Bloomingdale\u2019s.\nThis was the last year that the Razorbacks would play in Barnhill Arena.\nFrom 1954, she became a guest singer at the Vienna State Opera.", "mimetype": "text/plain", "start_char_idx": 2867, "end_char_idx": 3682, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "492b4f97-d056-4cde-bbf6-d2fa2a5b21b0": {"__data__": {"id_": "492b4f97-d056-4cde-bbf6-d2fa2a5b21b0", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "b927cf40-69a0-4d7a-9d6f-768c72cbf5d2", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c152e3c4ee6611a83b375fd6b7975ca60011353e30fd0e38986be3a95948c53c", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "fa1b2e06-8569-4c40-b557-50ab94a0728d", "node_type": "1", "metadata": {}, "hash": "8ccc7359253962eb9eb679631bc797c0fa109dcf61efd4760e05a82bc4b78894", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Other Eissler designated Masson to succeed him as Director of the Sigmund Freud\n Archives after his and Anna Freud\u2019s death.\n Music \u201dGet Right\u201d is a song recorded by American singer Jennifer Lopez for her\n fourth studio album.\n Art Film Margin for Error is a 1943 American drama film directed by Otto Preminger.\n Written Art The Count is a text adventure written by Scott Adams and published by\n Adventure International in 1979.\n Broadcast In the fall of 1957, Mitchell starred in ABC\u2019s \u201dThe Guy Mitchell Show\u201d.\n Painting His painting \u2019Rooftops\u2019 has been in the collection of the City of London\n Corporation since 1989.\n Other Kirwan appeared on stage at the Chichester Festival Theatre in a Jeremy\n Herrin production of Uncle Vanya.\n Airplane The Royal Norwegian Air Force\u2019s 330 Squadron operates a Westland Sea\n King search and rescue helicopter out of Flor\u00f8.\n Car The BYD Tang plug-in hybrid SUV was the top selling plug-in car with\nProduct 31,405 units delivered.\n Food The words \u201dTime to make the donuts\u201d are printed on the side of Dunkin\u2019\n Donuts boxes in memory of Michael Vale/Fred the Baker.\n Game Team Andromeda wanted to create a fully 3D arcade game, having worked\n on similar games such as \u201dOut Run\u201d which were not truly 3D.\n Ship As night fell, Marine Corps General Holland Smith studied reports aboard\n the command ship \u201dEldorado\u201d.\n Software It allows communication between the Wolfram Mathematica kernel and\n front-end.\n Train On 9 June 1929, railcar No. 220 \u201dWaterwitch\u201d overran signals at Marshgate\n Junction.\n Weapon Mannerheim gave Tibet\u2019s spiritual pontiff a Browning revolver and showed\n him how to reload the weapon.\n Other Rhinestone is as artificial and synthetic a concoction as has ever made its\n way to the screen.\n Attack It was on this route that Tecumseh was killed at the Battle of the Thames on\n October 5, 1813.\n Event Election At the 1935 United Kingdom general election, McGleenan stood in Armagh\n as an Independent Republican.\n Natural Disaster He was originally from Chicago, but moved to Japan after the Second Great\n Kanto earthquake that all but decimated Japan\u2019s infrastructure.\n ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 3013, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "fa1b2e06-8569-4c40-b557-50ab94a0728d": {"__data__": {"id_": "fa1b2e06-8569-4c40-b557-50ab94a0728d", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "b927cf40-69a0-4d7a-9d6f-768c72cbf5d2", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c152e3c4ee6611a83b375fd6b7975ca60011353e30fd0e38986be3a95948c53c", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "492b4f97-d056-4cde-bbf6-d2fa2a5b21b0", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "9275d481485002e742d430b0789ca6e90589eaa44b188a7790fecf04ffb61047", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "f70535c7-2605-4c2f-b0fc-4e390501a1e4", "node_type": "1", "metadata": {}, "hash": "2b89d7b9dc93a020bc5669695aec18eee0be614c2dae21179ba633dfc5f96f09", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Protest In 1832, following the failed Polish November Uprising, the Dominican\n monastery was sequestrated.\n Sports Event Carle received a new defense partner when the Flyers traded for Chris\n Pronger at the 2009 NHL Entry Draft.\n Other One of TMG\u2019s first performances was in September 1972 at the Waitara\n Festival.\n ", "mimetype": "text/plain", "start_char_idx": 3013, "end_char_idx": 3476, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "f70535c7-2605-4c2f-b0fc-4e390501a1e4": {"__data__": {"id_": "f70535c7-2605-4c2f-b0fc-4e390501a1e4", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "b927cf40-69a0-4d7a-9d6f-768c72cbf5d2", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "c152e3c4ee6611a83b375fd6b7975ca60011353e30fd0e38986be3a95948c53c", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "fa1b2e06-8569-4c40-b557-50ab94a0728d", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "9ee325e155102138302e74b1241fdaae467b551764014d87255c4702e4eff852", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Astronomy He discovered a number of double stars and took many photographs of Mars.\n Award He was awarded the Bialik Prize eight years later for these efforts.\n Biology Estradiol valerate is rapidly hydrolyzed into estradiol in the intestines.\n Chemistry It was the first gas manufacturer in Kuwait to provide industrial gases such\n MISC as oxygen and nitrogen to the local petroleum industry.\n Currency Total investment has been 19 billion Norwegian krone.\n Disease The 2020 competition was cancelled as part of the effort to minimize the\n COVID-19 pandemic.\n Educational Degree Sigurlaug enrolled into the medical department of the University of Iceland\n and graduated as a Medical Doctor in 2010.\n God Originally a farmer, Viking Ragnar Lothbrok claims to be descended from\n the god Odin.", "mimetype": "text/plain", "start_char_idx": 3476, "end_char_idx": 4528, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "76c294c4-2bf8-4452-9ad4-beb68c0848c3": {"__data__": {"id_": "76c294c4-2bf8-4452-9ad4-beb68c0848c3", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "c5f79cf4-c754-4363-b83f-4f33f84bb398", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "cae5d206ee6b291b2b022af2d4ceddd1925842a410b5abb435403ea4d737176c", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "c82a6593-cdd2-458f-915a-b0cbba22ba2a", "node_type": "1", "metadata": {}, "hash": "557f6f478ab11d50cc1b3ebd07fe2701a98d8cdd00c02544412bab67167f4460", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Language The play was translated into English by Michael Hofmann and published in\n 1987 by Hamish Hamilton.\n Law Four of his five policy recommendations were incorporated into the U.S.\n ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 408, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "c82a6593-cdd2-458f-915a-b0cbba22ba2a": {"__data__": {"id_": "c82a6593-cdd2-458f-915a-b0cbba22ba2a", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "c5f79cf4-c754-4363-b83f-4f33f84bb398", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "cae5d206ee6b291b2b022af2d4ceddd1925842a410b5abb435403ea4d737176c", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "76c294c4-2bf8-4452-9ad4-beb68c0848c3", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}, "hash": "eba346f03d76d266510635e125d822a6ac398628de7719f7ee3791db2145d514", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Federal Financial Law of 1966.\n Living Thing Schistura horai is a species of ray-finned fish in the stone loach genus\n \u201dSchistura\u201d.\n Medical Precious Blood Hospital offers specialist outpatient and inpatient services in\n General medicine.\n\nTable 8: All the coarse-grained and fine-grained entity types in FEW-NERD, we only highlight the entities with\nthe corresponding entity types in \u201cExample\u201d.", "mimetype": "text/plain", "start_char_idx": 408, "end_char_idx": 1017, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}}, "docstore/metadata": {"3e9bf844-0a4e-4de1-8be3-8a00f47f9be1": {"doc_hash": "576796b57e689b22de15d34b82d5053b9d5bdf4a99467e23748720950f46b830", "ref_doc_id": "13560595-482a-49aa-a22e-445536e10517"}, "d950eb15-82e3-4c1c-b8bb-d5a7249aadae": {"doc_hash": "c1f96e27b46409fd7c6f3cac110b6b12f040b98823a535ffba4579f5196168ac", "ref_doc_id": "13560595-482a-49aa-a22e-445536e10517"}, "89d6be11-da41-4dd7-899f-1340a92c4cd2": {"doc_hash": "cb4d43621c35c5b8f1ff16218a1e21d6f654171eea545a23bef5b185dcae6c5d", "ref_doc_id": "19de4927-f29a-4b3f-8694-79ef720fc706"}, "3a410009-58ad-4f35-9627-bfaa50dd56d8": {"doc_hash": "973f10c7404cc1f6eb4b7419ecb9f9b65f7f4a1960933a0d9e77ae416d60afb4", "ref_doc_id": "19de4927-f29a-4b3f-8694-79ef720fc706"}, "08a4de1b-58e0-4975-a68c-99b215ddca75": {"doc_hash": "886f24d810ac240a26ce195301ff78bf68c2e691894a7442daf75bbd3b93bb2c", "ref_doc_id": "19de4927-f29a-4b3f-8694-79ef720fc706"}, "703bb83a-4aea-4eb3-85a6-086d25555ccb": {"doc_hash": "c1fea1f263e861918b8164dd5899ca02120368c82ba7b941e44eca96b94f63f8", "ref_doc_id": "7a282a7e-fbe9-4d98-a3ca-17e66c5b5818"}, "f6c2c3db-ba3c-489e-9459-e6b4f579286b": {"doc_hash": "b019f6b69c45a12b54f0253a7ef02aa219ff58969e4136031fb9d8284e5b2151", "ref_doc_id": "7a282a7e-fbe9-4d98-a3ca-17e66c5b5818"}, "547f541a-ed82-4d22-af00-51a95dc3f0e1": {"doc_hash": "4f67430b5ae0a067bfe8d8399de04e224234e324f5e4e2ca18e032a9ab9275f4", "ref_doc_id": "48384cbb-f501-4add-bb32-198c5bc033a8"}, "5bd4a82d-022c-47e6-9bbb-8bdeef20f515": {"doc_hash": "7431dd7f03f5d180517e133d870a0219c77f32d2552a9b6218cce714a9867162", "ref_doc_id": "48384cbb-f501-4add-bb32-198c5bc033a8"}, "de0a20d6-b6dc-4ff3-8b6e-f6ad19472b08": {"doc_hash": "3255d025003acd326f1bc88fdaa9176eb886b6027ee323e73b9f687405d0ce07", "ref_doc_id": "48384cbb-f501-4add-bb32-198c5bc033a8"}, "39abd0c8-e1f5-4ee3-8da1-537353646ec6": {"doc_hash": "946e0fda6aa5de3de03554f4cd8f69a4c4c4cc39c9c6c3c0d18ab2f3d2397579", "ref_doc_id": "297587ad-1bff-4027-9c1e-7b5732f0d283"}, "ec59971c-cf54-40e2-9a55-c5de0cdbea76": {"doc_hash": "7164356aa9e8f2df54101a489109a375b5e85b1cb43ad0fb15ccd21f4ecc4ffe", "ref_doc_id": "297587ad-1bff-4027-9c1e-7b5732f0d283"}, "ce1695d1-7872-48ae-8589-5b5ed5355234": {"doc_hash": "ba3a7cfe46d943d4be44eed67ff6bd9acc215493ddfeac55bec780ddd5d933de", "ref_doc_id": "297587ad-1bff-4027-9c1e-7b5732f0d283"}, "5a2138f4-d397-4d63-9cac-d45d9fe4de7e": {"doc_hash": "1a6d32e20e86e7ff943edededb2390c3598451572dfb8a49fa89230f4d3c598c", "ref_doc_id": "12b98149-050c-4c93-81db-ea720272647e"}, "a2435907-a143-49c8-b483-ee3e8a02ba74": {"doc_hash": "8dae53115baed88ef4682b02b0e2d08e94e24a4984ce6b08396678c1a7c97ce7", "ref_doc_id": "12b98149-050c-4c93-81db-ea720272647e"}, "b3793ecc-96fc-4f50-bc61-21be9868e23b": {"doc_hash": "8a7a3f8bb5b2063ab717850df1411b14083f16386a2a6360d7d756256bd366c4", "ref_doc_id": "12b98149-050c-4c93-81db-ea720272647e"}, "c33b63d5-7341-40f1-9016-43201810afd5": {"doc_hash": "3d5e80599a4375c7886c2967fcf8a54badb3513ff488d9a03b295b8074e1fa7a", "ref_doc_id": "5c930767-ef2a-434e-91d1-780d7a9deb81"}, "ecacd21e-1829-48fa-95ab-5c90846e8dd3": {"doc_hash": "959bf9577e6f311a7c7757d259611276d0f23bc8487222cd6678c94fb44efeba", "ref_doc_id": "5c930767-ef2a-434e-91d1-780d7a9deb81"}, "95509f41-b5f0-4bc4-ba2c-886ad18a6046": {"doc_hash": "063045c6ffb3241c763f647f10411ec81558a4e1c93cdabb503301e008d0febe", "ref_doc_id": "5c930767-ef2a-434e-91d1-780d7a9deb81"}, "b778bdc3-b7ac-4222-b5f9-8e068507f3a6": {"doc_hash": "a593b49223ff84672127a243d4951709bb632f75b3dec73238a7928fce3fe317", "ref_doc_id": "fc3045bc-4ea0-4c56-aa91-9689fa1cee0d"}, "810ba2d6-65c6-4378-91c4-4ba38f087746": {"doc_hash": "5c86f26215998964aa4f80733a31ff48d71834cefbc0e7c508652bddf358767a", "ref_doc_id": "fc3045bc-4ea0-4c56-aa91-9689fa1cee0d"}, "27c32a2f-d0a1-4540-90a2-aed3847dc7e4": {"doc_hash": "bbabec9c4778e4f90d334b2020e61df2ae1bdc85d03e854ff28e0da66e1a8dcf", "ref_doc_id": "69aa87b9-95d8-46bd-b46f-f3948d0a5708"}, "c2ae573a-cfd8-4747-a7c2-ce1d55a0484b": {"doc_hash": "15380f84e88453210db0a8c7ab5cc42009ff66f68d4e66e029d3d1fcf38efd86", "ref_doc_id": "69aa87b9-95d8-46bd-b46f-f3948d0a5708"}, "9cc52dba-eaee-481f-b340-5c0a400c28e7": {"doc_hash": "e57cd03791979380484f8ad4b23c0ac1689681fe0a3a8a838cb94995b2e71224", "ref_doc_id": "69aa87b9-95d8-46bd-b46f-f3948d0a5708"}, "f1116f47-ab33-4225-bb26-ddc62fe95589": {"doc_hash": "1bbeb825aa73acd4f2bda14f92aebcaf9ef109db74efb2e92cd7058f95df94f1", "ref_doc_id": "7d05593a-cace-44c8-ad7c-15b475ca0267"}, "e72dac24-34a6-4159-818b-d6f023d89f0c": {"doc_hash": "e91fed0f32692abd2070a7bf73e855efebac7c6516b221b1f5ef89283bcad78b", "ref_doc_id": "7d05593a-cace-44c8-ad7c-15b475ca0267"}, "00f9b9f2-a717-4ccb-a263-c9c92e3a0604": {"doc_hash": "1dd73aff08b1e4d79f5ebb9551934a92650bfe7bffcbc212780a9f2d886731eb", "ref_doc_id": "7d05593a-cace-44c8-ad7c-15b475ca0267"}, "0b352382-f3d6-4693-8571-1762bd92e288": {"doc_hash": "f36c6ca5eebedf6abb5b59826751b9a025960535bfd565d7fd44080971aefc81", "ref_doc_id": "7d05593a-cace-44c8-ad7c-15b475ca0267"}, "812846d5-bd57-4218-8039-072d4826c457": {"doc_hash": "8afe57d020b291f7d2c074a8cfbea21e5d91fa2a6ec737f129bf199d39f7c070", "ref_doc_id": "7d05593a-cace-44c8-ad7c-15b475ca0267"}, "c52e3f4a-332f-4829-9c57-c42ad62c4c61": {"doc_hash": "999e3a4b02d38c963706e1d05333b4277f2012e56ef909d182fa73e9e77e030e", "ref_doc_id": "7d05593a-cace-44c8-ad7c-15b475ca0267"}, "e32886ff-2b1a-422c-b95b-e421bd43419f": {"doc_hash": "e0846d4696ffe6313c11e374b1bba89f9cac0be7bb6b7a5a01858b2b136efddd", "ref_doc_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00"}, "fbb1da9d-8adb-456b-a269-3544ffe0f8c3": {"doc_hash": "9c1acae81d6cf06908c3d8764c2a1ce3939120549a1ea21483b3414c13017bed", "ref_doc_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00"}, "5b74caa6-0e1a-4998-8fce-bc485614f693": {"doc_hash": "fdfb36f9efb828c887c3928c11563f0abed35d6b5a617e8ea570abd17c2d6464", "ref_doc_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00"}, "ae5d7634-5d34-44d1-a4e7-8d200469f0db": {"doc_hash": "f8eb07f417ebf2d68de8a444127f40066c8f8311545bb7cf8fa2a85315239053", "ref_doc_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00"}, "51714cff-a266-4cf3-96f1-bbb555068ce9": {"doc_hash": "51f2db76684508d27aa171fd8830fe03652e44af930add124e60d9816a15b89c", "ref_doc_id": "44bfd932-fee6-4074-a125-50f4c9a9ec00"}, "22d3e563-46d5-4e6a-a7d5-84b175421878": {"doc_hash": "ca7ccbc67094b071ae54681cfe5a52dd20523d7e9f4dc33f993608e3202a94b6", "ref_doc_id": "003a1c4d-0a7c-41a7-a905-8606e7d9e8d7"}, "04883a01-7aeb-46c7-ab74-6fa9337c61ee": {"doc_hash": "03eb769ba7629a8ef408fb471a560359be549810df1dbc93e6d6b84ee2947526", "ref_doc_id": "003a1c4d-0a7c-41a7-a905-8606e7d9e8d7"}, "b7178a9a-baa5-4df6-bf34-fe7e2076eb3f": {"doc_hash": "d2e2e7a04514266dfa807951883842eb702bf4c7ef1babd02dd7c1a89c77e42f", "ref_doc_id": "a4ada096-e175-47ff-9fc4-b609c217c6ba"}, "5a13abdf-cef2-4d15-a4c6-2678fd859672": {"doc_hash": "4a97f8e28500fa2c16bf6a826663c16c49a54e60f0f8628619b859fb20781d54", "ref_doc_id": "a4ada096-e175-47ff-9fc4-b609c217c6ba"}, "310337fe-3f15-42a8-a1fd-8a9bfc87f6a4": {"doc_hash": "8c24e3e71eb1d3868096bed9e6509ab02650f568637014c55491d9bf890eb80e", "ref_doc_id": "a4ada096-e175-47ff-9fc4-b609c217c6ba"}, "91ed48ed-da65-4f77-98c0-99f800d0db39": {"doc_hash": "8642d18aff05ba41c66b6502118f6cd10945423f7b5edb3c11e6a9783eae2a27", "ref_doc_id": "df5c2ab9-c043-48d1-b030-1e78c26fe080"}, "5998e668-1c0b-4446-ba84-6386fe51b607": {"doc_hash": "ce00a88f68c0ad1319ad7e09222473578e04f19a9e6109de5ded3c9c1d9d14b5", "ref_doc_id": "df5c2ab9-c043-48d1-b030-1e78c26fe080"}, "06456051-5542-40dd-9ddd-87258d76aa23": {"doc_hash": "ae01a3026568bf3de3ae7aa19abc61054c0fbc8416c250b57616a83b36ee7794", "ref_doc_id": "df5c2ab9-c043-48d1-b030-1e78c26fe080"}, "492b4f97-d056-4cde-bbf6-d2fa2a5b21b0": {"doc_hash": "9275d481485002e742d430b0789ca6e90589eaa44b188a7790fecf04ffb61047", "ref_doc_id": "b927cf40-69a0-4d7a-9d6f-768c72cbf5d2"}, "fa1b2e06-8569-4c40-b557-50ab94a0728d": {"doc_hash": "9ee325e155102138302e74b1241fdaae467b551764014d87255c4702e4eff852", "ref_doc_id": "b927cf40-69a0-4d7a-9d6f-768c72cbf5d2"}, "f70535c7-2605-4c2f-b0fc-4e390501a1e4": {"doc_hash": "f119230e0539df52ca6e9b72b018498680cc81cac108492673c48b0c459b7a0d", "ref_doc_id": "b927cf40-69a0-4d7a-9d6f-768c72cbf5d2"}, "76c294c4-2bf8-4452-9ad4-beb68c0848c3": {"doc_hash": "eba346f03d76d266510635e125d822a6ac398628de7719f7ee3791db2145d514", "ref_doc_id": "c5f79cf4-c754-4363-b83f-4f33f84bb398"}, "c82a6593-cdd2-458f-915a-b0cbba22ba2a": {"doc_hash": "a66614eb078d9801404b651cb8d4987412accf5d8b356598b1060bf1396a1fa1", "ref_doc_id": "c5f79cf4-c754-4363-b83f-4f33f84bb398"}}, "docstore/ref_doc_info": {"13560595-482a-49aa-a22e-445536e10517": {"node_ids": ["3e9bf844-0a4e-4de1-8be3-8a00f47f9be1", "d950eb15-82e3-4c1c-b8bb-d5a7249aadae"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "19de4927-f29a-4b3f-8694-79ef720fc706": {"node_ids": ["89d6be11-da41-4dd7-899f-1340a92c4cd2", "3a410009-58ad-4f35-9627-bfaa50dd56d8", "08a4de1b-58e0-4975-a68c-99b215ddca75"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "7a282a7e-fbe9-4d98-a3ca-17e66c5b5818": {"node_ids": ["703bb83a-4aea-4eb3-85a6-086d25555ccb", "f6c2c3db-ba3c-489e-9459-e6b4f579286b"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "48384cbb-f501-4add-bb32-198c5bc033a8": {"node_ids": ["547f541a-ed82-4d22-af00-51a95dc3f0e1", "5bd4a82d-022c-47e6-9bbb-8bdeef20f515", "de0a20d6-b6dc-4ff3-8b6e-f6ad19472b08"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "297587ad-1bff-4027-9c1e-7b5732f0d283": {"node_ids": ["39abd0c8-e1f5-4ee3-8da1-537353646ec6", "ec59971c-cf54-40e2-9a55-c5de0cdbea76", "ce1695d1-7872-48ae-8589-5b5ed5355234"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "12b98149-050c-4c93-81db-ea720272647e": {"node_ids": ["5a2138f4-d397-4d63-9cac-d45d9fe4de7e", "a2435907-a143-49c8-b483-ee3e8a02ba74", "b3793ecc-96fc-4f50-bc61-21be9868e23b"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "5c930767-ef2a-434e-91d1-780d7a9deb81": {"node_ids": ["c33b63d5-7341-40f1-9016-43201810afd5", "ecacd21e-1829-48fa-95ab-5c90846e8dd3", "95509f41-b5f0-4bc4-ba2c-886ad18a6046"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "fc3045bc-4ea0-4c56-aa91-9689fa1cee0d": {"node_ids": ["b778bdc3-b7ac-4222-b5f9-8e068507f3a6", "810ba2d6-65c6-4378-91c4-4ba38f087746"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "69aa87b9-95d8-46bd-b46f-f3948d0a5708": {"node_ids": ["27c32a2f-d0a1-4540-90a2-aed3847dc7e4", "c2ae573a-cfd8-4747-a7c2-ce1d55a0484b", "9cc52dba-eaee-481f-b340-5c0a400c28e7"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "7d05593a-cace-44c8-ad7c-15b475ca0267": {"node_ids": ["f1116f47-ab33-4225-bb26-ddc62fe95589", "e72dac24-34a6-4159-818b-d6f023d89f0c", "00f9b9f2-a717-4ccb-a263-c9c92e3a0604", "0b352382-f3d6-4693-8571-1762bd92e288", "812846d5-bd57-4218-8039-072d4826c457", "c52e3f4a-332f-4829-9c57-c42ad62c4c61"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "44bfd932-fee6-4074-a125-50f4c9a9ec00": {"node_ids": ["e32886ff-2b1a-422c-b95b-e421bd43419f", "fbb1da9d-8adb-456b-a269-3544ffe0f8c3", "5b74caa6-0e1a-4998-8fce-bc485614f693", "ae5d7634-5d34-44d1-a4e7-8d200469f0db", "51714cff-a266-4cf3-96f1-bbb555068ce9"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "003a1c4d-0a7c-41a7-a905-8606e7d9e8d7": {"node_ids": ["22d3e563-46d5-4e6a-a7d5-84b175421878", "04883a01-7aeb-46c7-ab74-6fa9337c61ee"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "a4ada096-e175-47ff-9fc4-b609c217c6ba": {"node_ids": ["b7178a9a-baa5-4df6-bf34-fe7e2076eb3f", "5a13abdf-cef2-4d15-a4c6-2678fd859672", "310337fe-3f15-42a8-a1fd-8a9bfc87f6a4"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "df5c2ab9-c043-48d1-b030-1e78c26fe080": {"node_ids": ["91ed48ed-da65-4f77-98c0-99f800d0db39", "5998e668-1c0b-4446-ba84-6386fe51b607", "06456051-5542-40dd-9ddd-87258d76aa23"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "b927cf40-69a0-4d7a-9d6f-768c72cbf5d2": {"node_ids": ["492b4f97-d056-4cde-bbf6-d2fa2a5b21b0", "fa1b2e06-8569-4c40-b557-50ab94a0728d", "f70535c7-2605-4c2f-b0fc-4e390501a1e4"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}, "c5f79cf4-c754-4363-b83f-4f33f84bb398": {"node_ids": ["76c294c4-2bf8-4452-9ad4-beb68c0848c3", "c82a6593-cdd2-458f-915a-b0cbba22ba2a"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\2105.07464v6.pdf", "file_name": "2105.07464v6.pdf", "file_type": "application/pdf", "file_size": 843730, "creation_date": "2024-12-05", "last_modified_date": "2024-11-14"}}}} \ No newline at end of file +{"docstore/data": {"8d55e99e-029a-47c6-8fae-5bff1ee7672a": {"__data__": {"id_": "8d55e99e-029a-47c6-8fae-5bff1ee7672a", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "f94e48ce-e161-46d1-800f-430e38b4962c", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b48eb0eef4d0df2eae6ff59be5cf4fc6a0f132aec4a695ee580d17064cf41786", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "1d81b4fd-e928-4180-8ef6-a41371ac1fed", "node_type": "1", "metadata": {}, "hash": "44018fcff7312a574af64fd046844327e2023ddeda661b59c96732fb610d3359", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "arXiv:1810.04805v2 [cs.CL] 24 May 2019\n\n BERT: Pre-training of Deep Bidirectional Transformers for\n Language Understanding\n\n Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova\n Google AI Language\n {jacobdevlin,mingweichang,kentonl,kristout}@google.com\n\n Abstract\n We introduce a new language representa-\n tion model called BERT, which stands for\n Bidirectional Encoder Representations from\n Transformers. Unlike recent language repre-\n sentation models (Peters et al., 2018a; Rad-\n ford et al., 2018), BERT is designed to pre-\n train deep bidirectional representations from\n unlabeled text by jointly conditioning on both\n left and right context in all layers. As a re-\n sult, the pre-trained BERT model can be fine-\n tuned with just one additional output layer\n to create state-of-the-art models for a wide\n range of tasks, such as question answering and\n language inference, without substantial task-\n specific architecture modifications.\n BERT is conceptually simple and empirically\n powerful. It obtains new state-of-the-art re-\n sults on eleven natural language processing\n tasks, including pushing the GLUE score to\n 80.5% (7.7% point absolute improvement),\n MultiNLI accuracy to 86.7% (4.6% absolute\n improvement), SQuAD v1.1 question answer-\n ing Test F1 to 93.2 (1.5 point absolute im-\n provement) and SQuAD v2.0 Test F1 to 83.1\n (5.1 point absolute improvement).\n1 Introduction\n\nLanguage model pre-training has been shown to\nbe effective for improving many natural language\nprocessing tasks (Dai and Le, 2015; Peters et al.,\n2018a; Radford et al., 2018; Howard and Ruder,\n2018). These include sentence-level tasks such as\nnatural language inference (Bowman et al., 2015;\nWilliams et al., 2018) and paraphrasing (Dolan\nand Brockett, 2005), which aim to predict the re-\nlationships between sentences by analyzing them\nholistically, as well as token-level tasks such as\nnamed entity recognition and question answering,\nwhere models are required to produce fine-grained\noutput at the token level (Tjong Kim Sang and\nDe Meulder, 2003; Rajpurkar et al., 2016).\n There are two existing strategies for apply-\ning pre-trained language representations to down-\nstream tasks: feature-based and fine-tuning. The\nfeature-based approach, such as ELMo (Peters\net al., 2018a), uses task-specific architectures that\ninclude the pre-trained representations as addi-\ntional features. The fine-tuning approach, such as\nthe Generative Pre-trained Transformer (OpenAI\nGPT) (Radford et al., 2018), introduces minimal\ntask-specific parameters, and is trained on the\n\ndownstream tasks by simply fine-tuning all pre-\ntrained parameters. The two approaches share the\nsame objective function during pre-training, where\nthey use unidirectional language models to learn\ngeneral language representations.\n ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 3179, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "1d81b4fd-e928-4180-8ef6-a41371ac1fed": {"__data__": {"id_": "1d81b4fd-e928-4180-8ef6-a41371ac1fed", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "f94e48ce-e161-46d1-800f-430e38b4962c", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b48eb0eef4d0df2eae6ff59be5cf4fc6a0f132aec4a695ee580d17064cf41786", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "8d55e99e-029a-47c6-8fae-5bff1ee7672a", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "2aca2bfa89e3ba6ce60d62fd9a99c06e1bf5d9cd3ff7a62bdec2bf59ae34d9f9", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "We argue that current techniques restrict the\npower of the pre-trained representations, espe-\ncially for the fine-tuning approaches. The ma-\njor limitation is that standard language models are\nunidirectional, and this limits the choice of archi-\n\ntectures that can be used during pre-training. For\nexample, in OpenAI GPT, the authors use a left-to-\nright architecture, where every token can only at-\ntend to previous tokens in the self-attention layers\nof the Transformer (Vaswani et al., 2017). Such re-\nstrictions are sub-optimal for sentence-level tasks,\nand could be very harmful when applying fine-\ntuning based approaches to token-level tasks such\nas question answering, where it is crucial to incor-\nporate context from both directions.\n In this paper, we improve the fine-tuning based\napproaches by proposing BERT: Bidirectional\nEncoder Representations from Transformers.\nBERT alleviates the previously mentioned unidi-\nrectionality constraint by using a \u201cmasked lan-\nguage model\u201d (MLM) pre-training objective, in-\nspired by the Cloze task (Taylor, 1953). The\nmasked language model randomly masks some of\nthe tokens from the input, and the objective is to\npredict the original vocabulary id of the masked", "mimetype": "text/plain", "start_char_idx": 3179, "end_char_idx": 4412, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "a58908ab-8178-40cd-b2a3-efcaf65228f2": {"__data__": {"id_": "a58908ab-8178-40cd-b2a3-efcaf65228f2", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "dcba1f47-98fb-4b83-8b82-bb0a4e4c9db8", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "0b8e1a0c339f566ebd752c36a4dc2922651392267af70af03aa115f95907c2a7", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "ab890d58-e9ed-49ae-9c78-282a91366161", "node_type": "1", "metadata": {}, "hash": "2fd2d87d3c78a3ac270b0311b7a48b60606c775990a2fa99270b055b6b13889e", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "word based only on its context. Unlike left-to-\nright language model pre-training, the MLM ob-\njective enables the representation to fuse the left\nand the right context, which allows us to pre-\ntrain a deep bidirectional Transformer. In addi-\ntion to the masked language model, we also use\na \u201cnext sentence prediction\u201d task that jointly pre-\ntrains text-pair representations. The contributions\nof our paper are as follows:\n \u2022 We demonstrate the importance of bidirectional\n pre-training for language representations. Un-\n like Radford et al. (2018), which uses unidirec-\n tional language models for pre-training, BERT\n uses masked language models to enable pre-\n trained deep bidirectional representations. This\n is also in contrast to Peters et al. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 762, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "ab890d58-e9ed-49ae-9c78-282a91366161": {"__data__": {"id_": "ab890d58-e9ed-49ae-9c78-282a91366161", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "dcba1f47-98fb-4b83-8b82-bb0a4e4c9db8", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "0b8e1a0c339f566ebd752c36a4dc2922651392267af70af03aa115f95907c2a7", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "a58908ab-8178-40cd-b2a3-efcaf65228f2", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "edd6518c76da299b4e2642067d6fafe03a3f884aa22306e6a84a97607f1a363c", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "9bbcaaca-e282-4f65-a5c2-9a5708a228eb", "node_type": "1", "metadata": {}, "hash": "db50d03f28b4aa8423179f882d2286b8c1ec0e6320acd93a52d396276d66c2ea", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "(2018a), which\n uses a shallow concatenation of independently\n trained left-to-right and right-to-left LMs.\n\n \u2022 We show that pre-trained representations reduce\n the need for many heavily-engineered task-\n specific architectures. BERT is the first fine-\n tuning based representation model that achieves\n state-of-the-art performance on a large suite\n of sentence-level and token-level tasks, outper-\n forming many task-specific architectures.\n \u2022 BERT advances the state of the art for eleven\n NLP tasks. The code and pre-trained mod-\n els are available at https://github.com/\n google-research/bert.\n\n2 Related Work\n\nThere is a long history of pre-training general lan-\nguage representations, and we briefly review the\nmost widely-used approaches in this section.\n2.1 Unsupervised Feature-based Approaches\nLearning widely applicable representations of\nwords has been an active area of research for\ndecades, including non-neural (Brown et al., 1992;\nAndo and Zhang, 2005; Blitzer et al., 2006) and\nneural (Mikolov et al., 2013; Pennington et al.,\n2014) methods. Pre-trained word embeddings\nare an integral part of modern NLP systems, of-\nfering significant improvements over embeddings\nlearned from scratch (Turian et al., 2010). To pre-\ntrain word embedding vectors, left-to-right lan-\nguage modeling objectives have been used (Mnih\nand Hinton, 2009), as well as objectives to dis-\ncriminate correct from incorrect words in left and\nright context (Mikolov et al., 2013).\n These approaches have been generalized to\ncoarser granularities, such as sentence embed-\ndings (Kiros et al., 2015; Logeswaran and Lee,\n2018) or paragraph embeddings (Le and Mikolov,\n2014). To train sentence representations, prior\nwork has used objectives to rank candidate next\nsentences (Jernite et al., 2017; Logeswaran and\nLee, 2018), left-to-right generation of next sen-\ntence words given a representation of the previous\nsentence (Kiros et al., 2015), or denoising auto-\nencoder derived objectives (Hill et al., 2016).\n ELMo and its predecessor (Peters et al., 2017,\n2018a) generalize traditional word embedding re-\nsearch along a different dimension. They extract\ncontext-sensitive features from a left-to-right and a\nright-to-left language model. The contextual rep-\nresentation of each token is the concatenation of\nthe left-to-right and right-to-left representations.\nWhen integrating contextual word embeddings\nwith existing task-specific architectures, ELMo\nadvances the state of the art for several major NLP\nbenchmarks (Peters et al., 2018a) including ques-\ntion answering (Rajpurkar et al., 2016), sentiment\nanalysis (Socher et al., 2013), and named entity\nrecognition (Tjong Kim Sang and De Meulder,\n2003). Melamud et al. (2016) proposed learning\ncontextual representations through a task to pre-\ndict a single word from both left and right context\nusing LSTMs. Similar to ELMo, their model is\nfeature-based and not deeply bidirectional. ", "mimetype": "text/plain", "start_char_idx": 762, "end_char_idx": 3740, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "9bbcaaca-e282-4f65-a5c2-9a5708a228eb": {"__data__": {"id_": "9bbcaaca-e282-4f65-a5c2-9a5708a228eb", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "dcba1f47-98fb-4b83-8b82-bb0a4e4c9db8", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "0b8e1a0c339f566ebd752c36a4dc2922651392267af70af03aa115f95907c2a7", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "ab890d58-e9ed-49ae-9c78-282a91366161", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "c48e35db928469b51656e9cd7516a4f31009d3c2ebcf1de606c315125da340d6", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Fedus\net al. (2018) shows that the cloze task can be used\nto improve the robustness of text generation mod-\nels.\n\n2.2 Unsupervised Fine-tuning Approaches\nAs with the feature-based approaches, the first\nworks in this direction only pre-trained word em-\nbedding parameters from unlabeled text (Col-\nlobert and Weston, 2008).\n More recently, sentence or document encoders\nwhich produce contextual token representations\nhave been pre-trained from unlabeled text and\nfine-tuned for a supervised downstream task (Dai\nand Le, 2015; Howard and Ruder, 2018; Radford\net al., 2018). The advantage of these approaches\nis that few parameters need to be learned from\nscratch. At least partly due to this advantage,\nOpenAI GPT (Radford et al., 2018) achieved pre-\nviously state-of-the-art results on many sentence-\nlevel tasks from the GLUE benchmark (Wang\net al., 2018a). Left-to-right language model-", "mimetype": "text/plain", "start_char_idx": 3740, "end_char_idx": 4661, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "926f43b0-e58c-4e8f-b766-c90f2447361e": {"__data__": {"id_": "926f43b0-e58c-4e8f-b766-c90f2447361e", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "f9edbae8-ed8d-4147-803e-936500d2d60f", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "9df513b63b8b262211a3d8eb8604c1aaa8b1b593862148aa0963a289adb8e421", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "1a82bbed-7218-4c23-8cb5-402971703a30", "node_type": "1", "metadata": {}, "hash": "6ac57136a445edf51d9d665746d3d0bd3c173d180cb8bc84e4a0fc41654c755b", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " NSP Mask LM Mask LM MNLI NER SQuAD Start/End Span\n\n C T 1 ... T N T [SEP]T \u20191 ... T \u2019M C T 1 ... T N T [SEP]T \u20191 ... T \u2019M\n\n BERT BERT BERT\n E [CLS]E 1 ... E N E [SEP]E \u20191 ... E \u2019M E [CLS]E 1 ... E N E [SEP] E \u20191 ... E \u2019M\n\n [CLS] Tok 1... Tok N [SEP] Tok 1... TokM [CLS] Tok 1... Tok N [SEP] Tok 1... TokM\n\n Masked Sentence A Masked Sentence B Question Paragraph\n\n Unlabeled Sentence A and B Pair Question Answer Pair\n\n Pre-training Fine-Tuning\n\n Figure 1: Overall pre-training and fine-tuning procedures for BERT. Apart from output layers, the same architec-\n tures are used in both pre-training and fine-tuning. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 1412, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "1a82bbed-7218-4c23-8cb5-402971703a30": {"__data__": {"id_": "1a82bbed-7218-4c23-8cb5-402971703a30", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "f9edbae8-ed8d-4147-803e-936500d2d60f", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "9df513b63b8b262211a3d8eb8604c1aaa8b1b593862148aa0963a289adb8e421", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "926f43b0-e58c-4e8f-b766-c90f2447361e", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "42369be9595563bb5dd96a4dde08a6e205223c2c6f3edb12c542a50a943a6e42", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "a93a9152-5892-4b84-87c4-d7d7e57b5d64", "node_type": "1", "metadata": {}, "hash": "d2427e562848251c49292d250a0d4e843d7b44186198c1c8f968486e8bfd40a3", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "The same pre-trained model parameters are used to initialize\n models for different down-stream tasks. During fine-tuning, all parameters are fine-tuned. [CLS] is a special\n symbol added in front of every input example, and [SEP] is a special separator token (e.g. separating ques-\n tions/answers).\n\n", "mimetype": "text/plain", "start_char_idx": 1412, "end_char_idx": 1774, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "a93a9152-5892-4b84-87c4-d7d7e57b5d64": {"__data__": {"id_": "a93a9152-5892-4b84-87c4-d7d7e57b5d64", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "f9edbae8-ed8d-4147-803e-936500d2d60f", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "9df513b63b8b262211a3d8eb8604c1aaa8b1b593862148aa0963a289adb8e421", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "1a82bbed-7218-4c23-8cb5-402971703a30", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "fc2e237c06d17ca2f6b8044e119b7a0ef6543d27c24ae314a1c3616d6ea6467e", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "ing and auto-encoder objectives have been used\nfor pre-training such models (Howard and Ruder,\n2018; Radford et al., 2018; Dai and Le, 2015).\n2.3 Transfer Learning from Supervised Data\nThere has also been work showing effective trans-\nfer from supervised tasks with large datasets, such\nas natural language inference (Conneau et al.,\n2017) and machine translation (McCann et al.,\n2017). Computer vision research has also demon-\nstrated the importance of transfer learning from\nlarge pre-trained models, where an effective recipe\nis to fine-tune models pre-trained with Ima-\ngeNet (Deng et al., 2009; Yosinski et al., 2014).\n3 BERT\n\nWe introduce BERT and its detailed implementa-\ntion in this section. There are two steps in our\nframework: pre-training and fine-tuning.\ning pre-training, the model is trained on unlabeled\ndata over different pre-training tasks. For fine-\ntuning, the BERT model is first initialized with\nthe pre-trained parameters, and all of the param-\neters are fine-tuned using labeled data from the\ndownstream tasks. Each downstream task has sep-\narate fine-tuned models, even though they are ini-\ntialized with the same pre-trained parameters. The\nquestion-answering example in Figure 1 will serve\nas a running example for this section.\n A distinctive feature of BERT is its unified ar-\nchitecture across different tasks. There is mini-\n mal difference between the pre-trained architec-\n ture and the final downstream architecture.\n Model Architecture BERT\u2019s model architec-\n ture is a multi-layer bidirectional Transformer en-\n coder based on the original implementation de-\n scribed in Vaswani et al. (2017) and released in\n the tensor2tensor library.1 Because the use\n of Transformers has become common and our im-\n plementation is almost identical to the original,\n we will omit an exhaustive background descrip-\n tion of the model architecture and refer readers to\n Vaswani et al. (2017) as well as excellent guides\n such as \u201cThe Annotated Transformer.\u201d2In this work, we denote the number of layers\n (i.e., Transformer blocks) as L, the hidden size as\n H, and the number of self-attention heads as A.3\n We primarily report results on two model sizes:\n BERTBASE (L=12, H=768, A=12, Total Param-\nDur- eters=110M) and BERTLARGE (L=24, H=1024,\n A=16, Total Parameters=340M).\n BERTBASE was chosen to have the same model\n size as OpenAI GPT for comparison purposes.\n Critically, however, the BERT Transformer uses\n bidirectional self-attention, while the GPT Trans-\n former uses constrained self-attention where every\n token can only attend to context to its left.4\n 1https://github.com/tensorflow/tensor2tensor\n 2http://nlp.seas.harvard.edu/2018/04/03/attention.html\n 3In all cases we set the feed-forward/filter size to be 4H,\n i.e., 3072 for the H = 768 and 4096 for the H = 1024.\n 4We note that in the literature the bidirectional Trans-", "mimetype": "text/plain", "start_char_idx": 1774, "end_char_idx": 5002, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "abbf6d32-14ba-4c60-850d-cf4429b349e4": {"__data__": {"id_": "abbf6d32-14ba-4c60-850d-cf4429b349e4", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "cc193b63-6a56-496d-92cc-812a0a7cf204", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "c504ea5b9924961a3c9ca9f79fe234f1e39fed524dcae1c6501d8e27d6c1815d", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "f1f06b3a-489c-4861-b5be-72cd1c1d8e80", "node_type": "1", "metadata": {}, "hash": "e5f87bf6b7e315ec301f7fa560aefe8486181803b4dd4bbfbbba0af61ff807a6", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Input/Output Representations To make BERT\nhandle a variety of down-stream tasks, our input\nrepresentation is able to unambiguously represent\nboth a single sentence and a pair of sentences\n(e.g., \u3008 Question, Answer \u3009) in one token sequence.\nThroughout this work, a \u201csentence\u201d can be an arbi-\ntrary span of contiguous text, rather than an actual\nlinguistic sentence. A \u201csequence\u201d refers to the in-\nput token sequence to BERT, which may be a sin-\ngle sentence or two sentences packed together.\n We use WordPiece embeddings (Wu et al.,\n2016) with a 30,000 token vocabulary. The first\ntoken of every sequence is always a special clas-\nsification token ([CLS]). The final hidden state\ncorresponding to this token is used as the ag-\ngregate sequence representation for classification\ntasks. Sentence pairs are packed together into a\nsingle sequence. We differentiate the sentences in\ntwo ways. First, we separate them with a special\ntoken ([SEP]). Second, we add a learned embed-\nding to every token indicating whether it belongs\nto sentence A or sentence B. As shown in Figure 1,\nwe denote input embedding as E, the final hidden\nvector of the special [CLS] token as C \u2208 RH ,\nand the final hidden vector for the ith input token\nas Ti \u2208 RH .\n For a given token, its input representation is\nconstructed by summing the corresponding token,\nsegment, and position embeddings. A visualiza-\ntion of this construction can be seen in Figure 2.\n\n3.1 Pre-training BERT\nUnlike Peters et al. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 1495, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "f1f06b3a-489c-4861-b5be-72cd1c1d8e80": {"__data__": {"id_": "f1f06b3a-489c-4861-b5be-72cd1c1d8e80", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "cc193b63-6a56-496d-92cc-812a0a7cf204", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "c504ea5b9924961a3c9ca9f79fe234f1e39fed524dcae1c6501d8e27d6c1815d", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "abbf6d32-14ba-4c60-850d-cf4429b349e4", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "41b6d9a30bcbf21c8b5930fbeae0cc97e5eaf1a435c890cf50c611130ed5513a", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "544c5e06-611b-4fe1-93ae-aa3441eb385e", "node_type": "1", "metadata": {}, "hash": "94a6a070021b29b916813b97eea6cc8ae87ff0c254afc72d7cdff7b398df6015", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "(2018a) and Radford et al.\n(2018), we do not use traditional left-to-right or\nright-to-left language models to pre-train BERT.\nInstead, we pre-train BERT using two unsuper-\nvised tasks, described in this section. This step\nis presented in the left part of Figure 1.\n\nTask #1: Masked LM Intuitively, it is reason-\nable to believe that a deep bidirectional model is\nstrictly more powerful than either a left-to-right\nmodel or the shallow concatenation of a left-to-\nright and a right-to-left model. Unfortunately,\nstandard conditional language models can only be\ntrained left-to-right or right-to-left, since bidirec-\ntional conditioning would allow each word to in-\ndirectly \u201csee itself\u201d, and the model could trivially\npredict the target word in a multi-layered context.\nformer is often referred to as a \u201cTransformer encoder\u201d while\nthe left-context-only version is referred to as a \u201cTransformer\ndecoder\u201d since it can be used for text generation.\n In order to train a deep bidirectional representa-\ntion, we simply mask some percentage of the input\ntokens at random, and then predict those masked\ntokens. We refer to this procedure as a \u201cmasked\nLM\u201d (MLM), although it is often referred to as a\nCloze task in the literature (Taylor, 1953). In this\ncase, the final hidden vectors corresponding to the\nmask tokens are fed into an output softmax over\nthe vocabulary, as in a standard LM. In all of our\nexperiments, we mask 15% of all WordPiece to-\nkens in each sequence at random. In contrast to\ndenoising auto-encoders (Vincent et al., 2008), we\nonly predict the masked words rather than recon-\nstructing the entire input.\n Although this allows us to obtain a bidirec-\ntional pre-trained model, a downside is that we\nare creating a mismatch between pre-training and\nfine-tuning, since the [MASK] token does not ap-\npear during fine-tuning. To mitigate this, we do\nnot always replace \u201cmasked\u201d words with the ac-\ntual [MASK] token. The training data generator\nchooses 15% of the token positions at random for\nprediction. If the i-th token is chosen, we replace\nthe i-th token with (1) the [MASK] token 80% of\nthe time (2) a random token 10% of the time (3)\nthe unchanged i-th token 10% of the time. Then,\nTi will be used to predict the original token with\ncross entropy loss. ", "mimetype": "text/plain", "start_char_idx": 1495, "end_char_idx": 3796, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "544c5e06-611b-4fe1-93ae-aa3441eb385e": {"__data__": {"id_": "544c5e06-611b-4fe1-93ae-aa3441eb385e", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "cc193b63-6a56-496d-92cc-812a0a7cf204", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "c504ea5b9924961a3c9ca9f79fe234f1e39fed524dcae1c6501d8e27d6c1815d", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "f1f06b3a-489c-4861-b5be-72cd1c1d8e80", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "7171358e6db4a7d03cbed4a1dde37cc4c3afdd7ed30c429f837a996a188fa018", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "We compare variations of this\nprocedure in Appendix C.2.\n\nTask #2: Next Sentence Prediction (NSP)\nMany important downstream tasks such as Ques-\ntion Answering (QA) and Natural Language Infer-\nence (NLI) are based on understanding the rela-\ntionship between two sentences, which is not di-\nrectly captured by language modeling. In order\nto train a model that understands sentence rela-\ntionships, we pre-train for a binarized next sen-\ntence prediction task that can be trivially gener-\nated from any monolingual corpus. Specifically,\nwhen choosing the sentences A and B for each pre-\ntraining example, 50% of the time B is the actual\nnext sentence that follows A (labeled as IsNext),\nand 50% of the time it is a random sentence from\nthe corpus (labeled as NotNext). As we show\nin Figure 1, C is used for next sentence predic-\ntion (NSP).5 Despite its simplicity, we demon-\nstrate in Section 5.1 that pre-training towards this\ntask is very beneficial to both QA and NLI. 6\n 5The final model achieves 97%-98% accuracy on NSP.\n 6The vector C is not a meaningful sentence representation\nwithout fine-tuning, since it was trained with NSP.", "mimetype": "text/plain", "start_char_idx": 3796, "end_char_idx": 4959, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "b635c2da-b277-4b49-936d-8ac5939da468": {"__data__": {"id_": "b635c2da-b277-4b49-936d-8ac5939da468", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5a8026ec-62da-4652-ab01-3c38b64fda7c", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "2528f15cb7e33e1a6812665686be884401bdcef4d6c11d2f8bdc665e233bd6d4", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "2f892821-777d-4fb6-ac8f-611ef0566d7d", "node_type": "1", "metadata": {}, "hash": "977dccbf7c4edab6ebb97aa97345b2293f6e71f1dec073e7975da20a780e2356", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Input [CLS] my dog is cute [SEP] he likes play ##ing [SEP]\n\n Token E [CLS] Emy Edog Eis Ecute E[SEP] Ehe Elikes Eplay E##ing E[SEP]\n Embeddings\n\n Segment E A E A E A E A E A E A E B E B E B E B E B\n Embeddings\n\n Position E 0 E 1 E 2 E 3 E 4 E 5 E 6 E 7 E 8 E 9 E 10\n Embeddings\n\n Figure 2: BERT input representation. The input embeddings are the sum of the token embeddings, the segmenta-\n tion embeddings and the position embeddings.\n\nThe NSP task is closely related to representation-\nlearning objectives used in Jernite et al. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 989, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "2f892821-777d-4fb6-ac8f-611ef0566d7d": {"__data__": {"id_": "2f892821-777d-4fb6-ac8f-611ef0566d7d", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5a8026ec-62da-4652-ab01-3c38b64fda7c", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "2528f15cb7e33e1a6812665686be884401bdcef4d6c11d2f8bdc665e233bd6d4", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "b635c2da-b277-4b49-936d-8ac5939da468", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "984d97b14cd67774554e50dda8a526787755c1ed32bce97782170293021e249b", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "6c2b1028-b0b0-4f92-af51-9660eb0f49f5", "node_type": "1", "metadata": {}, "hash": "a393d71d25b2c080f8870137cad5e1782ac540d1a08dcad32606faaa9d4f469a", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "(2017) and\nLogeswaran and Lee (2018). However, in prior\nwork, only sentence embeddings are transferred to\ndown-stream tasks, where BERT transfers all pa-\nrameters to initialize end-task model parameters.\n\nPre-training data The pre-training procedure\nlargely follows the existing literature on language\nmodel pre-training. For the pre-training corpus we\nuse the BooksCorpus (800M words) (Zhu et al.,\n2015) and English Wikipedia (2,500M words).\nFor Wikipedia we extract only the text passages\nand ignore lists, tables, and headers. It is criti-\ncal to use a document-level corpus rather than a\nshuffled sentence-level corpus such as the Billion\nWord Benchmark (Chelba et al., 2013) in order to\nextract long contiguous sequences.\n3.2 Fine-tuning BERT\nFine-tuning is straightforward since the self-\nattention mechanism in the Transformer al-\nlows BERT to model many downstream tasks\u2014\nwhether they involve single text or text pairs\u2014by\nswapping out the appropriate inputs and outputs.\nFor applications involving text pairs, a common\npattern is to independently encode text pairs be-\nfore applying bidirectional cross attention, such\nas Parikh et al. (2016); Seo et al. (2017). BERT\ninstead uses the self-attention mechanism to unify\nthese two stages, as encoding a concatenated text\npair with self-attention effectively includes bidi-\nrectional cross attention between two sentences.\n For each task, we simply plug in the task-\nspecific inputs and outputs into BERT and fine-\ntune all the parameters end-to-end. At the in-\nput, sentence A and sentence B from pre-training\nare analogous to (1) sentence pairs in paraphras-\ning, (2) hypothesis-premise pairs in entailment, (3)\nquestion-passage pairs in question answering, and\n(4) a degenerate text-\u2205 pair in text classification\nor sequence tagging. At the output, the token rep-\nresentations are fed into an output layer for token-\nlevel tasks, such as sequence tagging or question\nanswering, and the [CLS] representation is fed\ninto an output layer for classification, such as en-\ntailment or sentiment analysis.\n Compared to pre-training, fine-tuning is rela-\ntively inexpensive. All of the results in the pa-\nper can be replicated in at most 1 hour on a sin-\ngle Cloud TPU, or a few hours on a GPU, starting\nfrom the exact same pre-trained model.7 We de-\nscribe the task-specific details in the correspond-\ning subsections of Section 4. More details can be\nfound in Appendix A.5.\n\n4 Experiments\nIn this section, we present BERT fine-tuning re-\nsults on 11 NLP tasks.\n4.1 GLUE\nThe General Language Understanding Evaluation\n(GLUE) benchmark (Wang et al., 2018a) is a col-\nlection of diverse natural language understanding\ntasks. Detailed descriptions of GLUE datasets are\nincluded in Appendix B.1.\n To fine-tune on GLUE, we represent the input\nsequence (for single sentence or sentence pairs)\nas described in Section 3, and use the final hid-\nden vector C \u2208 RH corresponding to the first\ninput token ([CLS]) as the aggregate representa-\ntion. The only new parameters introduced during\nfine-tuning are classification layer weights W \u2208\nRK\u00d7H , where K is the number of labels. ", "mimetype": "text/plain", "start_char_idx": 989, "end_char_idx": 4141, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "6c2b1028-b0b0-4f92-af51-9660eb0f49f5": {"__data__": {"id_": "6c2b1028-b0b0-4f92-af51-9660eb0f49f5", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5a8026ec-62da-4652-ab01-3c38b64fda7c", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "2528f15cb7e33e1a6812665686be884401bdcef4d6c11d2f8bdc665e233bd6d4", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "2f892821-777d-4fb6-ac8f-611ef0566d7d", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b1b75cf9b57f383786b17095af33f3371cec5a2680d9146d9a5a6747db75b6c9", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "We com-\npute a standard classification loss with C and W ,\ni.e., log(softmax(CW T )).\n 7For example, the BERT SQuAD model can be trained in\naround 30 minutes on a single Cloud TPU to achieve a Dev\nF1 score of 91.0%.\n 8See (10) in https://gluebenchmark.com/faq.", "mimetype": "text/plain", "start_char_idx": 4141, "end_char_idx": 4407, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "349ffcb2-b1dd-49ef-b7f7-37406a635c71": {"__data__": {"id_": "349ffcb2-b1dd-49ef-b7f7-37406a635c71", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "bc534cc6-e535-4b99-b43b-605d5b37174d", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "94088816dc336558cdd166bf902f577f63147d180695eeb1688c3f56347193c3", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "43c7eae6-7d68-47bc-8fd4-8f0584405231", "node_type": "1", "metadata": {}, "hash": "c22399c625eba9e8f4f7b55bb2b397fc0836581dc8867be47199a10af00c43ac", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " System MNLI-(m/mm) QQP QNLI SST-2 CoLA STS-B MRPC RTE Average\n 392k 363k 108k 67k 8.5k 5.7k 3.5k 2.5k -\n Pre-OpenAI SOTA 80.6/80.1 66.1 82.3 93.2 35.0 81.0 86.0 61.7 74.0\n BiLSTM+ELMo+Attn 76.4/76.1 64.8 79.8 90.4 36.0 73.3 84.9 56.8 71.0\n OpenAI GPT 82.1/81.4 70.3 87.4 91.3 45.4 80.0 82.3 56.0 75.1\n BERTBASE 84.6/83.4 71.2 90.5 93.5 52.1 85.8 88.9 66.4 79.6\n BERTLARGE 86.7/85.9 72.1 92.7 94.9 60.5 86.5 89.3 70.1 82.1\n\n Table 1: GLUE Test results, scored by the evaluation server (https://gluebenchmark.com/leaderboard).\n The number below each task denotes the number of training examples. The \u201cAverage\u201d column is slightly different\n than the official GLUE score, since we exclude the problematic WNLI set.8 BERT and OpenAI GPT are single-\n model, single task. F1 scores are reported for QQP and MRPC, Spearman correlations are reported for STS-B, and\n accuracy scores are reported for the other tasks. We exclude entries that use BERT as one of their components.\n\n We use a batch size of 32 and fine-tune for 3\nepochs over the data for all GLUE tasks. For each\ntask, we selected the best fine-tuning learning rate\n(among 5e-5, 4e-5, 3e-5, and 2e-5) on the Dev set.\nAdditionally, for BERTLARGE we found that fine-\ntuning was sometimes unstable on small datasets,\nso we ran several random restarts and selected the\nbest model on the Dev set. With random restarts,\nwe use the same pre-trained checkpoint but per-\nform different fine-tuning data shuffling and clas-\nsifier layer initialization.9\n Results are presented in Table 1. Both\n\nBERTBASE and BERTLARGE outperform all sys-\ntems on all tasks by a substantial margin, obtaining\n4.5% and 7.0% respective average accuracy im-\nprovement over the prior state of the art. Note that\nBERTBASE and OpenAI GPT are nearly identical\nin terms of model architecture apart from the at-\ntention masking. For the largest and most widely\nreported GLUE task, MNLI, BERT obtains a 4.6%\nabsolute accuracy improvement. On the official\nGLUE leaderboard10, BERTLARGE obtains a score\nof 80.5, compared to OpenAI GPT, which obtains\n72.8 as of the date of writing.\n We find that BERTLARGE significantly outper-\nforms BERTBASE across all tasks, especially those\nwith very little training data. The effect of model\nsize is explored more thoroughly in Section 5.2.\n\n4.2 SQuAD v1.1\n\nThe Stanford Question Answering Dataset\n(SQuAD v1.1) is a collection of 100k crowd-\nsourced question/answer pairs (Rajpurkar et al.,\n2016). Given a question and a passage from\n 9The GLUE data set distribution does not include the Test\nlabels, and we only made a single GLUE evaluation server\nsubmission for each of BERTBASE and BERTLARGE .\n 10https://gluebenchmark.com/leaderboard\nWikipedia containing the answer, the task is to\npredict the answer text span in the passage.\n As shown in Figure 1, in the question answer-\ning task, we represent the input question and pas-\nsage as a single packed sequence, with the ques-\ntion using the A embedding and the passage using\nthe B embedding. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 3770, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "43c7eae6-7d68-47bc-8fd4-8f0584405231": {"__data__": {"id_": "43c7eae6-7d68-47bc-8fd4-8f0584405231", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "bc534cc6-e535-4b99-b43b-605d5b37174d", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "94088816dc336558cdd166bf902f577f63147d180695eeb1688c3f56347193c3", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "349ffcb2-b1dd-49ef-b7f7-37406a635c71", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "17269988ca15a411246f71360ae77c4492f64e2d353d65412d6591e307bda7ee", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "15b81cf1-d657-41b4-a9ff-703f8e5e6fac", "node_type": "1", "metadata": {}, "hash": "6ca3a08ecedab58920c940ad1a8daec2a5281f99c992e4c1d6f1b5fd2fcbc2f7", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "We only introduce a start vec-\ntor S \u2208 RH and an end vector E \u2208 RH during\nfine-tuning. The probability of word i being the\nstart of the answer span is computed as a dot prod-\nuct between Ti and S followed by a softmax over\nall of the words in the paragraph: Pi = \u2211eS\u00b7Ti\u00b7Tj .\n j eS\nThe analogous formula is used for the end of the\nanswer span. The score of a candidate span from\nposition i to position j is defined as S\u00b7Ti + E\u00b7Tj ,\nand the maximum scoring span where j \u2265 i is\nused as a prediction. The training objective is the\nsum of the log-likelihoods of the correct start and\nend positions. We fine-tune for 3 epochs with a\nlearning rate of 5e-5 and a batch size of 32.\n Table 2 shows top leaderboard entries as well\n\nas results from top published systems (Seo et al.,\n2017; Clark and Gardner, 2018; Peters et al.,\n2018a; Hu et al., 2018). The top results from the\nSQuAD leaderboard do not have up-to-date public\nsystem descriptions available,11 and are allowed to\nuse any public data when training their systems.\nWe therefore use modest data augmentation in\nour system by first fine-tuning on TriviaQA (Joshi\net al., 2017) befor fine-tuning on SQuAD.\n Our best performing system outperforms the top\nleaderboard system by +1.5 F1 in ensembling and\n+1.3 F1 as a single system. In fact, our single\nBERT model outperforms the top ensemble sys-\ntem in terms of F1 score. Without TriviaQA fine-\n 11QANet is described in Yu et al. ", "mimetype": "text/plain", "start_char_idx": 3770, "end_char_idx": 5273, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "15b81cf1-d657-41b4-a9ff-703f8e5e6fac": {"__data__": {"id_": "15b81cf1-d657-41b4-a9ff-703f8e5e6fac", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "bc534cc6-e535-4b99-b43b-605d5b37174d", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "94088816dc336558cdd166bf902f577f63147d180695eeb1688c3f56347193c3", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "43c7eae6-7d68-47bc-8fd4-8f0584405231", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "9e3b7dd2238d9acd968b8a769d0581f893d0699ebdb7bc690a8cf8e45cd3d9c4", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "(2018), but the system\n\nhas improved substantially after publication.", "mimetype": "text/plain", "start_char_idx": 5273, "end_char_idx": 5342, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "0af4d994-64d1-4bf4-8624-55ed2e4f7dbd": {"__data__": {"id_": "0af4d994-64d1-4bf4-8624-55ed2e4f7dbd", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7ab3456d-9fa7-49a8-856f-b2a713991fd8", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "0464c89c7c2bd6b62dc88231646ec3440477502e2d385be0ea2d4a00f6c68047", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "b6154034-ddde-4fd8-a018-126e613aa014", "node_type": "1", "metadata": {}, "hash": "204313461a2324cf4bb2994cf29641409922b6a49790a66acf6c5a6bf04b2950", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " System Dev Test System Dev Test\n EM F1 EM F1 ESIM+GloVe 51.9 52.7\n Top Leaderboard Systems (Dec 10th, 2018) ESIM+ELMo 59.1 59.2\n Human - - 82.3 91.2 OpenAI GPT - 78.0\n #1 Ensemble - nlnet - - 86.0 91.7 BERTBASE 81.6 -\n #2 Ensemble - QANet - - 84.5 90.5 BERTLARGE 86.6 86.3\n Published Human (expert)\u2020 - 85.0\n BiDAF+ELMo (Single) - 85.6 - 85.8 Human (5 annotations)\u2020 - 88.0\n R.M. Reader (Ensemble) 81.2 87.9 82.3 88.5\n Ours Table 4: SWAG Dev and Test accuracies. \u2020Human per-\n BERTBASE (Single) 80.8 88.5 - -\n BERTLARGE (Single) 84.1 90.9 - - formance is measured with 100 samples, as reported in\n BERTLARGE (Ensemble) 85.8 91.8 - - the SWAG paper.\n BERTLARGE (Sgl.+TriviaQA) 84.2 91.1 85.1 91.8\n BERTLARGE (Ens.+TriviaQA) 86.2 92.2 87.4 93.2\n\nTable 2: SQuAD 1.1 results. The BERT ensemble\nis 7x systems which use different pre-training check-\npoints and fine-tuning seeds.\n\n System Dev Test\n EM F1 EM F1\n Top Leaderboard Systems (Dec 10th, 2018)\n Human 86.3 89.0 86.9 89.5\n #1 Single - MIR-MRC (F-Net) - - 74.8 78.0\n #2 Single - nlnet - - 74.2 77.1\n Published\n unet (Ensemble) - - 71.4 74.9\n SLQA+ (Single) - 71.4 74.4\n Ours\n BERTLARGE (Single) 78.7 81.9 80.0 83.1\n\nTable 3: SQuAD 2.0 results. We exclude entries that\nuse BERT as one of their components.\n\ntuning data, we only lose 0.1-0.4 F1, still outper-\nforming all existing systems by a wide margin.12\n4.3 SQuAD v2.0\nThe SQuAD 2.0 task extends the SQuAD 1.1\nproblem definition by allowing for the possibility\nthat no short answer exists in the provided para-\ngraph, making the problem more realistic.\n We use a simple approach to extend the SQuAD\nv1.1 BERT model for this task. We treat ques-\ntions that do not have an answer as having an an-\nswer span with start and end at the [CLS] to-\nken. The probability space for the start and end\nanswer span positions is extended to include the\nposition of the [CLS] token. For prediction, we\ncompare the score of the no-answer span: snull =\nS\u00b7C + E\u00b7C to the score of the best non-null span\n 12The TriviaQA data we used consists of paragraphs from\nTriviaQA-Wiki formed of the first 400 tokens in documents,\nthat contain at least one of the provided possible answers.\ns\u02c6i,j = maxj\u2265iS\u00b7Ti + E\u00b7Tj . We predict a non-null\nanswer when \u02c6si,j > snull + \u03c4 , where the thresh-\nold \u03c4 is selected on the dev set to maximize F1.\nWe did not use TriviaQA data for this model. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 3711, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "b6154034-ddde-4fd8-a018-126e613aa014": {"__data__": {"id_": "b6154034-ddde-4fd8-a018-126e613aa014", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7ab3456d-9fa7-49a8-856f-b2a713991fd8", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "0464c89c7c2bd6b62dc88231646ec3440477502e2d385be0ea2d4a00f6c68047", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "0af4d994-64d1-4bf4-8624-55ed2e4f7dbd", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "5b203c46fac99619997fbe420c7403431c3da2285853ad72f69b2509fc694e47", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "61d2f504-bc66-41b5-b6b3-662d999e9f60", "node_type": "1", "metadata": {}, "hash": "791d9f609c53e58b17f287c8c24afec8ac14c1c6b0fe0ac2d12b5bf274ff46d1", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "We\nfine-tuned for 2 epochs with a learning rate of 5e-5\nand a batch size of 48.\n The results compared to prior leaderboard en-\ntries and top published work (Sun et al., 2018;\nWang et al., 2018b) are shown in Table 3, exclud-\n\ning systems that use BERT as one of their com-\nponents. We observe a +5.1 F1 improvement over\nthe previous best system.\n\n4.4 SWAG\nThe Situations With Adversarial Generations\n(SWAG) dataset contains 113k sentence-pair com-\npletion examples that evaluate grounded common-\nsense inference (Zellers et al., 2018). Given a sen-\ntence, the task is to choose the most plausible con-\ntinuation among four choices.When fine-tuning on the SWAG dataset, we\nconstruct four input sequences, each containing\nthe concatenation of the given sentence (sentence\nA) and a possible continuation (sentence B). The\nonly task-specific parameters introduced is a vec-\ntor whose dot product with the [CLS] token rep-\nresentation C denotes a score for each choice\nwhich is normalized with a softmax layer.\n We fine-tune the model for 3 epochs with a\nlearning rate of 2e-5 and a batch size of 16. ", "mimetype": "text/plain", "start_char_idx": 3711, "end_char_idx": 4814, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "61d2f504-bc66-41b5-b6b3-662d999e9f60": {"__data__": {"id_": "61d2f504-bc66-41b5-b6b3-662d999e9f60", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "7ab3456d-9fa7-49a8-856f-b2a713991fd8", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "0464c89c7c2bd6b62dc88231646ec3440477502e2d385be0ea2d4a00f6c68047", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "b6154034-ddde-4fd8-a018-126e613aa014", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "605d062d81f57157a49ee24b715fbe954d1ac4da24bf69e5746ce728e1edcf3b", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Re-\nsults are presented in Table 4. BERTLARGE out-\nperforms the authors\u2019 baseline ESIM+ELMo sys-\ntem by +27.1% and OpenAI GPT by 8.3%.\n5 Ablation Studies\n\nIn this section, we perform ablation experiments\nover a number of facets of BERT in order to better\nunderstand their relative importance. Additional", "mimetype": "text/plain", "start_char_idx": 4814, "end_char_idx": 5120, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "049233ae-7f17-4b97-b212-478744e93165": {"__data__": {"id_": "049233ae-7f17-4b97-b212-478744e93165", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "13d1b84a-4210-4825-933b-a6d498a30c00", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "131b956706902a0251fc74cfea9f60f7dea5bf0ed9d618204398e5e06339cc19", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "21c9c67d-01b5-49c7-b0c5-7c5856abacbe", "node_type": "1", "metadata": {}, "hash": "d8ef9c2e31fa3f500db572aff04fb54e46f16b62194b4a23b17b6ab8a41a5c44", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Dev Set results are still far worse than those of the pre-\n Tasks MNLI-m QNLI MRPC SST-2 SQuAD trained bidirectional models. The BiLSTM hurts\n (Acc) (Acc) (Acc) (Acc) (F1) performance on the GLUE tasks.\n BERTBASE 84.4 88.4 86.7 92.7 88.5 We recognize that it would also be possible to\n No NSP 83.9 84.9 86.5 92.6 87.9\n LTR & No NSP 82.1 84.3 77.5 92.1 77.8 train separate LTR and RTL models and represent\n + BiLSTM 82.1 84.1 75.7 91.6 84.9 each token as the concatenation of the two mod-\n Table 5: Ablation over the pre-training tasks using the els, as ELMo does. However: (a) this is twice as\n BERTBASE architecture. \u201cNo NSP\u201d is trained without expensive as a single bidirectional model; (b) this\n the next sentence prediction task. \u201cLTR & No NSP\u201d is is non-intuitive for tasks like QA, since the RTL\n trained as a left-to-right LM without the next sentence model would not be able to condition the answer\n prediction, like OpenAI GPT. \u201c+ BiLSTM\u201d adds a ran- on the question; (c) this it is strictly less powerful\n domly initialized BiLSTM on top of the \u201cLTR + No than a deep bidirectional model, since it can use\n NSP\u201d model during fine-tuning. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 1754, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "21c9c67d-01b5-49c7-b0c5-7c5856abacbe": {"__data__": {"id_": "21c9c67d-01b5-49c7-b0c5-7c5856abacbe", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "13d1b84a-4210-4825-933b-a6d498a30c00", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "131b956706902a0251fc74cfea9f60f7dea5bf0ed9d618204398e5e06339cc19", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "049233ae-7f17-4b97-b212-478744e93165", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "1816ded9e583043856ebde339cf39eff11a9ecbdc0f6b580ae47bd0efb3766af", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "2b41dc46-a706-4035-8707-8bdac62c2cfc", "node_type": "1", "metadata": {}, "hash": "65b9795fc0e74b775ed3a317666ae0a0a0f3a5cdeee861e08552d8821d5d91d7", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "both left and right context at every layer.\n\nablation studies can be found in Appendix C.\n\n5.1 Effect of Pre-training Tasks\nWe demonstrate the importance of the deep bidi-\nrectionality of BERT by evaluating two pre-\ntraining objectives using exactly the same pre-\ntraining data, fine-tuning scheme, and hyperpa-\nrameters as BERTBASE:\n\nNo NSP: A bidirectional model which is trained\nusing the \u201cmasked LM\u201d (MLM) but without the\n\u201cnext sentence prediction\u201d (NSP) task.\nLTR & No NSP: A left-context-only model which\nis trained using a standard Left-to-Right (LTR)\nLM, rather than an MLM. The left-only constraint\nwas also applied at fine-tuning, because removing\nit introduced a pre-train/fine-tune mismatch that\ndegraded downstream performance. Additionally,\nthis model was pre-trained without the NSP task.\nThis is directly comparable to OpenAI GPT, but\nusing our larger training dataset, our input repre-\nsentation, and our fine-tuning scheme.\n We first examine the impact brought by the NSP\ntask. In Table 5, we show that removing NSP\nhurts performance significantly on QNLI, MNLI,\nand SQuAD 1.1. Next, we evaluate the impact\nof training bidirectional representations by com-\nparing \u201cNo NSP\u201d to \u201cLTR & No NSP\u201d. The LTR\nmodel performs worse than the MLM model on all\ntasks, with large drops on MRPC and SQuAD.\n For SQuAD it is intuitively clear that a LTR\nmodel will perform poorly at token predictions,\nsince the token-level hidden states have no right-\nside context. In order to make a good faith at-\ntempt at strengthening the LTR system, we added\na randomly initialized BiLSTM on top. This does\nsignificantly improve results on SQuAD, but the\n5.2 Effect of Model Size\nIn this section, we explore the effect of model size\non fine-tuning task accuracy. We trained a number\nof BERT models with a differing number of layers,\nhidden units, and attention heads, while otherwise\nusing the same hyperparameters and training pro-\ncedure as described previously.\n Results on selected GLUE tasks are shown in\nTable 6. In this table, we report the average Dev\nSet accuracy from 5 random restarts of fine-tuning.\nWe can see that larger models lead to a strict ac-\ncuracy improvement across all four datasets, even\nfor MRPC which only has 3,600 labeled train-\ning examples, and is substantially different from\nthe pre-training tasks. It is also perhaps surpris-\ning that we are able to achieve such significant\nimprovements on top of models which are al-\nready quite large relative to the existing literature.\nFor example, the largest Transformer explored in\nVaswani et al. (2017) is (L=6, H=1024, A=16)\nwith 100M parameters for the encoder, and the\nlargest Transformer we have found in the literature\nis (L=64, H=512, A=2) with 235M parameters\n(Al-Rfou et al., 2018). By contrast, BERTBASE\ncontains 110M parameters and BERTLARGE con-\ntains 340M parameters.\n It has long been known that increasing the\nmodel size will lead to continual improvements\non large-scale tasks such as machine translation\nand language modeling, which is demonstrated\nby the LM perplexity of held-out training data\nshown in Table 6. However, we believe that\nthis is the first work to demonstrate convinc-\ningly that scaling to extreme model sizes also\nleads to large improvements on very small scale\ntasks, provided that the model has been suffi-\nciently pre-trained. Peters et al. ", "mimetype": "text/plain", "start_char_idx": 1754, "end_char_idx": 5122, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "2b41dc46-a706-4035-8707-8bdac62c2cfc": {"__data__": {"id_": "2b41dc46-a706-4035-8707-8bdac62c2cfc", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "13d1b84a-4210-4825-933b-a6d498a30c00", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "131b956706902a0251fc74cfea9f60f7dea5bf0ed9d618204398e5e06339cc19", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "21c9c67d-01b5-49c7-b0c5-7c5856abacbe", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "877b4d729adb665b0fdce9ed776eaa6b3ebd1bb13eeaba93e912c0f85fc9c1c5", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "(2018b) presented", "mimetype": "text/plain", "start_char_idx": 5122, "end_char_idx": 5139, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "641ed924-8e3f-4b7c-ae65-66e3dc4da5d5": {"__data__": {"id_": "641ed924-8e3f-4b7c-ae65-66e3dc4da5d5", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "6ea55280-c096-4e63-a744-d6fbd76d6e91", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b8d1715d2e0a931ca1d10b2617e5ac4bbad69540fa2ce884da0efa838dd17f72", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "60786148-3cfd-4e95-ab5d-256991f19a68", "node_type": "1", "metadata": {}, "hash": "98dacc97c1efac521fed40ac6eb5e115093ce4bbf9b2113f8d8e97493687b729", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "mixed results on the downstream task impact of\nincreasing the pre-trained bi-LM size from two\nto four layers and Melamud et al. (2016) men-\ntioned in passing that increasing hidden dimen-\nsion size from 200 to 600 helped, but increasing\nfurther to 1,000 did not bring further improve-\n\nments. Both of these prior works used a feature-\nbased approach \u2014 we hypothesize that when the\nmodel is fine-tuned directly on the downstream\ntasks and uses only a very small number of ran-\n\ndomly initialized additional parameters, the task-\nspecific models can benefit from the larger, more\nexpressive pre-trained representations even when\ndownstream task data is very small.\n5.3 Feature-based Approach with BERT\n\nAll of the BERT results presented so far have used\nthe fine-tuning approach, where a simple classifi-\ncation layer is added to the pre-trained model, and\nall parameters are jointly fine-tuned on a down-\nstream task. However, the feature-based approach,\nwhere fixed features are extracted from the pre-\ntrained model, has certain advantages. First, not\nall tasks can be easily represented by a Trans-\nformer encoder architecture, and therefore require\na task-specific model architecture to be added.\n", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 1203, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "60786148-3cfd-4e95-ab5d-256991f19a68": {"__data__": {"id_": "60786148-3cfd-4e95-ab5d-256991f19a68", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "6ea55280-c096-4e63-a744-d6fbd76d6e91", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b8d1715d2e0a931ca1d10b2617e5ac4bbad69540fa2ce884da0efa838dd17f72", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "641ed924-8e3f-4b7c-ae65-66e3dc4da5d5", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "edee340ad5b5b677edb913fcea52cb0f2a57f73bae804a9a5b35312f9e7408f5", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "9f8093af-21a6-443b-a6ac-864fb66387f8", "node_type": "1", "metadata": {}, "hash": "ab5363432f20eadc223a052d06475f118c0d9467a583aa29b07563f8fea6bdfe", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Second, there are major computational benefits\nto pre-compute an expensive representation of the\ntraining data once and then run many experiments\nwith cheaper models on top of this representation.\n In this section, we compare the two approaches\nby applying BERT to the CoNLL-2003 Named\nEntity Recognition (NER) task (Tjong Kim Sang\nand De Meulder, 2003). In the input to BERT, we\nuse a case-preserving WordPiece model, and we\ninclude the maximal document context provided\nby the data. Following standard practice, we for-\nmulate this as a tagging task but do not use a CRF\n\n Hyperparams Dev Set Accuracy\n #L #H #A LM (ppl) MNLI-m MRPC SST-2\n 3 768 12 5.84 77.9 79.8 88.4\n 6 768 3 5.24 80.6 82.2 90.7\n 6 768 12 4.68 81.9 84.8 91.3\n 12 768 12 3.99 84.4 86.7 92.9\n 12 1024 16 3.54 85.7 86.9 93.3\n 24 1024 16 3.23 86.6 87.8 93.7\n\nTable 6: Ablation over BERT model size. #L = the\nnumber of layers; #H = hidden size; #A = number of at-\ntention heads. \u201cLM (ppl)\u201d is the masked LM perplexity\nof held-out training data.\n System Dev F1 Test F1\n ELMo (Peters et al., 2018a) 95.7 92.2\n CVT (Clark et al., 2018) - 92.6\n CSE (Akbik et al., 2018) - 93.1\n Fine-tuning approach\n BERTLARGE 96.6 92.8\n BERTBASE 96.4 92.4\n Feature-based approach (BERTBASE )\n Embeddings 91.0 -\n Second-to-Last Hidden 95.6 -\n Last Hidden 94.9 -\n Weighted Sum Last Four Hidden 95.9 -\n Concat Last Four Hidden 96.1 -\n Weighted Sum All 12 Layers 95.5 -\nTable 7: CoNLL-2003 Named Entity Recognition re-\nsults. Hyperparameters were selected using the Dev\nset. ", "mimetype": "text/plain", "start_char_idx": 1203, "end_char_idx": 3320, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "9f8093af-21a6-443b-a6ac-864fb66387f8": {"__data__": {"id_": "9f8093af-21a6-443b-a6ac-864fb66387f8", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "6ea55280-c096-4e63-a744-d6fbd76d6e91", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b8d1715d2e0a931ca1d10b2617e5ac4bbad69540fa2ce884da0efa838dd17f72", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "60786148-3cfd-4e95-ab5d-256991f19a68", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "2b552c060d99c83f74c0324521e3574349a02cadcdf8c75a2d697b610b9a929f", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "The reported Dev and Test scores are averaged over\n5 random restarts using those hyperparameters.\n\n\nlayer in the output. We use the representation of\nthe first sub-token as the input to the token-level\nclassifier over the NER label set.\n To ablate the fine-tuning approach, we apply the\nfeature-based approach by extracting the activa-\ntions from one or more layers without fine-tuning\nany parameters of BERT. These contextual em-\nbeddings are used as input to a randomly initial-\nized two-layer 768-dimensional BiLSTM before\nthe classification layer.\n\n Results are presented in Table 7. BERTLARGE\nperforms competitively with state-of-the-art meth-\nods. The best performing method concatenates the\ntoken representations from the top four hidden lay-\ners of the pre-trained Transformer, which is only\n0.3 F1 behind fine-tuning the entire model. This\ndemonstrates that BERT is effective for both fine-\ntuning and feature-based approaches.\n\n6 Conclusion\n\nRecent empirical improvements due to transfer\nlearning with language models have demonstrated\nthat rich, unsupervised pre-training is an integral\npart of many language understanding systems. In\nparticular, these results enable even low-resource\n\ntasks to benefit from deep unidirectional architec-\ntures. Our major contribution is further general-\nizing these findings to deep bidirectional architec-\ntures, allowing the same pre-trained model to suc-\ncessfully tackle a broad set of NLP tasks.", "mimetype": "text/plain", "start_char_idx": 3320, "end_char_idx": 4773, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "e1c258d9-0291-4310-9fbd-a17f908a5826": {"__data__": {"id_": "e1c258d9-0291-4310-9fbd-a17f908a5826", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5d240373-8164-4fbe-9ff4-02a17074549b", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b9d83a9300be27e862369e9733ab621bde8059435a32ce45655d29b06eaad2ea", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "6808912a-ceb2-47ba-9281-2f1c06afe3d9", "node_type": "1", "metadata": {}, "hash": "c319b508831e30f2cdb555245d048d4409dd8e4ff6b09f4b73eb487c5b73c886", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "References\nAlan Akbik, Duncan Blythe, and Roland Vollgraf.\n ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 65, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "6808912a-ceb2-47ba-9281-2f1c06afe3d9": {"__data__": {"id_": "6808912a-ceb2-47ba-9281-2f1c06afe3d9", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5d240373-8164-4fbe-9ff4-02a17074549b", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b9d83a9300be27e862369e9733ab621bde8059435a32ce45655d29b06eaad2ea", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "e1c258d9-0291-4310-9fbd-a17f908a5826", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "439d13a98b6c08467cf9614fccd626de0db3605cb4fd6688c180248ab85390e7", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "d1afa468-be5c-4597-8f48-c90574dff711", "node_type": "1", "metadata": {}, "hash": "b635a18b19a68ad15d0895fc1a91db3a4767cb1c2c44fbbf02f3b13c4ea583f7", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2018. Contextual string embeddings for sequence\n labeling. In Proceedings of the 27th International\n Conference on Computational Linguistics, pages\n 1638\u20131649.\n\nRami Al-Rfou, Dokook Choe, Noah Constant, Mandy\n Guo, and Llion Jones. 2018. Character-level lan-\n guage modeling with deeper self-attention. arXiv\n preprint arXiv:1808.04444.\nKevin Clark, Minh-Thang Luong, Christopher D Man-ning, and Quoc Le. 2018. Semi-supervised se-\n quence modeling with cross-view training. In Pro-\n ceedings of the 2018 Conference on Empirical Meth-\n ods in Natural Language Processing, pages 1914\u2013\n 1925.\n\nRonan Collobert and Jason Weston. 2008. A unified\n architecture for natural language processing: Deep\n neural networks with multitask learning. In Pro-\n ceedings of the 25th international conference on\n Machine learning, pages 160\u2013167. ACM.\n\n Rie Kubota Ando and Tong Zhang. 2005. A framework Alexis Conneau, Douwe Kiela, Holger Schwenk, Lo\u00a8\u0131c\n for learning predictive structures from multiple tasks Barrault, and Antoine Bordes. 2017. Supervised\n and unlabeled data. Journal of Machine Learning learning of universal sentence representations from\n Research, 6(Nov):1817\u20131853. natural language inference data. In Proceedings of\n the 2017 Conference on Empirical Methods in Nat-\n Luisa Bentivogli, Bernardo Magnini, Ido Dagan, ural Language Processing, pages 670\u2013680, Copen-\n Hoa Trang Dang, and Danilo Giampiccolo. 2009. hagen, Denmark. Association for Computational\n The fifth PASCAL recognizing textual entailment Linguistics.\n challenge. In TAC. NIST. Andrew M Dai and Quoc V Le. ", "mimetype": "text/plain", "start_char_idx": 65, "end_char_idx": 2529, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "d1afa468-be5c-4597-8f48-c90574dff711": {"__data__": {"id_": "d1afa468-be5c-4597-8f48-c90574dff711", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5d240373-8164-4fbe-9ff4-02a17074549b", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b9d83a9300be27e862369e9733ab621bde8059435a32ce45655d29b06eaad2ea", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "6808912a-ceb2-47ba-9281-2f1c06afe3d9", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "226fd2b43894cd7d2595c0ca96cab1d872db2c703dd2ad5db2cc65ac79e93989", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "d854ece0-8e05-4e06-ba7d-442eb5a771eb", "node_type": "1", "metadata": {}, "hash": "bff0b7447221c74fc85f9a8b958569fe88d5566febc455d092f91aea495fe7fa", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2015. Semi-supervised\n sequence learning. In Advances in neural informa-\n John Blitzer, Ryan McDonald, and Fernando Pereira. tion processing systems, pages 3079\u20133087.\n 2006. Domain adaptation with structural correspon-\n dence learning. In Proceedings of the 2006 confer- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-\n ence on empirical methods in natural language pro- Fei. 2009. ImageNet: A Large-Scale Hierarchical\n cessing, pages 120\u2013128. Association for Computa- Image Database. In CVPR09.\n tional Linguistics.\n William B Dolan and Chris Brockett. 2005. Automati-\n Samuel R. Bowman, Gabor Angeli, Christopher Potts, cally constructing a corpus of sentential paraphrases.\n and Christopher D. Manning. 2015. A large anno- In Proceedings of the Third International Workshop\n tated corpus for learning natural language inference. on Paraphrasing (IWP2005).\n In EMNLP. Association for Computational Linguis-\n tics. William Fedus, Ian Goodfellow, and Andrew M Dai.\n 2018. Maskgan: Better text generation via filling in\n Peter F Brown, Peter V Desouza, Robert L Mercer, the . arXiv preprint arXiv:1801.07736.\n Vincent J Della Pietra, and Jenifer C Lai. 1992.\n Class-based n-gram models of natural language. Dan Hendrycks and Kevin Gimpel. 2016. Bridging\n Computational linguistics, 18(4):467\u2013479. nonlinearities and stochastic regularizers with gaus-\n sian error linear units. CoRR, abs/1606.08415.\n Daniel Cer, Mona Diab, Eneko Agirre, Inigo Lopez- Felix Hill, Kyunghyun Cho, and Anna Korhonen. 2016.\n Gazpio, and Lucia Specia. 2017. Semeval-2017 Learning distributed representations of sentences\n task 1: Semantic textual similarity multilingual and from unlabelled data. In Proceedings of the 2016\n crosslingual focused evaluation. In Proceedings Conference of the North American Chapter of the\n of the 11th International Workshop on Semantic\n Evaluation (SemEval-2017), pages 1\u201314, Vancou- Association for Computational Linguistics: Human\n ver, Canada. Association for Computational Lin- Language Technologies. Association for Computa-\n guistics. tional Linguistics.\n Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Jeremy Howard and Sebastian Ruder. 2018. Universal\n Thorsten Brants, Phillipp Koehn, and Tony Robin- language model fine-tuning for text classification. In\n son. 2013. One billion word benchmark for measur- ACL. Association for Computational Linguistics.\n ing progress in statistical language modeling. arXiv Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu,\n preprint arXiv:1312.3005. Furu Wei, and Ming Zhou. ", "mimetype": "text/plain", "start_char_idx": 2529, "end_char_idx": 7696, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "d854ece0-8e05-4e06-ba7d-442eb5a771eb": {"__data__": {"id_": "d854ece0-8e05-4e06-ba7d-442eb5a771eb", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5d240373-8164-4fbe-9ff4-02a17074549b", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b9d83a9300be27e862369e9733ab621bde8059435a32ce45655d29b06eaad2ea", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "d1afa468-be5c-4597-8f48-c90574dff711", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "dac55530bb0e1e04298ea3174a6d8afeaf75774267b062d665bf2cd5d72b6fd1", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "f48e1778-26db-49b0-89e7-e04c961609cc", "node_type": "1", "metadata": {}, "hash": "c3b0b9664baec8cb5cb9882ee4c32a97bbb75d76c9b9c3345e8ba4e7a18baf07", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2018. Reinforced\n mnemonic reader for machine reading comprehen-\n Z. Chen, H. Zhang, X. Zhang, and L. Zhao. 2018. sion. In IJCAI.\n Quora question pairs.\n Yacine Jernite, Samuel R. Bowman, and David Son-\n Christopher Clark and Matt Gardner. 2018. Simple tag. ", "mimetype": "text/plain", "start_char_idx": 7696, "end_char_idx": 8436, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "f48e1778-26db-49b0-89e7-e04c961609cc": {"__data__": {"id_": "f48e1778-26db-49b0-89e7-e04c961609cc", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5d240373-8164-4fbe-9ff4-02a17074549b", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b9d83a9300be27e862369e9733ab621bde8059435a32ce45655d29b06eaad2ea", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "d854ece0-8e05-4e06-ba7d-442eb5a771eb", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b3aa4da2f8a1cbc7f15222336bb2c50764469bdc5c7b461c11c4da23ea5bd7b2", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "f1a6746d-1e02-49e9-be62-360454d78ce3", "node_type": "1", "metadata": {}, "hash": "37a7841cfb432669c0b15be4123df867a05922b3a4ee0b5c9b96d0b79351ef6f", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2017. Discourse-based objectives for fast un-\n and effective multi-paragraph reading comprehen- supervised sentence representation learning. CoRR,\n sion. ", "mimetype": "text/plain", "start_char_idx": 8436, "end_char_idx": 8717, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "f1a6746d-1e02-49e9-be62-360454d78ce3": {"__data__": {"id_": "f1a6746d-1e02-49e9-be62-360454d78ce3", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "5d240373-8164-4fbe-9ff4-02a17074549b", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b9d83a9300be27e862369e9733ab621bde8059435a32ce45655d29b06eaad2ea", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "f48e1778-26db-49b0-89e7-e04c961609cc", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "39cb5ba8f8c815af3d966b4807c50db18346ef77ac003de3e592e7a97b6f3648", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "In ACL. abs/1705.00557.", "mimetype": "text/plain", "start_char_idx": 8717, "end_char_idx": 8806, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "388d7ccb-8037-4295-9e61-5e7bc66581e2": {"__data__": {"id_": "388d7ccb-8037-4295-9e61-5e7bc66581e2", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "74359064-bf24-40e0-9818-ab1d000bff3e", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "63cb94e1411b5ccf6b63ba01e87da25eb67b1a39b41578fc4e232688d086a98b", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "8e29386d-e646-4e6b-8096-3545b568bd44", "node_type": "1", "metadata": {}, "hash": "7a1e1c12c727340235d4c97b34e78540c1cbc5f0d42532c29b4d78a71dc257af", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Mandar Joshi, Eunsol Choi, Daniel S Weld, and Luke\n Zettlemoyer. 2017. Triviaqa: A large scale distantly\n supervised challenge dataset for reading comprehen-\n sion. In ACL.\n\n ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 185, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "8e29386d-e646-4e6b-8096-3545b568bd44": {"__data__": {"id_": "8e29386d-e646-4e6b-8096-3545b568bd44", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "74359064-bf24-40e0-9818-ab1d000bff3e", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "63cb94e1411b5ccf6b63ba01e87da25eb67b1a39b41578fc4e232688d086a98b", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "388d7ccb-8037-4295-9e61-5e7bc66581e2", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "4a2ff896ea82da46620dba0e1daf8725c78ea8ecb65c1904e66295ef2bd8b681", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "48a545c7-3964-4115-88cc-e2df29b360a0", "node_type": "1", "metadata": {}, "hash": "9de51d651b0539ae2768a76c255f5d2abaa2ca6bb83ccc5617edf623f7c82df5", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov,\n Richard Zemel, Raquel Urtasun, Antonio Torralba,\n and Sanja Fidler. ", "mimetype": "text/plain", "start_char_idx": 185, "end_char_idx": 307, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "48a545c7-3964-4115-88cc-e2df29b360a0": {"__data__": {"id_": "48a545c7-3964-4115-88cc-e2df29b360a0", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "74359064-bf24-40e0-9818-ab1d000bff3e", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "63cb94e1411b5ccf6b63ba01e87da25eb67b1a39b41578fc4e232688d086a98b", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "8e29386d-e646-4e6b-8096-3545b568bd44", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "2c1af3ed8808aa9d2ef1739e0ec9a336ed0872382c0b1734d3f39e5a792bdf3d", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "4ae1f21b-eae6-41ac-a45b-c08dcc7346e5", "node_type": "1", "metadata": {}, "hash": "2ea676273b06bbd9b193563a4ebf99b2abf4c42a926139869658b1e4101b0e91", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2015. Skip-thought vectors. In\n Advances in neural information processing systems,\n pages 3294\u20133302.\nQuoc Le and Tomas Mikolov. 2014. Distributed rep-\n resentations of sentences and documents. In Inter-\n national Conference on Machine Learning, pages\n 1188\u20131196.\nHector J Levesque, Ernest Davis, and Leora Morgen-\n stern. 2011. The winograd schema challenge. In\n Aaai spring symposium: Logical formalizations of\n commonsense reasoning, volume 46, page 47.\nLajanugen Logeswaran and Honglak Lee. 2018. An\n efficient framework for learning sentence represen-\n tations. In International Conference on Learning\n Representations.\nBryan McCann, James Bradbury, Caiming Xiong, and\n Richard Socher. 2017. Learned in translation: Con-\n textualized word vectors. In NIPS.\n\nOren Melamud, Jacob Goldberger, and Ido Dagan.\n 2016. context2vec: Learning generic context em-\n bedding with bidirectional LSTM. In CoNLL.\n Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Cor-\n rado, and Jeff Dean. 2013. Distributed representa-\n tions of words and phrases and their compositional-\n ity. In Advances in Neural Information Processing\n Systems 26, pages 3111\u20133119. Curran Associates,\n Inc.\n Andriy Mnih and Geoffrey E Hinton. ", "mimetype": "text/plain", "start_char_idx": 307, "end_char_idx": 1570, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "4ae1f21b-eae6-41ac-a45b-c08dcc7346e5": {"__data__": {"id_": "4ae1f21b-eae6-41ac-a45b-c08dcc7346e5", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "74359064-bf24-40e0-9818-ab1d000bff3e", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "63cb94e1411b5ccf6b63ba01e87da25eb67b1a39b41578fc4e232688d086a98b", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "48a545c7-3964-4115-88cc-e2df29b360a0", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "32a352e883afde83740202d56a5eee63aa32aa05471451b5aecc0494b951ae4a", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "f7ae2b3e-8e5f-4317-9483-6029a02f4a66", "node_type": "1", "metadata": {}, "hash": "e759a0b2061099a51b8a4833c5e182068c3f6a6af5e2c637fed9c28fd53d3166", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2009. A scal-\n able hierarchical distributed language model.\n D. Koller, D. Schuurmans, Y. Bengio, and L. Bot-\n tou, editors, Advances in Neural Information Pro-\n cessing Systems 21, pages 1081\u20131088. Curran As-\n sociates, Inc.\n Ankur P Parikh, Oscar T\u00a8ackstr\u00a8om, Dipanjan Das, and\n Jakob Uszkoreit. ", "mimetype": "text/plain", "start_char_idx": 1570, "end_char_idx": 1887, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "f7ae2b3e-8e5f-4317-9483-6029a02f4a66": {"__data__": {"id_": "f7ae2b3e-8e5f-4317-9483-6029a02f4a66", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "74359064-bf24-40e0-9818-ab1d000bff3e", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "63cb94e1411b5ccf6b63ba01e87da25eb67b1a39b41578fc4e232688d086a98b", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "4ae1f21b-eae6-41ac-a45b-c08dcc7346e5", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "9d1fddbe4bf045683de124d183fd5d71823e0de08303edccda49b94f7a06bd6a", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "cd5b0464-f7f4-465c-894e-93dd7a8f1e77", "node_type": "1", "metadata": {}, "hash": "7134eafd2306060597d120c1c1d81ca67b6a38347527017f8162a7c010a88a70", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2016. A decomposable attention\n model for natural language inference. In EMNLP.\n Jeffrey Pennington, Richard Socher, and Christo-\n pher D. Manning. 2014. Glove: Global vectors for\n word representation. In Empirical Methods in Nat-\n ural Language Processing (EMNLP), pages 1532\u2013\n 1543.\n Matthew Peters, Waleed Ammar, Chandra Bhagavat-\n ula, and Russell Power. 2017. Semi-supervised se-\n quence tagging with bidirectional language models.\n In ACL.\n\n ", "mimetype": "text/plain", "start_char_idx": 1887, "end_char_idx": 2359, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "cd5b0464-f7f4-465c-894e-93dd7a8f1e77": {"__data__": {"id_": "cd5b0464-f7f4-465c-894e-93dd7a8f1e77", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "74359064-bf24-40e0-9818-ab1d000bff3e", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "63cb94e1411b5ccf6b63ba01e87da25eb67b1a39b41578fc4e232688d086a98b", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "f7ae2b3e-8e5f-4317-9483-6029a02f4a66", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b278f032dbb5c904b0a6d0e7d0e86eae73b5c83c0bd47274ac3e8fa844d2439a", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Matthew Peters, Mark Neumann, Mohit Iyyer, Matt\n Gardner, Christopher Clark, Kenton Lee, and Luke\n Zettlemoyer. 2018a. Deep contextualized word rep-\n resentations. In NAACL.\n Matthew Peters, Mark Neumann, Luke Zettlemoyer,\n and Wen-tau Yih. 2018b. Dissecting contextual\n word embeddings: Architecture and representation.\n In Proceedings of the 2018 Conference on Empiri-\n cal Methods in Natural Language Processing, pages\n 1499\u20131509.\n\n Alec Radford, Karthik Narasimhan, Tim Salimans, and\n Ilya Sutskever. 2018. Improving language under-\n standing with unsupervised learning. Technical re-\n port, OpenAI.\n Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and\n Percy Liang. 2016. Squad: 100,000+ questions for\n machine comprehension of text. In Proceedings of\n the 2016 Conference on Empirical Methods in Nat-\n ural Language Processing, pages 2383\u20132392.\n Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and\n Hannaneh Hajishirzi. 2017. Bidirectional attention\n flow for machine comprehension. In ICLR.\n Richard Socher, Alex Perelygin, Jean Wu, Jason\n Chuang, Christopher D Manning, Andrew Ng, and\n Christopher Potts. 2013. Recursive deep models\n for semantic compositionality over a sentiment tree-\n bank. In Proceedings of the 2013 conference on\n empirical methods in natural language processing,\n pages 1631\u20131642.\n Fu Sun, Linyang Li, Xipeng Qiu, and Yang Liu.\n 2018. U-net: Machine reading comprehension\n with unanswerable questions. arXiv preprint\n arXiv:1810.06638.\n Wilson L Taylor. 1953. Cloze procedure: A new\n tool for measuring readability. Journalism Bulletin,\n 30(4):415\u2013433.\n\n Erik F Tjong Kim Sang and Fien De Meulder.\n 2003. Introduction to the conll-2003 shared task:\nIn Language-independent named entity recognition. In\n CoNLL.\n Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010.\n Word representations: A simple and general method\n for semi-supervised learning. In Proceedings of the\n 48th Annual Meeting of the Association for Compu-\n tational Linguistics, ACL \u201910, pages 384\u2013394.\n Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob\n Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz\n Kaiser, and Illia Polosukhin. 2017. Attention is all\n you need. In Advances in Neural Information Pro-\n cessing Systems, pages 6000\u20136010.\n Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and\n Pierre-Antoine Manzagol. 2008. Extracting and\n composing robust features with denoising autoen-\n coders. In Proceedings of the 25th international\n conference on Machine learning, pages 1096\u20131103.\n ACM.\n\n Alex Wang, Amanpreet Singh, Julian Michael, Fe-\n lix Hill, Omer Levy, and Samuel Bowman. 2018a.\n Glue: A multi-task benchmark and analysis platform", "mimetype": "text/plain", "start_char_idx": 2359, "end_char_idx": 5627, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "5042ecfe-b092-4370-a34d-f747863465a0": {"__data__": {"id_": "5042ecfe-b092-4370-a34d-f747863465a0", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "166109f8-fb93-4127-abe5-964ed07035c2", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "65af298d8c0241a3ccbb049b875129d8cf8d84b8ce9413dff19e760c11ba7913", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "3efbee33-f0fd-4f16-bf5c-3c7a897c1562", "node_type": "1", "metadata": {}, "hash": "3e04125e8542e61d7f68cfa7c1b35db6fe7e7db713c3bfdd2a1376a20dc89822", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " for natural language understanding. In Proceedings\n of the 2018 EMNLP Workshop BlackboxNLP: An-\n alyzing and Interpreting Neural Networks for NLP,\n pages 353\u2013355.\nWei Wang, Ming Yan, and Chen Wu. 2018b. Multi-\n granularity hierarchical attention fusion networks\n for reading comprehension and question answering.\n In Proceedings of the 56th Annual Meeting of the As-\n sociation for Computational Linguistics (Volume 1:\n Long Papers). Association for Computational Lin-\n guistics.\n\nAlex Warstadt, Amanpreet Singh, and Samuel R Bow-\n man. 2018. Neural network acceptability judg-\n ments. arXiv preprint arXiv:1805.12471.\nAdina Williams, Nikita Nangia, and Samuel R Bow-\n man. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 707, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "3efbee33-f0fd-4f16-bf5c-3c7a897c1562": {"__data__": {"id_": "3efbee33-f0fd-4f16-bf5c-3c7a897c1562", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "166109f8-fb93-4127-abe5-964ed07035c2", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "65af298d8c0241a3ccbb049b875129d8cf8d84b8ce9413dff19e760c11ba7913", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "5042ecfe-b092-4370-a34d-f747863465a0", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "ac66a64796c41de33dcab9aa3e041ccd7824d97d64672665be8e42e4253b6f8e", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "87a4fe6d-ce60-4893-a3ab-faea7aa65407", "node_type": "1", "metadata": {}, "hash": "3e95f0caa5abb302a198469d6228b17e43a0e215ff938e3755e9541d660364a4", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2018. A broad-coverage challenge corpus\n for sentence understanding through inference. In\n NAACL.\n\n", "mimetype": "text/plain", "start_char_idx": 707, "end_char_idx": 816, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "87a4fe6d-ce60-4893-a3ab-faea7aa65407": {"__data__": {"id_": "87a4fe6d-ce60-4893-a3ab-faea7aa65407", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "166109f8-fb93-4127-abe5-964ed07035c2", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "65af298d8c0241a3ccbb049b875129d8cf8d84b8ce9413dff19e760c11ba7913", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "3efbee33-f0fd-4f16-bf5c-3c7a897c1562", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "7289a56bcc3a329f219dbe8f3c17f35d43e9ecab375b0d711f807efe993d5ee9", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "4ee6eb46-8fd4-43f4-b321-1711067a516f", "node_type": "1", "metadata": {}, "hash": "2123f35ca78e606423d1e8bf69e910aa33a9a9e5abdfaba7613e8243792233e7", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V\n Le, Mohammad Norouzi, Wolfgang Macherey,\n Maxim Krikun, Yuan Cao, Qin Gao, Klaus\n Macherey, et al. ", "mimetype": "text/plain", "start_char_idx": 816, "end_char_idx": 970, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "4ee6eb46-8fd4-43f4-b321-1711067a516f": {"__data__": {"id_": "4ee6eb46-8fd4-43f4-b321-1711067a516f", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "166109f8-fb93-4127-abe5-964ed07035c2", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "65af298d8c0241a3ccbb049b875129d8cf8d84b8ce9413dff19e760c11ba7913", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "87a4fe6d-ce60-4893-a3ab-faea7aa65407", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "d50d92f080832f7adf7adfca276409259aaeb76000588441dc7dfa1a5e5a4957", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "2016. Google\u2019s neural ma-\n chine translation system: Bridging the gap between\n human and machine translation. arXiv preprint\n arXiv:1609.08144.\nJason Yosinski, Jeff Clune, Yoshua Bengio, and Hod\n Lipson. 2014. How transferable are features in deep\n neural networks? In Advances in neural information\n processing systems, pages 3320\u20133328.\n\nAdams Wei Yu, David Dohan, Minh-Thang Luong, Rui\n Zhao, Kai Chen, Mohammad Norouzi, and Quoc V\n Le. 2018. QANet: Combining local convolution\n with global self-attention for reading comprehen-\n sion. In ICLR.\nRowan Zellers, Yonatan Bisk, Roy Schwartz, and Yejin\n Choi. 2018. Swag: A large-scale adversarial dataset\n for grounded commonsense inference. In Proceed-\n ings of the 2018 Conference on Empirical Methods\n in Natural Language Processing (EMNLP).\nYukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhut-\n dinov, Raquel Urtasun, Antonio Torralba, and Sanja\n Fidler. 2015. Aligning books and movies: Towards\n story-like visual explanations by watching movies\n and reading books. In Proceedings of the IEEE\n international conference on computer vision, pages\n 19\u201327.\n\n Appendix for \u201cBERT: Pre-training of\n Deep Bidirectional Transformers for\n Language Understanding\u201d\n We organize the appendix into three sections:\n\n \u2022 Additional implementation details for BERT\n are presented in Appendix A;\n \u2022 Additional details for our experiments are\n presented in Appendix B; and\n\n \u2022 Additional ablation studies are presented in\n Appendix C.\n We present additional ablation studies for\n BERT including:\n\n \u2013 Effect of Number of Training Steps; and\n\n \u2013 Ablation for Different Masking Proce-\n dures.\n\nA Additional Details for BERT\n\nA.1 Illustration of the Pre-training Tasks\nWe provide examples of the pre-training tasks in\nthe following.\n\nMasked LM and the Masking Procedure As-\nsuming the unlabeled sentence is my dog is\nhairy, and during the random masking procedure\n\nwe chose the 4-th token (which corresponding to\nhairy), our masking procedure can be further il-\nlustrated by\n\n \u2022 80% of the time: Replace the word with the\n [MASK] token, e.g., my dog is hairy \u2192\n my dog is [MASK]\n\n \u2022 10% of the time: Replace the word with a\n random word, e.g., my dog is hairy \u2192 my\n dog is apple\n\n \u2022 10% of the time: Keep the word un-\n changed, e.g., my dog is hairy \u2192 my dog\n is hairy. The purpose of this is to bias the\n representation towards the actual observed\n word.\n\n The advantage of this procedure is that the\nTransformer encoder does not know which words\nit will be asked to predict or which have been re-\nplaced by random words, so it is forced to keep\na distributional contextual representation of ev-\n\nery input token. Additionally, because random\nreplacement only occurs for 1.5% of all tokens\n(i.e., 10% of 15%), this does not seem to harm\nthe model\u2019s language understanding capability. In\nSection C.2, we evaluate the impact this proce-\ndure.\n Compared to standard langauge model training,\nthe masked LM only make predictions on 15% of\ntokens in each batch, which suggests that more\npre-training steps may be required for the model", "mimetype": "text/plain", "start_char_idx": 970, "end_char_idx": 4266, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "305ade7f-2710-4529-8972-2d49133a67ed": {"__data__": {"id_": "305ade7f-2710-4529-8972-2d49133a67ed", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "77ff5d8d-0dda-479c-9295-ad7f30935d02", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "0c80f68887fa99b2003a00b5141ac73dfc42f506fee14a4cc318865998b26184", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "8b857906-4aa7-4a72-9b9b-fe4e7472a16b", "node_type": "1", "metadata": {}, "hash": "eb9de8cb3ead53b34236e9a40d0e1851fd295484069b2b411cc0229cc5387b45", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " BERT (Ours) OpenAI GPT ELMo\n T 1 T 2 ... T N T 1 T 2 ... T N T 1 T 2 ... T N\n\n Trm Trm ... Trm Trm Trm ... Trm\n Lstm Lstm ... Lstm Lstm Lstm Lstm\n ...\n\n Trm Trm ... Trm Trm Trm ... Trm Lstm Lstm ... Lstm Lstm Lstm ... Lstm\n\n E 1 E 2 ... E N E 1 E 2 ... E N E 1 E 2 ... E N\n\n Figure 3: Differences in pre-training model architectures. BERT uses a bidirectional Transformer. OpenAI GPT\n uses a left-to-right Transformer. ELMo uses the concatenation of independently trained left-to-right and right-to-\n left LSTMs to generate features for downstream tasks. Among the three, only BERT representations are jointly\n conditioned on both left and right context in all layers. In addition to the architecture differences, BERT and\n OpenAI GPT are fine-tuning approaches, while ELMo is a feature-based approach.\n\nto converge. In Section C.1 we demonstrate that\nMLM does converge marginally slower than a left-\nto-right model (which predicts every token), but\nthe empirical improvements of the MLM model\nfar outweigh the increased training cost.\nNext Sentence Prediction The next sentence\n\nprediction task can be illustrated in the following\nexamples.\nInput = [CLS] the man went to [MASK] store [SEP]\n\n he bought a gallon [MASK] milk [SEP]\n Label = IsNext\n\nInput = [CLS] the man [MASK] to the store [SEP]\n penguin [MASK] are flight ##less birds [SEP]\n Label = NotNext\n\nA.2 Pre-training Procedure\nTo generate each training input sequence, we sam-\nple two spans of text from the corpus, which we\nrefer to as \u201csentences\u201d even though they are typ-\nically much longer than single sentences (but can\nbe shorter also). The first sentence receives the A\nembedding and the second receives the B embed-\nding. 50% of the time B is the actual next sentence\nthat follows A and 50% of the time it is a random\nsentence, which is done for the \u201cnext sentence pre-\ndiction\u201d task. They are sampled such that the com-\nbined length is \u2264 512 tokens. The LM masking is\napplied after WordPiece tokenization with a uni-\nform masking rate of 15%, and no special consid-\neration given to partial word pieces.\n We train with batch size of 256 sequences (256\nsequences * 512 tokens = 128,000 tokens/batch)\nfor 1,000,000 steps, which is approximately 40\nepochs over the 3.3 billion word corpus. We\nuse Adam with learning rate of 1e-4, \u03b21 = 0.9,\n\u03b22 = 0.999, L2 weight decay of 0.01, learning\nrate warmup over the first 10,000 steps, and linear\ndecay of the learning rate. We use a dropout prob-\nability of 0.1 on all layers. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 3338, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "8b857906-4aa7-4a72-9b9b-fe4e7472a16b": {"__data__": {"id_": "8b857906-4aa7-4a72-9b9b-fe4e7472a16b", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "77ff5d8d-0dda-479c-9295-ad7f30935d02", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "0c80f68887fa99b2003a00b5141ac73dfc42f506fee14a4cc318865998b26184", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "305ade7f-2710-4529-8972-2d49133a67ed", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "b07e8aeacea96fb9c4d2b4780c8fbfab3315c10acdbc5ebf1b197df3b56f1907", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "e4240080-f6c3-485f-b3b8-bc17acedd026", "node_type": "1", "metadata": {}, "hash": "020cf96cf4dc57fe2ffc75bbcf50b34c86db6c41f3e9cb83059389f2869d7b8f", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "We use a gelu acti-\nvation (Hendrycks and Gimpel, 2016) rather than\nthe standard relu, following OpenAI GPT. The\ntraining loss is the sum of the mean masked LM\nlikelihood and the mean next sentence prediction\nlikelihood.\n Training of BERTBASE was performed on 4\nCloud TPUs in Pod configuration (16 TPU chips\ntotal).13 Training of BERTLARGE was performed\non 16 Cloud TPUs (64 TPU chips total). Each pre-\ntraining took 4 days to complete.Longer sequences are disproportionately expen-\nsive because attention is quadratic to the sequence\nlength. To speed up pretraing in our experiments,\nwe pre-train the model with sequence length of\n128 for 90% of the steps. Then, we train the rest\n10% of the steps of sequence of 512 to learn the\npositional embeddings.\n\nA.3 Fine-tuning Procedure\nFor fine-tuning, most model hyperparameters are\nthe same as in pre-training, with the exception of\nthe batch size, learning rate, and number of train-\ning epochs. ", "mimetype": "text/plain", "start_char_idx": 3338, "end_char_idx": 4290, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "e4240080-f6c3-485f-b3b8-bc17acedd026": {"__data__": {"id_": "e4240080-f6c3-485f-b3b8-bc17acedd026", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "77ff5d8d-0dda-479c-9295-ad7f30935d02", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "0c80f68887fa99b2003a00b5141ac73dfc42f506fee14a4cc318865998b26184", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "8b857906-4aa7-4a72-9b9b-fe4e7472a16b", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "ffddeddb1979247b208f81899c271a3d56e3b04902804318e7d344dcc44b1280", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "The dropout probability was always\nkept at 0.1. The optimal hyperparameter values\nare task-specific, but we found the following range\nof possible values to work well across all tasks:\n\n \u2022 Batch size: 16, 32\n 13https://cloudplatform.googleblog.com/2018/06/Cloud-\nTPU-now-offers-preemptible-pricing-and-global-\navailability.html", "mimetype": "text/plain", "start_char_idx": 4290, "end_char_idx": 4621, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "7432a915-d9a3-49ac-bda3-f72850d06063": {"__data__": {"id_": "7432a915-d9a3-49ac-bda3-f72850d06063", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "aaadb951-b38a-4e3a-accf-f6bfdb2f05a6", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "1e00f7be7373d801694101a53b6f6c259fe3d81300983be5be248aaf28f5803c", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "f6bb6a46-a49b-41ef-b78b-84c7502bebb9", "node_type": "1", "metadata": {}, "hash": "35ac692a69e27b8d07e5c2696f898d3e00d1939ad4998df5fd3f1d2c8df44247", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " \u2022 Learning rate (Adam): 5e-5, 3e-5, 2e-5\n \u2022 Number of epochs: 2, 3, 4\n\n We also observed that large data sets (e.g.,\n100k+ labeled training examples) were far less\nsensitive to hyperparameter choice than small data\nsets. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 230, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "f6bb6a46-a49b-41ef-b78b-84c7502bebb9": {"__data__": {"id_": "f6bb6a46-a49b-41ef-b78b-84c7502bebb9", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "aaadb951-b38a-4e3a-accf-f6bfdb2f05a6", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "1e00f7be7373d801694101a53b6f6c259fe3d81300983be5be248aaf28f5803c", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "7432a915-d9a3-49ac-bda3-f72850d06063", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "054e75f13f4902325c847e83695ac16adca2a5333266a2405b7c56020376292b", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "7200837e-7b73-4376-b06f-d9b4c910cd3e", "node_type": "1", "metadata": {}, "hash": "e8ce4d0a1f79b183a9a86c72d13537be3d5b328bdf892e19901c89bc0de7d5c2", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "Fine-tuning is typically very fast, so it is rea-\nsonable to simply run an exhaustive search over\nthe above parameters and choose the model that\nperforms best on the development set.\n\nA.4 Comparison of BERT, ELMo ,and\n OpenAI GPT\nHere we studies the differences in recent popular\nrepresentation learning models including ELMo,\nOpenAI GPT and BERT. The comparisons be-\ntween the model architectures are shown visually\nin Figure 3. Note that in addition to the architec-\nture differences, BERT and OpenAI GPT are fine-\ntuning approaches, while ELMo is a feature-based\napproach.\n The most comparable existing pre-training\nmethod to BERT is OpenAI GPT, which trains a\nleft-to-right Transformer LM on a large text cor-\npus. In fact, many of the design decisions in BERT\nwere intentionally made to make it as close to\nGPT as possible so that the two methods could be\nminimally compared. The core argument of this\nwork is that the bi-directionality and the two pre-\ntraining tasks presented in Section 3.1 account for\nthe majority of the empirical improvements, but\nwe do note that there are several other differences\nbetween how BERT and GPT were trained:\n\n \u2022 GPT is trained on the BooksCorpus (800M\n words); BERT is trained on the BooksCor-\n pus (800M words) and Wikipedia (2,500M\n words).\n \u2022 GPT uses a sentence separator ([SEP]) and\n classifier token ([CLS]) which are only in-\n troduced at fine-tuning time; BERT learns\n [SEP], [CLS] and sentence A/B embed-\n dings during pre-training.\n \u2022 GPT was trained for 1M steps with a batch\n size of 32,000 words; BERT was trained for\n 1M steps with a batch size of 128,000 words.\n\n \u2022 GPT used the same learning rate of 5e-5 for\n all fine-tuning experiments; BERT chooses a\n task-specific fine-tuning learning rate which\n performs the best on the development set.\n To isolate the effect of these differences, we per-\nform ablation experiments in Section 5.1 which\ndemonstrate that the majority of the improvements\nare in fact coming from the two pre-training tasks\nand the bidirectionality they enable.\n\nA.5 Illustrations of Fine-tuning on Different\n Tasks\nThe illustration of fine-tuning BERT on different\ntasks can be seen in Figure 4. Our task-specific\nmodels are formed by incorporating BERT with\none additional output layer, so a minimal num-\nber of parameters need to be learned from scratch.\nAmong the tasks, (a) and (b) are sequence-level\ntasks while (c) and (d) are token-level tasks. In\nthe figure, E represents the input embedding, Ti\nrepresents the contextual representation of token i,\n[CLS] is the special symbol for classification out-\nput, and [SEP] is the special symbol to separate\nnon-consecutive token sequences.\n\nB Detailed Experimental Setup\nB.1 Detailed Descriptions for the GLUE\n Benchmark Experiments.\nOur GLUE results in Table1 are obtained\nfrom https://gluebenchmark.com/\nleaderboard and https://blog.\nopenai.com/language-unsupervised.\n", "mimetype": "text/plain", "start_char_idx": 230, "end_char_idx": 3269, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "7200837e-7b73-4376-b06f-d9b4c910cd3e": {"__data__": {"id_": "7200837e-7b73-4376-b06f-d9b4c910cd3e", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "aaadb951-b38a-4e3a-accf-f6bfdb2f05a6", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "1e00f7be7373d801694101a53b6f6c259fe3d81300983be5be248aaf28f5803c", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "f6bb6a46-a49b-41ef-b78b-84c7502bebb9", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "e3e97367c6c1af85c11cddda6d32ea6720ce5d064c10807902143af990bd19f1", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "The GLUE benchmark includes the following\ndatasets, the descriptions of which were originally\nsummarized in Wang et al. (2018a):\n\nMNLI Multi-Genre Natural Language Inference\nis a large-scale, crowdsourced entailment classifi-\ncation task (Williams et al., 2018). Given a pair of\nsentences, the goal is to predict whether the sec-\nond sentence is an entailment, contradiction, or\nneutral with respect to the first one.\nQQP Quora Question Pairs is a binary classifi-\ncation task where the goal is to determine if two\nquestions asked on Quora are semantically equiv-\nalent (Chen et al., 2018).\nQNLI Question Natural Language Inference is\na version of the Stanford Question Answering\nDataset (Rajpurkar et al., 2016) which has been\nconverted to a binary classification task (Wang\net al., 2018a). The positive examples are (ques-\ntion, sentence) pairs which do contain the correct\nanswer, and the negative examples are (question,\nsentence) from the same paragraph which do not\ncontain the answer.", "mimetype": "text/plain", "start_char_idx": 3269, "end_char_idx": 4277, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "bc03e397-6cf3-4852-a898-40b4c4063325": {"__data__": {"id_": "bc03e397-6cf3-4852-a898-40b4c4063325", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "de34d689-b92f-4d51-9747-504f3d6499a7", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "45e5affb0a2bcfbbd8b92364275d130db3665278f02b87ddc29514a66d4e5b6b", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "69f5ade7-59a5-4766-95c0-52706c102651", "node_type": "1", "metadata": {}, "hash": "b88fcb5a4ba36ce3506d6c3d76d2d277aa79e92199e8db9a72b70e22fac999ca", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": " Class Class\n Label Label\n\n C T 1 ... T N T [SEP] T \u20191... T \u2019M C T 1 T 2 ... T N\n\n BERT BERT\n\nE [CLS]E 1 ... E N E [SEP] E \u20191... E \u2019M\n[CLS] Tok Tok Tok Tok\n\n\n 1 ... N [SEP] 1 ... M\nE [CLS] E 1 E 2 ... E N\n[CLS]\n[CLS] Tok 1\n Tok 1 Tok 2 Tok N\n ...\n\n Sentence 1 Sentence 2 Single Sentence\n\n Start/End Span O B-PER ... O\n\n C T 1 ... T N T [SEP] T \u20191... T \u2019M C T 1 T 2 ... T N\n\n BERT BERT\n\n E [CLS]E 1 ... E N E [SEP] E \u20191 ... E \u2019M E [CLS] E 1 E 2 ... E N\n\n [CLS] Tok ... Tok [SEP] Tok ... Tok [CLS] Tok 1 Tok 2 ... Tok N\n 1 N 1 M\n\n Question Paragraph Single Sentence\n\n Figure 4: Illustrations of Fine-tuning BERT on Different Tasks.\n\nSST-2 The Stanford Sentiment Treebank is a\nbinary single-sentence classification task consist-\ning of sentences extracted from movie reviews\nwith human annotations of their sentiment (Socher\net al., 2013).\n\nCoLA The Corpus of Linguistic Acceptability is\na binary single-sentence classification task, where\nthe goal is to predict whether an English sentence\nis linguistically \u201cacceptable\u201d or not (Warstadt\net al., 2018).\n\nSTS-B The Semantic Textual Similarity Bench-\nmark is a collection of sentence pairs drawn from\nnews headlines and other sources (Cer et al.,\n2017). They were annotated with a score from 1\nto 5 denoting how similar the two sentences are in\nterms of semantic meaning.\nMRPC Microsoft Research Paraphrase Corpus\nconsists of sentence pairs automatically extracted\n\nfrom online news sources, with human annotations\nfor whether the sentences in the pair are semanti-\ncally equivalent (Dolan and Brockett, 2005).\n\nRTE Recognizing Textual Entailment is a bi-\nnary entailment task similar to MNLI, but with\nmuch less training data (Bentivogli et al., 2009).14\n\nWNLI Winograd NLI is a small natural lan-\nguage inference dataset (Levesque et al., 2011).\nThe GLUE webpage notes that there are issues\nwith the construction of this dataset, 15 and every\ntrained system that\u2019s been submitted to GLUE has\nperformed worse than the 65.1 baseline accuracy\nof predicting the majority class. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 3411, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "69f5ade7-59a5-4766-95c0-52706c102651": {"__data__": {"id_": "69f5ade7-59a5-4766-95c0-52706c102651", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "de34d689-b92f-4d51-9747-504f3d6499a7", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "45e5affb0a2bcfbbd8b92364275d130db3665278f02b87ddc29514a66d4e5b6b", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "bc03e397-6cf3-4852-a898-40b4c4063325", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "e1ade5db5b47d10cff19421ad494efc0d9bda59731a1f1a6ec892c986aaf732b", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "We therefore ex-\nclude this set to be fair to OpenAI GPT. For our\nGLUE submission, we always predicted the ma-\n 14Note that we only report single-task fine-tuning results\nin this paper. A multitask fine-tuning approach could poten-\ntially push the performance even further. For example, we\ndid observe substantial improvements on RTE from multi-\ntask training with MNLI.15https://gluebenchmark.com/faq", "mimetype": "text/plain", "start_char_idx": 3411, "end_char_idx": 3814, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "3ab56ff6-1afe-4901-b46a-0160b56c7a48": {"__data__": {"id_": "3ab56ff6-1afe-4901-b46a-0160b56c7a48", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "1d130fec-17ad-4a6a-a2dd-1409c65e4a7d", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "67ea5751ce9c00dc7ddd9fe026d2254e6bf5e49f6d210ca65dd9e2f7604edfa2", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "c6a86942-07f0-4b10-a27c-f02d201c542f", "node_type": "1", "metadata": {}, "hash": "428f9c0596ab18d35b02b20d26927b1c81e0ed04eccf0fb69154c366273dbbaf", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "jority class.\nC Additional Ablation Studies\nC.1 Effect of Number of Training Steps\n\nFigure 5 presents MNLI Dev accuracy after fine-\ntuning from a checkpoint that has been pre-trained\nfor k steps. This allows us to answer the following\nquestions:\n\n 1. Question: Does BERT really need such\n a large amount of pre-training (128,000\n words/batch * 1,000,000 steps) to achieve\n high fine-tuning accuracy?\n Answer: Yes, BERTBASE achieves almost\n 1.0% additional accuracy on MNLI when\n\n trained on 1M steps compared to 500k steps.\n\n 2. Question: Does MLM pre-training converge\n slower than LTR pre-training, since only 15%\n of words are predicted in each batch rather\n than every word?\n Answer: The MLM model does converge\n slightly slower than the LTR model. How-\n ever, in terms of absolute accuracy the MLM\n model begins to outperform the LTR model\n almost immediately.\n\nC.2 Ablation for Different Masking\n Procedures\nIn Section 3.1, we mention that BERT uses a\nmixed strategy for masking the target tokens when\npre-training with the masked language model\n(MLM) objective. The following is an ablation\nstudy to evaluate the effect of different masking\nstrategies.\n\n 84\n\n 82\n 80\n\n 78\n BERTBASE (Masked LM)\n 76 BERTBASE (Left-to-Right)\n Note that the purpose of the masking strategiesMNLI Dev Accuracyis to reduce the mismatch between pre-training\nand fine-tuning, as the [MASK] symbol never ap-\npears during the fine-tuning stage. We report the\nDev results for both MNLI and NER. For NER,\nwe report both fine-tuning and feature-based ap-\nproaches, as we expect the mismatch will be am-\nplified for the feature-based approach as the model\nwill not have the chance to adjust the representa-\ntions.\n\n Masking Rates Dev Set Results\nMASK SAME RND MNLI NER\n Fine-tune Fine-tune Feature-based\n 80% 10% 10% 84.2 95.4 94.9\n 100% 0% 0% 84.3 94.9 94.0\n 80% 0% 20% 84.1 95.2 94.6\n 80% 20% 0% 84.4 95.2 94.7\n 0% 20% 80% 83.7 94.8 94.6\n 0% 0% 100% 83.6 94.9 94.6\n\n Table 8: Ablation over different masking strategies.\n\n The results are presented in Table 8. ", "mimetype": "text/plain", "start_char_idx": 0, "end_char_idx": 2481, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "c6a86942-07f0-4b10-a27c-f02d201c542f": {"__data__": {"id_": "c6a86942-07f0-4b10-a27c-f02d201c542f", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "1d130fec-17ad-4a6a-a2dd-1409c65e4a7d", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "67ea5751ce9c00dc7ddd9fe026d2254e6bf5e49f6d210ca65dd9e2f7604edfa2", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "3ab56ff6-1afe-4901-b46a-0160b56c7a48", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "e9e54ed853ee5bbdbf469f59fa078b65885530ae445ff05a7b5b16287628e0f6", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "d4f64e83-6625-43c1-8da5-3551fee253a5", "node_type": "1", "metadata": {}, "hash": "3d0387a38319c0c4a6a07cfa63d48bb879e376c7fd84b7549569a6e4d28f4631", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "In the table,\nMASK means that we replace the target token with\nthe [MASK] symbol for MLM; SAME means that\nwe keep the target token as is; RND means that\nwe replace the target token with another random\ntoken.\n The numbers in the left part of the table repre-\nsent the probabilities of the specific strategies used\nduring MLM pre-training (BERT uses 80%, 10%,\n10%). The right part of the paper represents the\nDev set results. For the feature-based approach,\nwe concatenate the last 4 layers of BERT as the\nfeatures, which was shown to be the best approach\nin Section 5.3.From the table it can be seen that fine-tuning is\nsurprisingly robust to different masking strategies.\nHowever, as expected, using only the MASK strat-\negy was problematic when applying the feature-\nbased approach to NER. Interestingly, using only\nthe RND strategy performs much worse than our\nstrategy as well.\n\n\n\n\n ", "mimetype": "text/plain", "start_char_idx": 2481, "end_char_idx": 3404, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}, "d4f64e83-6625-43c1-8da5-3551fee253a5": {"__data__": {"id_": "d4f64e83-6625-43c1-8da5-3551fee253a5", "embedding": null, "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "1d130fec-17ad-4a6a-a2dd-1409c65e4a7d", "node_type": "4", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "67ea5751ce9c00dc7ddd9fe026d2254e6bf5e49f6d210ca65dd9e2f7604edfa2", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "c6a86942-07f0-4b10-a27c-f02d201c542f", "node_type": "1", "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}, "hash": "3304a0ebf9902880e1b4a1890134196408a77c6ef7a614889fba777106713ac9", "class_name": "RelatedNodeInfo"}}, "metadata_template": "{key}: {value}", "metadata_separator": "\n", "text": "200 400 600 800 1,000\n Pre-training Steps (Thousands)\n\n Figure 5: Ablation over number of training steps. This\n shows the MNLI accuracy after fine-tuning, starting\n from model parameters that have been pre-trained for\n k steps. The x-axis is the value of k.", "mimetype": "text/plain", "start_char_idx": 3404, "end_char_idx": 3801, "metadata_seperator": "\n", "text_template": "{metadata_str}\n\n{content}", "class_name": "TextNode"}, "__type__": "1"}}, "docstore/metadata": {"8d55e99e-029a-47c6-8fae-5bff1ee7672a": {"doc_hash": "2aca2bfa89e3ba6ce60d62fd9a99c06e1bf5d9cd3ff7a62bdec2bf59ae34d9f9", "ref_doc_id": "f94e48ce-e161-46d1-800f-430e38b4962c"}, "1d81b4fd-e928-4180-8ef6-a41371ac1fed": {"doc_hash": "d186919dd54075a6eda65382523733547789e6d01b2b18af581dc53ef19974ae", "ref_doc_id": "f94e48ce-e161-46d1-800f-430e38b4962c"}, "a58908ab-8178-40cd-b2a3-efcaf65228f2": {"doc_hash": "edd6518c76da299b4e2642067d6fafe03a3f884aa22306e6a84a97607f1a363c", "ref_doc_id": "dcba1f47-98fb-4b83-8b82-bb0a4e4c9db8"}, "ab890d58-e9ed-49ae-9c78-282a91366161": {"doc_hash": "c48e35db928469b51656e9cd7516a4f31009d3c2ebcf1de606c315125da340d6", "ref_doc_id": "dcba1f47-98fb-4b83-8b82-bb0a4e4c9db8"}, "9bbcaaca-e282-4f65-a5c2-9a5708a228eb": {"doc_hash": "01e61f2445fd436f8507df047920f5265c41b2975f6ff40cdae587854553c4bb", "ref_doc_id": "dcba1f47-98fb-4b83-8b82-bb0a4e4c9db8"}, "926f43b0-e58c-4e8f-b766-c90f2447361e": {"doc_hash": "42369be9595563bb5dd96a4dde08a6e205223c2c6f3edb12c542a50a943a6e42", "ref_doc_id": "f9edbae8-ed8d-4147-803e-936500d2d60f"}, "1a82bbed-7218-4c23-8cb5-402971703a30": {"doc_hash": "fc2e237c06d17ca2f6b8044e119b7a0ef6543d27c24ae314a1c3616d6ea6467e", "ref_doc_id": "f9edbae8-ed8d-4147-803e-936500d2d60f"}, "a93a9152-5892-4b84-87c4-d7d7e57b5d64": {"doc_hash": "f10e994c1d91ae4c8bc4ea64aa8b6ba381a3fe5278f5a865b368cd6658d1b01a", "ref_doc_id": "f9edbae8-ed8d-4147-803e-936500d2d60f"}, "abbf6d32-14ba-4c60-850d-cf4429b349e4": {"doc_hash": "41b6d9a30bcbf21c8b5930fbeae0cc97e5eaf1a435c890cf50c611130ed5513a", "ref_doc_id": "cc193b63-6a56-496d-92cc-812a0a7cf204"}, "f1f06b3a-489c-4861-b5be-72cd1c1d8e80": {"doc_hash": "7171358e6db4a7d03cbed4a1dde37cc4c3afdd7ed30c429f837a996a188fa018", "ref_doc_id": "cc193b63-6a56-496d-92cc-812a0a7cf204"}, "544c5e06-611b-4fe1-93ae-aa3441eb385e": {"doc_hash": "37cda6455ab0b252d0c7797c47cfa0c5fc72c8abbf7747e8f89c2d40fe6d69ab", "ref_doc_id": "cc193b63-6a56-496d-92cc-812a0a7cf204"}, "b635c2da-b277-4b49-936d-8ac5939da468": {"doc_hash": "984d97b14cd67774554e50dda8a526787755c1ed32bce97782170293021e249b", "ref_doc_id": "5a8026ec-62da-4652-ab01-3c38b64fda7c"}, "2f892821-777d-4fb6-ac8f-611ef0566d7d": {"doc_hash": "b1b75cf9b57f383786b17095af33f3371cec5a2680d9146d9a5a6747db75b6c9", "ref_doc_id": "5a8026ec-62da-4652-ab01-3c38b64fda7c"}, "6c2b1028-b0b0-4f92-af51-9660eb0f49f5": {"doc_hash": "e2dc9f57eb6a14717530ff0b2f4fe5ab98181c686c49d22a8de64969fd38e0b3", "ref_doc_id": "5a8026ec-62da-4652-ab01-3c38b64fda7c"}, "349ffcb2-b1dd-49ef-b7f7-37406a635c71": {"doc_hash": "17269988ca15a411246f71360ae77c4492f64e2d353d65412d6591e307bda7ee", "ref_doc_id": "bc534cc6-e535-4b99-b43b-605d5b37174d"}, "43c7eae6-7d68-47bc-8fd4-8f0584405231": {"doc_hash": "9e3b7dd2238d9acd968b8a769d0581f893d0699ebdb7bc690a8cf8e45cd3d9c4", "ref_doc_id": "bc534cc6-e535-4b99-b43b-605d5b37174d"}, "15b81cf1-d657-41b4-a9ff-703f8e5e6fac": {"doc_hash": "aaac0953e78e1481ec674f4706a04223fe34ac76e4aff2cd9a9f51f92ad56b09", "ref_doc_id": "bc534cc6-e535-4b99-b43b-605d5b37174d"}, "0af4d994-64d1-4bf4-8624-55ed2e4f7dbd": {"doc_hash": "5b203c46fac99619997fbe420c7403431c3da2285853ad72f69b2509fc694e47", "ref_doc_id": "7ab3456d-9fa7-49a8-856f-b2a713991fd8"}, "b6154034-ddde-4fd8-a018-126e613aa014": {"doc_hash": "605d062d81f57157a49ee24b715fbe954d1ac4da24bf69e5746ce728e1edcf3b", "ref_doc_id": "7ab3456d-9fa7-49a8-856f-b2a713991fd8"}, "61d2f504-bc66-41b5-b6b3-662d999e9f60": {"doc_hash": "1c716450597b1e4b1afe4cc8b954902e8f0e54f4f1b4b9ed4783671708cba627", "ref_doc_id": "7ab3456d-9fa7-49a8-856f-b2a713991fd8"}, "049233ae-7f17-4b97-b212-478744e93165": {"doc_hash": "1816ded9e583043856ebde339cf39eff11a9ecbdc0f6b580ae47bd0efb3766af", "ref_doc_id": "13d1b84a-4210-4825-933b-a6d498a30c00"}, "21c9c67d-01b5-49c7-b0c5-7c5856abacbe": {"doc_hash": "877b4d729adb665b0fdce9ed776eaa6b3ebd1bb13eeaba93e912c0f85fc9c1c5", "ref_doc_id": "13d1b84a-4210-4825-933b-a6d498a30c00"}, "2b41dc46-a706-4035-8707-8bdac62c2cfc": {"doc_hash": "4854d9c2319df1e9cd22d54da936e65c2a02eee859483e4bf9e628a3d1b6db23", "ref_doc_id": "13d1b84a-4210-4825-933b-a6d498a30c00"}, "641ed924-8e3f-4b7c-ae65-66e3dc4da5d5": {"doc_hash": "edee340ad5b5b677edb913fcea52cb0f2a57f73bae804a9a5b35312f9e7408f5", "ref_doc_id": "6ea55280-c096-4e63-a744-d6fbd76d6e91"}, "60786148-3cfd-4e95-ab5d-256991f19a68": {"doc_hash": "2b552c060d99c83f74c0324521e3574349a02cadcdf8c75a2d697b610b9a929f", "ref_doc_id": "6ea55280-c096-4e63-a744-d6fbd76d6e91"}, "9f8093af-21a6-443b-a6ac-864fb66387f8": {"doc_hash": "aaa375d627f38ef98fc757becb63f3dffc72afaad2793af7d20d9d196df6d314", "ref_doc_id": "6ea55280-c096-4e63-a744-d6fbd76d6e91"}, "e1c258d9-0291-4310-9fbd-a17f908a5826": {"doc_hash": "439d13a98b6c08467cf9614fccd626de0db3605cb4fd6688c180248ab85390e7", "ref_doc_id": "5d240373-8164-4fbe-9ff4-02a17074549b"}, "6808912a-ceb2-47ba-9281-2f1c06afe3d9": {"doc_hash": "226fd2b43894cd7d2595c0ca96cab1d872db2c703dd2ad5db2cc65ac79e93989", "ref_doc_id": "5d240373-8164-4fbe-9ff4-02a17074549b"}, "d1afa468-be5c-4597-8f48-c90574dff711": {"doc_hash": "dac55530bb0e1e04298ea3174a6d8afeaf75774267b062d665bf2cd5d72b6fd1", "ref_doc_id": "5d240373-8164-4fbe-9ff4-02a17074549b"}, "d854ece0-8e05-4e06-ba7d-442eb5a771eb": {"doc_hash": "b3aa4da2f8a1cbc7f15222336bb2c50764469bdc5c7b461c11c4da23ea5bd7b2", "ref_doc_id": "5d240373-8164-4fbe-9ff4-02a17074549b"}, "f48e1778-26db-49b0-89e7-e04c961609cc": {"doc_hash": "39cb5ba8f8c815af3d966b4807c50db18346ef77ac003de3e592e7a97b6f3648", "ref_doc_id": "5d240373-8164-4fbe-9ff4-02a17074549b"}, "f1a6746d-1e02-49e9-be62-360454d78ce3": {"doc_hash": "5ffb5da693a4515d8a99aa3c5378fe9fcceb8af750407a6fe2f027b6bb10d3cf", "ref_doc_id": "5d240373-8164-4fbe-9ff4-02a17074549b"}, "388d7ccb-8037-4295-9e61-5e7bc66581e2": {"doc_hash": "4a2ff896ea82da46620dba0e1daf8725c78ea8ecb65c1904e66295ef2bd8b681", "ref_doc_id": "74359064-bf24-40e0-9818-ab1d000bff3e"}, "8e29386d-e646-4e6b-8096-3545b568bd44": {"doc_hash": "2c1af3ed8808aa9d2ef1739e0ec9a336ed0872382c0b1734d3f39e5a792bdf3d", "ref_doc_id": "74359064-bf24-40e0-9818-ab1d000bff3e"}, "48a545c7-3964-4115-88cc-e2df29b360a0": {"doc_hash": "32a352e883afde83740202d56a5eee63aa32aa05471451b5aecc0494b951ae4a", "ref_doc_id": "74359064-bf24-40e0-9818-ab1d000bff3e"}, "4ae1f21b-eae6-41ac-a45b-c08dcc7346e5": {"doc_hash": "9d1fddbe4bf045683de124d183fd5d71823e0de08303edccda49b94f7a06bd6a", "ref_doc_id": "74359064-bf24-40e0-9818-ab1d000bff3e"}, "f7ae2b3e-8e5f-4317-9483-6029a02f4a66": {"doc_hash": "b278f032dbb5c904b0a6d0e7d0e86eae73b5c83c0bd47274ac3e8fa844d2439a", "ref_doc_id": "74359064-bf24-40e0-9818-ab1d000bff3e"}, "cd5b0464-f7f4-465c-894e-93dd7a8f1e77": {"doc_hash": "abfa47e008a706030ded92c08bb68bf666b50e88ed3bf341d502468d16b5e686", "ref_doc_id": "74359064-bf24-40e0-9818-ab1d000bff3e"}, "5042ecfe-b092-4370-a34d-f747863465a0": {"doc_hash": "ac66a64796c41de33dcab9aa3e041ccd7824d97d64672665be8e42e4253b6f8e", "ref_doc_id": "166109f8-fb93-4127-abe5-964ed07035c2"}, "3efbee33-f0fd-4f16-bf5c-3c7a897c1562": {"doc_hash": "7289a56bcc3a329f219dbe8f3c17f35d43e9ecab375b0d711f807efe993d5ee9", "ref_doc_id": "166109f8-fb93-4127-abe5-964ed07035c2"}, "87a4fe6d-ce60-4893-a3ab-faea7aa65407": {"doc_hash": "d50d92f080832f7adf7adfca276409259aaeb76000588441dc7dfa1a5e5a4957", "ref_doc_id": "166109f8-fb93-4127-abe5-964ed07035c2"}, "4ee6eb46-8fd4-43f4-b321-1711067a516f": {"doc_hash": "a2fa6a035ac9bdb9b7e8a7c6985270597942f33e99dd48074b349c6739f74a63", "ref_doc_id": "166109f8-fb93-4127-abe5-964ed07035c2"}, "305ade7f-2710-4529-8972-2d49133a67ed": {"doc_hash": "b07e8aeacea96fb9c4d2b4780c8fbfab3315c10acdbc5ebf1b197df3b56f1907", "ref_doc_id": "77ff5d8d-0dda-479c-9295-ad7f30935d02"}, "8b857906-4aa7-4a72-9b9b-fe4e7472a16b": {"doc_hash": "ffddeddb1979247b208f81899c271a3d56e3b04902804318e7d344dcc44b1280", "ref_doc_id": "77ff5d8d-0dda-479c-9295-ad7f30935d02"}, "e4240080-f6c3-485f-b3b8-bc17acedd026": {"doc_hash": "e032f9d9a50a15d1bd37544bf800a56bc1512e093defa96cdd8b10202dcca54e", "ref_doc_id": "77ff5d8d-0dda-479c-9295-ad7f30935d02"}, "7432a915-d9a3-49ac-bda3-f72850d06063": {"doc_hash": "054e75f13f4902325c847e83695ac16adca2a5333266a2405b7c56020376292b", "ref_doc_id": "aaadb951-b38a-4e3a-accf-f6bfdb2f05a6"}, "f6bb6a46-a49b-41ef-b78b-84c7502bebb9": {"doc_hash": "e3e97367c6c1af85c11cddda6d32ea6720ce5d064c10807902143af990bd19f1", "ref_doc_id": "aaadb951-b38a-4e3a-accf-f6bfdb2f05a6"}, "7200837e-7b73-4376-b06f-d9b4c910cd3e": {"doc_hash": "8073a590f45482b1f25286526e36f00c8dda79028b691774a20a673ccf997ad6", "ref_doc_id": "aaadb951-b38a-4e3a-accf-f6bfdb2f05a6"}, "bc03e397-6cf3-4852-a898-40b4c4063325": {"doc_hash": "e1ade5db5b47d10cff19421ad494efc0d9bda59731a1f1a6ec892c986aaf732b", "ref_doc_id": "de34d689-b92f-4d51-9747-504f3d6499a7"}, "69f5ade7-59a5-4766-95c0-52706c102651": {"doc_hash": "674935bb7c1ee9b4bd0e5e0bc5b5457267cafbbab5f75be6d12804d47edcc74c", "ref_doc_id": "de34d689-b92f-4d51-9747-504f3d6499a7"}, "3ab56ff6-1afe-4901-b46a-0160b56c7a48": {"doc_hash": "e9e54ed853ee5bbdbf469f59fa078b65885530ae445ff05a7b5b16287628e0f6", "ref_doc_id": "1d130fec-17ad-4a6a-a2dd-1409c65e4a7d"}, "c6a86942-07f0-4b10-a27c-f02d201c542f": {"doc_hash": "3304a0ebf9902880e1b4a1890134196408a77c6ef7a614889fba777106713ac9", "ref_doc_id": "1d130fec-17ad-4a6a-a2dd-1409c65e4a7d"}, "d4f64e83-6625-43c1-8da5-3551fee253a5": {"doc_hash": "257f9cea62315ec330ff4cbe534f75feaff0bd7ba072c73342b45ee9c61a5eea", "ref_doc_id": "1d130fec-17ad-4a6a-a2dd-1409c65e4a7d"}}, "docstore/ref_doc_info": {"f94e48ce-e161-46d1-800f-430e38b4962c": {"node_ids": ["8d55e99e-029a-47c6-8fae-5bff1ee7672a", "1d81b4fd-e928-4180-8ef6-a41371ac1fed"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "dcba1f47-98fb-4b83-8b82-bb0a4e4c9db8": {"node_ids": ["a58908ab-8178-40cd-b2a3-efcaf65228f2", "ab890d58-e9ed-49ae-9c78-282a91366161", "9bbcaaca-e282-4f65-a5c2-9a5708a228eb"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "f9edbae8-ed8d-4147-803e-936500d2d60f": {"node_ids": ["926f43b0-e58c-4e8f-b766-c90f2447361e", "1a82bbed-7218-4c23-8cb5-402971703a30", "a93a9152-5892-4b84-87c4-d7d7e57b5d64"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "cc193b63-6a56-496d-92cc-812a0a7cf204": {"node_ids": ["abbf6d32-14ba-4c60-850d-cf4429b349e4", "f1f06b3a-489c-4861-b5be-72cd1c1d8e80", "544c5e06-611b-4fe1-93ae-aa3441eb385e"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "5a8026ec-62da-4652-ab01-3c38b64fda7c": {"node_ids": ["b635c2da-b277-4b49-936d-8ac5939da468", "2f892821-777d-4fb6-ac8f-611ef0566d7d", "6c2b1028-b0b0-4f92-af51-9660eb0f49f5"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "bc534cc6-e535-4b99-b43b-605d5b37174d": {"node_ids": ["349ffcb2-b1dd-49ef-b7f7-37406a635c71", "43c7eae6-7d68-47bc-8fd4-8f0584405231", "15b81cf1-d657-41b4-a9ff-703f8e5e6fac"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "7ab3456d-9fa7-49a8-856f-b2a713991fd8": {"node_ids": ["0af4d994-64d1-4bf4-8624-55ed2e4f7dbd", "b6154034-ddde-4fd8-a018-126e613aa014", "61d2f504-bc66-41b5-b6b3-662d999e9f60"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "13d1b84a-4210-4825-933b-a6d498a30c00": {"node_ids": ["049233ae-7f17-4b97-b212-478744e93165", "21c9c67d-01b5-49c7-b0c5-7c5856abacbe", "2b41dc46-a706-4035-8707-8bdac62c2cfc"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "6ea55280-c096-4e63-a744-d6fbd76d6e91": {"node_ids": ["641ed924-8e3f-4b7c-ae65-66e3dc4da5d5", "60786148-3cfd-4e95-ab5d-256991f19a68", "9f8093af-21a6-443b-a6ac-864fb66387f8"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "5d240373-8164-4fbe-9ff4-02a17074549b": {"node_ids": ["e1c258d9-0291-4310-9fbd-a17f908a5826", "6808912a-ceb2-47ba-9281-2f1c06afe3d9", "d1afa468-be5c-4597-8f48-c90574dff711", "d854ece0-8e05-4e06-ba7d-442eb5a771eb", "f48e1778-26db-49b0-89e7-e04c961609cc", "f1a6746d-1e02-49e9-be62-360454d78ce3"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "74359064-bf24-40e0-9818-ab1d000bff3e": {"node_ids": ["388d7ccb-8037-4295-9e61-5e7bc66581e2", "8e29386d-e646-4e6b-8096-3545b568bd44", "48a545c7-3964-4115-88cc-e2df29b360a0", "4ae1f21b-eae6-41ac-a45b-c08dcc7346e5", "f7ae2b3e-8e5f-4317-9483-6029a02f4a66", "cd5b0464-f7f4-465c-894e-93dd7a8f1e77"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "166109f8-fb93-4127-abe5-964ed07035c2": {"node_ids": ["5042ecfe-b092-4370-a34d-f747863465a0", "3efbee33-f0fd-4f16-bf5c-3c7a897c1562", "87a4fe6d-ce60-4893-a3ab-faea7aa65407", "4ee6eb46-8fd4-43f4-b321-1711067a516f"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "77ff5d8d-0dda-479c-9295-ad7f30935d02": {"node_ids": ["305ade7f-2710-4529-8972-2d49133a67ed", "8b857906-4aa7-4a72-9b9b-fe4e7472a16b", "e4240080-f6c3-485f-b3b8-bc17acedd026"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "aaadb951-b38a-4e3a-accf-f6bfdb2f05a6": {"node_ids": ["7432a915-d9a3-49ac-bda3-f72850d06063", "f6bb6a46-a49b-41ef-b78b-84c7502bebb9", "7200837e-7b73-4376-b06f-d9b4c910cd3e"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "de34d689-b92f-4d51-9747-504f3d6499a7": {"node_ids": ["bc03e397-6cf3-4852-a898-40b4c4063325", "69f5ade7-59a5-4766-95c0-52706c102651"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}, "1d130fec-17ad-4a6a-a2dd-1409c65e4a7d": {"node_ids": ["3ab56ff6-1afe-4901-b46a-0160b56c7a48", "c6a86942-07f0-4b10-a27c-f02d201c542f", "d4f64e83-6625-43c1-8da5-3551fee253a5"], "metadata": {"file_path": "C:\\MAIN\\it\\projects\\vs\\ds_rag\\data\\paper.pdf", "file_name": "paper.pdf", "file_type": "application/pdf", "file_size": 775166, "creation_date": "2024-12-05", "last_modified_date": "2024-12-08"}}}} \ No newline at end of file