SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1

This is a sentence-transformers model finetuned from sentence-transformers/multi-qa-mpnet-base-dot-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/multi-qa-mpnet-base-dot-v1
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 tokens
Similarity Function: Dot Product

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("BenElliot27/multi-qa-mpnet-base-dot-v1-ATLAS-TALK")
# Run inference
sentences = [
    'Failure to read evgen file in Rivet grid job? What I see is that you are not producing ttz_analysis.yoda, and the code is producing a user.narayan.6831283.EXT0._000026.ttz_analysis.yoda with zero file size.\nCheers,\nAlden',
    'Hi all,\nI\'m having trouble running Rivet in Athena with private evgen as input, see:\nhttp://bigpanda.cern.ch/task/4532210/\nThe error given is "pilot: Encountered zero file size for file user.mcfayden.4532210.EXT0._000003.187522.Zjets.yoda", but actually the problem seems to be due to the evgen file not being opened correctly which means Rivet had no events to run over:\nEventSelector        INFO EventSelection with query\nDbSession Info     Open     DbSession\nDomain[ROOT_All] Info &gt;   Access   DbDomain     READ      [ROOT_All]\nDomain[ROOT_All] Info -&gt;  Access   DbDatabase   READ      [ROOT_All] 9651733E-360B-424D-B1BC-51B25F68B05D\nDomain[ROOT_All] Info                           user.mcfayden.4532110.EXT2._000001.mc12_7TeV.187522.EVNT.root\nRootDBase.open Success user.mcfayden.4532110.EXT2._000001.mc12_7TeV.187522.EVNT.root File version:53005\nImplicitCollection Info Opened the implicit collection with connection string "PFN:user.mcfayden.4532110.EXT2._000001.mc12_7TeV.187522.EVNT.root"\nImplicitCollection Info and a name "POOLContainer(DataHeader)"\nAthenaSummarySvc     INFO  -&gt; file incident: FID:9651733E-360B-424D-B1BC-51B25F68B05D [GUID: FID:9651733E-360B-424D-B1BC-51B25F68B05D]\nPoolSvc              INFO Failed to find container MetaDataHdrDataHeader to get Token.\nEventPersistenc...   INFO Added successfully Conversion service:AthenaPoolCnvSvc\nAthenaPoolConve...  ERROR Failed to convert persistent object to transient: FID "74385E9E-38B2-7F4F-A610-B42059934C68" is not existing in the catalog ( POOL : "PersistencySvc::UserDatabase::connectForRead" from "PersistencySvc" )\nAthenaPoolConve...  ERROR createObj PoolToDataObject() failed, Token = [DB=9651733E-360B-424D-B1BC-51B25F68B05D][CNT=MetaDataHdr(DataHeader)][CLID=D82968A1-CF91-4320-B2DD-E0F739CBC7E6][TECH=00000202][OID=000000000000000C-0000000000000000]\nDataProxy         WARNING accessData: conversion failed for data object 222376821/;00;MetaDataSvc\n&nbsp;&nbsp;Returning NULL DataObject pointer\nMetaDataSvc         ERROR Could not get DataHeader, will not read Metadata\nFull log here:\nhttp://aipanda057.cern.ch/media/filebrowser/be0426bd-2ec0-49bc-9462-eb822ca3c9f3/tarball_PandaJob_2330211990_ANALY_NIKHEF-ELPROD_SHORT/athena_stdout.txt\nRunning the same job but using officially produced evgen (from a much older release) as input works just fine, see:\nhttp://bigpanda.cern.ch/task/4532297/\nIt even works with privately produced evgen from a few months ago, see:\nhttp://bigpanda.cern.ch/task/4511752/\nAlso, if I download the input file and run on it locally it runs with no problems.\nAny ideas what the problem might be here?\nCheers,\nJosh.\n\nHi\nThank you for looking into it. But I figured out the problem. I was sending an so file which was compiled with a different athena release than the grid version \nNow that I have figured out the problem, it works fine\nCheers\nRohin\n\nHi Josh.\nYour read on the situation matches mine. The input file is in place and of the right size throughout the operation.\nItems to troubleshoot from here out include: possible Athena version compatibility or ROOT version mismatch, or a subtle site error. It’s failed on a retry, so that’s not good.\nCould you download the exact file, if you haven’t already, and run it locally. Send me the output and the results of the ls and env commands?\nThanks,\nAlden\n\nDear experts,\nplease excuse me referring back to this old thread. I’m struggling with the same problem, running my custom rivet code on evgen files on the grid (http://bigpanda.cern.ch/task/8787362/). Locally the code runs just fine on these files, not on the grid though.\nThe error occurs while accessing the evgen files and the job finishes with:\nPilot error 1191: Encountered zero file size for file user.tkupfer.8787362.EXT0._000003.WYWb900LH05.yoda\nHere is a part of the athena stdout:\nRootCollection Info Opening Collection File dcap://dcache-atlas-dcap.desy.de:22125//pnfs/desy.de/atlas/dq2/atlaslocalgroupdisk/rucio/user/fschenck/06/5e/mc15_13TeV.WYWb900LH05.10000.1.evgen.root in mode: READ\nRootCollection Info File dcap://dcache-atlas-dcap.desy.de:22125//pnfs/desy.de/atlas/dq2/atlaslocalgroupdisk/rucio/user/fschenck/06/5e/mc15_13TeV.WYWb900LH05.10000.1.evgen.root opened\nDbSession Info     Open     DbSession    \nDomain[ROOT_All] Info &gt;   Access   DbDomain     READ      [ROOT_All] \nDomain[ROOT_All] Info -&gt;  Access   DbDatabase   READ      [ROOT_All] 4B75BCC9-FAA4-4E2F-AC15-A2B26FF20048\nDomain[ROOT_All] Info                           dcap://dcache-atlas-dcap.desy.de:22125//pnfs/desy.de/atlas/dq2/atlaslocalgroupdisk/rucio/user/fschenck/06/5e/mc15_13TeV.WYWb900LH05.10000.1.evgen.root\nRootDatabase.open Success dcap://dcache-atlas-dcap.desy.de:22125//pnfs/desy.de/atlas/dq2/atlaslocalgroupdisk/rucio/user/fschenck/06/5e/mc15_13TeV.WYWb900LH05.10000.1.evgen.root File version:53413\nImplicitCollection Info Opened the implicit collection with connection string "PFN:dcap://dcache-atlas-dcap.desy.de:22125//pnfs/desy.de/atlas/dq2/atlaslocalgroupdisk/rucio/user/fschenck/06/5e/mc15_13TeV.WYWb900LH05.10000.1.evgen.root"\nImplicitCollection Info and a name "POOLContainer(DataHeader)"\nAthenaSummarySvc     INFO  -&gt; file incident: FID:4B75BCC9-FAA4-4E2F-AC15-A2B26FF20048 [GUID: FID:4B75BCC9-FAA4-4E2F-AC15-A2B26FF20048]\nPoolSvc              INFO Failed to find container MetaDataHdrDataHeader to get Token.\nEventPersistenc...   INFO Added successfully Conversion service:AthenaPoolCnvSvc\nAthenaPoolConve...  ERROR Failed to convert persistent object to transient: FID "613EA41B-C384-2247-96A7-82EEABEA23B1" is not existing in the catalog ( POOL : "PersistencySvc::UserDatabase::connectForRead" from "PersistencySvc" )\nAthenaPoolConve...  ERROR createObj PoolToDataObject() failed, Token = [DB=4B75BCC9-FAA4-4E2F-AC15-A2B26FF20048][CNT=MetaDataHdr(DataHeader)][CLID=D82968A1-CF91-4320-B2DD-E0F739CBC7E6][TECH=00000202][OID=000000000000000B-0000000000000000]\nDataProxy         WARNING accessData: conversion failed for data object 222376821/;00;MetaDataSvc\n Returning NULL DataObject pointer  \nMetaDataSvc         ERROR Could not get DataHeader, will not read Metadata\nMetaDataSvc       WARNING Unable to load MetaData Proxies\n\nI\'ve tried to figure out whether different athena releases are used to compile the .so files and to run the code on the grid, since\nthis seems to have solved the problem before.\nI\'ve already tried many combinations of commands to specify the AthenaTag and to set up the local athena version on lxplus, but without any success..\n\nBeing very precise on the version in the end:\nasetup 20.1.8.3,AtlasProduction,64,here (locally)\n--athenaTag=20.1.8.3,AtlasProduction,64 (grid)\n\nwasn\'t successful neither and I\'ve still suspicious about the proper athena setup because it says:\ntransUses : Atlas-20.1.8 \ntranshome : AnalysisTransforms-AtlasProduction_20.1.8.3\n\nI\'m not very used to the grid and most likely I\'m doing something stupid.\nSo, please let me know if there is any trick to set up athena on the grid properly, or if this problem has been solved any other way.\n\nThanks in advance!\n\nBest,\nTobias\n\nHi Alden,\nRunning on the same file locally works without any problem\n(File: user.mcfayden.evnt.test.2014-12-08_124829.187522.test_EXT2/user.mcfayden.4532110.EXT2._000001.mc12_7TeV.187522.EVNT.root)\nThe full log and output of ls and env are attached.\nCheers,\nJosh.\nlog.txt (77 KB)\nls.txt (708 Bytes)\nenv.txt (40.1 KB)\n\nRight. Looks like it runs well – so I am at a loss.\nI’ll put some more time into this tomorrow. Sorry.\nCheers,\nAlden\n\nHi Alden,\nI think I might have found the issue.\nI’m just waiting for some jobs to finish to confirm this, so maybe wait before putting too much time into this. \nCheers,\nJosh.\n\nHi again,\nYep, it looks like the problem is due to the fact that I had this in my pathena command:\n–extOutFile=“*mc12_7TeV.187522.EVNT.root”\nI think that this was required in the pre-JEDI days when running two transforms in one job to retrieve the intermediate files.\nAnd it essentially meant that I had the same output file in two output containers, *_EXT1 and *_EXT2.\nMore details:\nFailed task: http://bigpanda.cern.ch/task/4548422/ (with input from: http://bigpanda.cern.ch/task/4547050/)\nSucceeded task: http://bigpanda.cern.ch/task/4548421/ (with input from: http://bigpanda.cern.ch/task/4546848/)\nI have no idea why this causes the file not to be read properly as input for other tasks… but at least I have a fix!\nCheers,\nJosh.\n\nThanks, Josh – that looks like a good fix.\nCheers,\nAlden',
    'Hi UK loud support,\nwould you please check what is the issue in accessing these files\nin (*).\nI have checked this one and the error is here:\nTrying SURL srm://srm-atlas.gridpp.rl.ac.uk:8443/srm/managerv2?SFN=/castor/ads.rl.ac.uk/prod/atlas/StripDeg/atlasgroupdisk/phys-beauty/rucio/data11_7TeV/a8/23/DAOD_ONIAMUMU.594591._000001.pool.root.1 ...\n[SE][Ls][SRM_INVALID_PATH] No such file or directory\n&nbsp;&nbsp;Thanks.\n&nbsp;&nbsp;&nbsp;Cheers,\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Farida\n(*)\ndata11_7TeV:DAOD_ONIAMUMU.594591._000001.pool.root.1\ndata12_8TeV:DAOD_JPSIMUMU.01237672._000076.pool.root.1\ndata12_8TeV:DAOD_JPSIMUMU.01237615._000026.pool.root.1\n\nDear Farida, dear UK cloud support,\nsorry for disturbing you again,\nis there some progress for recovering those three DAOD files?\ndata11_7TeV:DAOD_ONIAMUMU.594591._000001.pool.root.1\ndata12_8TeV:DAOD_JPSIMUMU.01237672._000076.pool.root.1\ndata12_8TeV:DAOD_JPSIMUMU.01237615._000026.pool.root.1\nBest regards,\nVladimir.\n\nHi UK cloud support,\nUser is still waiting for your feedback to fix the issue related to the below files. I have just tried and seems the issue persist (*)\nThanks for looking it this.!\n&nbsp;&nbsp;Cheers,\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Farida\n(*)\nrucio download --protocol srm data11_7TeV:DAOD_ONIAMUMU.594591._000001.pool.root.1\n2016-04-14 22:57:48,933 INFO [Starting download for data11_7TeV:DAOD_ONIAMUMU.594591._000001.pool.root.1 with 1 files]\n2016-04-14 22:57:49,014 INFO [Starting the download of data11_7TeV:DAOD_ONIAMUMU.594591._000001.pool.root.1]\n2016-04-14 22:57:50,884 WARNING [Source file not found.\nDetails: Source file not found.\nDetails: Could not open source: error on the turl request : [SE][PrepareToGet][SRM_INVALID_PATH] No such file or directory]\n2016-04-14 22:57:51,082 WARNING [Source file not found.\nDetails: Source file not found.\nDetails: Could not open source: error on the turl request : [SE][PrepareToGet][SRM_INVALID_PATH] No such file or directory]\n2016-04-14 22:57:51,345 WARNING [Source file not found.\nDetails: Source file not found.\nDetails: Could not open source: error on the turl request : [SE][PrepareToGet][SRM_INVALID_PATH] No such file or directory]\n2016-04-14 22:57:51,579 WARNING [Source file not found.\nDetails: Source file not found.\n\nHi Vladimir\nThe first file is meant to be at RAL (data11_7TeV:DAOD_ONIAMUMU.594591._000001.pool.root.1). I have checked and it does not exist. As this was the only replica of that data it is unfortunately lost.\nThe other two files are meant to be at Lancaster, I will ask the site admin to check but I suspect they are likely to be lost too.\nSorry about this. I’ll start a separate thread in the B-Physics mail list about what we can do to recover them.\nAlastair',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

Size: 11,044 training samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 14 tokens
mean: 297.34 tokens
max: 512 tokens

min: 34 tokens
mean: 458.93 tokens
max: 512 tokens

	anchor	positive
type	string	string
details	min: 14 tokens mean: 297.34 tokens max: 512 tokens	min: 34 tokens mean: 458.93 tokens max: 512 tokens

Samples:

anchor	positive
`We need a PATHelp thread here @jburr mentioned that we could create a PATHelp thread here. I don’t think I have permissions, @lheinric what do you think? I guess it’s hard without Attila and friends on board.`	I don’t think they would be opposed to the idea, though whether they have the time to provide significant support is another question. Either way - this would need to be well-advertised as an alternative (or even replacement!?) for PAT help. Significant support just means watching a category and replying to it though, right? If we could get a few people behind it it might build some momentum. I’m also curious as to whether it might get some help from above: I believe that Microsoft is increasing our licensing fees for things like sharepoint so in some kind of ideal world we’d move away from it entirely. Of course getting rid of it entirely will take many years and I don’t know if there’s any financial advantage to a partial migration… By an astounding coincidence take a look at the last slide in today’s ASG intro… https://indico.cern.ch/event/801152/contributions/3329455/attachments/1800673/2936936/ASGIntro22022019.pdf Awesome, well as a starting point I guess we need @akras to complete to something. A question raised from the ASG meeting: is it possible to tag a mailing list or something similar? I’m sure you can make a bot that does it, but that goes back to the same issue where we have to implement that ourselves. So… February 22 was quite a while ago. Just saying Any ideas to improve usage? I didn’t attend the ASG meeting when people discussed this, but I think I might just start posting my stupid questions here. But as a starting point encourage people to set categories they are interested in to “watching first post” which will notify them when there are new threads. I’m watching ASG and Machine Learning now. There’s nothing to watch, but I’m watching it like a boss. Did this ever go anywhere? I’m from DAST, and it would also make sense to move DAST here. But yeah, there’s a lot of momentum behind the mailing list, so being able to forward mails from the mailing list here, and send responses to the user’s mail would be a good way to get the migration going. It seems like it could be done: https://meta.discourse.org/t/configuring-reply-via-email-e-mail/42026 Personally I think funneling DAST through this thread would be great. The only downsides I see are that: From that link it looks like we can only use one email account. Their suggested workaround was to forward everything to one account and then filter into categories on the discourse side. Someone would have to set it up. @fschenck are you volunteering? The overall result will probably be better if we can get everyone using the forum directly rather than just using it to log emails. That said, I think anything which gets people onto a forum is still better than what we do now. @lheinric what do you think? Ah, I didn’t know it’s one email account for the entire discourse, but a filter would be simple. Well, I could give it a go. I also think it makes sense to move DAST here as we end up answering the same questions a lot, as searching the mailing list archives is a PITA even if you know what you’re doing. But it would be important to keep both running for a while as people know the mailing list. @fschenck this is me replying to see if your mailing list integration worked.
`Alternatives to twiki This doesn’t really seem like a replacement for twiki, but that being said it would be really awesome if we did have a replacement for twiki (i.e. I’d use it for whatever groups I lead). I’ve been looking around and found fosswiki and XWiki both of which look like better maintained alternatives. @lheinric, do you know if CERN IT supports anything like this (or who I should ask about what they support)?`	While I’ll never be the first person to defend twikis, unless there is some way to wholesale copy across the entire existing twiki I don’t think any replacement will be viable. There is just far too much documented on twikis and we run this risk… Right, I knew which one that was before I clicked on it. But to be honest I don’t see the risk: there’s a well established interface between the two (the URL), and if there’s something out there that performs better that outweighs the inconvenience in my mind. I’m admittedly not an expert on twiki, but I’m not convinced that there are any features there that lock us in. I’m nothing if not predictable My worry would be that it leaves there being yet one more place to look for documentation, so documentation can get lost in more places. It also leaves people learning yet another system (which, let’s face it, physicists are rather loath to do). From that wikipedia page it does sound like Foswiki is broadly compatible so it might be possible to transfer things across. Still, transferring how your whole documentation is structured is hardly going to be a small project. Links from slides, code, etc will become dead (unless we can set up some sort of automatic forwarding). Again, not saying that I think it’s doomed to fail but I doubt it would be a simple switch. I think the gitbook one is nice for some uses-cases of documentation that can be authored collaboratively. I somewhat like the idea of an say analysis-specific gitbook, but it’s definitely not a “wiki” (also turnaround time to publishing is a bit higher) We’re toying with the idea of replacing the ML Forum twiki with CodiMD. There are still some quarks to work out (the indexing between pages is a bit sloppy) but so far it seems like a nice alternative.
How do I rucio get one file? I have a dataset: mc16_13TeV:mc16_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.merge.AOD.e6337_e5984_s3126_r10201_r10210_tid14774488_00 and my job is failing on the file AOD.14774488._000007.pool.root.1 in that dataset. How do I download this file alone? I tried rucio get AOD.14774488._000007.pool.root.1 and rucio get mc16_13TeV:mc16_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.merge.AOD.e6337_e5984_s3126_r10201_r10210_tid14774488_00/AOD.14774488._000007.pool.root.1 but both print something like 2019-07-17 08:53:22,099 INFO Processing 1 item(s) for input 2019-07-17 08:53:22,099 INFO Getting sources of DIDs 2019-07-17 08:53:22,244 INFO Using main thread to download 0 file(s) 2019-07-17 08:53:22,244 ERROR None of the requested files have been downloaded.	`Hi @dguest I first checked that the file you are requesting exists in the dataset by doing, rucio list-files mc16_13TeV:mc16_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.merge.AOD.e6337_e5984_s3126_r10201_r10210_tid14774488_00 and get the following output (truncated and ending at the sought-for file): +---------------------------------------------+--------------------------------------+-------------+------------+----------+`

Loss: CachedMultipleNegativesRankingLoss with these parameters:

{
    "scale": 1.0,
    "similarity_fct": "dot_score"
}

Evaluation Dataset

Unnamed Dataset

Size: 2,762 evaluation samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 21 tokens
mean: 278.68 tokens
max: 512 tokens

min: 19 tokens
mean: 440.02 tokens
max: 512 tokens

	anchor	positive
type	string	string
details	min: 21 tokens mean: 278.68 tokens max: 512 tokens	min: 19 tokens mean: 440.02 tokens max: 512 tokens

Samples:

anchor	positive
`Job submission failure Hello, Same here. I submitted 2 jobs over 3 and the failed one has the error: Failed to connect to host. ERROR : Failed to get allowed site list Thanks, Clement`	Hi DAST experts, While submitting jobs, some of my jobs (not all ) wouldn’t get submitted and I am getting the following prun error messages: Error1: ERROR: Failed to get allowed site list Failed to connect to host. / >> SSL connect error. The SSL handshaking failed. Error2: ERROR: failed to upload source files with 255 ERROR: Could not check Sandbox duplication with 35 My Setup: localSetupDQ2Client --skipConfirm localSetupPandaClient --noAthenaCheck voms-proxy-init -voms atlas Can you please point me the source of this problem? Thank you for your time in advance. Cheers, Hasib Hi, I have the same error. Best, Haifeng Hi Hasid, It seems to me an issue with the first authentication which is failing now for some users too. Some issue on the ATLAS VO (.. of unavailable CRL) can provide such error, I will cc voms experts to see whether is a problem with the host server. Users get these errors () after doing the panda_client setup. Thanks! Cheers, Farida () Error1: ERROR: Failed to get allowed site list >> Failed to connect to host. / >> SSL connect error. The SSL handshaking failed. Error2: ERROR: failed to upload source files with 255 Hi, I do not think this is a VO issue but a pandaserver timeout issue … note that my rf. compliant pilots are failing …as well as interactive users who want to submit jobs, eg: 31 Oct 15:43:06
dq2-get: globus_xio Input/output error Dear all, I’m trying since few days now to download some datasets which fail due to Input/output error. For example, trying dq2-get user.cinca.169889_E01-00_S01-00_tag0_JES_EFFECTIVE_STATISTICAL1_DW.SelHadTop_mySimpleTree.root/: stderr:Using grid catalog type: UNKNOWN Using grid catalog : (null) VO name: atlas Checksum type: None Trying SURL srm://svr018.gla.scotgrid.ac.uk:8446/srm/managerv2?SFN=/dpm/gla.scotgrid.ac.uk/home/atlas/atlasscratchdisk/rucio/user/cinca/e0/c9/user.cinca.4322300._000003.mySimpleTree.root … Source SE type: SRMv2 Source SRM Request Token: 82e64a92-eef9-4994-bf9d-4779aa505e57 Source URL: srm://svr018.gla.scotgrid.ac.uk:8446/srm/managerv2?SFN=/dpm/gla.scotgrid.ac.uk/home/atlas/atlasscratchdisk/rucio/user/cinca/e0/c9/user.cinca.4322300._000003.mySimpleTree.root File size: 125606257 Source URL for copy: gsiftp://disk046.gla.scotgrid.ac.uk/disk046.gla.scotgrid.ac.uk:/gridstore3/atlas/2014-10-28/user.cinca.4322300._000003.mySimpleTree.root.232695315.0 Destination URL: file:/afs/cern.ch/work/c/cinca/eos/atlas/user/c/cinca/E01-00_S01-00/lo_JES/user.cinca.169889_E01-00_S01-00_tag0_JES_EFFECTIVE_STATISTICAL1_DW.SelHadTop_mySimpleTree.root.8510447/user.cinca.4322300._000003.mySimpleTree.root streams: 1 globus_xio: Unable to open file /afs/cern.ch/work/c/cinca/eos/atlas/user/c/cinca/E01-00_S01-00/lo_JES/user.cinca.169889_E01-00_S01-00_tag0_JES_EFFECTIVE_STATISTICAL1_DW.SelHadTop_mySimpleTree.root.8510447/user.cinca.4322300._000003.mySimpleTree.root globus_xio: System error in open: Input/output error globus_xio: A system call failed: Input/output error Could you please tell me how I could solve this problem as at some points the samples will be erased from the grid ? Thanks for your help, Diane	`Hi Diane, Maybe your home directory partition is full. Try if you can create new file. Also try to download the file in /tmp directory? Better way, if you need to keep the dataset for longer time, is to request a DaTRI transfer to your LOCALGROUPDISK. Cheers, Yun-Ha Hi Yun-Ha, thanks, I asked for transfer to our local group disk. The download succeeds in tmp/ repository, it may be that my eos quota is exceeded, which I find strange. But I’ll check. Thanks for your help in fixing this ! Diane`
ERROR Missing DCS field information: solenoid 0 toroid 0 Hi, I have a big set of jobs running on the grid, and ~98% have finished successfully. However, for the remaining jobs I keep getting the following error: MagFieldAthenaSvc ERROR Missing DCS field information: solenoid 0 toroid 0 IOVSvcTool ERROR Problems calling MagFieldAthenaSvc[0xd3c6b64]+31 Skipping all subsequent callbacks. IncidentSvc ERROR Standard std::exception is caught handling incident0xff9aad54 etc at multiple sites and with multiple retries. Is this a known issue? If so, what is the work-around? Cheers, Cameron	Please provide a Panda Monitor link to one or a few of the jobs i question. Mattias Ellert ATLAS DAST Essentially any of the failed jobs here: http://panda.cern.ch/server/pandamon/query?job=*&ui=user&name=Cameron%20Cuthbert E.g. http://panda.cern.ch/server/pandamon/query?job=2303236765 http://panda.cern.ch/server/pandamon/query?job=2303230620 I have downloaded one of the files failing and can confirm it fails locally, too. I think the issue is with the infile DCS metadata. The log.log file ends with: Shortened traceback (most recent user call last): File "./BPhysAnalysisMasterAuto.py", line 55, in <module> print "Setting evtMax to ",EvtMax NameError: name 'EvtMax' is not defined Py:Athena INFO leaving with code 8: "an unknown exception occurred" Athena tries the print the variable “EvtMax” that is not defined. Mattias Ellert ATLAS DAST Yes, but log.log is an old log file I created with an earlier version of the code. "BPhysAnalysisMasterAuto.py" no longer contains these lines. The actual error in this case is in athena_stdout.txt. E.g. : http://aipanda048.cern.ch:25880/monitor/logs/517f5c02-c1be-4c1a-9500-37899a64cdb0/tarball_PandaJob_2303236765_ANALY_RAL_SL6/athena_stdout.txt Cameron [cuthbert@sydui1 totalSumOfWeights]$ Hi Cameron. I forward your question to the database experts. Mattias Ellert ATLAS DAST Hi Mattias, Thanks. I am also getting a second class of error which may be DB related: ToolSvc.CaloNoiseToolDefault.sysInitialize() FATAL Standard std::exception is caught ToolSvc.CaloNoiseToolDefault.sysInitialize() ERROR CaloCondBlobBase::getAddress: Index out of range: 100608 >= 95616 StatusCodeSvc FATAL Unchecked StatusCode in AlgTool::sysInitialize() from lib /cvmfs/atlas.cern.ch/repo/sw/software/i686-slc5-gcc43-opt/17.2.1/GAUDI/v22r1p7-lcg61d/InstallArea/i686-slc5-gcc43-opt/lib/libGaudiKernel.so See: http://panda.cern.ch/server/pandamon/query?overview=viewlogfile&nocachemark=yes&guid=9f934b7c-238c-4db2-95a0-9b6bd070ef29&lfn=group.phys-beauty.data12_8TeV.periodI.physics_Bphysics.PhysCont.DAOD_UPSIMUMU.grp14_v03_p1425.v1.log.4288166.001234.log.tgz&site=RAL-LCG2_SCRATCHDISK&scope=group.phys-beauty Cheers, Cameron Any news on this? I do not follow very well what you are doing Cameron, but looking at your log file it seems to me that the error is related to time used to access the DCS folder. Nevertheless I cannot check this, because the information is not printed. May be you could try to use a DEBUG level of logging ? I do not think that to force the system to access Oracle-Frontier you need to put the override line you mention. There should be something else at athena level, but there again I cannot really help. A. Hi, Ok I will try to run some test jobs on DEBUG and send you through the logs. In the mean time, who can I ask about accessing the 'HEAD' of the conditions database? Cheers, Cameron Hi Regarding This has probably nothing to do with DCS folders This can be related to the fact that you are using a “new” condition tag and an old software version and the two are not compatible (we have only backward compatibility not forward compatibility) Guillaume Hi, Is there anything that can be done to fix this issue then, aside from using a different software version (not an option)? Cheers, Cameron Hi Maybe you can say which software release you are using and which condition tag you are using ? Guillaume Hi, Py:Athena INFO using release [AtlasOffline-17.2.1] [i686-slc5-gcc43-opt] [17.2.X-VAL/rel_3] -- built on [2012 03/27 21:48] TrfJobReport metaData_conditionsTag = COMCOND-BLKPA-006-07 Cheers, Cameron Here is the log file from a run with atlas -l DEBUG. Not sure it gives you any extra info in this case... outDEBUG.log (1.6 MB)

Loss: CachedMultipleNegativesRankingLoss with these parameters:

{
    "scale": 1.0,
    "similarity_fct": "dot_score"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 32
per_device_eval_batch_size: 32
warmup_ratio: 0.1
fp16: True
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 32
per_device_eval_batch_size: 32
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
eval_use_gather_object: False
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	Training Loss	Validation Loss
0.2890	100	0.7838	1.1991
0.5780	200	0.4176	0.6541
0.8671	300	0.2991	0.6290
1.1561	400	0.4573	0.6447
1.4451	500	0.1258	0.6278
1.7341	600	0.0781	0.6762
2.0231	700	0.1254	0.6074
2.3121	800	0.0727	0.7019
2.6012	900	0.0199	0.6263
2.8902	1000	0.025	0.6574

Framework Versions

Python: 3.12.8
Sentence Transformers: 3.2.1
Transformers: 4.44.0
PyTorch: 2.4.1
Accelerate: 1.3.0
Datasets: 3.2.0
Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

BenElliot27
/

multi-qa-mpnet-base-dot-v1-ATLAS-TALK

SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Training Details

Training Dataset

Unnamed Dataset

Evaluation Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

CachedMultipleNegativesRankingLoss

Model tree for BenElliot27/multi-qa-mpnet-base-dot-v1-ATLAS-TALK