model1 / README.md
Jrinky's picture
Add new SentenceTransformer model
6751507 verified
metadata
language:
  - en
license: apache-2.0
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:6433
  - loss:Infonce
base_model: microsoft/mpnet-base
widget:
  - source_sentence: >-
      What surprised the author about the appearance of sloths when looking for
      animals to draw for the letter S
    sentences:
      - >-
        Third National Bank may refer to:


        in the United States

        (by state)

        Third National Bank (Atlanta, Georgia), now The Metropolitan (Atlanta
        condominium building)
         Third National Bank (Glasgow, Kentucky), listed on the NRHP in Kentucky
         Third National Bank (Ohio), a predecessor of Fifth Third Bank
         Third National Bank (Syracuse, New York), listed on the NRHP in New York
         Third National Bank (Sandusky, Ohio), listed on the NRHP in Ohio
         Third National Bank in Nashville, now incorporated within SunTrust Bank
      - >-
        S is for Sloth

        I never really paid much attention to sloths until I began searching for
        animals to draw for the letter S. Looking at some photos, I was shocked
        to see just how much they look like Muppets in real life! They're
        hilarious!
      - >-
        History

        The Annual Review of Animal Biosciences was first published in 2013,
        with Harris A. Lewin and R. Michael Roberts as the founding co-editors.
        Though it was initially published in print, as of 2021 it is only
        published electronically. Scope and indexing

        The Annual Review of Animal Biosciences defines its scope as covering
        significant developments relevant to biotechnology, genomics, genetics,
        veterinary medicine, animal breeding, and conservation biology. The
        intended audience for the journal is scientists and veterinarians
        involved with wild and domestic animals. It is abstracted and indexed in
        Scopus, Science Citation Index Expanded, MEDLINE, and Embase, among
        others.
  - source_sentence: >-
      What distinguished the roles of different prisoners, such as the
      functionaries and the Sonderkommando, in the Auschwitz camps
    sentences:
      - >-
        Michael Matthews is a South African writer, producer and director. In
        2017, he directed Five Fingers for Marseilles, a film that won best film
        category at 14th Africa Movie Academy Awards. Early life

        Matthews was born in Durban. He studied filmmaking at the CityVarsity
        Cape Town campus. Career 

        Matthews is the director of Five Fingers for Marseilles, a film dubbed
        as South African first western film.
      - >-
        Warsame and Ahmed M. Hassan, who was elected to the Clarkston, Georgia
        City Council on the same day, are the first Somali Americans to be
        elected to municipal offices in the United States and were the highest
        elected Somali Americans in the country at the time. Warsame's election
        set civic precedence in the Somali American community of Minneapolis, in
        which his campaign energized and mobilized this sub-community's powerful
        voting bloc.
      - "Designated as Aussenlager (external camp), Nebenlager (extension camp), Arbeitslager (labor camp), or Aussenkommando (external work detail), camps were built at Blechhammer, Jawiszowice, Jaworzno, Lagisze, Mysłowice, Trzebinia, and as far afield as the Protectorate of Bohemia and Moravia in Czechoslovakia. Industries with satellite camps included coal mines, foundries and other metal works, and chemical plants. Prisoners were also made to work in forestry and farming. For example, Wirtschaftshof Budy, in the Polish village of Budy near Brzeszcze, was a farming subcamp where prisoners worked 12-hour days in the fields, tending animals, and making compost by mixing human ashes from the crematoria with sod and manure. Incidents of sabotage to decrease production took place in several subcamps, including Charlottengrube, Gleiwitz II, and Rajsko. Living conditions in some of the camps were so poor that they were regarded as punishment subcamps. Life in the camps\n\nSS garrison\n\nRudolf Höss, born in Baden-Baden in 1900, was named the first commandant of Auschwitz when Heinrich Himmler ordered on 27 April 1940 that the camp be established. Living with his wife and children in a two-story stucco house near the commandant's and administration building, he served as commandant until 11 November 1943, with Josef Kramer as his deputy. Succeeded as commandant by Arthur Liebehenschel, Höss joined the SS Business and Administration Head Office in Oranienburg as director of Amt DI, a post that made him deputy of the camps inspectorate. Richard Baer became commandant of Auschwitz I on 11 May 1944 and Fritz Hartjenstein of Auschwitz II from 22 November 1943, followed by Josef Kramer from 15 May 1944 until the camp's liquidation in January 1945. Heinrich Schwarz was commandant of Auschwitz III from the point at which it became an autonomous camp in November 1943 until its liquidation. Höss returned to Auschwitz between 8 May and 29 July 1944 as the local SS garrison commander (Standortältester) to oversee the arrival of Hungary's Jews, which made him the superior officer of all the commandants of the Auschwitz camps. According to Aleksander Lasik, about 6,335 people (6,161 of them men) worked for the SS at Auschwitz over the course of the camp's existence; 4.2 percent were officers, 26.1 percent non-commissioned officers, and 69.7 percent rank and file. In March 1941, there were 700 SS guards; in June 1942, 2,000; and in August 1944, 3,342. At its peak in January 1945, 4,480 SS men and 71 SS women worked in Auschwitz; the higher number is probably attributable to the logistics of evacuating the camp. Female guards were known as SS supervisors (SS-Aufseherinnen). Most of the staff were from Germany or Austria, but as the war progressed, increasing numbers of Volksdeutsche from other countries, including Czechoslovakia, Poland, Yugoslavia, and the Baltic states, joined the SS at Auschwitz. Not all were ethnically German. Guards were also recruited from Hungary, Romania, and Slovakia. Camp guards, around three quarters of the SS personnel, were members of the SS-Totenkopfverbände (death's head units). Other SS staff worked in the medical or political departments, or in the economic administration, which was responsible for clothing and other supplies, including the property of dead prisoners. The SS viewed Auschwitz as a comfortable posting; being there meant they had avoided the front and had access to the victims' property. Functionaries and Sonderkommando\n\nCertain prisoners, at first non-Jewish Germans but later Jews and non-Jewish Poles, were assigned positions of authority as Funktionshäftlinge (functionaries), which gave them access to better housing and food. The Lagerprominenz (camp elite) included Blockschreiber (barracks clerk), Kapo (overseer), Stubendienst (barracks orderly), and Kommandierte (trusties). Wielding tremendous power over other prisoners, the functionaries developed a reputation as sadists. Very few were prosecuted after the war, because of the difficulty of determining which atrocities had been performed by order of the SS. Although the SS oversaw the murders at each gas chamber, the forced labor portion of the work was done by prisoners known from 1942 as the Sonderkommando (special squad). These were mostly Jews but they included groups such as Soviet POWs. In 1940–1941 when there was one gas chamber, there were 20 such prisoners, in late 1943 there were 400, and by 1944 during the Holocaust in Hungary the number had risen to 874. The Sonderkommando removed goods and corpses from the incoming trains, guided victims to the dressing rooms and gas chambers, removed their bodies afterwards, and took their jewelry, hair, dental work, and any precious metals from their teeth, all of which was sent to Germany. Once the bodies were stripped of anything valuable, the Sonderkommando burned them in the crematoria. Because they were witnesses to the mass murder, the Sonderkommando lived separately from the other prisoners, although this rule was not applied to the non-Jews among them. Their quality of life was further improved by their access to the property of new arrivals, which they traded within the camp, including with the SS. Nevertheless, their life expectancy was short; they were regularly murdered and replaced. About 100 survived to the camp's liquidation. They were forced on a death march and by train to the camp at Mauthausen, where three days later they were asked to step forward during roll call. No one did, and because the SS did not have their records, several of them survived. Tattoos and triangles\n\nUniquely at Auschwitz, prisoners were tattooed with a serial number, on their left breast for Soviet prisoners of war and on the left arm for civilians. Categories of prisoner were distinguishable by triangular pieces of cloth (German: Winkel) sewn onto on their jackets below their prisoner number. Political prisoners (Schutzhäftlinge or Sch), mostly Poles, had a red triangle, while criminals (Berufsverbrecher or BV) were mostly German and wore green. Asocial prisoners (Asoziale or Aso), which included vagrants, prostitutes and the Roma, wore black. Purple was for Jehovah's Witnesses (Internationale Bibelforscher-Vereinigung or IBV)'s and pink for gay men, who were mostly German. An estimated 5,000–15,000 gay men prosecuted under German Penal Code Section 175 (proscribing sexual acts between men) were detained in concentration camps, of whom an unknown number were sent to Auschwitz. Jews wore a yellow badge, the shape of the Star of David, overlaid by a second triangle if they also belonged to a second category. The nationality of the inmate was indicated by a letter stitched onto the cloth. A racial hierarchy existed, with German prisoners at the top. Next were non-Jewish prisoners from other countries. Jewish prisoners were at the bottom. Transports\n\nDeportees were brought to Auschwitz crammed in wretched conditions into goods or cattle wagons, arriving near a railway station or at one of several dedicated trackside ramps, including one next to Auschwitz I. The Altejudenrampe (old Jewish ramp), part of the Oświęcim freight railway station, was used from 1942 to 1944 for Jewish transports. Located between Auschwitz I and Auschwitz II, arriving at this ramp meant a 2.5\_km journey to Auschwitz II and the gas chambers. Most deportees were forced to walk, accompanied by SS men and a car with a Red Cross symbol that carried the Zyklon B, as well as an SS doctor in case officers were poisoned by mistake. Inmates arriving at night, or who were too weak to walk, were taken by truck. Work on a new railway line and ramp (right) between sectors BI and BII in Auschwitz II, was completed in May 1944 for the arrival of Hungarian Jews between May and early July 1944. The rails led directly to the area around the gas chambers. Life for the inmates\nThe day began at 4:30\_am for the men (an hour later in winter), and earlier for the women, when the block supervis"
  - source_sentence: >-
      Do restaurants like Chick-fil-A have signs indicating restrictions against
      LGBTQ+ individuals
    sentences:
      - >-
        Restaurants have signs like "no smoking," "no guns," "no shoes, no
        service," but never have I seen a restaurant, especially Chick-fil-A,
        say "no gays or lesbians." Get on the ball

        In the newspaper this past week, "St. Charles seeks input on mall," how
        many more studies are they going to do
      - >-
        His excavations lead to him being convinced that this site was more than
        likely a pre-ceramic age and decided to discover it further. Later
        Voorhies worked to understand and evaluate the Chantuto sites and the
        people who inhabited this area.
      - >-
        Gross domestic product (GDP) is the market value of all final goods and
        services from a nation in a given year.
  - source_sentence: How can Vaseline be used to help with chapped lips
    sentences:
      - >-
        4. Soothe Chapped Lips

        So, you probably already knew Vaseline made a great lip balm, but it can
        also be used as a base in many lip scrubs, which will really come in
        handy during the winter months. 5.
      - >-
        Retrieved March 30, 2005 from . Healthcare Information and Management
        Systems Society (HIMSS) (2005, February).
      - >-
        The Government of Zimbabwe strongly believes in the independence of the
        judiciary and respects the principles of the separation of powers as set
        out in the Constitution of Zimbabwe. The Government of Zimbabwe,
        therefore, recognises the importance of the judiciary as a dependable
        interpreter of the law where various opinions may arise.
  - source_sentence: >-
      What challenges do university researchers face when trying to turn their
      discoveries into commercial products
    sentences:
      - >-
        A major shakeup has taken place at the top of the Boston Celtics. Danny
        Ainge has stepped down as president of basketball operations, and head
        coach Brad Stevens has stepped into the role. Stevens will now lead the
        search for a new coach. The team made the announcement early Wednesday,
        one day after the Celtics were eliminated by the Brooklyn Nets in the
        first round of the Eastern Conference playoffs. “Helping guide this
        organization has been the thrill of a lifetime, and having worked
        side-by-side with him since he’s been here, I know we couldn’t be in
        better hands than with Brad guiding the team going forward,” Ainge said
        in a statement. “I’m grateful to ownership, all of my Celtics
        colleagues, and the best fans in basketball for being part of the
        journey.”

        Ainge, 62, is a franchise legend.
      - >-
        Alfred William Lawson (March 24, 1869 – November 29, 1954) was an
        English born professional baseball player, aviator and utopian
        philosopher. He was a baseball player, manager, and league promoter from
        1887 through 1916 and went on to play a pioneering role in the U.S.
        aircraft industry. He published two early aviation trade journals. He is
        frequently cited as the inventor of the airliner and was awarded several
        of the first air mail contracts, which he ultimately could not fulfill.
        He founded the Lawson Aircraft Company in Green Bay, Wisconsin, to build
        military training aircraft and later the Lawson Airplane Company in
        South Milwaukee, Wisconsin, to build airliners. The crash of his
        ambitious Lawson L-4 "Midnight Liner"  during its trial flight takeoff
        on May 8, 1921, ended his best chance for commercial aviation success.
        In 1904, he wrote a utopian novel, Born Again, in which he developed the
        philosophy which later became Lawsonomy. Baseball career (1888–1907)


        He made one start for the Boston Beaneaters and two for the Pittsburgh
        Alleghenys during the 1890 season. His minor league playing career
        lasted through 1895. He later managed in the minors from 1905 to 1907.
        Union Professional League

        In 1908, he started a new professional baseball league known as the
        Union Professional League. The league took the field in April but folded
        one month later owing to financial difficulties. Aviation career
        (1908–1928)

        An early advocate or rather evangelist of aviation, in October 1908
        Lawson started the magazine Fly to stimulate public interest and educate
        readers in the fundamentals of the new science of aviation. It sold for
        10 cents a copy from newsstands across the country. In 1910, moving to
        New York City, he renamed the magazine Aircraft and published it until
        1914. The magazine chronicled the technical developments of the early
        aviation pioneers. Lawson was the first advocate for commercial air
        travel, coining the term "airline." He also advocated for a strong
        American flying force, lobbying Congress in 1913 to expand its
        appropriations for Army aircraft. In early 1913, he learned to fly the
        Sloan-Deperdussin and the Moisant-Bleriot monoplanes, becoming an
        accomplished pilot. Later that year he bought a Thomas flying boat and
        became the first air commuter regularly flying from his country house in
        Seidler's Beach, New Jersey, to the foot of 75th Street in New York City
        (about 35 miles). In 1917, utilizing the knowledge gained from ten years
        of advocating aviation, he built his first airplane, the Lawson Military
        Tractor 1 (MT-1) trainer, and founded the Lawson Aircraft Corporation.
        The company's plant was sited at Green Bay, Wisconsin. There he secured
        a contract and built the Lawson MT-2. He also designed the steel
        fuselage Lawson Armored Battler, which never got beyond the drafting
        board, given doubts within the Army aviation community and the signing
        of the armistice. After the war, in 1919 Lawson started a project to
        build America's first airline. He secured financial backing, and in five
        months he had built and demonstrated in flight his biplane airliner, the
        18-passenger Lawson L-2. He demonstrated its capabilities in a 2000-mile
        multi-city tour from Milwaukee to
        Chicago-Toledo-Cleveland-Buffalo-Syracuse-New york City-Washington,
        D.C.-Collinsville-Dayton-Chicago and back to Milwaukee, creating a buzz
        of positive press. The publicity allowed him to secure an additional $1
        million to build the 26-passenger Midnight Liner. The aircraft crashed
        on takeoff on its maiden flight. In late 1920, he secured government
        contracts for three airmail routes and to deliver ten war planes, but
        owing to the fall 1920 recession, he could not secure the necessary
        $100,000 in cash reserves called for in the contracts and had to decline
        them.
      - >-
        Universities are vital to the process of innovation and advancement:
        they educate students who bring new ways of thinking to old problems,
        and they make new discoveries that no one else would make because no one
        else has the opportunity to delve so deeply. In creating this type of
        refuge, we also create a comfort zone. Because governmental support for
        science and technology is designed to support long-term, high-risk work
        regardless of immediate return, ROI is not a factor in getting
        government funding. University researchers become successful at pitching
        research ideas without serious reference to commercial outcome. Peer
        review – which is critical for the success of science – further
        reinforces this tendency. University researchers are rewarded for
        thinking in this very specific way, and this creates the comfort zone.
        As it dawns on a researcher that they may need to work with a company or
        an entrepreneur to see their discoveries become products or services
        that can benefit society, they may find themselves a victim of their own
        past success. Many researchers reflexively approach companies as if they
        are yet another type of funding agency, but since companies are not in
        the grant-making business, a partnership fails to materialize. This
        basic failure to communicate means valuable commercial opportunities are
        often not recognized, or when they are, the resulting partnership does
        not go well.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

MPNet base trained on AllNLI triplets

This is a sentence-transformers model finetuned from microsoft/mpnet-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: microsoft/mpnet-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Jrinky/mpnet-base-all-nli-triplet")
# Run inference
sentences = [
    'What challenges do university researchers face when trying to turn their discoveries into commercial products',
    'Universities are vital to the process of innovation and advancement: they educate students who bring new ways of thinking to old problems, and they make new discoveries that no one else would make because no one else has the opportunity to delve so deeply. In creating this type of refuge, we also create a comfort zone. Because governmental support for science and technology is designed to support long-term, high-risk work regardless of immediate return, ROI is not a factor in getting government funding. University researchers become successful at pitching research ideas without serious reference to commercial outcome. Peer review – which is critical for the success of science – further reinforces this tendency. University researchers are rewarded for thinking in this very specific way, and this creates the comfort zone. As it dawns on a researcher that they may need to work with a company or an entrepreneur to see their discoveries become products or services that can benefit society, they may find themselves a victim of their own past success. Many researchers reflexively approach companies as if they are yet another type of funding agency, but since companies are not in the grant-making business, a partnership fails to materialize. This basic failure to communicate means valuable commercial opportunities are often not recognized, or when they are, the resulting partnership does not go well.',
    'A major shakeup has taken place at the top of the Boston Celtics. Danny Ainge has stepped down as president of basketball operations, and head coach Brad Stevens has stepped into the role. Stevens will now lead the search for a new coach. The team made the announcement early Wednesday, one day after the Celtics were eliminated by the Brooklyn Nets in the first round of the Eastern Conference playoffs. “Helping guide this organization has been the thrill of a lifetime, and having worked side-by-side with him since he’s been here, I know we couldn’t be in better hands than with Brad guiding the team going forward,” Ainge said in a statement. “I’m grateful to ownership, all of my Celtics colleagues, and the best fans in basketball for being part of the journey.”\nAinge, 62, is a franchise legend.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 6,433 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 6 tokens
    • mean: 16.21 tokens
    • max: 42 tokens
    • min: 5 tokens
    • mean: 140.69 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    What type of event is being described by Pierre LeBrun in relation to the NHL ESPN’s Pierre LeBrun said, “It's not just about one NHL game anymore. It's a week-long event.
    Who designed the property's landscape and when was the building listed on the National Register of Historic Places The property's landscape continues a circular theme, with flower beds, fencing, and parking arranged in concentric patterns around the structure. It was designed by the Washington, DC firm of Deigert & Yerkes. The building was listed on the National Register of Historic Places in 2017.
    Is 'ladens' a valid word to use in Scrabble and other word games Scrabble?! LADENSIs ladens valid for Scrabble? Words With Friends? Lexulous? WordFeud? Other games
  • Loss: selfloss.Infonce with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 804 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 804 samples:
    anchor positive
    type string string
    details
    • min: 7 tokens
    • mean: 16.44 tokens
    • max: 38 tokens
    • min: 8 tokens
    • mean: 149.21 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    What types of special events can the salon services be booked for Our fabulous salon services are available at your special event! Whether it's a wedding, photo shoot, prom, or just a fun girls' night in- we do it all.
    What material is the Hudson Baby plush hooded robe made of Dimensions (Overall): 10 inches (L), 10 inches (H) x 1 inches (W)
    Weight: 1 pounds
    Textile Material: 100% Polyester
    • Animal face plush hooded bath robe. • Made with 100% plush coral fleece fabric
    • Soft and gentle on baby's skin
    • Optimal for everyday use
    • Affordable, high quality bath robe
    Hudson Baby plush hooded robe is made of super soft, cozy plush material to dry and warm baby after bath or pool time.
    Where is this uncommon species thought to occur It is also thought to occur in New Zealand. It is an uncommon species, growing in "heathy woodland [in] semi shade".
  • Loss: selfloss.Infonce with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 2e-05
  • num_train_epochs: 6
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 6
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss
0.9901 100 1.4311 0.2171
1.9802 200 0.237 0.1718
2.9703 300 0.1466 0.1561
3.9604 400 0.1084 0.1541
4.9505 500 0.0879 0.1528
5.9406 600 0.0794 0.1514

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.4.0
  • Transformers: 4.48.1
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

Infonce

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}