--- base_model: dunzhang/stella_en_1.5B_v5 datasets: [] language: [] library_name: sentence-transformers pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:99000 - loss:MultipleNegativesSymmetricRankingLoss widget: - source_sentence: 'Instruct: Given a web search query, retrieve relevant passages that answer the query. Query: Glay' sentences: - The Theory of Good and Evil is a 1907 book about ethics by the English philosopher Hastings Rashdall, in which the author expounds a theory he calls "ideal utilitarianism". It has been seen as Rashdall's most important philosophical work. - GLAY is a Japanese rock band , formed in Hakodate in 1988 . Glay primarily composes songs in the rock and pop genres , but they have also arranged songs using elements from a wide variety of genres , including punk , electronic , R&B , progressive rock , folk , reggae , gospel , and ska . Originally a visual kei band , the group slowly shifted to less dramatic attire through the years . As of 2008 , Glay had sold an estimated 51 million records ; 28 million singles and 23 million albums , making them one of the top ten best-selling artists of all time in Japan . - Aashirwad is a 1968 Bollywood film , directed by Hrishikesh Mukherjee . The film stars Ashok Kumar and Sanjeev Kumar . The film is notable for its inclusion of a rap-like song performed by Ashok Kumar , `` Rail Gaadi '' . - source_sentence: 'Instruct: Given a web search query, retrieve relevant passages that answer the query. Query: Indexing does not work with index package' sentences: - 'I am trying to do indexing with the following code: \documentclass[a4paper]{article} \usepackage{index} \makeindex \newindex{aut}{adx}{and}{Name Index} \begin{document} Hellow \index[aut]{FiRST} \printindex[aut] \end{document} Acccording to documention of the `index` package it should work. But makeindex creates empty `.idx` and `.ind`. If I run code like this: \documentclass[a4paper]{article} \usepackage{index} \makeindex \begin{document} Hellow \index{FiRST} \printindex \end{document} It runs. But I need to have user-defined index. Please help me with it. I''ve searched for several hours on internet, but without success.' - 'Body materials may include, but are not limited to, any of these materials:' - Berberis aemulans is a shrub endemic to the region of Sichuan in southern China. It grows there in thickets and on slopes at elevations of 2900-3200 m.Berberis aemulans is a deciduous shrub up to 2 m tall, with spines along the branches. Leaves are simple, elliptical to ovate, up to 4 cm long, lighter in color on the underside because of a waxy layer. Flowers are in simple racemes of only a few flowers. Berries egg-shaped, orange, up to 16 mm long. - source_sentence: 'Instruct: Given a web search query, retrieve relevant passages that answer the query. Query: Parodi''s hemispingus' sentences: - Another event dubbed a "Battle of the Sexes" took place during the 1998 Australian Open[51] between Karsten Braasch and the Williams sisters. Venus and Serena Williams had claimed that they could beat any male player ranked outside the world's top 200, so Braasch, then ranked 203rd, challenged them both. Braasch was described by one journalist as "a man whose training regime centered around a pack of cigarettes and more than a couple bottles of ice cold lager".[52][51] The matches took place on court number 12 in Melbourne Park,[53] after Braasch had finished a round of golf and two shandies. He first took on Serena and after leading 5–0, beat her 6–1. Venus then walked on court and again Braasch was victorious, this time winning 6–2.[54] Braasch said afterwards, "500 and above, no chance". He added that he had played like someone ranked 600th in order to keep the game "fun".[55] Braasch said the big difference was that men can chase down shots much easier, and that men put spin on the ball that the women can't handle. The Williams sisters adjusted their claim to beating men outside the top 350.[51] - The Parodi 's hemispingus ( Hemispingus parodii ) is a species of bird in the family Thraupidae that is endemic to Peru . Its natural habitat is subtropical or tropical moist montane forests . - 'I need help because my Minecraft launcher doesn''t work... It''s been a long time I haven''t played Minecraft and until now it worked nicely. But now that I want to play on it again and I run the launcher, this appears (click images to enlarge): ![enter image description here](http://i.stack.imgur.com/hvD9R.png) At the bottom left of the screen the profile names keep loading (normally my username appears in the box) and as you can see I am unable to click on the "Play" button. I tried creating another profile but it doesn''t work because soon after they ask to enter my Minecraft username and password. The password I entered disappears and it keeps loading (I''ve tried waiting like, 30 minutes and it still doesn''t work) so this is definitely not normal. ![enter image description here](http://i.stack.imgur.com/yDYjX.png) ![enter image description here](http://i.stack.imgur.com/4Nf1L.png) ![enter image description here](http://i.stack.imgur.com/T6cJu.png) So basically I can''t play on Minecraft anymore (version 1.7.9)... P.S. I use Windows 7.' - source_sentence: 'Instruct: Given a web search query, retrieve relevant passages that answer the query. Query: Mahabharata' sentences: - The epic employs the story within a story structure, otherwise known as frametales, popular in many Indian religious and non-religious works. It is first recited at Takshashila by the sage Vaiśampāyana,[12][13] a disciple of Vyāsa, to the King Janamejaya who is the great-grandson of the Pāṇḍava prince Arjuna. The story is then recited again by a professional storyteller named Ugraśrava Sauti, many years later, to an assemblage of sages performing the 12-year sacrifice for the king Saunaka Kulapati in the Naimiśa Forest. - 'Guncati (Serbian Cyrillic: Гунцати) is a suburban settlement of Belgrade, the capital of Serbia. It is located in the municipality of Barajevo.Guncati is located west of the municipal seat of Barajevo, halfway between the Belgrade-Bar railway and Ibarska magistrala (Highway of Ibar).It is a rural settlement with a steady population growth: from 1,718 (Census 1991) to 2,102 (Census 2002).' - Beck 's Brewery , also known as Brauerei Beck & Co. , is a brewery in the northern German city of Bremen . In 2001 , Interbrew agreed to buy Brauerei Beck for 1.8 billion euro ; at that time it was the fourth largest brewer in Germany . US manufacture of Beck 's Brew has been based in St. Louis , Missouri , since early 2012 but some customers have rebelled against the US market version . Since 2008 , it has been owned by the Interbrew subsidiary of Anheuser-Busch InBev SA/NV . The Beck 's Art Label Campaign has offered artists the opportunity to provide designs to replace the brand 's label . It started in London in 1987 with Gilbert and George . The artists created an art label , because Beck 's sponsored their retrospective at the Hayward Gallery . The labels of the 2000 limited edition Beck 's bottles were matching their exhibition poster . Other participants of the Art Label Campaign are members of the loose group `` Young British Artists '' and nominees or winners of the Turner Prize . Damien Hirst for example , designed a label for Beck 's in 1995 , showing his famous spots . In 2000 , Tracey Emin created a label , which shows herself , posing in a bathtub . Furthermore , Rachel Whiteread designed a label in 1993 , presenting her artwork `` house '' , which was also financed by Beck 's . The Art Label Campaign has also been parodied by Matthew Higgs , who is a member of the British art collective `` Bank '' . In the Bank exhibition `` The Charge of the Light Brigade '' in 1995 , he brewed a beer , called `` Kunstlerbrau '' . In 2012 , Beck 's started giving young and independent musicians the opportunity to design a label for the Beck 's bottle . Beck 's summer 2009 limited-edition labels were designed by the musical groups Hard-Fi and Ladyhawke . - source_sentence: 'Instruct: Given a web search query, retrieve relevant passages that answer the query. Query: Ahu A Umi Heiau' sentences: - The 1967 All-Ireland Intermediate Hurling Championship was the seventh staging of the All-Ireland hurling championship. The championship ended on 17 September 1967.Tipperary were the defending champions, however, they were defeated in the provincial championship. London won the title after defeating Cork by 1-9 to 1-5 in the final. - 'The digit ratio is the ratio of the lengths of different digits or fingers typically measured from the midpoint of bottom crease ( where the finger joins the hand ) to the tip of the finger . It has been suggested by some scientists that the ratio of two digits in particular , the 2nd ( index finger ) and 4th ( ring finger ) , is affected by exposure to androgens , e.g. , testosterone while in the uterus and that this 2D :4 D ratio can be considered a crude measure for prenatal androgen exposure , with lower 2D :4 D ratios pointing to higher prenatal androgen exposure . The 2D :4 D ratio is calculated by dividing the length of the index finger of a given hand by the length of the ring finger of the same hand . A longer index finger will result in a ratio higher than 1 , while a longer ring finger will result in a ratio lower than 1 . The 2D :4 D digit ratio is sexually dimorphic : although the second digit is typically shorter in both females and males , the difference between the lengths of the two digits is greater in males than in females . A number of studies have shown a correlation between the 2D :4 D digit ratio and various physical and behavioral traits .' - Ahu A ʻ Umi Heiau means "shrine at the temple of ʻ Umi" in the Hawaiian Language. --- # SentenceTransformer based on dunzhang/stella_en_1.5B_v5 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [dunzhang/stella_en_1.5B_v5](https://huggingface.co/dunzhang/stella_en_1.5B_v5). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [dunzhang/stella_en_1.5B_v5](https://huggingface.co/dunzhang/stella_en_1.5B_v5) - **Maximum Sequence Length:** 8096 tokens - **Output Dimensionality:** 1024 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 8096, 'do_lower_case': False}) with Transformer model: Qwen2Model (1): Pooling({'word_embedding_dimension': 1536, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Dense({'in_features': 1536, 'out_features': 1024, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("sentence_transformers_model_id") # Run inference sentences = [ 'Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: Ahu A Umi Heiau', 'Ahu A ʻ Umi Heiau means "shrine at the temple of ʻ Umi" in the Hawaiian Language.', 'The digit ratio is the ratio of the lengths of different digits or fingers typically measured from the midpoint of bottom crease ( where the finger joins the hand ) to the tip of the finger . It has been suggested by some scientists that the ratio of two digits in particular , the 2nd ( index finger ) and 4th ( ring finger ) , is affected by exposure to androgens , e.g. , testosterone while in the uterus and that this 2D :4 D ratio can be considered a crude measure for prenatal androgen exposure , with lower 2D :4 D ratios pointing to higher prenatal androgen exposure . The 2D :4 D ratio is calculated by dividing the length of the index finger of a given hand by the length of the ring finger of the same hand . A longer index finger will result in a ratio higher than 1 , while a longer ring finger will result in a ratio lower than 1 . The 2D :4 D digit ratio is sexually dimorphic : although the second digit is typically shorter in both females and males , the difference between the lengths of the two digits is greater in males than in females . A number of studies have shown a correlation between the 2D :4 D digit ratio and various physical and behavioral traits .', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 1024] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ### Training Logs | Epoch | Step | Training Loss | retrival loss | |:------:|:----:|:-------------:|:-------------:| | 0.6466 | 500 | 0.0424 | 0.0060 | | 1.2932 | 1000 | 0.0073 | 0.0040 | | 1.9399 | 1500 | 0.0029 | 0.0039 |