--- license: apache-2.0 language: - en metrics: - accuracy pipeline_tag: text-generation model-index: - name: Phi3.1-Simple-Arguments results: - task: type: text-generation dataset: name: Argument-parsing type: Argument-parsing metrics: - name: Accuracy type: Accuracy value: 100 --- # Phi3.1 Simple Arguments ![image](assets/phi_simple_arguments.png) [![image](assets/hire_me.png)](https://www.freelancer.com/u/cdesivo92) This model aims to parse simple english arguments, arguments formed of two premises and a conclusion, including two propositions. ## Model Details ### Model Description - **Developed by:** Cristian Desivo - **Model type:** LLM - **Language(s) (NLP):** English - **License:** Apache-2.0 - **Finetuned from model:** Phi3.1-mini ### Model Sources - **Repository:** TBD - **Demo:** TBD ### Quantization ## Usage Below we share some code snippets on how to get quickly started with running the model. ### llama.cpp server [Recommended] The recommended way of running the model is with a llama.cpp server running the quantized Then you can use the following script to use the server's model for inference: ```python import json import requests def llmCall(messages, **args): url = "http://localhost:8080/v1/chat/completions" headers = { "Content-Type": "application/json" } data = { 'messages': messages } for arg in args: data[arg] = args[arg] response = requests.post(url, headers=headers, json=data) return response.json() def analyze_argument(argument): instruction = "Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity." inputText = "### Input:\n" + argument prompt = f"""{instruction} {inputText} """ messages=[{"role":"user", "content":prompt}] properties = { "Premise 1": {"type": "string"}, "Premise 2": {"type": "string"}, "Conclusion": {"type": "string"}, "Type of argument": {"type": "string"}, "Proposition 1": {"type": "string"}, "Proposition 2": {"type": "string"}, "Negation of Proposition 1": {"type": "string"}, "Negation of Proposition 2": {"type": "string"}, "Validity": {"type": "string"}, } analysis = llmCall( messages=messages, max_tokens=1000, temperature=0, stop=["<|end|>"], response_format={ "type": "json_object", "schema": { "type": "object", "properties": properties, "required": list(properties.keys()), }, } )['choices'][0]['message']['content'] if analysis.endswith("<|end|>"): analysis = analysis[:-5] return analysis argument = "If it's wednesday it's cold, and it's cold, therefore it's wednesday." output = analyze_argument("If it's wednesday it's cold, and it's cold, therefore it's wednesday.") print(output) ``` Output: ``` {"Premise 1": "If it's wednesday it's cold", "Premise 2": "It's cold", "Conclusion": "It is Wednesday", "Proposition 1": "It is Wednesday", "Proposition 2": "It is cold", "Type of argument": "affirming the consequent", "Negation of Proposition 1": "It is not Wednesday", "Negation of Proposition 2": "It is not cold", "Validity": true} ``` ### transformers 🤗 First make sure to pip install -U transformers, then use the code below replacing the `argument` variable for the argument you want to parse: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("cris177/Phi3.1-Simple-Arguments", device_map="auto",) tokenizer = AutoTokenizer.from_pretrained("cris177/Phi3.1-Simple-Arguments") argument = "If it's wednesday it's cold, and it's cold, therefore it's wednesday." instruction = 'Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity.' alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {} ### Input: {} ### Response:""" prompt = alpaca_prompt.format(instruction, argument) input_ids = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**input_ids, max_length=1000, num_return_sequences=1) print(tokenizer.decode(outputs[0])) ``` Output: ``` Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity. ### Input: If it's wednesday it's cold, and it's cold, therefore it's wednesday. ### Response: {"Premise 1": "If it's wednesday it's cold", "Premise 2": "It's cold", "Conclusion": "It is Wednesday", "Proposition 1": "It is Wednesday", "Proposition 2": "It is cold", "Type of argument": "affirming the consequent", "Negation of Proposition 1": "It is not Wednesday", "Negation of Proposition 2": "It is not cold", "Validity": "false"}<|endoftext|> ``` ## Training Details ### Training Data The model was trained on syntethic data, based on the following types of arguments: - Modus Ponen - Modus Tollen - Affirming Consequent - Disjunctive Syllogism - Denying Antecedent - Invalid Conditional Syllogism Each argument was constructed by selecting two random propositions (from a list of 400 propositions that was generated beforehand), choosing a type of argument and combining it all with randomly selected connectors (therefore, since, hence, thus, etc). 50k arguments were created to train the model, and 100 to test. ### Training Procedure #### Training We used unsloth for memory reduced sped up training. We trained for one epoch. Less than 3.5 GB of VRAM were used for training, and it took 3 hours. ## Evaluation The model obtains 100% train and test accuracy on our synthetic dataset.