--- tags: - autotrain - text-generation-inference - text-generation - peft library_name: transformers base_model: allenai/Llama-3.1-Tulu-3-8B widget: - messages: - role: user content: What are the requirements for cross-examination according to Indian law? license: other --- # InLawMate-peft: Indian Legal Domain PEFT Model ## Model Description InLawMate-peft is a Parameter-Efficient Fine-Tuned (PEFT) language model specifically optimized for understanding and reasoning about Indian legal documentation. The model was trained on a carefully curated dataset of nearly 7,000 question-answer pairs derived from Indian criminal law documentation, making it particularly adept at legal comprehension and explanation tasks. ## Training Data The training data consists of nearly 7,000 high-quality legal Q&A pairs that were systematically generated using a sophisticated two-stage process: 1. **Question Generation**: Questions were extracted to cover key legal concepts, definitions, procedures, and roles, ensuring comprehensive coverage of: - Legal terminology and definitions - Procedural rules and steps - Rights and penalties - Jurisdictional aspects - Roles of legal entities (judges, lawyers, law enforcement) 2. **Answer Generation**: Answers were crafted following a structured legal reasoning approach, ensuring: - Legal precision and accuracy - Comprehensive coverage of relevant points - Clear explanation of legal concepts - Professional legal discourse style ## Training Details - **Base Model**: allenai/Llama-3.1-Tulu-3-8B - **Architecture**: PEFT (Parameter-Efficient Fine-Tuning) - **Training Epochs**: 3 - **Batch Size**: 2 (with gradient accumulation steps of 4) - **Learning Rate**: 3e-05 with cosine scheduler - **Sequence Length**: 1024 tokens - **Mixed Precision**: BF16 - **Optimization**: AdamW with β1=0.9, β2=0.999 ## Use Cases This model is particularly suited for: - Legal document analysis and comprehension - Answering questions about Indian criminal law - Understanding legal procedures and requirements - Explaining legal concepts and terminology - Assisting in legal research and education ## Limitations - The model is specifically trained on Indian legal documentation - Responses should be verified by legal professionals for critical applications - The model should not be used as a substitute for professional legal advice ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "aryaman/legalpara-lm", device_map="auto", torch_dtype='auto' ).eval() tokenizer = AutoTokenizer.from_pretrained("Aryaman02/InLawMate-peft") # Example legal query messages = [ {"role": "user", "content": "What are the requirements for cross-examination according to Indian law?"} ] input_ids = tokenizer.apply_chat_template( conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt' ) output_ids = model.generate(input_ids.to('cuda')) response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True) print(response) ``` ## Citation If you use this model in your research, please cite: ```bibtex @misc{legalpara-lm, title={InLawMate: A PEFT Model for Indian Legal Domain Understanding}, year={2024}, publisher={Aryaman}, note={Model trained on Indian legal documentation} } ``` Our training data and procedure for synth data creation is outlined in https://github.com/DarryCrucian/law-llm