{"cells":[{"cell_type":"markdown","metadata":{},"source":["
The goal of this notebook is to analyze the output predicted by OpenHermes-2.5-Mistral-7B to identify the weaknesses of this model.
\n","Next, we will prepare a dataset consisting of context and correct responses. To compile this dataset, we've extracted 59 pages from our PDF and supplied them to Gemini. We've asked Gemini to furnish responses in JSON format. Following this, we've validated the results received from Gemini.
\n"]},{"cell_type":"code","execution_count":19,"metadata":{"execution":{"iopub.execute_input":"2024-03-27T09:59:30.878562Z","iopub.status.busy":"2024-03-27T09:59:30.878182Z","iopub.status.idle":"2024-03-27T09:59:31.037100Z","shell.execute_reply":"2024-03-27T09:59:31.036038Z","shell.execute_reply.started":"2024-03-27T09:59:30.878528Z"},"trusted":true},"outputs":[],"source":["with open(\"/kaggle/input/mc-eurpen/mc_EuropeInterchangeManual_Customer (2).txt\", 'r', encoding='utf-8') as file:\n"," content = file.read()\n","\n","token = 'Interchange and Service Fees Manual: Europe Region • 12 September 2023'\n","paginated_doc = content.split(token)"]},{"cell_type":"code","execution_count":20,"metadata":{"execution":{"iopub.execute_input":"2024-03-27T09:59:31.038635Z","iopub.status.busy":"2024-03-27T09:59:31.038313Z","iopub.status.idle":"2024-03-27T09:59:31.053395Z","shell.execute_reply":"2024-03-27T09:59:31.052632Z","shell.execute_reply.started":"2024-03-27T09:59:31.038609Z"},"trusted":true},"outputs":[],"source":["data = []\n","i=40\n","with open(\"/kaggle/input/clean-data-for-fine-t/clean_data.jsonl\",encoding='utf-8') as file:\n"," for line in file:\n"," response = json.loads(line)\n"," \n"," if response =={\"message\": \"Context lacks a Payment product\"}:\n"," response={'message': 'Context lacks a Payment product,FeeTier and Rate'}\n"," context = paginated_doc[i]\n"," data.append({\"context\":context,\"response\":response})\n"," i+=1"]},{"cell_type":"code","execution_count":21,"metadata":{"execution":{"iopub.execute_input":"2024-03-27T09:59:31.054719Z","iopub.status.busy":"2024-03-27T09:59:31.054395Z","iopub.status.idle":"2024-03-27T09:59:31.061535Z","shell.execute_reply":"2024-03-27T09:59:31.060536Z","shell.execute_reply.started":"2024-03-27T09:59:31.054694Z"},"trusted":true},"outputs":[{"data":{"text/plain":["(59, 2)"]},"execution_count":21,"metadata":{},"output_type":"execute_result"}],"source":["df = pd.DataFrame(data)\n","df.shape"]},{"cell_type":"code","execution_count":22,"metadata":{"execution":{"iopub.execute_input":"2024-03-27T09:59:31.063421Z","iopub.status.busy":"2024-03-27T09:59:31.063101Z","iopub.status.idle":"2024-03-27T09:59:31.071585Z","shell.execute_reply":"2024-03-27T09:59:31.070484Z","shell.execute_reply.started":"2024-03-27T09:59:31.063376Z"},"trusted":true},"outputs":[{"name":"stdout","output_type":"stream","text":[" 41 \n"," \n"," \n"," \n"," \n"," \n"," Global program rates \n"," \n"," \n"," \n"," \n"," \n"," \n"," IRD and program name Product code Rate (USD) \n"," \n"," BB MBS: Mastercard B2B Product 1 2.00% + USD 0.00 \n"," \n"," Commercial Business-to-Business MBA: Mastercard B2B Product 2 1.80% + USD 0.00 \n"," \n"," MBG: Mastercard B2B Product 3 1.60% + USD 0.00 \n"," MBH: Mastercard B2B Product 4 1.40% + USD 0.00 \n"," \n"," MBI: Mastercard B2B Product 5 1.20% + USD 0.00 \n"," \n"," MBJ: Mastercard B2B Product 6 1.00% + USD 0.00 \n"," \n"," MTA: Mastercard B2B Product 7 2.00% + USD 0.00 \n"," \n"," MTB: Mastercard B2B Product 8 1.90% + USD 0.00 \n"," \n"," MTC: Mastercard B2B Product 9 1.80% + USD 0.00 \n"," \n"," MTD: Mastercard B2B Product 10 1.70% + USD 0.00 \n"," \n"," MTE: Mastercard B2B Product 11 1.60% + USD 0.00 \n"," \n"," MTF: Mastercard B2B Product 12 1.50% + USD 0.00 \n"," \n"," MTG: Mastercard B2B Product 13 1.40% + USD 0.00 \n"," MTH: Mastercard B2B Product 14 1.30% + USD 0.00 \n"," \n"," MTI: Mastercard B2B Product 15 1.20% + USD 0.00 \n"," \n"," MTJ: Mastercard B2B Product 16 1.10% + USD 0.00 \n"," \n"," MTK: Mastercard B2B Product 17 1.00% + USD 0.00 \n"," \n"," MTL: Mastercard B2B Product 18 Rate to be announced \n"," \n"," MTM: Mastercard B2B Product 19 Rate to be announced \n"," \n"," MTN: Mastercard B2B Product 20 Rate to be announced \n"," \n"," MTO: Mastercard B2B Product 21 Rate to be announced \n"," \n"," MTQ: Mastercard B2B Product 22 Rate to be announced \n"," \n"," MTR: Mastercard B2B Product 23 Rate to be announced \n"," MTS: Mastercard B2B Product 24 Rate to be announced \n"," \n"," MTT: Mastercard B2B Product 25 Rate to be announced \n"," \n"," MTU: Mastercard B2B Product 26 Rate to be announced \n"," \n"," MTV: Mastercard B2B Product 27 Rate to be announced \n"," \n"," \n"," \n"," NOTE: Product codes MTA, MTB, MTC, MTD, MTE, MTF, MTG, MTH, MTI, MTJ, MTK, MTL, MTM, MTN, MTO, \n"," MTG, MTR, MTS, MTT, MTU, and MTV are effective globally except for the Canada region and Brazil. These \n"," product codes will be effective in the Canada region and Brazil in Release 23.Q4. \n"," \n"," \n"," \n"," \n"," ©1999–2023 Mastercard. Proprietary. All rights reserved. \n"," \n"," \n"]}],"source":["print(df.loc[0,'context'])"]},{"cell_type":"code","execution_count":23,"metadata":{"execution":{"iopub.execute_input":"2024-03-27T09:59:31.073113Z","iopub.status.busy":"2024-03-27T09:59:31.072834Z","iopub.status.idle":"2024-03-27T09:59:31.081256Z","shell.execute_reply":"2024-03-27T09:59:31.080221Z","shell.execute_reply.started":"2024-03-27T09:59:31.073089Z"},"trusted":true},"outputs":[{"name":"stdout","output_type":"stream","text":["{'message': 'Context lacks a Payment product,FeeTier and Rate'}\n"]}],"source":["print(df.loc[0,'response'])"]},{"cell_type":"code","execution_count":24,"metadata":{"execution":{"iopub.execute_input":"2024-03-27T09:59:31.082817Z","iopub.status.busy":"2024-03-27T09:59:31.082492Z","iopub.status.idle":"2024-03-27T09:59:31.091168Z","shell.execute_reply":"2024-03-27T09:59:31.090159Z","shell.execute_reply.started":"2024-03-27T09:59:31.082790Z"},"trusted":true},"outputs":[{"name":"stdout","output_type":"stream","text":[" 56 \n"," \n"," \n"," \n"," \n"," \n"," Intercountry fallback fee rates \n"," Intra-EEA Mastercard MoneySend funding transaction fallback service fee rates \n"," \n"," \n"," \n"," Payment product Fee tier IRD Fee rate \n"," \n"," Mastercard N/A Q6, Q7 1.65% \n"," \n"," BusinessCard/Mastercard \n"," Professional Card/ \n"," Mastercard Executive \n"," BusinessCard/Mastercard \n"," Corporate Executive Card \n"," \n"," Mastercard Electronic \n"," BusinessCard \n"," \n"," Debit Mastercard for \n"," Business \n"," \n"," \n"," Mastercard Purchasing N/A Q6, Q7 1.65% \n"," \n"," Mastercard Fleetcard N/A Q6, Q7 1.65% \n"," \n"," Mastercard Prepaid N/A Q6, Q7 1.65% \n"," Commercial \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," ©1999–2023 Mastercard. Proprietary. All rights reserved. \n"," \n"," \n"]}],"source":["print(df.loc[15,'context'])"]},{"cell_type":"code","execution_count":25,"metadata":{"execution":{"iopub.execute_input":"2024-03-27T09:59:31.092712Z","iopub.status.busy":"2024-03-27T09:59:31.092369Z","iopub.status.idle":"2024-03-27T09:59:31.100505Z","shell.execute_reply":"2024-03-27T09:59:31.099624Z","shell.execute_reply.started":"2024-03-27T09:59:31.092686Z"},"trusted":true},"outputs":[{"name":"stdout","output_type":"stream","text":["{'GeographicContext': 'Intercountry', 'SubGeographicContext': 'Intra-EEA', 'Channel': 'Mastercard MoneySend funding transaction', 'RateType': 'fallback service fee rates', 'Notes': [], 'Rates': [{'PaymentProduct': 'Mastercard BusinessCard/Mastercard Professional Card/Mastercard Executive BusinessCard/Mastercard Corporate Executive Card\\nMastercard Electronic BusinessCard\\nDebit Mastercard for Business', 'Details': [{'FeeTier': 'N/A', 'IRD': ['Q6', 'Q7'], 'Rate': '1.65%'}]}, {'PaymentProduct': 'Mastercard Purchasing', 'Details': [{'FeeTier': 'N/A', 'IRD': ['Q6', 'Q7'], 'Rate': '1.65%'}]}, {'PaymentProduct': 'Mastercard Fleetcard', 'Details': [{'FeeTier': 'N/A', 'IRD': ['Q6', 'Q7'], 'Rate': '1.65%'}]}, {'PaymentProduct': 'Mastercard Prepaid\\nCommercial', 'Details': [{'FeeTier': 'N/A', 'IRD': ['Q6', 'Q7'], 'Rate': '1.65%'}]}]}\n"]}],"source":["print(df.loc[15,'response'])"]},{"cell_type":"markdown","metadata":{},"source":["\n","#