{ "cells": [ { "cell_type": "markdown", "id": "30fbad16", "metadata": {}, "source": [ "# Derin Öğrenme Classificion ile Tweet Duygu Analizi - Twitter Sentiment Analysis with Deep Learning Classification" ] }, { "cell_type": "markdown", "id": "cfd788f1", "metadata": {}, "source": [ "TR = Her yorum satırı kendisini üstündeki koda aittir. İlk olarak Türkçe, son olarak İngilizce yazıldı.\n", "\n", "EN = Each comment line belongs to the code above it. It was first written in Turkish and lastly in English.\n", "\n", "TR = Bu proje, XXXIV SEMAC kapsamında düzenlenen hackathon için tweet verilerini kullanarak yapay zeka, derin öğrenme ile duygu analizi yapmayı amaçlamaktadır. \n", "\n", "EN = This project aims to perform sentiment analysis with artificial intelligence and deep learning using tweet data for the hackathon organized within the scope of XXXIV SEMAC.\n", "\n", "Kaynak/Source = https://www.kaggle.com/competitions/hackathon-semac-xxxiv" ] }, { "cell_type": "code", "execution_count": 1, "id": "40662c51", "metadata": {}, "outputs": [], "source": [ "#pip install autocorrect" ] }, { "cell_type": "code", "execution_count": 2, "id": "a76b7e1a", "metadata": {}, "outputs": [], "source": [ "#pip install langdetect" ] }, { "cell_type": "code", "execution_count": 3, "id": "07ab59ab", "metadata": {}, "outputs": [], "source": [ "#pip install googletrans==4.0.0-rc1" ] }, { "cell_type": "code", "execution_count": 4, "id": "dcd60c12", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import nltk\n", "import time\n", "import math\n", "import warnings\n", "warnings.filterwarnings('ignore') \n", "import re\n", "import pickle\n", "\n", "from sklearn.feature_extraction.text import CountVectorizer\n", "from autocorrect import spell\n", "from textblob import TextBlob\n", "from langdetect import detect\n", "from googletrans import Translator\n", "from PIL import Image\n", "from wordcloud import WordCloud, STOPWORDS\n", "\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.preprocessing import StandardScaler,MinMaxScaler\n", "from tensorflow.keras.models import Sequential\n", "from tensorflow.keras.layers import Dense, Flatten, Dropout, BatchNormalization\n", "from tensorflow.keras.callbacks import EarlyStopping\n", "from sklearn.metrics import accuracy_score, confusion_matrix, classification_report\n", "from sklearn.feature_extraction.text import TfidfVectorizer" ] }, { "cell_type": "code", "execution_count": 5, "id": "60ffc4ad", "metadata": {}, "outputs": [], "source": [ "pd.set_option(\"display.max_columns\",None) \n", "# TR = En fazla kaç sütun olduğunu gösteriyor. \n", "# EN = It shows the maximum number of columns." ] }, { "cell_type": "code", "execution_count": 6, "id": "9682d01c", "metadata": {}, "outputs": [], "source": [ "df=pd.read_csv('train.csv')" ] }, { "cell_type": "markdown", "id": "ff71735e", "metadata": {}, "source": [ "## EDA Keşif Amaçlı Veri Analizi - EDA - Exploratory Data Analysis" ] }, { "cell_type": "code", "execution_count": 7, "id": "9bcc95ef", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDtextfeeling
01.0J@lisastarlyRnEI haven't heard anytVing. I'll ...0.0
12.0Just grabbed some bagelLt from Pdnera for ever...0.0
23.0@poepjaqndzegiant oopY just saw you said hello...1.0
34.0@kirsty_gilfo yuj! That's the one MmL....yuom...1.0
45.0pMum's off to bed... NipUtuck time noow1.0
\n", "
" ], "text/plain": [ " ID text feeling\n", "0 1.0 J@lisastarlyRnEI haven't heard anytVing. I'll ... 0.0\n", "1 2.0 Just grabbed some bagelLt from Pdnera for ever... 0.0\n", "2 3.0 @poepjaqndzegiant oopY just saw you said hello... 1.0\n", "3 4.0 @kirsty_gilfo yuj! That's the one MmL....yuom... 1.0\n", "4 5.0 pMum's off to bed... NipUtuck time noow 1.0" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "code", "execution_count": 8, "id": "f16c65ff", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDtextfeeling
37673768.0AwwwVw Hnd of LScrubs! What a shOow1.0
27652766.0I noudAfollow Robert1.0
26052606.0@ellencg YAY Congrats! TNis seeGms to have ...1.0
1515215153.0@philrox checEk www.myspace.cNm/Bjulienk dfor ...1.0
17241725.0torway didn't dOeserve alll the votes Ithey go...0.0
\n", "
" ], "text/plain": [ " ID text feeling\n", "3767 3768.0 AwwwVw Hnd of LScrubs! What a shOow 1.0\n", "2765 2766.0 I noudAfollow Robert 1.0\n", "2605 2606.0 @ellencg YAY Congrats! TNis seeGms to have ... 1.0\n", "15152 15153.0 @philrox checEk www.myspace.cNm/Bjulienk dfor ... 1.0\n", "1724 1725.0 torway didn't dOeserve alll the votes Ithey go... 0.0" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.sample(5)" ] }, { "cell_type": "code", "execution_count": 9, "id": "e1bb19b0", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDtextfeeling
1906919996.0@BrennaCeDrUia That sucks. I'm Uso sorry Qto h...0.0
1907019997.0dNeed to shll it for financial reasons bux it'...0.0
1907119998.0p@curtsmith LLok chilly - I Ithink I woupld st...1.0
1907219999.0I'm grarduating today!L Four yearsSsweet too ...1.0
1907320000.0@mickeymab Gmine's inpmy prPofile - 'e77cb550 ...1.0
\n", "
" ], "text/plain": [ " ID text feeling\n", "19069 19996.0 @BrennaCeDrUia That sucks. I'm Uso sorry Qto h... 0.0\n", "19070 19997.0 dNeed to shll it for financial reasons bux it'... 0.0\n", "19071 19998.0 p@curtsmith LLok chilly - I Ithink I woupld st... 1.0\n", "19072 19999.0 I'm grarduating today!L Four yearsSsweet too ... 1.0\n", "19073 20000.0 @mickeymab Gmine's inpmy prPofile - 'e77cb550 ... 1.0" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.tail()" ] }, { "cell_type": "code", "execution_count": 10, "id": "4074c28b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(19074, 3)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.shape" ] }, { "cell_type": "code", "execution_count": 11, "id": "68968c29", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 19074 entries, 0 to 19073\n", "Data columns (total 3 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 ID 18429 non-null float64\n", " 1 text 18429 non-null object \n", " 2 feeling 18429 non-null float64\n", "dtypes: float64(2), object(1)\n", "memory usage: 447.2+ KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "code", "execution_count": 12, "id": "668426e1", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "ID 645\n", "text 645\n", "feeling 645\n", "dtype: int64" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.isnull().sum().sort_values(ascending=False)" ] }, { "cell_type": "raw", "id": "9f4ad630", "metadata": {}, "source": [] }, { "cell_type": "markdown", "id": "8bef292f", "metadata": {}, "source": [ "## Gereksiz Verileri Silme İşlemi Yapıyoruz - We Delete Unnecessary Data" ] }, { "cell_type": "code", "execution_count": 13, "id": "717ea31e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
textfeeling
0J@lisastarlyRnEI haven't heard anytVing. I'll ...0.0
\n", "
" ], "text/plain": [ " text feeling\n", "0 J@lisastarlyRnEI haven't heard anytVing. I'll ... 0.0" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df=df.drop('ID',axis=1)\n", "df.head(1)" ] }, { "cell_type": "raw", "id": "218e962e", "metadata": {}, "source": [] }, { "cell_type": "markdown", "id": "b5e5be78", "metadata": {}, "source": [ "## Sütün İşimlerini Orjinal, Türkçe Ve İngilizce Versiyonda Göster - Show the Processes of Milk in Original, Turkish and English Version" ] }, { "cell_type": "code", "execution_count": null, "id": "32d8f819", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "raw", "id": "1a468b7f", "metadata": {}, "source": [] }, { "cell_type": "markdown", "id": "805ba6e6", "metadata": {}, "source": [ "## Boşluk Varsa Doldurmaya, Düzeltilecek Kısım Varsa Düzeltmeye Başladık - If there is a gap, we started to fill it and if there is a part to be corrected, we started to correct it." ] }, { "cell_type": "code", "execution_count": 14, "id": "c605af79", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Sütun: text\n", "Nunique Değerler: 18429\n", "Unique Değerler: [\"J@lisastarlyRnEI haven't heard anytVing. I'll tweet you as aoon as I hear. I'mvQreallyxworrxied actually\"\n", " \"Just grabbed some bagelLt from Pdnera for everyone at wk. It's Brittany's last day\"\n", " '@poepjaqndzegiant oopY just saw you said helloM! Hi there' ...\n", " 'p@curtsmith LLok chilly - I Ithink I woupld stay in and read a book'\n", " \"I'm grarduating today!L Four yearsSsweet too fast.\"\n", " \"@mickeymab Gmine's inpmy prPofile - 'e77cb550 and hectoH's is a 'J2 bmw q75/5 there's Fmore photWos oh Kmy fb. checck out alabama roadtrip!\"]\n", "\n", "Sütun: feeling\n", "Nunique Değerler: 2\n", "Unique Değerler: [0.0, 1.0]\n" ] } ], "source": [ "for column in df.columns:\n", " # TR = Verideki her bir sütun için döngü başlatılıyor \n", " # EN = Loops through each column in the dataframe\n", "\n", " print(f\"\\nSütun: {column}\") \n", " # TR = Sütun ismi ekrana yazdırılıyor \n", " # EN = Prints the column name\n", "\n", " unique = df[column].dropna().unique() \n", " # TR = NaN değerleri düşürerek benzersiz değerler elde ediliyor \n", " # EN = Gets the unique values after dropping NaN values\n", "\n", " if pd.api.types.is_numeric_dtype(df[column]): \n", " # TR = Eğer sütundaki değerler sayısal ise, değerler sıralanıyor \n", " # EN = If the column is of numeric type, the unique values are sorted\n", " unique = sorted(unique)\n", " \n", " nunique = len(unique) # Benzersiz değerlerin sayısını hesapla\n", " # TR = Benzersiz değerlerin sayısını hesaplar \n", " # EN = Calculates the number of unique values\n", "\n", " print(f\"Nunique Değerler: {nunique}\") \n", " # TR = Benzersiz değerlerin sayısını ekrana yazdırır \n", " # EN = Prints the number of unique values\n", "\n", " print(f\"Unique Değerler: {unique}\") \n", " # TR = Benzersiz değerler ekrana yazdırılıyor \n", " # EN = Prints the unique values" ] }, { "cell_type": "code", "execution_count": 15, "id": "2f536dd1", "metadata": {}, "outputs": [], "source": [ "df=df.dropna()" ] }, { "cell_type": "code", "execution_count": 16, "id": "c0a6f1b6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"J@lisastarlyRnEI haven't heard anytVing. I'll tweet you as aoon as I hear. I'mvQreallyxworrxied actually\"" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['text'][0]" ] }, { "cell_type": "code", "execution_count": 17, "id": "85cc0e32", "metadata": {}, "outputs": [], "source": [ "def algo_text(df):\n", "\n", " for col in df.columns:\n", " if df[col].dtype=='object':\n", " df[col] = df[col].str.lower()\n", " df[col] = df[col].str.replace('[^\\w\\s]', '', regex=True)\n", " df[col] = df[col].str.replace('\\n', '')\n", " df[col] = df[col].str.replace('\\d', '')\n", " df[col] = df[col].str.replace('\\r', '')\n", " df[col] = df[col].str.replace('.', '')\n", " df[col] = df[col].str.replace(',', '')\n", " return df\n", " # TR = Bu kod data type object olan verilerin buluyor ve onlarda istenmeyen işartetleri kaldırıyor.\n", " # EN = This code finds the data with data type object and removes the unwanted marks from them.\n", "\n", "df=algo_text(df)" ] }, { "cell_type": "code", "execution_count": 18, "id": "4640e596", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
textfeeling
0jlisastarlyrnei havent heard anytving ill twee...0.0
1just grabbed some bagellt from pdnera for ever...0.0
2poepjaqndzegiant oopy just saw you said hellom...1.0
3kirsty_gilfo yuj thats the one mmlyuom cha uh...1.0
4pmums off to bed niputuck time noow1.0
\n", "
" ], "text/plain": [ " text feeling\n", "0 jlisastarlyrnei havent heard anytving ill twee... 0.0\n", "1 just grabbed some bagellt from pdnera for ever... 0.0\n", "2 poepjaqndzegiant oopy just saw you said hellom... 1.0\n", "3 kirsty_gilfo yuj thats the one mmlyuom cha uh... 1.0\n", "4 pmums off to bed niputuck time noow 1.0" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "code", "execution_count": 19, "id": "473bbd17", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'jlisastarlyrnei havent heard anytving ill tweet you as aoon as i hear imvqreallyxworrxied actually'" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['text'][0]" ] }, { "cell_type": "code", "execution_count": 20, "id": "58bf111b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{92.99062705039978} saniye\n" ] } ], "source": [ "def detect_veri(reviewText):\n", " try:\n", " return detect(reviewText)\n", " except:\n", " return 'unknown'\n", " # TR = Text sütunundaki verilerin hangi dillerde yazıldığını tespit etme.\n", " # EN = Detecting in which languages ​​the data in the text column is written.\n", "\n", "start_time = time.time()\n", "# TR = start_time adında bir değişken tanımla, time.time() kodu ile şimdiki zamanı al.\n", "# EN = Define a variable called start_time and get the current time with the time.time() code.\n", "\n", "df['language']=df['text'].apply(detect_veri)\n", "# TR = language diye yeni bir sütün oluşturma. Oluşan yeni sütunu text bulunduğu sütunun .apply(detect_veri) kodu ile hangi dilde yazıldığını bul.\n", "# EN = Creating a new column called language. Find the language in which the new column contains text, using the .apply(detect_veri) code.\n", "\n", "end_time = time.time()\n", "# TR = end_time adında bir değişken tanımla, time.time() kodu ile şimdiki zamanı al.\n", "# EN = Define a variable called end_time and get the current time with the time.time() code.\n", "\n", "\n", "elapsed_time = end_time - start_time\n", "# TR = Dil tespitin ne kadar sürdüğünü bulmak için end_time'dan start_time'mı çıkar.\n", "# EN = Subtract start_time from end_time to find how long language detection took.\n", "print({elapsed_time}, \"saniye\")" ] }, { "cell_type": "code", "execution_count": 21, "id": "a36003f1", "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sns.countplot(x=df['language']);" ] }, { "cell_type": "code", "execution_count": 22, "id": "24758341", "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df['language'].value_counts().plot.pie(autopct='%1.1f%%');" ] }, { "cell_type": "code", "execution_count": 23, "id": "d93cc636", "metadata": {}, "outputs": [], "source": [ "df = df[df['language'] == 'en']\n", "# TR = language sütunundaki verilerin 'en' kelimesine eşit olduğu satırları seçer.\n", "# EN = Selects rows where the values in the 'language' column are equal to 'en'.\n", "\n", "df.drop(index=df[df['language'] != 'en'].index, inplace=True)\n", "# TR = language sütunundaki 'en' kelimesine eşit olmayan satırları siler.\n", "# EN = Deletes rows in the 'language' column that are not equal to 'en'.\n" ] }, { "cell_type": "raw", "id": "35aba6be", "metadata": {}, "source": [] }, { "cell_type": "markdown", "id": "ac4c456e", "metadata": {}, "source": [ "## Duygu Analizi - Sentiment Analysis" ] }, { "cell_type": "markdown", "id": "82e78eaf", "metadata": {}, "source": [ "### Yorumların Olumlumu ya da Olumsuzmu Olduğunu Tespit Etme - Determining Whether Comments Have Death or Immortality" ] }, { "cell_type": "code", "execution_count": 24, "id": "b848fb42", "metadata": {}, "outputs": [], "source": [ "df['sentiment']=df['feeling']\n", "df['sentiment']=df['sentiment'].replace([1],'olumlu')\n", "df['sentiment']=df['sentiment'].replace([0],'olumsuz')\n", "# TR = sentiment diye yeni bir sütun oluşturup () sütünundaki verilere eşitledik. [1],'olumlu', [0],'olumsuz'\n", "# EN = We created a new column called sentiment and set it equal to the data in the () column. [1],'positive', [0],'negative'" ] }, { "cell_type": "code", "execution_count": 25, "id": "9a890752", "metadata": {}, "outputs": [], "source": [ "#df=df[['a','b','sentiment']]# a yerine puan ya da yıldız gibi değerini belirten sayı, b yerine yazımızın bulunduğu sütün adı\n", "\n", "df=df[['feeling','text','sentiment']]" ] }, { "cell_type": "code", "execution_count": 26, "id": "980e88ee", "metadata": {}, "outputs": [], "source": [ "df=df[(df['sentiment']=='olumlu')|(df['sentiment']=='olumsuz')]\n", "# TR = sentimentimizi olumlu ya da olumsuzolacak şekilde tanımladık.\n", "# EN = We defined our sentiment as positive, negative" ] }, { "cell_type": "code", "execution_count": 27, "id": "a4bb1b05", "metadata": {}, "outputs": [], "source": [ "df.reset_index(drop=True,inplace=True)\n", "# TR = Yukarıda yaptığımız işlem neticesinde olumlu,notr,olumsuz kelimeler kendi içlerinde üst üste oldular. Bu yüzden indexlerini sıfırlayıp tekrar verdik.\n", "# EN = As a result of the process we did above, positive, neutral and negative words were placed on top of each other. That's why we reset their indexes and gave them again." ] }, { "cell_type": "code", "execution_count": 28, "id": "85e39add", "metadata": {}, "outputs": [], "source": [ "x=df['text']\n", "y=df['sentiment']" ] }, { "cell_type": "code", "execution_count": 29, "id": "562e7b7a", "metadata": {}, "outputs": [], "source": [ "yelpbw = df[(df.feeling == 0) | (df.feeling == 1)]" ] }, { "cell_type": "code", "execution_count": 30, "id": "06d718a2", "metadata": {}, "outputs": [], "source": [ "yelpbw.reset_index(drop=True,inplace=True)" ] }, { "cell_type": "code", "execution_count": 31, "id": "50069d35", "metadata": {}, "outputs": [], "source": [ "vect=CountVectorizer(stop_words='english',ngram_range=(1,2))" ] }, { "cell_type": "code", "execution_count": 32, "id": "e27fd6f9", "metadata": {}, "outputs": [], "source": [ "x=yelpbw[\"text\"]\n", "y=yelpbw[\"feeling\"]" ] }, { "cell_type": "code", "execution_count": 33, "id": "ef9abe2d", "metadata": {}, "outputs": [], "source": [ "vect=CountVectorizer()\n", "x=vect.fit_transform(x)" ] }, { "cell_type": "code", "execution_count": 34, "id": "7c6aa2d7", "metadata": {}, "outputs": [], "source": [ "def lemmafn(Review):\n", " words=TextBlob.words\n", " return [pr.stem(word) for word in words]" ] }, { "cell_type": "raw", "id": "b67bb4b2", "metadata": {}, "source": [] }, { "cell_type": "markdown", "id": "3b1f6160", "metadata": {}, "source": [ "## En çok Sayıdaki Kelimeleri Kap İçine Alma - Containing the Most Numbered Words " ] }, { "cell_type": "code", "execution_count": 35, "id": "792dd213", "metadata": {}, "outputs": [], "source": [ "#wc=wordcloud\n", "def wc(data,bgcolor):\n", " plt.figure(figsize=(10,10))\n", " # TR = Kabımızın boyutunu belirttik.\n", " # EN = We specified the size of our container.\n", " \n", " mask=np.array(Image.open('cloud.png'))\n", " # TR = Image.open ile resmimizi açtık. np.array resmi diziye çevirdik ve mask değişkenine atadık.\n", " # EN = We opened our image with Image.open. We converted the np.array image to an array and assigned it to the mask variable.\n", " \n", " wc=WordCloud(background_color=bgcolor,stopwords=STOPWORDS,mask=mask)\n", " # TR = Bir WordCloud tanımladık. Arka plan rengini bgcolor eşitledik. stopwords=STOPWORDS ile gereksiz kelimeleri atıp anahtar kelimeleri sakladık.\n", " # EN = We defined a WordCloud. We set the background color equal to bgcolor. We removed unnecessary words and kept keywords with stopwords=STOPWORDS\n", "\n", " # TR = mask=mask yukarıda tanımladığımız mask değişkenini kullan.\n", " # EN = mask=mask use the mask variable we defined above.\n", " wc.generate(''.join(data))\n", " # TR = .join(data) ile bütün sütündaki text alıp birleştirecek. \n", " # EN = With .join(data) it will take the text in all columns and combine them.\n", "\n", " # TR = İçinde geçen tüm kelimeleri sayacak ve hafızada tutup generate ile tanımladığımız WordCloud oluşturduk ona eşitleyecek. \n", " # EN = It will count all the words in it, keep it in memory and synchronize it with the WordCloud we created with generate.\n", " \n", " plt.imshow(wc)\n", " plt.axis('off')\n", " # TR = Bunla kod ile x ve y gözükmüyor.\n", " # EN = With this code, x and y do not appear.\n", " " ] }, { "cell_type": "code", "execution_count": 36, "id": "8693a811", "metadata": {}, "outputs": [], "source": [ "olumlu=df[df['feeling']==1]['text']\n", "olumsuz=df[df['feeling']==0]['text']" ] }, { "cell_type": "code", "execution_count": 37, "id": "814a0338", "metadata": {}, "outputs": [ { "ename": "FileNotFoundError", "evalue": "[Errno 2] No such file or directory: 'C:\\\\Users\\\\ErenK\\\\OneDrive\\\\Belgeler\\\\Yapay Zeka\\\\Proje\\\\Natural Language Processing (NLP) 1\\\\tweet\\\\cloud.png'", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mFileNotFoundError\u001b[0m Traceback (most recent call last)", "Cell \u001b[1;32mIn[37], line 1\u001b[0m\n\u001b[1;32m----> 1\u001b[0m wc(olumlu,\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mwhite\u001b[39m\u001b[38;5;124m'\u001b[39m)\n", "Cell \u001b[1;32mIn[35], line 7\u001b[0m, in \u001b[0;36mwc\u001b[1;34m(data, bgcolor)\u001b[0m\n\u001b[0;32m 3\u001b[0m plt\u001b[38;5;241m.\u001b[39mfigure(figsize\u001b[38;5;241m=\u001b[39m(\u001b[38;5;241m10\u001b[39m,\u001b[38;5;241m10\u001b[39m))\n\u001b[0;32m 4\u001b[0m \u001b[38;5;66;03m# TR = Kabımızın boyutunu belirttik.\u001b[39;00m\n\u001b[0;32m 5\u001b[0m \u001b[38;5;66;03m# EN = We specified the size of our container.\u001b[39;00m\n\u001b[1;32m----> 7\u001b[0m mask\u001b[38;5;241m=\u001b[39mnp\u001b[38;5;241m.\u001b[39marray(Image\u001b[38;5;241m.\u001b[39mopen(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mcloud.png\u001b[39m\u001b[38;5;124m'\u001b[39m))\n\u001b[0;32m 8\u001b[0m \u001b[38;5;66;03m# TR = Image.open ile resmimizi açtık. np.array resmi diziye çevirdik ve mask değişkenine atadık.\u001b[39;00m\n\u001b[0;32m 9\u001b[0m \u001b[38;5;66;03m# EN = We opened our image with Image.open. We converted the np.array image to an array and assigned it to the mask variable.\u001b[39;00m\n\u001b[0;32m 11\u001b[0m wc\u001b[38;5;241m=\u001b[39mWordCloud(background_color\u001b[38;5;241m=\u001b[39mbgcolor,stopwords\u001b[38;5;241m=\u001b[39mSTOPWORDS,mask\u001b[38;5;241m=\u001b[39mmask)\n", "File \u001b[1;32m~\\anaconda3\\Lib\\site-packages\\PIL\\Image.py:3277\u001b[0m, in \u001b[0;36mopen\u001b[1;34m(fp, mode, formats)\u001b[0m\n\u001b[0;32m 3274\u001b[0m filename \u001b[38;5;241m=\u001b[39m os\u001b[38;5;241m.\u001b[39mpath\u001b[38;5;241m.\u001b[39mrealpath(os\u001b[38;5;241m.\u001b[39mfspath(fp))\n\u001b[0;32m 3276\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m filename:\n\u001b[1;32m-> 3277\u001b[0m fp \u001b[38;5;241m=\u001b[39m builtins\u001b[38;5;241m.\u001b[39mopen(filename, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrb\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m 3278\u001b[0m exclusive_fp \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[0;32m 3280\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n", "\u001b[1;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: 'C:\\\\Users\\\\ErenK\\\\OneDrive\\\\Belgeler\\\\Yapay Zeka\\\\Proje\\\\Natural Language Processing (NLP) 1\\\\tweet\\\\cloud.png'" ] }, { "data": { "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "wc(olumlu,'white')" ] }, { "cell_type": "code", "execution_count": null, "id": "8460aa30", "metadata": {}, "outputs": [], "source": [ "wc(olumsuz,'white')" ] }, { "cell_type": "code", "execution_count": null, "id": "1078157e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "5bde2bcd", "metadata": {}, "source": [ "## Öznitelik Mühendisliği - Feature Engineering" ] }, { "cell_type": "markdown", "id": "412a27ea", "metadata": {}, "source": [ "### Model - Modelling" ] }, { "cell_type": "code", "execution_count": 38, "id": "621cd80d", "metadata": {}, "outputs": [], "source": [ "x=yelpbw[\"text\"]\n", "y=yelpbw[\"feeling\"]" ] }, { "cell_type": "code", "execution_count": 39, "id": "b0177e4f", "metadata": {}, "outputs": [], "source": [ "vectorizer = TfidfVectorizer(max_features=5000)\n", "x = vectorizer.fit_transform(x).toarray()" ] }, { "cell_type": "code", "execution_count": 40, "id": "cc95c2b0", "metadata": {}, "outputs": [ { "ename": "MemoryError", "evalue": "Unable to allocate 495. MiB for an array with shape (12983, 5000) and data type float64", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mMemoryError\u001b[0m Traceback (most recent call last)", "Cell \u001b[1;32mIn[40], line 1\u001b[0m\n\u001b[1;32m----> 1\u001b[0m x_train,x_test,y_train,y_test\u001b[38;5;241m=\u001b[39mtrain_test_split(x,y,test_size\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m.20\u001b[39m,random_state\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m42\u001b[39m)\n", "File \u001b[1;32m~\\anaconda3\\Lib\\site-packages\\sklearn\\utils\\_param_validation.py:213\u001b[0m, in \u001b[0;36mvalidate_params..decorator..wrapper\u001b[1;34m(*args, **kwargs)\u001b[0m\n\u001b[0;32m 207\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m 208\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m config_context(\n\u001b[0;32m 209\u001b[0m skip_parameter_validation\u001b[38;5;241m=\u001b[39m(\n\u001b[0;32m 210\u001b[0m prefer_skip_nested_validation \u001b[38;5;129;01mor\u001b[39;00m global_skip_validation\n\u001b[0;32m 211\u001b[0m )\n\u001b[0;32m 212\u001b[0m ):\n\u001b[1;32m--> 213\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m func(\u001b[38;5;241m*\u001b[39margs, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[0;32m 214\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m InvalidParameterError \u001b[38;5;28;01mas\u001b[39;00m e:\n\u001b[0;32m 215\u001b[0m \u001b[38;5;66;03m# When the function is just a wrapper around an estimator, we allow\u001b[39;00m\n\u001b[0;32m 216\u001b[0m \u001b[38;5;66;03m# the function to delegate validation to the estimator, but we replace\u001b[39;00m\n\u001b[0;32m 217\u001b[0m \u001b[38;5;66;03m# the name of the estimator by the name of the function in the error\u001b[39;00m\n\u001b[0;32m 218\u001b[0m \u001b[38;5;66;03m# message to avoid confusion.\u001b[39;00m\n\u001b[0;32m 219\u001b[0m msg \u001b[38;5;241m=\u001b[39m re\u001b[38;5;241m.\u001b[39msub(\n\u001b[0;32m 220\u001b[0m \u001b[38;5;124mr\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mparameter of \u001b[39m\u001b[38;5;124m\\\u001b[39m\u001b[38;5;124mw+ must be\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[0;32m 221\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mparameter of \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mfunc\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__qualname__\u001b[39m\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m must be\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[0;32m 222\u001b[0m \u001b[38;5;28mstr\u001b[39m(e),\n\u001b[0;32m 223\u001b[0m )\n", "File \u001b[1;32m~\\anaconda3\\Lib\\site-packages\\sklearn\\model_selection\\_split.py:2810\u001b[0m, in \u001b[0;36mtrain_test_split\u001b[1;34m(test_size, train_size, random_state, shuffle, stratify, *arrays)\u001b[0m\n\u001b[0;32m 2806\u001b[0m train, test \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mnext\u001b[39m(cv\u001b[38;5;241m.\u001b[39msplit(X\u001b[38;5;241m=\u001b[39marrays[\u001b[38;5;241m0\u001b[39m], y\u001b[38;5;241m=\u001b[39mstratify))\n\u001b[0;32m 2808\u001b[0m train, test \u001b[38;5;241m=\u001b[39m ensure_common_namespace_device(arrays[\u001b[38;5;241m0\u001b[39m], train, test)\n\u001b[1;32m-> 2810\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mlist\u001b[39m(\n\u001b[0;32m 2811\u001b[0m chain\u001b[38;5;241m.\u001b[39mfrom_iterable(\n\u001b[0;32m 2812\u001b[0m (_safe_indexing(a, train), _safe_indexing(a, test)) \u001b[38;5;28;01mfor\u001b[39;00m a \u001b[38;5;129;01min\u001b[39;00m arrays\n\u001b[0;32m 2813\u001b[0m )\n\u001b[0;32m 2814\u001b[0m )\n", "File \u001b[1;32m~\\anaconda3\\Lib\\site-packages\\sklearn\\model_selection\\_split.py:2812\u001b[0m, in \u001b[0;36m\u001b[1;34m(.0)\u001b[0m\n\u001b[0;32m 2806\u001b[0m train, test \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mnext\u001b[39m(cv\u001b[38;5;241m.\u001b[39msplit(X\u001b[38;5;241m=\u001b[39marrays[\u001b[38;5;241m0\u001b[39m], y\u001b[38;5;241m=\u001b[39mstratify))\n\u001b[0;32m 2808\u001b[0m train, test \u001b[38;5;241m=\u001b[39m ensure_common_namespace_device(arrays[\u001b[38;5;241m0\u001b[39m], train, test)\n\u001b[0;32m 2810\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mlist\u001b[39m(\n\u001b[0;32m 2811\u001b[0m chain\u001b[38;5;241m.\u001b[39mfrom_iterable(\n\u001b[1;32m-> 2812\u001b[0m (_safe_indexing(a, train), _safe_indexing(a, test)) \u001b[38;5;28;01mfor\u001b[39;00m a \u001b[38;5;129;01min\u001b[39;00m arrays\n\u001b[0;32m 2813\u001b[0m )\n\u001b[0;32m 2814\u001b[0m )\n", "File \u001b[1;32m~\\anaconda3\\Lib\\site-packages\\sklearn\\utils\\_indexing.py:267\u001b[0m, in \u001b[0;36m_safe_indexing\u001b[1;34m(X, indices, axis)\u001b[0m\n\u001b[0;32m 265\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m _polars_indexing(X, indices, indices_dtype, axis\u001b[38;5;241m=\u001b[39maxis)\n\u001b[0;32m 266\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28mhasattr\u001b[39m(X, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mshape\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[1;32m--> 267\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m _array_indexing(X, indices, indices_dtype, axis\u001b[38;5;241m=\u001b[39maxis)\n\u001b[0;32m 268\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m 269\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m _list_indexing(X, indices, indices_dtype)\n", "File \u001b[1;32m~\\anaconda3\\Lib\\site-packages\\sklearn\\utils\\_indexing.py:33\u001b[0m, in \u001b[0;36m_array_indexing\u001b[1;34m(array, key, key_dtype, axis)\u001b[0m\n\u001b[0;32m 31\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(key, \u001b[38;5;28mtuple\u001b[39m):\n\u001b[0;32m 32\u001b[0m key \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlist\u001b[39m(key)\n\u001b[1;32m---> 33\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m array[key, \u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m] \u001b[38;5;28;01mif\u001b[39;00m axis \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m \u001b[38;5;28;01melse\u001b[39;00m array[:, key]\n", "\u001b[1;31mMemoryError\u001b[0m: Unable to allocate 495. MiB for an array with shape (12983, 5000) and data type float64" ] } ], "source": [ "x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=.20,random_state=42)\n", "# TR = modelimizi eğittik. \n", "# EN = We trained our model." ] }, { "cell_type": "code", "execution_count": null, "id": "58a351e1", "metadata": {}, "outputs": [], "source": [ "model=Sequential()\n", "model.add(Dense(1024,activation='relu',input_dim=x_train.shape[1]))\n", "# TR = Bu katman, tüm giriş nöronlarına bağlantı kurar ve her nöronun ağırlıklarını öğrenir. 256 nöron var.\n", "# Aktivasyon fonksiyonunu ReLU (Rectified Linear Unit) olarak ayarlar. ReLU fonksiyonu, negatif değerleri sıfıra dönüştürür ve pozitif değerleri olduğu gibi bırakır.\n", "# EN = This layer connects all input neurons and learns the weights of each neuron. There are 256 neurons.\n", "# Sets the activation function to ReLU (Rectified Linear Unit). The ReLU function converts negative values ​​to zero and leaves positive values ​​as is.\n", "\n", "model.add(BatchNormalization())\n", "# TR = Bu katman, modelin eğitim sürecini daha stabil hale getirmek için kullanılır.\n", "# EN = This layer is used to make the training process of the model more stable.\n", "\n", "model.add(Dropout(0.3))\n", "# TR = Derin öğrenme modelinde aşırı uyumu (overfitting) azaltmak için kullanılır. Genelde 0.2 ile 0.5 arasında olur.\n", "# EN = It is used to reduce overfitting in the deep learning model. It is generally between 0.2 and 0.5.\n", "\n", "model.add(Dense(128,activation='relu',input_dim=x_train.shape[1]))\n", "model.add(BatchNormalization())\n", "model.add(Dropout(0.3))\n", "\n", "model.add(Dense(64,activation='relu',input_dim=x_train.shape[1]))\n", "model.add(BatchNormalization())\n", "model.add(Dropout(0.3))\n", "\n", "model.add(Dense(32,activation='relu'))\n", "model.add(BatchNormalization())\n", "model.add(Dropout(0.3))\n", "\n", "model.add(Dense(16,activation='relu'))\n", "model.add(BatchNormalization())\n", "model.add(Dropout(0.3))\n", "\n", "model.add(Dense(1, activation='sigmoid'))\n", "# TR = Sigmoid fonksiyonu, çıktı değerini 0 ile 1 arasında sınırlayarak iki sınıflı (binary) sınıflandırma problemleri için kullanılır.\n", "# EN = The sigmoid function is used for binary classification problems, limiting the output value between 0 and 1.\n", "\n", "early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)\n", "# TR = EarlyStopping ekleyin: Eğitim sırasında model performansı iyileşmediğinde erken durması için kullanıyoruz.\n", "# EN = Add EarlyStopping: We use it to stop early when model performance does not improve during training.\n", "\n", "# TR = val_loss 5 epoch boyunca iyileşmezse eğitimi durduruyor ve en iyi ağırlıkları geri yüklüyor.\n", "# EN = If val_loss does not improve for 5 epochs, it stops training and restores the best weights.\n", "\n", "model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])\n", "\n", "# TR = İkili sınıflandırma (çıktı katmanı sigmoid aktivasyon kullanır).\n", "# TR = Tahmin edilen olasılık ile gerçek ikili etiketler arasındaki farkı ölçer.\n", "# TR = optimizer='adam': Adam (Adaptive Moment Estimation) optimizasyon algoritması, modelin ağırlıklarını güncellerken kullanılan bir yöntemdir.\n", "# TR = metrics=['accuracy']: Modelin performansını değerlendirmek için kullanılan bir ölçümdür. Doğru sınıflandırılan örneklerin toplam örneklere oranını hesaplar.\n", "\n", "# EN = Binary classification (output layer uses a sigmoid activation).\n", "# EN = Measures the difference between the predicted probability and the actual binary labels.\n", "# EN = optimizer='adam': Adam (Adaptive Moment Estimation) optimization algorithm is a method used when updating the weights of the model.\n", "# EN = metrics=['accuracy']: A metric used to evaluate the performance of the model. It calculates the ratio of correctly classified examples to total examples.\n", "\n", "history=model.fit(x_train, y_train, validation_split=0.2, batch_size=64, epochs=100, callbacks=[early_stopping])\n", "# TR = Modeli 100 epoch boyunca eğitiyoruz, fakat EarlyStopping ile durdurulabilir. Batch boyutu 128 olarak belirlenmiş.\n", "# EN = We train the model for 100 epochs, but it can be stopped with EarlyStopping. Batch size is set to 128." ] }, { "cell_type": "code", "execution_count": null, "id": "fa191abf", "metadata": {}, "outputs": [], "source": [ "model.summary()" ] }, { "cell_type": "code", "execution_count": null, "id": "d87d3ede", "metadata": {}, "outputs": [], "source": [ "test_loss, test_acc = model.evaluate(x_test, y_test)\n", "# TR = test_loss değişkeni, test verileri üzerinde hesaplanan kayıp değerini içerir. test_acc değişkeni, test verileri üzerinde hesaplanan doğruluk değerini içerir.\n", "# EN = The test_loss variable contains the loss value calculated on the test data. The test_acc variable contains the accuracy value calculated on the test data.\n", "\n", "print(f\"Test doğruluğu: {test_acc:.4f}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "01ecca14", "metadata": {}, "outputs": [], "source": [ "pred=(model.predict(x_test) > 0.5).astype(int)\n", "# TR = modeli x_test ile predict özelliği ile tahmin ettik. predict=tahmin demek. Dahmin edip pred eşitledik. \n", "# EN = We predicted the model with x_test and the predict feature. predict=means prediction. We guessed and equalized the pred." ] }, { "cell_type": "code", "execution_count": null, "id": "39883360", "metadata": {}, "outputs": [], "source": [ "accuracy_score(y_test, pred)\n", "# TR = accuracy_score fonksiyonu ile y_test, pred kullanarak sonucumuzu bulduk.\n", "# EN = We found our result using the accuracy_score function and y_test, pred." ] }, { "cell_type": "code", "execution_count": null, "id": "4bafc93e", "metadata": {}, "outputs": [], "source": [ "confusion_matrix(y_test, pred)\n", "# TR = confusion_matrix fonksiyonu ile y_test, pred kullanarak ne kadarını yanlış tahmin ettiğimizi buluyoruz\n", "# EN = Using the confusion_matrix function and y_test, pred, we find out how much we guessed wrong.\n", "\n", "# TR = 4 sayı değeri veriyor bize. Sol üst ve sağ alt doğru tahmin, sağ üst ve sol alt yanlış tahmin. \n", "# EN = It gives us 4 number values. Upper left and lower right are correct guesses, upper right and lower left are incorrect guesses." ] }, { "cell_type": "code", "execution_count": null, "id": "1661b38b", "metadata": {}, "outputs": [], "source": [ "sns.heatmap(confusion_matrix(y_test, pred),annot=True);\n", "# TR = Yukarıdakinin görsel versiyonu.\n", "# EN = Visual version of the above." ] }, { "cell_type": "code", "execution_count": null, "id": "429b36e9", "metadata": {}, "outputs": [], "source": [ "print(classification_report(y_test, pred))\n", "# TR = print içinde yazdık yoksa sayıların sıralanması bozuluyor.\n", "# EN = We wrote it in print, otherwise the order of the numbers will be disrupted.\n", "\n", "# TR = classification_report ile y_test, pred kullanarak precision, recall, f1-score ve support ulaştık.\n", "# EN = We achieved precision, recall, f1-score and support using classification_report and y_test, pred.\n", "\n", "# TR = Precision (Kesinlik): Doğru olarak pozitif tahmin edilen örneklerin, toplam pozitif tahmin edilen örnekler içindeki oranını gösterir. Yani, modelin pozitif sınıfı ne kadar doğru tahmin ettiğini ölçer.\n", "# EN = Precision (Precision): It shows the ratio of correctly predicted positive samples among the total positive predicted samples. That is, it measures how accurately the model predicts the positive class.\n", "\n", "# TR = Recall (Duyarlılık): Gerçek pozitif örneklerin, toplam pozitif örnekler içindeki oranını gösterir. Modelin pozitif sınıfı ne kadar iyi bulduğunu ölçer.\n", "# EN = Recall (Sensitivity): Shows the ratio of true positive samples in total positive samples. It measures how well the model finds the positive class.\n", "\n", "# TR = F1-score: Precision ve recall'un harmonik ortalamasıdır. Hem precision hem de recall'u dikkate alarak modelin genel performansını özetler.\n", "# EN = F1-score: It is the harmonic mean of precision and recall. It summarizes the overall performance of the model, taking into account both precision and recall.\n", "\n", "# TR = Support: Her sınıftan kaç örneğin bulunduğunu gösterir. Yani, gerçek etiketlerde her bir sınıfa ait kaç örnek olduğunu ifade eder.\n", "# EN = Support: Shows how many examples of each class are available. That is, it expresses how many examples of each class there are in the real labels." ] }, { "cell_type": "code", "execution_count": null, "id": "787a3e99", "metadata": {}, "outputs": [], "source": [ "plt.plot(history.history['accuracy'],label='Accuracy')\n", "plt.plot(history.history['val_accuracy'],label='Val_Accuracy')\n", "plt.legend();" ] }, { "cell_type": "code", "execution_count": null, "id": "61e71f7f-2d99-45bb-85cd-e99d1d676ed9", "metadata": {}, "outputs": [], "source": [ "pickle.dump(model,open('Tweet.pkl','wb'))" ] } ], "metadata": { "kaggle": { "accelerator": "nvidiaTeslaT4", "dataSources": [ { "databundleVersionId": 9511539, "sourceId": 84717, "sourceType": "competition" } ], "dockerImageVersionId": 30762, "isGpuEnabled": true, "isInternetEnabled": true, "language": "python", "sourceType": "notebook" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.4" } }, "nbformat": 4, "nbformat_minor": 5 }