shwetashweta05 commited on
Commit
a63d7c1
·
verified ·
1 Parent(s): 5118bc0

Update pages/Life Cycle Of Machine Learning.py

Browse files
pages/Life Cycle Of Machine Learning.py CHANGED
@@ -97,7 +97,7 @@ if st.button("Data Pre-processing"):
97
  - Example: A house with a price 10x higher than similar houses might be an outlier.
98
  - **4.Convert Categorical Data to Numbers**
99
  Machine learning models work with numbers, so categorical data must be converted.
100
- - **echniques:**
101
  - Label Encoding: Assign a number to each category (e.g., Male = 0, Female = 1).
102
  - One-Hot Encoding: Create new columns for each category with binary values (0 or 1).
103
  - Example: Convert Location (e.g., "City A", "City B") into numerical values.
@@ -112,4 +112,37 @@ if st.button("Data Pre-processing"):
112
  - Training set: Used to train the model.
113
  - Testing set: Used to evaluate the model’s performance.
114
  - Example: Split 80% of the data for training and 20% for testing.
115
- """)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97
  - Example: A house with a price 10x higher than similar houses might be an outlier.
98
  - **4.Convert Categorical Data to Numbers**
99
  Machine learning models work with numbers, so categorical data must be converted.
100
+ - **Techniques:**
101
  - Label Encoding: Assign a number to each category (e.g., Male = 0, Female = 1).
102
  - One-Hot Encoding: Create new columns for each category with binary values (0 or 1).
103
  - Example: Convert Location (e.g., "City A", "City B") into numerical values.
 
112
  - Training set: Used to train the model.
113
  - Testing set: Used to evaluate the model’s performance.
114
  - Example: Split 80% of the data for training and 20% for testing.
115
+ """)
116
+
117
+ if st.button("Exploratory Data Analysis (EDA)"):
118
+ st.write("**EDA in Machine Learning (Easy Language)EDA (Exploratory Data Analysis) is like getting to know your dataset before using it in a machine learning model. It helps you understand the data's structure, patterns, and relationships to decide how to process and use it effectively.**")
119
+ st.write("""
120
+ Why is EDA Important?
121
+ - Identifies errors, missing values, or outliers.
122
+ - Helps understand data distribution and trends.
123
+ - Guides feature selection and engineering.
124
+ - Gives insights for choosing the right ML model.
125
+ """)
126
+ st.write("""**Steps in EDA:**
127
+ - **Understand the Dataset**
128
+ - Look at the structure of your data (rows, columns, and types of values).
129
+ - Example: In a student dataset, check if columns include Name, Math Marks, and Grade.
130
+ - **Summarize the Data**
131
+ - Generate statistics like mean, median, minimum, maximum, and standard deviation.
132
+ - Example: For math scores, check the average, highest, and lowest scores.
133
+ - **Handle Missing Values**
134
+ - Identify any missing data and decide how to fix it (e.g., fill with average values or remove).
135
+ - Example: If a student is missing Science Marks, fill it with the average science score.
136
+ - **Visualize the Data**
137
+ - Create plots to understand data distributions and relationships:
138
+ - Histograms: Show how data is spread across a range (e.g., how many students scored between 70-80).
139
+ - Boxplots: Highlight outliers and data spread.
140
+ - Scatter Plots: Show relationships between two variables (e.g., Attendance vs. Marks).
141
+ - **Check Relationships**
142
+ - Use a correlation matrix to see how features relate to each other.
143
+ - Example: See if Attendance has a strong positive correlation with Math Marks.
144
+ - **Identify Outliers**
145
+ - Look for extreme values that might distort the analysis.
146
+ - Example: A student with Marks = 0 when others scored 70-100 could be an error.
147
+ """)
148
+