Spaces:

shwetashweta05
/

Zero_to_Hero_Machine_Learning

Sleeping

App Files Files Community

shwetashweta05 commited on Dec 11, 2024

Commit

597e65b

verified ·

1 Parent(s): ffe927c

Update pages/6.Data Collection.py

Browse files

Files changed (1) hide show

pages/6.Data Collection.py +0 -85

pages/6.Data Collection.py CHANGED Viewed

@@ -73,88 +73,3 @@ if data_type == "Structured":
                     mime="application/octet-stream",
                 )
-    # CSV Format Section
-    elif format_selected == "CSV":
-        st.write("#### CSV Format")
-        # Part (a) What it is
-        st.subheader("What is CSV?")
-        st.write("""
-        CSV (Comma-Separated Values) is a lightweight text file format for structured data,
-        where values are separated by commas. It is widely used for data exchange between systems.
-        """)
-        # Part (b) How to read these files
-        st.subheader("How to Read CSV Files?")
-        st.code("""
-        import pandas as pd
-        # Read a CSV file
-        df = pd.read_csv("file.csv")
-        print(df.head())
-        """)
-        # Part (c) Issues encountered
-        st.subheader("Common Issues Encountered When Handling CSV Files")
-        st.write("""
-        - **Misaligned Rows**: Extra or missing delimiters can lead to misaligned rows.
-        - **Encoding Problems**: Non-standard characters may cause encoding errors.
-        - **Large Files**: Processing large CSV files can be resource-intensive.
-        """)
-        # Part (d) How to overcome these errors/issues
-        st.subheader("How to Overcome These Issues?")
-        st.write("""
-        - **Misaligned Rows**: Use a consistent delimiter and validate the file before processing.
-        - **Encoding Problems**: Explicitly specify the encoding format, e.g., `encoding='utf-8'`.
-        - **Large Files**: Process the file in chunks using `pandas` (`chunksize` parameter).
-        """)
-        # Downloadable Guide Button
-        st.markdown("### Download Coding Guide:")
-        if st.button("Download CSV Guide"):
-            # Provide a downloadable file
-            file_path = "CSV_guide.ipynb"  # Ensure this file exists in the app directory
-            with open(file_path, "rb") as file:
-                st.download_button(
-                    label="Download CSV Guide",
-                    data=file,
-                    file_name="CSV_guide.ipynb",
-                    mime="application/octet-stream",
-                )
-# Add similar sections for "Unstructured" and "Semi-Structured" data types as needed.
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": "## Excel Data Format\n\n### What is Excel?\nExcel is a tabular data format commonly used in business and analytics, with extensions `.xls` and `.xlsx`."
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": "### How to Read Excel Files\nUse the `pandas` library to read Excel files:"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": "import pandas as pd\n\ndf = pd.read_excel(\"example.xlsx\")\nprint(df.head())"
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": "### Common Issues\n1. Missing Data\n2. Encoding Problems\n3. File Corruption\n4. Large Files"
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": "### How to Overcome Issues\n1. Use data imputation methods for missing data.\n2. Specify encoding when reading files (`encoding='utf-8'`).\n3. Repair or convert corrupted files.\n4. Process large files in chunks with `pandas`."
-  }
- ],
- "metadata": {},
- "nbformat": 4,
- "nbformat_minor": 2
-}


73	mime="application/octet-stream",
74	)
75