{ "cells": [ { "cell_type": "markdown", "id": "5901c138", "metadata": {}, "source": [ "\n", "# 📊 Introduction to Reading Data & Exploratory Data Analysis (EDA)\n", "\n", "This lab is your **first structured contact with real datasets**.\n", "\n", "Today you will:\n", "\n", "- Load data from an external file (.rda format)\n", "- Inspect the structure of a dataset\n", "- Identify variable types\n", "- Compute basic descriptive statistics\n", "- Use automated EDA tools to explore data visually\n", "- Reflect on why visualization matters\n", "\n", "⚠️ Today is about *understanding data*, not heavy coding.\n" ] }, { "cell_type": "markdown", "id": "60e41689", "metadata": {}, "source": [ "\n", "## 1️⃣ Install Required Libraries (Run Once)\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "620c56a6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: pyreadr in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (0.5.4)\n", "Requirement already satisfied: sweetviz in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (2.3.1)\n", "Requirement already satisfied: dtale in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (3.20.0)\n", "Requirement already satisfied: pandas>=1.2.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from pyreadr) (3.0.1)\n", "Requirement already satisfied: numpy>=1.16.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from sweetviz) (1.26.4)\n", "Requirement already satisfied: matplotlib>=3.1.3 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from sweetviz) (3.10.8)\n", "Requirement already satisfied: tqdm>=4.43.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from sweetviz) (4.67.3)\n", "Requirement already satisfied: scipy>=1.3.2 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from sweetviz) (1.15.3)\n", "Requirement already satisfied: jinja2>=2.11.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from sweetviz) (3.1.6)\n", "Requirement already satisfied: importlib-resources>=1.2.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from sweetviz) (6.5.2)\n", "Requirement already satisfied: lz4 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (4.4.5)\n", "Requirement already satisfied: beautifulsoup4!=4.13.0b2 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (4.14.3)\n", "Requirement already satisfied: certifi in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (2026.1.4)\n", "Requirement already satisfied: cycler in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (0.12.1)\n", "Requirement already satisfied: dash<=2.18.2 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (2.18.2)\n", "Requirement already satisfied: dash-bootstrap-components<=1.7.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (1.7.1)\n", "Requirement already satisfied: dash_daq<=0.5.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (0.5.0)\n", "Requirement already satisfied: et_xmlfile in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (2.0.0)\n", "Requirement already satisfied: Flask in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (3.0.3)\n", "Requirement already satisfied: Flask-Compress in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (1.23)\n", "Requirement already satisfied: future>=0.14.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (1.0.0)\n", "Requirement already satisfied: itsdangerous in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (2.2.0)\n", "Requirement already satisfied: kaleido in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (1.2.0)\n", "Requirement already satisfied: missingno in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (0.5.2)\n", "Requirement already satisfied: networkx in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (3.6.1)\n", "Requirement already satisfied: openpyxl!=3.2.0b1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (3.1.5)\n", "Requirement already satisfied: packaging in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (26.0)\n", "Requirement already satisfied: pkginfo in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (1.12.1.2)\n", "Requirement already satisfied: plotly in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (6.5.2)\n", "Requirement already satisfied: requests in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (2.32.5)\n", "Requirement already satisfied: scikit-learn in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (1.8.0)\n", "Requirement already satisfied: seaborn in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (0.13.2)\n", "Requirement already satisfied: squarify in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (0.4.4)\n", "Requirement already satisfied: statsmodels in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (0.14.6)\n", "Requirement already satisfied: strsimpy in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (0.2.1)\n", "Requirement already satisfied: six in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (1.17.0)\n", "Requirement already satisfied: werkzeug in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (3.0.6)\n", "Requirement already satisfied: xarray in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (2026.2.0)\n", "Requirement already satisfied: xlrd in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dtale) (2.0.2)\n", "Requirement already satisfied: dash-html-components==2.0.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dash<=2.18.2->dtale) (2.0.0)\n", "Requirement already satisfied: dash-core-components==2.0.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dash<=2.18.2->dtale) (2.0.0)\n", "Requirement already satisfied: dash-table==5.0.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dash<=2.18.2->dtale) (5.0.0)\n", "Requirement already satisfied: importlib-metadata in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dash<=2.18.2->dtale) (8.7.1)\n", "Requirement already satisfied: typing-extensions>=4.1.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dash<=2.18.2->dtale) (4.15.0)\n", "Requirement already satisfied: retrying in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dash<=2.18.2->dtale) (1.4.2)\n", "Requirement already satisfied: nest-asyncio in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dash<=2.18.2->dtale) (1.6.0)\n", "Requirement already satisfied: setuptools in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from dash<=2.18.2->dtale) (65.5.0)\n", "Requirement already satisfied: click>=8.1.3 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from Flask->dtale) (8.3.1)\n", "Requirement already satisfied: blinker>=1.6.2 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from Flask->dtale) (1.9.0)\n", "Requirement already satisfied: MarkupSafe>=2.1.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from werkzeug->dtale) (3.0.3)\n", "Requirement already satisfied: soupsieve>=1.6.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from beautifulsoup4!=4.13.0b2->dtale) (2.8.3)\n", "Requirement already satisfied: colorama in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from click>=8.1.3->Flask->dtale) (0.4.6)\n", "Requirement already satisfied: contourpy>=1.0.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from matplotlib>=3.1.3->sweetviz) (1.3.3)\n", "Requirement already satisfied: fonttools>=4.22.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from matplotlib>=3.1.3->sweetviz) (4.61.1)\n", "Requirement already satisfied: kiwisolver>=1.3.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from matplotlib>=3.1.3->sweetviz) (1.4.9)\n", "Requirement already satisfied: pillow>=8 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from matplotlib>=3.1.3->sweetviz) (12.1.1)\n", "Requirement already satisfied: pyparsing>=3 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from matplotlib>=3.1.3->sweetviz) (3.3.2)\n", "Requirement already satisfied: python-dateutil>=2.7 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from matplotlib>=3.1.3->sweetviz) (2.9.0.post0)\n", "Requirement already satisfied: tzdata in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from pandas>=1.2.0->pyreadr) (2025.3)\n", "Requirement already satisfied: narwhals>=1.15.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from plotly->dtale) (2.16.0)\n", "Requirement already satisfied: brotli in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from Flask-Compress->dtale) (1.2.0)\n", "Requirement already satisfied: backports.zstd in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from Flask-Compress->dtale) (1.3.0)\n", "Requirement already satisfied: zipp>=3.20 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from importlib-metadata->dash<=2.18.2->dtale) (3.23.0)\n", "Requirement already satisfied: choreographer>=1.1.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from kaleido->dtale) (1.2.1)\n", "Requirement already satisfied: logistro>=1.0.8 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from kaleido->dtale) (2.0.1)\n", "Requirement already satisfied: orjson>=3.10.15 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from kaleido->dtale) (3.11.7)\n", "Requirement already satisfied: pytest-timeout>=2.4.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from kaleido->dtale) (2.4.0)\n", "Requirement already satisfied: simplejson>=3.19.3 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from choreographer>=1.1.1->kaleido->dtale) (3.20.2)\n", "Requirement already satisfied: pytest>=7.0.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from pytest-timeout>=2.4.0->kaleido->dtale) (9.0.2)\n", "Requirement already satisfied: iniconfig>=1.0.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from pytest>=7.0.0->pytest-timeout>=2.4.0->kaleido->dtale) (2.3.0)\n", "Requirement already satisfied: pluggy<2,>=1.5 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from pytest>=7.0.0->pytest-timeout>=2.4.0->kaleido->dtale) (1.6.0)\n", "Requirement already satisfied: pygments>=2.7.2 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from pytest>=7.0.0->pytest-timeout>=2.4.0->kaleido->dtale) (2.19.2)\n", "Requirement already satisfied: charset_normalizer<4,>=2 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from requests->dtale) (3.4.4)\n", "Requirement already satisfied: idna<4,>=2.5 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from requests->dtale) (3.11)\n", "Requirement already satisfied: urllib3<3,>=1.21.1 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from requests->dtale) (2.6.3)\n", "Requirement already satisfied: joblib>=1.3.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from scikit-learn->dtale) (1.5.3)\n", "Requirement already satisfied: threadpoolctl>=3.2.0 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from scikit-learn->dtale) (3.6.0)\n", "Requirement already satisfied: patsy>=0.5.6 in D:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages (from statsmodels->dtale) (1.0.2)\n" ] } ], "source": [ "\n", "# If needed, uncomment:\n", "!pip install pyreadr sweetviz dtale\n", "\n", "import pyreadr\n", "import pandas as pd\n", "import sweetviz as sv\n", "import dtale\n" ] }, { "cell_type": "markdown", "id": "7aa8d747", "metadata": {}, "source": [ "\n", "## 2️⃣ Load the .rda File\n", "\n", "Replace the file path if necessary.\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "ba921fe7", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "odict_keys(['datasaurus_dozen'])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "result = pyreadr.read_r(\"../data/datasaurus_dozen.rda\")\n", "result.keys()\n" ] }, { "cell_type": "markdown", "id": "7bb3fc80", "metadata": {}, "source": [ "\n", "The file may contain multiple objects. Extract the dataframe:\n" ] }, { "cell_type": "code", "execution_count": null, "id": "10cb8cda", "metadata": {}, "outputs": [ { "ename": "AttributeError", "evalue": "'list' object has no attribute 'head'", "output_type": "error", "traceback": [ "\u001b[31m---------------------------------------------------------------------------\u001b[39m", "\u001b[31mAttributeError\u001b[39m Traceback (most recent call last)", "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[11]\u001b[39m\u001b[32m, line 2\u001b[39m\n\u001b[32m 1\u001b[39m df = \u001b[38;5;28mlist\u001b[39m(result.values())\n\u001b[32m----> \u001b[39m\u001b[32m2\u001b[39m \u001b[43mdf\u001b[49m\u001b[43m.\u001b[49m\u001b[43mhead\u001b[49m()\n", "\u001b[31mAttributeError\u001b[39m: 'list' object has no attribute 'head'" ] } ], "source": [ "\n", "df = list(result.values())[0]\n", "df.head()\n" ] }, { "cell_type": "markdown", "id": "6089afaa", "metadata": {}, "source": [ "\n", "## 3️⃣ First Contact With the Dataset\n", "\n", "Answer:\n", "\n", "- How many rows?\n", "- How many columns?\n", "- What are the variable names?\n", "- What types of variables exist?\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "fa82ffe1", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1846, 3)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "df.shape\n" ] }, { "cell_type": "code", "execution_count": 7, "id": "64688f15", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 1846 entries, 0 to 1845\n", "Data columns (total 3 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 dataset 1846 non-null str \n", " 1 x 1846 non-null float64\n", " 2 y 1846 non-null float64\n", "dtypes: float64(2), str(1)\n", "memory usage: 43.4 KB\n" ] } ], "source": [ "\n", "df.info()\n" ] }, { "cell_type": "markdown", "id": "9141177c", "metadata": {}, "source": [ "\n", "## 4️⃣ Descriptive Statistics\n", "\n", "What do the summary statistics tell you?\n" ] }, { "cell_type": "code", "execution_count": 6, "id": "1f84ca73", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
xy
count1846.0000001846.000000
mean54.26569547.835099
std16.71300126.847766
min15.5607500.015119
25%41.07340322.561073
50%52.59126947.594450
75%67.27784571.810778
max98.28812399.694680
\n", "
" ], "text/plain": [ " x y\n", "count 1846.000000 1846.000000\n", "mean 54.265695 47.835099\n", "std 16.713001 26.847766\n", "min 15.560750 0.015119\n", "25% 41.073403 22.561073\n", "50% 52.591269 47.594450\n", "75% 67.277845 71.810778\n", "max 98.288123 99.694680" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "df.describe()\n" ] }, { "cell_type": "markdown", "id": "731fc39d", "metadata": {}, "source": [ "\n", "## 5️⃣ Automated Exploratory Data Analysis (Sweetviz)\n", "\n", "Generate a report:\n" ] }, { "cell_type": "code", "execution_count": 10, "id": "0d86fa10", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b3541d57f0994913b40d2e5a2f96c0e7", "version_major": 2, "version_minor": 0 }, "text/plain": [ " | | [ 0%] 00:00 -> (? left)" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "2026-02-22 11:37:04,556 - INFO - Executing shutdown due to inactivity...\n", "2026-02-22 11:37:08,648 - INFO - Executing shutdown...\n", "2026-02-22 11:37:08,650 - INFO - Not running with the Werkzeug Server, exiting by searching gc for BaseWSGIServer\n" ] } ], "source": [ "\n", "report = sv.analyze(df)\n", "#report.show_html(\"sweetviz_report.html\")\n", "report.show_notebook()" ] }, { "cell_type": "markdown", "id": "678e3713", "metadata": {}, "source": [ "\n", "Explore:\n", "\n", "- Distributions\n", "- Correlations\n", "- Missing values\n", "- Variable relationships\n", "\n", "What patterns do you notice?\n" ] }, { "cell_type": "markdown", "id": "ee012206", "metadata": {}, "source": [ "\n", "## 6️⃣ Interactive Exploration (D-Tale)\n" ] }, { "cell_type": "code", "execution_count": 9, "id": "90110819", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2026-02-22 09:59:06,283 - ERROR - Exception on /dtale/charts/_dash-update-component [POST]\n", "Traceback (most recent call last):\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 1473, in wsgi_app\n", " response = self.full_dispatch_request()\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 882, in full_dispatch_request\n", " rv = self.handle_user_exception(e)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 880, in full_dispatch_request\n", " rv = self.dispatch_request()\n", " ^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 865, in dispatch_request\n", " return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\dash.py\", line 1376, in dispatch\n", " ctx.run(\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\_callback.py\", line 507, in add_context\n", " raise err\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\_callback.py\", line 496, in add_context\n", " output_value = _invoke_callback(func, *func_args, **func_kwargs)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\_callback.py\", line 43, in _invoke_callback\n", " return func(*args, **kwargs) # %% callback invoked %%\n", " ^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\dash_application\\saved_charts.py\", line 215, in load_saved_chart\n", " return dict(display=\"block\"), charts, config, build_saved_header(config)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\dash_application\\saved_charts.py\", line 106, in build_saved_header\n", " if chart_type == \"scatter\" and config[\"trendline\"]:\n", " ~~~~~~^^^^^^^^^^^^^\n", "KeyError: 'trendline'\n", "d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\dash_application\\charts.py:4170: GuessedAtParserWarning:\n", "\n", "No parser was explicitly specified, so I'm using the best available HTML parser for this system (\"html.parser\"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.\n", "\n", "The code that caused this warning is on line 4170 of the file d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\dash_application\\charts.py. To get rid of this warning, pass the additional argument 'features=\"html.parser\"' to the BeautifulSoup constructor.\n", "\n", "\n", "2026-02-22 10:02:45,101 - ERROR - Exception on /dtale/charts/_dash-update-component [POST]\n", "Traceback (most recent call last):\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 1473, in wsgi_app\n", " response = self.full_dispatch_request()\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 882, in full_dispatch_request\n", " rv = self.handle_user_exception(e)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 880, in full_dispatch_request\n", " rv = self.dispatch_request()\n", " ^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 865, in dispatch_request\n", " return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\dash.py\", line 1376, in dispatch\n", " ctx.run(\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\_callback.py\", line 507, in add_context\n", " raise err\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\_callback.py\", line 496, in add_context\n", " output_value = _invoke_callback(func, *func_args, **func_kwargs)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\_callback.py\", line 43, in _invoke_callback\n", " return func(*args, **kwargs) # %% callback invoked %%\n", " ^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\dash_application\\views.py\", line 1377, in group_values\n", " group_vals = build_group_val_options(group_vals, group_cols)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\dash_application\\layout\\layout.py\", line 1341, in build_group_val_options\n", " group_vals = find_group_vals(df, group_cols)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\charts\\utils.py\", line 1064, in find_group_vals\n", " group_vals, _ = retrieve_chart_data(df, group_cols)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\charts\\utils.py\", line 368, in retrieve_chart_data\n", " all_data = pd.concat(all_data, axis=1)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\reshape\\concat.py\", line 407, in concat\n", " objs, keys, ndims = _clean_keys_and_objs(objs, keys)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\reshape\\concat.py\", line 808, in _clean_keys_and_objs\n", " raise ValueError(\"No objects to concatenate\")\n", "ValueError: No objects to concatenate\n", "2026-02-22 10:04:23,850 - ERROR - Exception on /dtale/charts/_dash-update-component [POST]\n", "Traceback (most recent call last):\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 1473, in wsgi_app\n", " response = self.full_dispatch_request()\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 882, in full_dispatch_request\n", " rv = self.handle_user_exception(e)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 880, in full_dispatch_request\n", " rv = self.dispatch_request()\n", " ^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\flask\\app.py\", line 865, in dispatch_request\n", " return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\dash.py\", line 1376, in dispatch\n", " ctx.run(\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\_callback.py\", line 507, in add_context\n", " raise err\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\_callback.py\", line 496, in add_context\n", " output_value = _invoke_callback(func, *func_args, **func_kwargs)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dash\\_callback.py\", line 43, in _invoke_callback\n", " return func(*args, **kwargs) # %% callback invoked %%\n", " ^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\dash_application\\saved_charts.py\", line 215, in load_saved_chart\n", " return dict(display=\"block\"), charts, config, build_saved_header(config)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\dash_application\\saved_charts.py\", line 106, in build_saved_header\n", " if chart_type == \"scatter\" and config[\"trendline\"]:\n", " ~~~~~~^^^^^^^^^^^^^\n", "KeyError: 'trendline'\n", "2026-02-22 10:23:45,299 - ERROR - Exception occurred while processing request: 'SeriesGroupBy' object has no attribute 'mad'\n", "Traceback (most recent call last):\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\views.py\", line 121, in _handle_exceptions\n", " return func(*args, **kwargs)\n", " ^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\views.py\", line 3192, in get_column_analysis\n", " return jsonify(**analysis.build())\n", " ^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\column_analysis.py\", line 140, in build\n", " return_data, code = self.analysis.build(self)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\dtale\\column_analysis.py\", line 327, in build\n", " ].agg(self.aggs)\n", " ^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\groupby\\generic.py\", line 2291, in aggregate\n", " result = op.agg()\n", " ^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\apply.py\", line 297, in agg\n", " return self.agg_list_like()\n", " ^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\apply.py\", line 414, in agg_list_like\n", " return self.agg_or_apply_list_like(op_name=\"agg\")\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\apply.py\", line 1640, in agg_or_apply_list_like\n", " keys, results = self.compute_list_like(op_name, selected_obj, kwargs)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\apply.py\", line 473, in compute_list_like\n", " new_res = getattr(colg, op_name)(func, *args, **kwargs)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\groupby\\generic.py\", line 464, in aggregate\n", " ret = self._aggregate_multiple_funcs(func, *args, **kwargs)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\groupby\\generic.py\", line 522, in _aggregate_multiple_funcs\n", " results[key] = self.aggregate(func, *args, **kwargs)\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\groupby\\generic.py\", line 456, in aggregate\n", " return getattr(self, func)(*args, **kwargs)\n", " ^^^^^^^^^^^^^^^^^^^\n", " File \"d:\\Projects\\43679_InteractiveVis\\VI_Lab_01_EDA\\.venv\\Lib\\site-packages\\pandas\\core\\groupby\\groupby.py\", line 1115, in __getattr__\n", " raise AttributeError(\n", "AttributeError: 'SeriesGroupBy' object has no attribute 'mad'\n" ] } ], "source": [ "\n", "d = dtale.show(df, host='localhost')\n", "d.open_browser()" ] }, { "cell_type": "code", "execution_count": null, "id": "7c6e0b13", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "6c82bcfd", "metadata": {}, "source": [ "\n", "Use D‑Tale to:\n", "\n", "- Sort values\n", "- Filter rows\n", "- Inspect unique values\n", "- Look at correlations\n", "\n", "---\n", "\n", "## 7️⃣ Reflection Questions\n", "\n", "1. Why is it dangerous to rely only on summary statistics?\n", "2. What new information did visualization reveal?\n", "3. What would you check before doing any modeling?\n", "4. What kinds of problems might real datasets contain?\n", "\n", "---\n", "\n", "Next classes: you will receive *messy datasets* that require cleaning and preparation.\n" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 5 }