{"id":801,"date":"2026-04-06T05:25:38","date_gmt":"2026-04-06T05:25:38","guid":{"rendered":"https:\/\/techpaathshala.com\/blog\/?p=801"},"modified":"2026-04-21T06:59:23","modified_gmt":"2026-04-21T06:59:23","slug":"python-for-data-science-why-its-the-only-language-you-need-in-2026","status":"publish","type":"post","link":"https:\/\/techpaathshala.com\/blog\/python-for-data-science-why-its-the-only-language-you-need-in-2026\/","title":{"rendered":"Python for Data Science \u2014 Why It&#8217;s the Only Language You Need in 2026"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">If you have spent any time researching a career in data science, you have probably encountered the debate: Python or R? Python or Julia? Python or SAS?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here is the honest answer, stated plainly so you can stop losing sleep over it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Learn Python. The debate is settled.<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Not because R is bad \u2014 it is not. Not because other tools have no value \u2014 some do, in specific contexts. But because Python has become the universal language of data science, machine learning, AI engineering, and data engineering to such a decisive degree that choosing anything else as your primary language in 2026 means actively swimming against the current of the entire industry.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In India specifically \u2014 and in Mumbai&#8217;s data science job market in particular \u2014 Python appears in over 85% of data analyst and data scientist job descriptions. The ML frameworks that power production systems at every major tech company (TensorFlow, PyTorch, Scikit-learn, XGBoost, LangChain) are Python-first. The GenAI APIs that are reshaping every industry are documented primarily in Python. The notebooks that data scientists share, the Stack Overflow answers that solve debugging problems, the GitHub repositories that contain usable code examples \u2014 overwhelmingly Python.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This guide explains <em>why<\/em> Python won \u2014 and more usefully, <em>what<\/em> Python for data science actually means: the specific libraries, the specific workflows, and the specific level of proficiency that makes you job-ready in India&#8217;s 2026 data market.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n<div class=\"custom-ad-banner\" style=\"margin:20px 0; text-align:center;\"><a href=\"https:\/\/techpaathshala.com\/data-science-program-mumbai\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/techpaathshala.com\/blog\/wp-content\/uploads\/2026\/04\/WhatsApp-Image-2026-04-20-at-11.47.35-AM.jpeg\" alt=\"Advertisement\" \/><\/a><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Why Python Won \u2014 The Five Reasons That Actually Matter<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Understanding why Python became dominant helps you understand what you are learning and why each part of the ecosystem exists. It is not an accident of history. Python won for reasons that are structural and durable.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. Readability That Lowers the Barrier to Entry<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Python was designed with readability as a first principle. Its syntax is closer to plain English than any other general-purpose language. This matters for data science specifically because the people who need to write data code are not always professional software engineers \u2014 they are statisticians, researchers, analysts, and domain experts who need to express analytical logic in code without fighting the language.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Compare the same operation in Python and Java:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Python: filter a list of numbers above the average\nnumbers = &#091;14, 8, 32, 21, 7, 45, 19, 28]\nabove_average = &#091;n for n in numbers if n &gt; sum(numbers) \/ len(numbers)]\nprint(above_average)\n# Output: &#091;32, 21, 45, 28]\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ Java: the same operation\nimport java.util.ArrayList;\nimport java.util.Arrays;\nimport java.util.List;\n\npublic class Main {\n    public static void main(String&#091;] args) {\n        List&lt;Integer&gt; numbers = Arrays.asList(14, 8, 32, 21, 7, 45, 19, 28);\n        double average = numbers.stream()\n            .mapToInt(Integer::intValue)\n            .average()\n            .orElse(0);\n        List&lt;Integer&gt; aboveAverage = new ArrayList&lt;&gt;();\n        for (int n : numbers) {\n            if (n &gt; average) aboveAverage.add(n);\n        }\n        System.out.println(aboveAverage);\n    }\n}\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The Python version reads almost like an English sentence. The Java version requires understanding of generics, streams, imports, and class structure before you can even run it. For a data scientist who wants to express &#8220;give me the numbers above average,&#8221; Python gets out of the way and lets you think about the problem rather than the language.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2. The Ecosystem Is Unmatched<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Python&#8217;s dominance in data science is self-reinforcing through its library ecosystem. The most important data science tools in existence are Python libraries \u2014 and they were built in Python because Python was already where the data science community was.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">NumPy (numerical computing), Pandas (data manipulation), Matplotlib and Seaborn (visualisation), Scikit-learn (machine learning), TensorFlow and PyTorch (deep learning), XGBoost and LightGBM (gradient boosting), Statsmodels (statistical analysis), SciPy (scientific computing), LangChain (LLM orchestration), Hugging Face Transformers (pre-trained models) \u2014 every foundational library in the modern data science stack is Python.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">R has excellent statistical packages. SAS is powerful for specific enterprise contexts. Julia is fast for numerical computing. None of them have an ecosystem that is even close to Python&#8217;s breadth, depth, and rate of development.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When a new ML paper is published, the reference implementation is almost always Python. When a new AI API launches, the primary SDK is Python. When a data science team at a Mumbai startup needs to build something they have never built before, they search for a Python library first \u2014 and almost always find one.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3. Versatility Across the Entire Data Pipeline<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Python is not just a data analysis language. It is a complete engineering language that happens to be excellent at data science. This means a Python-proficient data scientist can:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Write the SQL query that extracts training data (via <code>psycopg2<\/code> or <code>SQLAlchemy<\/code>)<\/li>\n\n\n\n<li>Clean and transform the data (Pandas)<\/li>\n\n\n\n<li>Build and evaluate the model (Scikit-learn, XGBoost)<\/li>\n\n\n\n<li>Deploy the model as an API (FastAPI)<\/li>\n\n\n\n<li>Schedule automated retraining (Airflow, Prefect)<\/li>\n\n\n\n<li>Integrate LLM capabilities into the pipeline (Anthropic SDK, OpenAI SDK)<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">A data scientist who only knows R can do the analytical middle part of this pipeline. A data scientist who knows Python can own the entire thing. In Mumbai&#8217;s startup and mid-size company context \u2014 where data teams are often small and individual data scientists are expected to do more than just model building \u2014 this versatility is a career-defining advantage.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">4. The AI and GenAI Revolution Is Python-Native<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This is the 2026-specific reason that makes Python more important than it has ever been.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Every major LLM API (Anthropic Claude, OpenAI GPT, Google Gemini) has a Python SDK as the primary interface. Every major GenAI framework (LangChain, LangGraph, LlamaIndex, CrewAI) is Python. Every major model fine-tuning library (Hugging Face PEFT, TRL, Unsloth) is Python. The Jupyter notebook \u2014 the standard interface for interactive data and AI work \u2014 is Python.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you want to build RAG systems, fine-tune open-source models, build AI agents, or integrate LLMs into data pipelines \u2014 Python is not one option among several. It is the only practical choice.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">5. The Indian Market Has Standardised on Python<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In the Indian context specifically, Python&#8217;s dominance in data science hiring is not just a reflection of global trends \u2014 it is a deliberate, explicit hiring standard.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Across Naukri, LinkedIn, and Instahire job postings for data analyst and data scientist roles in Mumbai, Bengaluru, Hyderabad, and Pune, Python appears as a required skill in over 85% of postings. In contrast, R appears in approximately 15\u201320% (primarily in research and pharma contexts), and SAS appears in under 10% (legacy BFSI contexts that are declining).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Indian data science community \u2014 including the IITs, IIMs, and top engineering colleges that feed the industry \u2014 has standardised on Python for data science education. The professionals entering the market with data science skills are Python-proficient. The teams they join expect Python. The gap between knowing Python and knowing something else, in the Indian market, is the gap between being a strong candidate and being a weak one.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">The Python Data Science Stack: What You Actually Need to Learn<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Python is a large language with many libraries and applications. Python <em>for data science<\/em> is a specific, bounded subset of that landscape. Here is what that subset looks like \u2014 the libraries that constitute the working toolkit of a practising data scientist in India.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">NumPy \u2014 The Numerical Foundation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">NumPy (Numerical Python) is the foundation on which Pandas, Scikit-learn, and virtually every other data science library is built. It provides the <code>ndarray<\/code> \u2014 an n-dimensional array \u2014 as the core data structure for numerical computing, along with a comprehensive library of mathematical operations that run significantly faster than native Python because they are implemented in C.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For most data scientists, direct NumPy usage is less frequent than Pandas usage \u2014 but understanding NumPy arrays and operations is the foundation that makes everything else make sense.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import numpy as np\n\n# Creating arrays\nscores = np.array(&#091;72, 85, 91, 68, 77, 94, 82, 79, 88, 65])\n\n# Descriptive statistics \u2014 instant, no loops needed\nprint(f\"Mean score:        {np.mean(scores):.1f}\")\nprint(f\"Median score:      {np.median(scores):.1f}\")\nprint(f\"Std deviation:     {np.std(scores):.1f}\")\nprint(f\"Min \/ Max:         {np.min(scores)} \/ {np.max(scores)}\")\nprint(f\"25th percentile:   {np.percentile(scores, 25):.1f}\")\nprint(f\"75th percentile:   {np.percentile(scores, 75):.1f}\")\n\n# Boolean masking \u2014 filter without a loop\nhigh_scorers = scores&#091;scores &gt;= 85]\nprint(f\"\\nScores above 85:   {high_scorers}\")\nprint(f\"Count above 85:    {len(high_scorers)}\")\nprint(f\"% above 85:        {len(high_scorers)\/len(scores)*100:.0f}%\")\n\n# Vectorised operations \u2014 apply transformations to every element at once\n# Normalise scores to 0-1 range (min-max scaling)\nnormalised = (scores - np.min(scores)) \/ (np.max(scores) - np.min(scores))\nprint(f\"\\nNormalised scores: {np.round(normalised, 2)}\")\n\n# Reshaping \u2014 rearrange array dimensions\nmatrix = scores.reshape(2, 5)      # 2 rows, 5 columns\nprint(f\"\\nReshaped to 2x5:\\n{matrix}\")\nprint(f\"Transpose:\\n{matrix.T}\")   # flip rows and columns\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What a beginner learns from NumPy:<\/strong> How computers represent numerical data efficiently, why vectorised operations are faster than loops (critical for working with large datasets), and what &#8220;array broadcasting&#8221; means (how NumPy handles operations between arrays of different shapes).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What a working data scientist uses NumPy for:<\/strong> Normalising and scaling data before model training, computing statistical summaries, reshaping data for ML model inputs, and random number generation for reproducibility (<code>np.random.seed<\/code>).<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Pandas \u2014 The Heart of Data Analysis<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">If NumPy is the foundation, Pandas is where most data science work actually happens. The Pandas <code>DataFrame<\/code> \u2014 a two-dimensional table with labelled rows and columns \u2014 is the primary data structure for loading, exploring, cleaning, transforming, and analysing structured data in Python.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Understanding Pandas well is the single most impactful skill investment for a data scientist working with tabular data.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\nimport numpy as np\n\n# --- LOADING DATA ---\n# From a CSV file\ndf = pd.read_csv('mumbai_ecommerce_orders.csv')\n\n# First look at the data\nprint(\"Shape:\", df.shape)                    # (rows, columns)\nprint(\"\\nColumn names:\", df.columns.tolist())\nprint(\"\\nData types:\\n\", df.dtypes)\nprint(\"\\nFirst 5 rows:\\n\", df.head())\nprint(\"\\nBasic statistics:\\n\", df.describe())\n\n# --- UNDERSTANDING DATA QUALITY ---\nprint(\"\\nMissing values per column:\")\nprint(df.isnull().sum())\nprint(f\"\\nTotal missing: {df.isnull().sum().sum()}\")\nprint(f\"Missing %: {df.isnull().mean().mul(100).round(1)}\")\n\n# --- SELECTING DATA ---\n# Select a single column (returns a Series)\norder_amounts = df&#091;'order_amount']\n\n# Select multiple columns\nsubset = df&#091;&#091;'customer_id', 'city', 'order_amount', 'order_date']]\n\n# Filter rows by condition\nmumbai_orders = df&#091;df&#091;'city'] == 'Mumbai']\nhigh_value     = df&#091;df&#091;'order_amount'] &gt; 5000]\nrecent_mumbai  = df&#091;(df&#091;'city'] == 'Mumbai') &amp; (df&#091;'order_date'] &gt;= '2025-01-01')]\n\nprint(f\"\\nTotal orders:         {len(df)}\")\nprint(f\"Mumbai orders:        {len(mumbai_orders)}\")\nprint(f\"High-value (&gt;5000):   {len(high_value)}\")\n\n# --- CLEANING DATA ---\n# Handle missing values\ndf&#091;'customer_age'].fillna(df&#091;'customer_age'].median(), inplace=True)\ndf&#091;'product_category'].fillna('Unknown', inplace=True)\ndf.dropna(subset=&#091;'customer_id', 'order_amount'], inplace=True)\n\n# Fix data types\ndf&#091;'order_date']   = pd.to_datetime(df&#091;'order_date'])\ndf&#091;'order_amount'] = pd.to_numeric(df&#091;'order_amount'], errors='coerce')\n\n# Remove duplicates\ndf.drop_duplicates(subset=&#091;'order_id'], keep='first', inplace=True)\n\n# --- CREATING NEW FEATURES ---\ndf&#091;'order_month']    = df&#091;'order_date'].dt.month\ndf&#091;'order_year']     = df&#091;'order_date'].dt.year\ndf&#091;'order_quarter']  = df&#091;'order_date'].dt.quarter\ndf&#091;'is_weekend']     = df&#091;'order_date'].dt.dayofweek.isin(&#091;5, 6]).astype(int)\ndf&#091;'log_amount']     = np.log1p(df&#091;'order_amount'])   # log transform for skewed data\ndf&#091;'is_high_value']  = (df&#091;'order_amount'] &gt; df&#091;'order_amount'].quantile(0.75)).astype(int)\n\n# --- GROUPING AND AGGREGATING ---\n# Revenue by city\ncity_summary = df.groupby('city')&#091;'order_amount'].agg(\n    total_revenue='sum',\n    avg_order_value='mean',\n    num_orders='count',\n    median_order='median'\n).round(2).sort_values('total_revenue', ascending=False)\n\nprint(\"\\nTop 5 cities by revenue:\")\nprint(city_summary.head())\n\n# Monthly revenue trend\nmonthly_revenue = df.groupby(&#091;'order_year', 'order_month'])&#091;'order_amount'].sum().reset_index()\nmonthly_revenue.columns = &#091;'year', 'month', 'revenue']\nprint(\"\\nMonthly revenue (last 6 months):\")\nprint(monthly_revenue.tail(6))\n\n# --- MERGING DATASETS ---\n# Combine orders with customer profile data\ncustomers = pd.read_csv('customer_profiles.csv')\n\nmerged = df.merge(\n    customers&#091;&#091;'customer_id', 'segment', 'acquisition_channel']],\n    on='customer_id',\n    how='left'          # keep all orders, even if no customer profile match\n)\n\n# Revenue by acquisition channel and segment\nchannel_segment = merged.groupby(\n    &#091;'acquisition_channel', 'segment']\n)&#091;'order_amount'].agg(&#091;'sum', 'mean', 'count']).round(2)\nprint(\"\\nRevenue by channel and segment:\")\nprint(channel_segment)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What a beginner learns from this code:<\/strong> How to load, inspect, clean, and summarise a dataset \u2014 the EDA workflow that starts every data science project.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What this looks like in a real interview:<\/strong> You receive a raw CSV, spend 20 minutes exploring and cleaning it with Pandas, and answer five business questions with <code>groupby<\/code> aggregations and filtered subsets. This is the most common data science technical screen format in Mumbai.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Matplotlib and Seaborn \u2014 Making Data Visible<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Data visualisation in Python serves two purposes: exploration (finding patterns in data before modelling) and communication (showing findings to stakeholders after analysis). Matplotlib is the foundational charting library; Seaborn is built on top of it with higher-level, statistical-first chart types that are more directly useful for data science.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import matplotlib.pyplot as plt\nimport seaborn as sns\nimport pandas as pd\nimport numpy as np\n\n# Set a clean style\nplt.style.use('seaborn-v0_8-whitegrid')\nsns.set_palette(\"husl\")\n\nfig, axes = plt.subplots(2, 3, figsize=(16, 10))\nfig.suptitle('Mumbai E-commerce Order Analysis', fontsize=16, fontweight='bold', y=1.02)\n\n# --- PLOT 1: Distribution of order amounts (histogram + KDE) ---\nsns.histplot(\n    data=df,\n    x='order_amount',\n    bins=40,\n    kde=True,\n    ax=axes&#091;0, 0]\n)\naxes&#091;0, 0].set_title('Distribution of Order Amounts')\naxes&#091;0, 0].set_xlabel('Order Amount (\u20b9)')\naxes&#091;0, 0].axvline(df&#091;'order_amount'].mean(), color='red', linestyle='--',\n                   label=f\"Mean: \u20b9{df&#091;'order_amount'].mean():.0f}\")\naxes&#091;0, 0].axvline(df&#091;'order_amount'].median(), color='orange', linestyle='--',\n                   label=f\"Median: \u20b9{df&#091;'order_amount'].median():.0f}\")\naxes&#091;0, 0].legend()\n\n# --- PLOT 2: Top 10 cities by revenue (horizontal bar) ---\ntop_cities = (df.groupby('city')&#091;'order_amount']\n              .sum()\n              .sort_values(ascending=True)\n              .tail(10))\ntop_cities.plot(kind='barh', ax=axes&#091;0, 1], color='steelblue')\naxes&#091;0, 1].set_title('Top 10 Cities by Total Revenue')\naxes&#091;0, 1].set_xlabel('Total Revenue (\u20b9)')\n\n# --- PLOT 3: Monthly revenue trend (line chart) ---\nmonthly = (df.groupby(df&#091;'order_date'].dt.to_period('M'))&#091;'order_amount']\n           .sum()\n           .reset_index())\nmonthly&#091;'order_date'] = monthly&#091;'order_date'].astype(str)\naxes&#091;0, 2].plot(monthly&#091;'order_date'], monthly&#091;'order_amount'],\n                marker='o', linewidth=2, markersize=4)\naxes&#091;0, 2].set_title('Monthly Revenue Trend')\naxes&#091;0, 2].set_xlabel('Month')\naxes&#091;0, 2].set_ylabel('Revenue (\u20b9)')\naxes&#091;0, 2].tick_params(axis='x', rotation=45)\n\n# --- PLOT 4: Order amount by product category (box plot) ---\ntop_categories = df&#091;'product_category'].value_counts().head(6).index\ncat_data = df&#091;df&#091;'product_category'].isin(top_categories)]\nsns.boxplot(\n    data=cat_data,\n    x='product_category',\n    y='order_amount',\n    ax=axes&#091;1, 0]\n)\naxes&#091;1, 0].set_title('Order Amount by Category')\naxes&#091;1, 0].tick_params(axis='x', rotation=30)\naxes&#091;1, 0].set_xlabel('')\n\n# --- PLOT 5: Correlation heatmap ---\nnumeric_cols = df.select_dtypes(include=&#091;np.number]).columns.tolist()\ncorr_matrix = df&#091;numeric_cols].corr()\nsns.heatmap(\n    corr_matrix,\n    annot=True,\n    fmt='.2f',\n    cmap='coolwarm',\n    center=0,\n    ax=axes&#091;1, 1],\n    square=True\n)\naxes&#091;1, 1].set_title('Feature Correlation Matrix')\n\n# --- PLOT 6: Orders by day of week (count plot) ---\nday_names = &#091;'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']\ndf&#091;'day_of_week'] = df&#091;'order_date'].dt.dayofweek\nday_counts = df&#091;'day_of_week'].value_counts().sort_index()\naxes&#091;1, 2].bar(day_names, day_counts.values, color='coral')\naxes&#091;1, 2].set_title('Orders by Day of Week')\naxes&#091;1, 2].set_ylabel('Number of Orders')\n\nplt.tight_layout()\nplt.savefig('ecommerce_analysis.png', dpi=150, bbox_inches='tight')\nplt.show()\nprint(\"Dashboard saved as 'ecommerce_analysis.png'\")\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What a beginner takes away from visualisation code:<\/strong> How to choose the right chart for the right question (distribution \u2192 histogram, comparison \u2192 bar, trend \u2192 line, relationship \u2192 scatter, spread \u2192 box), how to customise titles, labels, and colours for readability, and how to produce multi-panel dashboards that tell a complete analytical story.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Why this matters for job seekers:<\/strong> A well-produced EDA visualisation in a portfolio project is more persuasive to a hiring manager than a certificate. It shows that you can translate data into insight \u2014 which is the entire job.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Scikit-learn \u2014 From Data to Predictions<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Scikit-learn is the standard Python library for machine learning on structured (tabular) data. It provides consistent, well-documented implementations of virtually every classical ML algorithm, along with the tools for model evaluation, hyperparameter tuning, and pipeline construction that are required for production-quality work.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from sklearn.model_selection import train_test_split, cross_val_score\nfrom sklearn.preprocessing import StandardScaler, LabelEncoder\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.metrics import (classification_report, confusion_matrix,\n                              roc_auc_score)\nfrom sklearn.pipeline import Pipeline\nimport pandas as pd\nimport numpy as np\n\n# --- PREPARE FEATURES ---\n# Encode categorical variable\nle = LabelEncoder()\ndf&#091;'city_encoded'] = le.fit_transform(df&#091;'city'])\n\n# Select features and target\nfeatures = &#091;\n    'customer_age', 'days_since_last_order', 'total_orders_12m',\n    'avg_order_value', 'log_amount', 'city_encoded', 'is_weekend'\n]\nX = df&#091;features]\ny = df&#091;'churned']   # 1 = churned, 0 = retained\n\nprint(f\"Dataset: {X.shape&#091;0]:,} rows, {X.shape&#091;1]} features\")\nprint(f\"Churn rate: {y.mean():.1%}\")\n\n# --- SPLIT INTO TRAIN AND TEST ---\n# stratify=y ensures equal churn proportion in both sets\nX_train, X_test, y_train, y_test = train_test_split(\n    X, y,\n    test_size=0.2,\n    random_state=42,\n    stratify=y\n)\nprint(f\"\\nTraining set: {len(X_train):,} rows\")\nprint(f\"Test set:     {len(X_test):,} rows\")\n\n# --- BUILD A PIPELINE ---\n# Pipeline applies scaler then trains model \u2014 prevents data leakage\npipeline = Pipeline(&#091;\n    ('scaler', StandardScaler()),\n    ('model', RandomForestClassifier(\n        n_estimators=100,\n        max_depth=8,\n        class_weight='balanced',   # handles class imbalance\n        random_state=42\n    ))\n])\n\n# --- TRAIN ---\npipeline.fit(X_train, y_train)\nprint(\"\\nModel trained successfully.\")\n\n# --- EVALUATE ---\ny_pred = pipeline.predict(X_test)\ny_prob = pipeline.predict_proba(X_test)&#091;:, 1]\n\nprint(\"\\n--- Model Performance ---\")\nprint(classification_report(y_test, y_pred, target_names=&#091;'Retained', 'Churned']))\nprint(f\"AUC-ROC: {roc_auc_score(y_test, y_prob):.4f}\")\n\n# Confusion matrix \u2014 understand error types\ncm = confusion_matrix(y_test, y_pred)\nprint(f\"\\nConfusion Matrix:\")\nprint(f\"True Negatives  (correctly predicted retained): {cm&#091;0,0]}\")\nprint(f\"False Positives (retained flagged as churned):  {cm&#091;0,1]}\")\nprint(f\"False Negatives (churned missed by model):      {cm&#091;1,0]}\")\nprint(f\"True Positives  (correctly predicted churned):  {cm&#091;1,1]}\")\n\n# Cross-validation \u2014 more reliable than single train-test split\ncv_scores = cross_val_score(pipeline, X, y, cv=5, scoring='roc_auc')\nprint(f\"\\n5-Fold CV AUC: {cv_scores.mean():.4f} \u00b1 {cv_scores.std():.4f}\")\n\n# --- FEATURE IMPORTANCE ---\nrf = pipeline.named_steps&#091;'model']\nimportance = pd.DataFrame({\n    'feature': features,\n    'importance': rf.feature_importances_\n}).sort_values('importance', ascending=False)\n\nprint(\"\\nFeature Importance:\")\nprint(importance.to_string(index=False))\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What a beginner learns from this workflow:<\/strong> The complete ML pipeline \u2014 data preparation, train-test split, model training, evaluation with appropriate metrics \u2014 structured in a way that avoids the most common beginner mistake (data leakage from the scaler seeing the test data).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What this prepares you for:<\/strong> The technical screen at every mid-to-senior data science role in Mumbai will involve some version of this workflow. Being able to write it from scratch, explain each step, and discuss the choices made (why <code>stratify=y<\/code>? why <code>class_weight='balanced'<\/code>? why AUC-ROC over accuracy?) is what passes the screen.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Python vs. R: The Honest Comparison for Indian Job Seekers<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">This question deserves a direct answer rather than a diplomatic &#8220;both have their place.&#8221;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>R&#8217;s genuine strengths:<\/strong> Superior statistical computing for specific advanced methods (some econometric models, survival analysis variants, Bayesian computing via Stan). An excellent visualisation library (<code>ggplot2<\/code>) that many consider more elegant than Matplotlib for static publication-quality charts. A strong academic and research community, particularly in biostatistics, epidemiology, and social sciences.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Why Python wins for Indian job seekers in 2026:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Indian job market for data roles is industry-facing, not academia-facing. The companies hiring the most data professionals in Mumbai \u2014 FinTech, e-commerce, IT services, consulting \u2014 use Python. The production ML systems at these companies are built in Python. The engineers, data scientists, and ML engineers who already work there use Python. Joining a team with R knowledge in a Python-first organisation means using the team&#8217;s tools anyway \u2014 or being the person who can not collaborate on shared codebases.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">R&#8217;s statistical strengths are real but largely irrelevant for 90% of applied industry data science work. The advanced statistical methods where R genuinely outperforms Python are niche enough that most data scientists in industry encounter them rarely, if at all.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>The pragmatic answer:<\/strong> If you are a biostatistician, academic researcher, or working in a context where R is the team standard \u2014 learn R. If you are building a career in India&#8217;s technology, finance, or e-commerce industry \u2014 learn Python. The market is unambiguous.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">The Python Data Science Learning Roadmap<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Here is the sequence that takes you from &#8220;I have never written Python&#8221; to &#8220;I can build and present a complete data science project&#8221; \u2014 the minimum bar for job applications at entry level.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 1: Python Fundamentals (Weeks 1\u20133)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Before touching data science libraries, build a working foundation in Python itself:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Variables, data types, and basic operations<\/li>\n\n\n\n<li>Lists, dictionaries, tuples, sets \u2014 and their methods<\/li>\n\n\n\n<li>Conditional statements (<code>if<\/code>\/<code>elif<\/code>\/<code>else<\/code>)<\/li>\n\n\n\n<li>Loops (<code>for<\/code>, <code>while<\/code>) and list comprehensions<\/li>\n\n\n\n<li>Functions \u2014 defining, calling, arguments, return values<\/li>\n\n\n\n<li>File I\/O \u2014 reading and writing CSV and text files<\/li>\n\n\n\n<li>Error handling with <code>try<\/code>\/<code>except<\/code><\/li>\n\n\n\n<li>Installing packages with <code>pip<\/code>, importing with <code>import<\/code><\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Milestone:<\/strong> Write a Python script that reads a CSV file, calculates summary statistics, and writes the results to a new file \u2014 without using Pandas or NumPy.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 2: NumPy and Pandas (Weeks 4\u20137)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>NumPy arrays, shapes, reshaping, and vectorised operations<\/li>\n\n\n\n<li>Pandas Series and DataFrame \u2014 creation, indexing, slicing<\/li>\n\n\n\n<li>Data loading from CSV, Excel, and JSON<\/li>\n\n\n\n<li>Data exploration \u2014 <code>head<\/code>, <code>info<\/code>, <code>describe<\/code>, <code>value_counts<\/code><\/li>\n\n\n\n<li>Handling missing data \u2014 <code>isnull<\/code>, <code>fillna<\/code>, <code>dropna<\/code><\/li>\n\n\n\n<li>Data type conversion and string operations<\/li>\n\n\n\n<li>Filtering, sorting, and selecting rows\/columns<\/li>\n\n\n\n<li><code>groupby<\/code> with aggregation functions<\/li>\n\n\n\n<li>Merging and joining DataFrames<\/li>\n\n\n\n<li>Time series operations with datetime columns<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Milestone:<\/strong> Take a real, messy dataset (Kaggle has excellent free options \u2014 try the &#8220;IPL Dataset&#8221; or an &#8220;India E-commerce Dataset&#8221;) and produce a clean, documented Jupyter notebook answering five specific business questions using only Pandas.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 3: Visualisation (Weeks 8\u20139)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Matplotlib: figure and axes structure, line charts, bar charts, scatter plots<\/li>\n\n\n\n<li>Seaborn: histograms with KDE, box plots, heatmaps, count plots, pair plots<\/li>\n\n\n\n<li>Chart design principles \u2014 titles, labels, legends, colour choices<\/li>\n\n\n\n<li>Multi-panel figures with <code>plt.subplots<\/code><\/li>\n\n\n\n<li>Saving figures to file<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Milestone:<\/strong> Produce a four-panel EDA dashboard from your Pandas dataset that a non-technical person could understand and find useful.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 4: Scikit-learn and ML Fundamentals (Weeks 10\u201314)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Train-test split and cross-validation<\/li>\n\n\n\n<li>Classification: Logistic Regression, Random Forest, XGBoost<\/li>\n\n\n\n<li>Regression: Linear Regression, Ridge, Random Forest Regressor<\/li>\n\n\n\n<li>Evaluation metrics \u2014 classification (accuracy, precision, recall, F1, AUC-ROC), regression (RMSE, MAE, R\u00b2)<\/li>\n\n\n\n<li>Data preprocessing: <code>StandardScaler<\/code>, <code>LabelEncoder<\/code>, <code>OneHotEncoder<\/code><\/li>\n\n\n\n<li>Pipeline construction to prevent data leakage<\/li>\n\n\n\n<li>Feature importance and basic model interpretability<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Milestone:<\/strong> Build an end-to-end ML project \u2014 from raw data through EDA, feature engineering, model training, evaluation, and interpretation. Publish it to GitHub with a clear README. This is your first portfolio artifact.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">The Portfolio That Opens Doors<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Three projects, done well, are more valuable than ten projects done carelessly. Here is what each portfolio project should contain:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Project structure that impresses hiring managers:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A clear problem statement \u2014 what business question are you answering?<\/li>\n\n\n\n<li>A data source \u2014 publicly available (Kaggle, government open data, World Bank)<\/li>\n\n\n\n<li>An EDA section with 4\u20136 visualisations that surface genuine insights<\/li>\n\n\n\n<li>A modelling section with at least two models compared on appropriate metrics<\/li>\n\n\n\n<li>A conclusions section that answers the original question in business language<\/li>\n\n\n\n<li>A README that a non-technical person can read and understand<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Dataset sources for India-relevant projects:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Kaggle&#8217;s &#8220;E-Commerce Shipping Dataset&#8221; \u2014 logistics and customer data<\/li>\n\n\n\n<li>RBI&#8217;s open data portal \u2014 financial and banking statistics<\/li>\n\n\n\n<li>data.gov.in \u2014 government datasets across sectors<\/li>\n\n\n\n<li>SEBI&#8217;s open data \u2014 capital market data<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Where to publish:<\/strong> GitHub (primary), Kaggle notebooks (for community visibility), and a PDF version of key visualisations and findings for sharing in interviews.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Library<\/th><th>Category<\/th><th>What It Does<\/th><th>When to Use It<\/th><th>Example Use Case<\/th><\/tr><\/thead><tbody><tr><td>NumPy<\/td><td>Numerical Computing<\/td><td>Handles arrays, matrices, fast math operations<\/td><td>Working with numerical data, linear algebra<\/td><td>Matrix operations, scientific computing<\/td><\/tr><tr><td>Pandas<\/td><td>Data Analysis<\/td><td>Data manipulation, cleaning, tabular operations<\/td><td>Handling datasets (CSV, Excel, SQL)<\/td><td>Data cleaning, EDA<\/td><\/tr><tr><td>Matplotlib<\/td><td>Data Visualization<\/td><td>Basic plotting (line, bar, scatter)<\/td><td>Simple static charts<\/td><td>Sales trends, basic analytics<\/td><\/tr><tr><td>Seaborn<\/td><td>Data Visualization<\/td><td>Advanced statistical visualizations<\/td><td>Better-looking charts with less code<\/td><td>Correlation heatmaps<\/td><\/tr><tr><td>Scikit-learn<\/td><td>Machine Learning<\/td><td>ML algorithms (regression, classification, clustering)<\/td><td>Building traditional ML models<\/td><td>Predicting house prices<\/td><\/tr><tr><td>TensorFlow<\/td><td>Deep Learning<\/td><td>Neural networks, large-scale ML models<\/td><td>Deep learning &amp; production ML<\/td><td>Image recognition<\/td><\/tr><tr><td>PyTorch<\/td><td>Deep Learning<\/td><td>Flexible deep learning framework<\/td><td>Research &amp; experimentation<\/td><td>NLP models, CV tasks<\/td><\/tr><tr><td>XGBoost<\/td><td>ML (Boosting)<\/td><td>High-performance gradient boosting<\/td><td>Structured\/tabular data problems<\/td><td>Kaggle competitions, prediction systems<\/td><\/tr><tr><td>LightGBM<\/td><td>ML (Boosting)<\/td><td>Faster gradient boosting (large datasets)<\/td><td>Large-scale ML tasks<\/td><td>Fraud detection<\/td><\/tr><tr><td>Statsmodels<\/td><td>Statistics<\/td><td>Statistical tests, regression analysis<\/td><td>In-depth statistical modeling<\/td><td>Hypothesis testing<\/td><\/tr><tr><td>OpenCV<\/td><td>Computer Vision<\/td><td>Image processing &amp; video analysis<\/td><td>Vision-based applications<\/td><td>Face detection<\/td><\/tr><tr><td>NLTK<\/td><td>NLP<\/td><td>Basic text processing tools<\/td><td>Beginner NLP tasks<\/td><td>Tokenization, stemming<\/td><\/tr><tr><td>spaCy<\/td><td>NLP<\/td><td>Fast, production-ready NLP<\/td><td>Real-world NLP applications<\/td><td>Named entity recognition<\/td><\/tr><tr><td>Transformers<\/td><td>GenAI \/ NLP<\/td><td>Pre-trained LLMs (BERT, GPT, etc.)<\/td><td>Working with modern AI models<\/td><td>Chatbots, summarization<\/td><\/tr><tr><td>LangChain<\/td><td>LLM Apps<\/td><td>Build apps using LLMs (chains, agents, RAG)<\/td><td>Creating AI-powered systems<\/td><td>ChatGPT-like apps<\/td><\/tr><tr><td>FastAPI<\/td><td>Backend API<\/td><td>Build high-performance APIs<\/td><td>Deploy ML\/AI models<\/td><td>Serving predictions via API<\/td><\/tr><tr><td>Streamlit<\/td><td>App Development<\/td><td>Build data apps quickly<\/td><td>Creating dashboards &amp; ML demos<\/td><td>Interactive ML apps<\/td><\/tr><tr><td>Plotly<\/td><td>Visualization<\/td><td>Interactive charts &amp; dashboards<\/td><td>Advanced UI charts<\/td><td>Business dashboards<\/td><\/tr><tr><td>Dask<\/td><td>Big Data<\/td><td>Parallel computing for large datasets<\/td><td>Handling big data beyond memory<\/td><td>Scaling Pandas workflows<\/td><\/tr><tr><td>PySpark<\/td><td>Big Data<\/td><td>Distributed data processing<\/td><td>Enterprise-scale data pipelines<\/td><td>Processing TBs of data<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">One Language. One Decision. No Looking Back.<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The data science landscape in 2026 is vast, fast-moving, and occasionally overwhelming. The one decision that simplifies everything else is also the easiest one to make: choose Python.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Once you have made that choice, the path is clear. Foundations \u2192 NumPy and Pandas \u2192 visualisation \u2192 Scikit-learn \u2192 real projects \u2192 portfolio \u2192 job. Each step is learnable. The community is enormous \u2014 Stack Overflow answers exist for virtually every error you will encounter. The documentation is excellent. The feedback loop from writing code to seeing results is fast.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You do not need to learn everything before you start. You need to start to learn everything.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Learn Python for Data Science the Right Way \u2014 With Real Projects.<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Join TechPaathshala&#8217;s Data Science Program<\/strong> \u2014 where Python is taught as the working language of data science from Day 1, applied to real datasets drawn from Mumbai&#8217;s industry, and built into a portfolio that hiring managers can evaluate directly.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Our curriculum takes you through Python fundamentals, the full data science library stack (NumPy, Pandas, Matplotlib, Seaborn, Scikit-learn), SQL integration, and ML model building \u2014 with structured projects at every stage and mentorship from practitioners who use these tools in production.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">No prior programming experience required. No prerequisites except the commitment to build something real.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\ud83d\udccd TechPaathshala | Vikhroli West, Mumbai | Hybrid Available<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><a href=\"https:\/\/techpaathshala.com\/\">Explore the Data Science Program \u2192<\/a><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>Next cohort forming now. All backgrounds welcome.<\/em><\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Meta Description:<\/strong> Why Python is the only language for data science beginners in India in 2026 \u2014 covering NumPy, Pandas, Matplotlib, and Scikit-learn with real code and a step-by-step learning roadmap.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p class=\"wp-block-paragraph\"><em>TechPaathshala (Stalwarts Techpaathshala Pvt. Ltd.) | Vikhroli West, Mumbai | techpaathshala.com<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you have spent any time researching a career in data science, you have probably encountered the debate: Python or R? Python or Julia? Python or SAS? Here is the honest answer, stated plainly so you can stop losing sleep over it. Learn Python. The debate is settled. Not because R is bad \u2014 it [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":815,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"ocean_post_layout":"","ocean_both_sidebars_style":"","ocean_both_sidebars_content_width":0,"ocean_both_sidebars_sidebars_width":0,"ocean_sidebar":"","ocean_second_sidebar":"","ocean_disable_margins":"enable","ocean_add_body_class":"","ocean_shortcode_before_top_bar":"","ocean_shortcode_after_top_bar":"","ocean_shortcode_before_header":"","ocean_shortcode_after_header":"","ocean_has_shortcode":"","ocean_shortcode_after_title":"","ocean_shortcode_before_footer_widgets":"","ocean_shortcode_after_footer_widgets":"","ocean_shortcode_before_footer_bottom":"","ocean_shortcode_after_footer_bottom":"","ocean_display_top_bar":"default","ocean_display_header":"default","ocean_header_style":"","ocean_center_header_left_menu":"","ocean_custom_header_template":"","ocean_custom_logo":0,"ocean_custom_retina_logo":0,"ocean_custom_logo_max_width":0,"ocean_custom_logo_tablet_max_width":0,"ocean_custom_logo_mobile_max_width":0,"ocean_custom_logo_max_height":0,"ocean_custom_logo_tablet_max_height":0,"ocean_custom_logo_mobile_max_height":0,"ocean_header_custom_menu":"","ocean_menu_typo_font_family":"","ocean_menu_typo_font_subset":"","ocean_menu_typo_font_size":0,"ocean_menu_typo_font_size_tablet":0,"ocean_menu_typo_font_size_mobile":0,"ocean_menu_typo_font_size_unit":"px","ocean_menu_typo_font_weight":"","ocean_menu_typo_font_weight_tablet":"","ocean_menu_typo_font_weight_mobile":"","ocean_menu_typo_transform":"","ocean_menu_typo_transform_tablet":"","ocean_menu_typo_transform_mobile":"","ocean_menu_typo_line_height":0,"ocean_menu_typo_line_height_tablet":0,"ocean_menu_typo_line_height_mobile":0,"ocean_menu_typo_line_height_unit":"","ocean_menu_typo_spacing":0,"ocean_menu_typo_spacing_tablet":0,"ocean_menu_typo_spacing_mobile":0,"ocean_menu_typo_spacing_unit":"","ocean_menu_link_color":"","ocean_menu_link_color_hover":"","ocean_menu_link_color_active":"","ocean_menu_link_background":"","ocean_menu_link_hover_background":"","ocean_menu_link_active_background":"","ocean_menu_social_links_bg":"","ocean_menu_social_hover_links_bg":"","ocean_menu_social_links_color":"","ocean_menu_social_hover_links_color":"","ocean_disable_title":"default","ocean_disable_heading":"default","ocean_post_title":"","ocean_post_subheading":"","ocean_post_title_style":"","ocean_post_title_background_color":"","ocean_post_title_background":0,"ocean_post_title_bg_image_position":"","ocean_post_title_bg_image_attachment":"","ocean_post_title_bg_image_repeat":"","ocean_post_title_bg_image_size":"","ocean_post_title_height":0,"ocean_post_title_bg_overlay":0.5,"ocean_post_title_bg_overlay_color":"","ocean_disable_breadcrumbs":"default","ocean_breadcrumbs_color":"","ocean_breadcrumbs_separator_color":"","ocean_breadcrumbs_links_color":"","ocean_breadcrumbs_links_hover_color":"","ocean_display_footer_widgets":"default","ocean_display_footer_bottom":"default","ocean_custom_footer_template":"","ocean_post_oembed":"","ocean_post_self_hosted_media":"","ocean_post_video_embed":"","ocean_link_format":"","ocean_link_format_target":"self","ocean_quote_format":"","ocean_quote_format_link":"post","ocean_gallery_link_images":"on","ocean_gallery_id":[],"footnotes":""},"categories":[71],"tags":[],"class_list":["post-801","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science","entry","has-media"],"acf":[],"_links":{"self":[{"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/posts\/801","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/comments?post=801"}],"version-history":[{"count":2,"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/posts\/801\/revisions"}],"predecessor-version":[{"id":912,"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/posts\/801\/revisions\/912"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/media\/815"}],"wp:attachment":[{"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/media?parent=801"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/categories?post=801"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techpaathshala.com\/blog\/wp-json\/wp\/v2\/tags?post=801"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}