Python Data Analysis: Master Python Analytics with Machine Learning, Deep Learning, GenAI, LLMs, and Data Engineering - Couverture souple

Avinash Navlani; Cornellius Yudha Wijaya

 
9781806022878: Python Data Analysis: Master Python Analytics with Machine Learning, Deep Learning, GenAI, LLMs, and Data Engineering

Synopsis

Understand data analysis pipelines using Python Data Analysis, machine learning, pandas, scikit-learn, and data visualization techniques. Build scalable workflows for time series, NLP, image analytics, and big data processing.

Key Features

  • Prepare, clean, and transform data with Python, pandas, and exploratory data analysis techniques
  • Apply machine learning with Python using regression, classification, clustering, PCA, and Bayesian methods
  • Scale analytics workflows using Dask, Ray, Modin, and PySpark

Book Description

Modern data analysis goes beyond cleaning and visualizing data. Today's practitioners need to build scalable data pipelines, apply machine learning, work with text and image data, and understand emerging AI techniques such as Generative AI and Large Language Models (LLMs). This guide shows you how to tackle these challenges using Python's modern data ecosystem.

Unlike books focused on a single library or technique, this book provides an end-to-end approach to Python data analysis. You'll learn how to move from data preparation and exploratory analysis to machine learning, NLP, image analytics, scalable processing, and AI-powered workflows.

Starting with statistical foundations, you'll learn how to clean, transform, wrangle, and visualize data. You'll then explore time series analysis, signal processing, forecasting, and predictive analytics before applying machine learning techniques such as regression, classification, clustering, PCA, probabilistic methods, and Bayesian approaches.

The book also covers graph analytics, sentiment analysis, NLP, image analytics, Generative AI, and LLMs. Finally, you'll learn to scale analytics workflows using Dask, Modin, Ray, and PySpark.

By the end of the book, you'll be able to build end-to-end data analysis pipelines and apply modern data science and AI techniques to solve real-world challenges.

What you will learn

  • Prepare, clean, and transform data for exploratory data analysis and data wrangling
  • Analyze and visualize data using Python and pandas
  • Perform time series analysis, forecasting, and signal processing
  • Apply machine learning with Python using scikit-learn techniques
  • Use regression, classification, clustering, PCA, and Bayesian methods
  • Perform sentiment analysis, NLP, graph analytics, and image analytics
  • Accelerate workflows using Dask, Modin, and Ray
  • Build scalable big data analytics pipelines with PySpark

Who this book is for

This book is for data analysts, data scientists, business analysts, statisticians, students, and academic professionals who want to strengthen their Python Data Analysis skills. It is ideal for readers looking to apply data science with Python to real-world problems involving data preparation, visualization, machine learning, NLP, image analytics, and big data processing. A basic understanding of mathematics and working knowledge of Python will help you get the most from this book.

Table of Contents

  1. Getting Started with Python Libraries
  2. NumPy and Pandas
  3. Statistics for Data Insights
  4. Linear Algebra
  5. Data Visualization
  6. Retrieving, Processing, and Storing Data
  7. Cleaning Messy Data
  8. Time-Series Analysis
  9. Supervised Learning: Regression and Classification
  10. Unsupervised Learning: Dimensionality Reduction, Clustering, Anomaly Detection
  11. Ensemble Methods: Bagging and Boosting Methods
  12. Artificial Neural Networks and Deep Learning
  13. Analyzing Text Data
  14. Analyzing Image Data
  15. LLMs and Gen AI
  16. Parallel Computing Using Dask, Modin, and Ray
  17. Big Data Analytics using PySpark

Les informations fournies dans la section « Synopsis » peuvent faire référence à une autre édition de ce titre.

À propos de l'auteur

Avinash Navlani, PhD in Data Science, is a senior data scientist, researcher, and educator with 14 years of experience in data science, including 9 years in industry, 4 years in academia, and 1 year in research. He has developed machine learning models, optimization solutions, NLP systems, scalable data pipelines, and cloud-based MLOps platforms across healthcare, retail, finance, oil & gas, and manufacturing. His expertise includes Python, PySpark, Airflow, Databricks, Azure ML, MLflow, and Data Engineering. A former lecturer and speaker, he is passionate about applying analytics to solve real-world problems.

Cornellius Yudha Wijaya has over eight years of experience in data science, machine learning, and artificial intelligence. He currently works as a data scientist manager, where he leads AI initiatives, manages team members, and helps drive the development of practical data and AI solutions. Over the course of his career, he has worked across data science, AI product development, and technical education, with experience in building machine learning systems, supporting business decision-making, and making advanced analytics more usable in real-world settings. He has also written extensively on data science, Python, machine learning, and generative AI, with a strong focus on practical learning and applied problem-solving.

Les informations fournies dans la section « A propos du livre » peuvent faire référence à une autre édition de ce titre.