Articles liés à LEARN APACHE SPARK: Build Scalable Pipelines with PySpark...

LEARN APACHE SPARK: Build Scalable Pipelines with PySpark and Optimization - Couverture souple

 
9798289704603: LEARN APACHE SPARK: Build Scalable Pipelines with PySpark and Optimization

Synopsis

LEARN APACHE SPARK Build Scalable Pipelines with PySpark and Optimization

This book is designed for students, developers, data engineers, data scientists, and technology professionals who want to master Apache Spark in practice, in corporate environments, public cloud, and modern integrations.

You will learn to build scalable pipelines for large-scale data processing, orchestrating distributed workloads with AWS EMR, Databricks, Azure Synapse, and Google Cloud Dataproc. The content covers integration with Hadoop, Hive, Kafka, SQL, Delta Lake, MongoDB, and Python, as well as advanced techniques in tuning, job optimization, real-time analysis, machine learning with MLlib, and workflow automation.

Includes:

• Implementation of ETL and ELT pipelines with Spark SQL and DataFrames

• Data streaming processing and integration with Kafka and AWS Kinesis

• Optimization of distributed jobs, performance tuning, and use of Spark UI

• Integration of Spark with S3, Data Lake, NoSQL, and relational databases

• Deployment on managed clusters in AWS, Azure, and Google Cloud

• Applied Machine Learning with MLlib, Delta Lake, and Databricks

• Automation of routines, monitoring, and scalability for Big Data

By the end, you will master Apache Spark as a professional solution for data analysis, process automation, and machine learning in complex, high-performance environments.

Content reviewed by A.I. with technical supervision.

apache spark, big data, pipelines, distributed processing, aws emr, databricks, streaming, etl, machine learning, cloud integration Google Data Engineer, AWS Data Analytics, Azure Data Engineer, Big Data Engineer, MLOps, DataOps Professional

Les informations fournies dans la section « Synopsis » peuvent faire référence à une autre édition de ce titre.

Résultats de recherche pour LEARN APACHE SPARK: Build Scalable Pipelines with PySpark...

Image d'archives

Rodrigues, Diego; Smart Tech Content, StudioD21
Edité par Independently published, 2025
ISBN 13 : 9798289704603
Neuf Couverture souple
impression à la demande

Vendeur : California Books, Miami, FL, Etats-Unis

Évaluation du vendeur 5 sur 5 étoiles Evaluation 5 étoiles, En savoir plus sur les évaluations des vendeurs

Etat : New. Print on Demand. N° de réf. du vendeur I-9798289704603

Contacter le vendeur

Acheter neuf

EUR 17,57
Autre devise
Frais de port : Gratuit
Vers Etats-Unis
Destinations, frais et délais

Quantité disponible : Plus de 20 disponibles

Ajouter au panier

Image d'archives

Rodrigues, Diego; Smart Tech Content, StudioD21
Edité par Independently published, 2025
ISBN 13 : 9798289704603
Neuf Couverture souple

Vendeur : Best Price, Torrance, CA, Etats-Unis

Évaluation du vendeur 5 sur 5 étoiles Evaluation 5 étoiles, En savoir plus sur les évaluations des vendeurs

Etat : New. SUPER FAST SHIPPING. N° de réf. du vendeur 9798289704603

Contacter le vendeur

Acheter neuf

EUR 11,53
Autre devise
Frais de port : EUR 7,66
Vers Etats-Unis
Destinations, frais et délais

Quantité disponible : 2 disponible(s)

Ajouter au panier

Image d'archives

Studiod21 Smart Tech Content
Edité par Independently Published, 2025
ISBN 13 : 9798289704603
Neuf Paperback

Vendeur : Grand Eagle Retail, Mason, OH, Etats-Unis

Évaluation du vendeur 5 sur 5 étoiles Evaluation 5 étoiles, En savoir plus sur les évaluations des vendeurs

Paperback. Etat : new. Paperback. LEARN APACHE SPARK Build Scalable Pipelines with PySpark and OptimizationThis book is designed for students, developers, data engineers, data scientists, and technology professionals who want to master Apache Spark in practice, in corporate environments, public cloud, and modern integrations.You will learn to build scalable pipelines for large-scale data processing, orchestrating distributed workloads with AWS EMR, Databricks, Azure Synapse, and Google Cloud Dataproc. The content covers integration with Hadoop, Hive, Kafka, SQL, Delta Lake, MongoDB, and Python, as well as advanced techniques in tuning, job optimization, real-time analysis, machine learning with MLlib, and workflow automation.Includes: - Implementation of ETL and ELT pipelines with Spark SQL and DataFrames- Data streaming processing and integration with Kafka and AWS Kinesis- Optimization of distributed jobs, performance tuning, and use of Spark UI- Integration of Spark with S3, Data Lake, NoSQL, and relational databases- Deployment on managed clusters in AWS, Azure, and Google Cloud- Applied Machine Learning with MLlib, Delta Lake, and Databricks- Automation of routines, monitoring, and scalability for Big DataBy the end, you will master Apache Spark as a professional solution for data analysis, process automation, and machine learning in complex, high-performance environments.apache spark, big data, pipelines, distributed processing, aws emr, databricks, streaming, etl, machine learning, cloud integration Google Data Engineer, AWS Data Analytics, Azure Data Engineer, Big Data Engineer, MLOps, DataOps Professional Shipping may be from multiple locations in the US or from the UK, depending on stock availability. N° de réf. du vendeur 9798289704603

Contacter le vendeur

Acheter neuf

EUR 19,71
Autre devise
Frais de port : Gratuit
Vers Etats-Unis
Destinations, frais et délais

Quantité disponible : 1 disponible(s)

Ajouter au panier

Image d'archives

Studiod21 Smart Tech Content
Edité par Independently Published, 2025
ISBN 13 : 9798289704603
Neuf Paperback

Vendeur : CitiRetail, Stevenage, Royaume-Uni

Évaluation du vendeur 5 sur 5 étoiles Evaluation 5 étoiles, En savoir plus sur les évaluations des vendeurs

Paperback. Etat : new. Paperback. LEARN APACHE SPARK Build Scalable Pipelines with PySpark and OptimizationThis book is designed for students, developers, data engineers, data scientists, and technology professionals who want to master Apache Spark in practice, in corporate environments, public cloud, and modern integrations.You will learn to build scalable pipelines for large-scale data processing, orchestrating distributed workloads with AWS EMR, Databricks, Azure Synapse, and Google Cloud Dataproc. The content covers integration with Hadoop, Hive, Kafka, SQL, Delta Lake, MongoDB, and Python, as well as advanced techniques in tuning, job optimization, real-time analysis, machine learning with MLlib, and workflow automation.Includes: - Implementation of ETL and ELT pipelines with Spark SQL and DataFrames- Data streaming processing and integration with Kafka and AWS Kinesis- Optimization of distributed jobs, performance tuning, and use of Spark UI- Integration of Spark with S3, Data Lake, NoSQL, and relational databases- Deployment on managed clusters in AWS, Azure, and Google Cloud- Applied Machine Learning with MLlib, Delta Lake, and Databricks- Automation of routines, monitoring, and scalability for Big DataBy the end, you will master Apache Spark as a professional solution for data analysis, process automation, and machine learning in complex, high-performance environments.apache spark, big data, pipelines, distributed processing, aws emr, databricks, streaming, etl, machine learning, cloud integration Google Data Engineer, AWS Data Analytics, Azure Data Engineer, Big Data Engineer, MLOps, DataOps Professional Shipping may be from our UK warehouse or from our Australian or US warehouses, depending on stock availability. N° de réf. du vendeur 9798289704603

Contacter le vendeur

Acheter neuf

EUR 20,07
Autre devise
Frais de port : EUR 42,43
De Royaume-Uni vers Etats-Unis
Destinations, frais et délais

Quantité disponible : 1 disponible(s)

Ajouter au panier