Articles liés à PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes - Couverture souple

 
9781484243367: PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

L'édition de cet ISBN n'est malheureusement plus disponible.

Synopsis

Chapter 1:  Introduction to PySparkSQL

Chapter Goal: Reader will  understand about PySpark, PySparkSQL , Catalyst Optimizer, Project Tungsten and Hive

No of pages                   20-30

Sub -Topics

1.      PySpark

2.      PySparkSQL

3.      Hive

4.      Catalyst

5.      Project Tungsten

 

Chapter 2:  Some time with Installation

Chapter Goal: Learner will understand about installation of Spark, Hive, PostgreSQL, MySQL, MongoDB, Cassandra etc.

No of pages: 30 -40

Sub - Topics                 

1.       Installation Spark

2.      Installation Hive

3.      Installation MySQL

4.      Installation MongoDB

Chapter 3:  IO in PySparkSQL

Chapter Goal: This chapter will provide recipes to the reader, which will  enable them to create PySparkSQL DataFrame from different sources.

No of pages : 40-50

Sub - Topics:                

1.      Creating DataFrame from data.

2.      Reading csv file to create Dataframe

3.  Reading JSON file to create Dataframe.

4.  Saving  DataFrames to different formats.

 

Chapter 4 :  Operations on PySparkSQL DataFrames

Chapter Goal:               Reader will learn about data filtering, data manuipulation, data descriptive analysis , Dealing with missing value etc

No Of Pages ; 40 -50

1.      Data filtering

2.      Data manipulation

3.      Row and column manipulation

 

Chapter 5 :  Data Merging and Data Aggregation using PySparkSQL

Chapter Goal: Reader will learn about data merging and aggregation using PySparkSQL

1.      Data Merging

2.      Data aggregation

 

Chapter 6: SQL, NoSQL and PySparkSQL

Chapter Goal: Reader will learn to run SQL and HiveQL queries on Dataframe

No of pages: 30-40

Sub - Topics:

1. Running SQL on DataFrame

2. Running HiveQL

 

Chapter 7: Structured Streaming

Chapter Goal:               Reader will understand about structured streaming

No of pages : 30-40

1.      Different type of modes.

2.      Data aggregation in structured streaming

3.      Different type of sources

 

 

 

 

Chapter 8 : Optimizing PySparkSQL

Chapter Goal:               Reader will learn about optimizing PySparkSQL

No Of pages  : 20-30

Optimizing PySparkSQL

 

 

 

Chapter 9 : GraphFrames

Chapter Goal:               Reader will understand about graph data analysis with Graphframes. 

No of pages : 30-40

1. GraphFrame Creat

Les informations fournies dans la section « Synopsis » peuvent faire référence à une autre édition de ce titre.

(Aucun exemplaire disponible)

Chercher:



Créez une demande

Vous ne trouvez pas le livre que vous recherchez ? Nous allons poursuivre vos recherches. Si l'un de nos libraires l'ajoute aux offres sur AbeBooks, nous vous le ferons savoir !

Créez une demande

Autres éditions populaires du même titre

9781484243343: PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

Edition présentée

ISBN 10 :  148424334X ISBN 13 :  9781484243343
Editeur : Apress, 2019
Couverture souple