Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools - Couverture souple

Mertz ; David

9781801071291: Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Couverture souple

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Editeur : Packt Publishing, 2021

Afficher les exemplaires de cette �dition comportant l'ISBN

7 D'occasion

De EUR 17,02

17 Neuf

De EUR 42,59

A comprehensive guide for data scientists to master effective data cleaning tools and techniques

Key Features

Think about your data intelligently and ask the right questions
Master data cleaning techniques using hands-on examples belonging to diverse domains
Work with detailed, commented, well-tested code samples in Python and R

Book Description

In data science, data analysis, or machine learning, most of the effort needed to achieve your actual purpose lies in cleaning your data. Using Python, R, and command-line tools, you will learn the essential cleaning steps performed in every production data science or data analysis pipeline. This book not only teaches you data preparation but also what questions you should ask of your data.

The book dives into the practical application of tools and techniques needed for data ingestion, anomaly detection, value imputation, and feature engineering. It also offers long-form exercises at the end of each chapter to practice the skills acquired.

You will begin by looking at data ingestion of a range of data formats. Moving on, you will impute missing values, detect unreliable data and statistical anomalies, and generate synthetic features that are necessary for successful data analysis and visualization goals.

By the end of this book, you will have acquired a firm understanding of the data cleaning process necessary to perform real-world data science and machine learning tasks.

What you will learn

Ingest and work with common tabular, hierarchical, and other data formats
Apply useful rules and heuristics for assessing data quality and detecting bias
Identify and handle unreliable data and outliers in their many forms
Impute sensible values into missing data and use sampling to fix imbalances
Generate synthetic features that help to draw out patterns in your data
Prepare data competently and correctly for analytic and machine learning tasks

Who this book is for

This book is designed to benefit software developers, data scientists, aspiring data scientists, and students who are interested in data analysis or scientific computing.

Basic familiarity with statistics, general concepts in machine learning, knowledge of a programming language (Python or R), and some exposure to data science are helpful.

The text will also be helpful to intermediate and advanced data scientists who want to improve their rigor in data hygiene and wish for a refresher on data preparation issues.

Data Ingestion – Tabular Formats
Data Ingestion - Hierarchical Formats
Data Ingestion - Repurposing Data Sources
The Vicissitudes of Error - Anomaly Detection
The Vicissitudes of Error - Data Quality
Rectification and Creation - Value Imputation
Rectification and Creation - Feature Engineering
Ancillary Matters - Closure/Glossary

Les informations fournies dans la section � Synopsis � peuvent faire r�f�rence � une autre �dition de ce titre.

� propos de l'auteur

David Mertz, Ph.D. is the founder of KDM Training, a partnership dedicated to educating developers and data scientists in machine learning and scientific computing. He created a data science training program for Anaconda Inc. and was a senior trainer for them. With the advent of deep neural networks, he has turned to training our robot overlords as well.

He previously worked for 8 years with D. E. Shaw Research and was also a Director of the Python Software Foundation for 6 years. David remains co-chair of its Trademarks Committee and Scientific Python Working Group. His columns, Charming Python and XML Matters, were once the most widely read articles in the Python world.

Les informations fournies dans la section � A propos du livre � peuvent faire r�f�rence � une autre �dition de ce titre.

�diteur: Packt Publishing
Date d'�dition: 2021
Langue: anglais
ISBN 10: 1801071292
ISBN 13: 9781801071291
Reliure: Broch�
Nombre de pages: 498
Coordonn�es du fabricant: non disponible
Personne responsable: gpsr@libri.de
gpsr@libri.de

Friedensallee 273
Hamburg
22763
Allemagne

Acheter D'occasion

�tat : Assez bon

Gently read. May have name of previous...

Afficher cet article

EUR 17,02

Livraison gratuite
Exp�dition nationale�: Etats-Unis

Ajouter au panier

Acheter neuf

Afficher cet article

EUR 42,59

Exp�dition �EUR 2,31
Exp�dition nationale�: Etats-Unis

Ajouter au panier

R�sultats de recherche pour Cleaning Data for Effective Data Science: Doing the...

Image fournie par le vendeur

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Mertz; David

Edit� par Packt Publishing, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Ancien ou d'occasion Couverture souple

Vendeur : -OnTimeBooks-, Phoenix, AZ, Etats-Unis

�valuation du vendeur 5 sur 5 �toiles

Etat : very_good. Gently read. May have name of previous ownership, or ex-library edition. Binding tight; spine straight and smooth, with no creasing; covers clean and crisp. Minimal signs of handling or shelving. 100% GUARANTEE! Shipped with delivery confirmation, if you're not satisfied with purchase please return item! Ships USPS Media Mail. N� de r�f. du vendeur OTV.1801071292.VG

Contacter le vendeur

Acheter D'occasion

EUR 17,02

Livraison gratuite
Exp�dition nationale�: Etats-Unis

Quantit� disponible : 1 disponible(s)

Ajouter au panier

Image fournie par le vendeur

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Mertz; David

Edit� par Packt Publishing, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Ancien ou d'occasion Couverture souple

Vendeur : Goodwill Books, Hillsboro, OR, Etats-Unis

�valuation du vendeur 5 sur 5 �toiles

Etat : acceptable. Fairly worn, but readable and intact. If applicable: Dust jacket, disc or access code may not be included. N� de r�f. du vendeur GICWV.1801071292.A

Contacter le vendeur

Acheter D'occasion

EUR 16,73

Exp�dition �EUR 3,50
Exp�dition nationale�: Etats-Unis

Quantit� disponible : 1 disponible(s)

Ajouter au panier

Image d'archives

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Mertz

Edit� par Packt Publishing, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Ancien ou d'occasion paperback

Vendeur : Textbooks_Source, Columbia, MO, Etats-Unis

�valuation du vendeur 5 sur 5 �toiles

paperback. Etat : Good. Ships in a BOX from Central Missouri! May not include working access code. Will not include dust jacket. Has used sticker(s) and some writing or highlighting. UPS shipping for most packages, (Priority Mail for AK/HI/APO/PO Boxes). N� de r�f. du vendeur 009781373U

Contacter le vendeur

Acheter D'occasion

EUR 22,17

Exp�dition �EUR 3,50
Exp�dition nationale�: Etats-Unis

Quantit� disponible : 1 disponible(s)

Ajouter au panier

Image d'archives

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Mertz; David

Edit� par Packt Publishing, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Ancien ou d'occasion Couverture souple

Vendeur : GreatBookPrices, Columbia, MD, Etats-Unis

�valuation du vendeur 5 sur 5 �toiles

Etat : As New. Unread book in perfect condition. N� de r�f. du vendeur 42642714

Contacter le vendeur

Acheter D'occasion

EUR 28,57

Exp�dition �EUR 2,31
Exp�dition nationale�: Etats-Unis

Quantit� disponible : 1 disponible(s)

Ajouter au panier

Image d'archives

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Mertz; David

Edit� par Packt Publishing, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Ancien ou d'occasion Couverture souple

Vendeur : GreatBookPrices, Columbia, MD, Etats-Unis

�valuation du vendeur 5 sur 5 �toiles

Etat : good. May show signs of wear, highlighting, writing, and previous use. This item may be a former library book with typical markings. No guarantee on products that contain supplements Your satisfaction is 100% guaranteed. Twenty-five year bookseller with shipments to over fifty million happy customers. N� de r�f. du vendeur 42642714-5

Contacter le vendeur

Acheter D'occasion

EUR 30,95

Exp�dition �EUR 2,31
Exp�dition nationale�: Etats-Unis

Quantit� disponible : 1 disponible(s)

Ajouter au panier

Image d'archives

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Mertz; David

Edit� par Packt Publishing, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Neuf Couverture souple

Vendeur : GreatBookPrices, Columbia, MD, Etats-Unis

�valuation du vendeur 5 sur 5 �toiles

Etat : New. N� de r�f. du vendeur 42642714-n

Contacter le vendeur

Acheter neuf

EUR 42,59

Exp�dition �EUR 2,31
Exp�dition nationale�: Etats-Unis

Quantit� disponible : 1 disponible(s)

Ajouter au panier

Image fournie par le vendeur

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools (Paperback or Softback)

Mertz, David

Edit� par Packt Publishing 3/31/2021, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Neuf Paperback or Softback

Vendeur : BargainBookStores, Grand Rapids, MI, Etats-Unis

�valuation du vendeur 5 sur 5 �toiles

Paperback or Softback. Etat : New. Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools. Book. N� de r�f. du vendeur BBS-9781801071291

Contacter le vendeur

Acheter neuf

EUR 44,98

Livraison gratuite
Exp�dition nationale�: Etats-Unis

Quantit� disponible : 5 disponible(s)

Ajouter au panier

Image d'archives

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Mertz; David

Edit� par Packt Publishing, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Neuf Couverture souple

Vendeur : California Books, Miami, FL, Etats-Unis

�valuation du vendeur 4 sur 5 �toiles

Etat : New. N� de r�f. du vendeur I-9781801071291

Contacter le vendeur

Acheter neuf

EUR 46,03

Livraison gratuite
Exp�dition nationale�: Etats-Unis

Quantit� disponible : Plus de 20 disponibles

Ajouter au panier

Image d'archives

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Mertz; David

Edit� par Packt Publishing, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Neuf Couverture souple

Vendeur : GreatBookPricesUK, Woodford Green, Royaume-Uni

�valuation du vendeur 5 sur 5 �toiles

Etat : New. N� de r�f. du vendeur 42642714-n

Contacter le vendeur

Acheter neuf

EUR 31,69

Exp�dition �EUR 17,41
Exp�dition depuis Royaume-Uni vers Etats-Unis

Quantit� disponible : Plus de 20 disponibles

Ajouter au panier

Image d'archives

Cleaning Data for Effective Data Science

David Mertz

Edit� par Packt Publishing Limited, 2021

ISBN 10 : 1801071292 ISBN 13 : 9781801071291

Neuf PAP

impression � la demande

Vendeur : PBShop.store UK, Fairford, GLOS, Royaume-Uni

�valuation du vendeur 5 sur 5 �toiles

PAP. Etat : New. New Book. Delivered from our UK warehouse in 4 to 14 business days. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. N� de r�f. du vendeur L0-9781801071291

Contacter le vendeur

Acheter neuf

EUR 48,95

Exp�dition �EUR 3,82
Exp�dition depuis Royaume-Uni vers Etats-Unis

Quantit� disponible : Plus de 20 disponibles

Ajouter au panier

There are 14 autres exemplaires de ce livre sont disponibles

Afficher tous les r�sultats pour ce livre

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools - Couverture souple

Synopsis

Key Features

Book Description

What you will learn

Who this book is for

Table of Contents

� propos de l'auteur

R�sultats de recherche pour Cleaning Data for Effective Data Science: Doing the...

Acheter D'occasion

Acheter D'occasion

Acheter D'occasion

Acheter D'occasion

Acheter D'occasion

Acheter neuf

Acheter neuf

Acheter neuf

Acheter neuf

Acheter neuf

There are 14 autres exemplaires de ce livre sont disponibles