Inicio > > Bases de datos > Diseño y teoría de bases de datos > Cleaning Data for Effective Data Science
Cleaning Data for Effective Data Science

Cleaning Data for Effective Data Science

David Mertz

65,67 €
IVA incluido
Disponible
Editorial:
Packt Publishing
Año de edición:
2021
Materia
Diseño y teoría de bases de datos
ISBN:
9781801071291
65,67 €
IVA incluido
Disponible

Selecciona una librería:

  • Librería Samer Atenea
  • Librería Aciertas (Toledo)
  • Kálamo Books
  • Librería Perelló (Valencia)
  • Librería Elías (Asturias)
  • Donde los libros
  • Librería Kolima (Madrid)
  • Librería Proteo (Málaga)

A comprehensive guide for data scientists to master effective data cleaning tools and techniquesKey Features:Think about your data intelligently and ask the right questionsMaster data cleaning techniques using hands-on examples belonging to diverse domainsWork with detailed, commented, well-tested code samples in Python and RBook Description:In data science, data analysis, or machine learning, most of the effort needed to achieve your actual purpose lies in cleaning your data. Using Python, R, and command-line tools, you will learn the essential cleaning steps performed in every production data science or data analysis pipeline. This book not only teaches you data preparation but also what questions you should ask of your data.The book dives into the practical application of tools and techniques needed for data ingestion, anomaly detection, value imputation, and feature engineering. It also offers long-form exercises at the end of each chapter to practice the skills acquired.You will begin by looking at data ingestion of a range of data formats. Moving on, you will impute missing values, detect unreliable data and statistical anomalies, and generate synthetic features that are necessary for successful data analysis and visualization goals.By the end of this book, you will have acquired a firm understanding of the data cleaning process necessary to perform real-world data science and machine learning tasks.What You Will Learn:Ingest and work with common tabular, hierarchical, and other data formatsApply useful rules and heuristics for assessing data quality and detecting biasIdentify and handle unreliable data and outliers in their many formsImpute sensible values into missing data and use sampling to fix imbalancesGenerate synthetic features that help to draw out patterns in your dataPrepare data competently and correctly for analytic and machine learning tasksWho this book is for:This book is designed to benefit software developers, data scientists, aspiring data scientists, and students who are interested in data analysis or scientific computing. Basic familiarity with statistics, general concepts in machine learning, knowledge of a programming language (Python or R), and some exposure to data science are helpful. The text will also be helpful to intermediate and advanced data scientists who want to improve their rigor in data hygiene and wish for a refresher on data preparation issues.

Artículos relacionados

  • Hands-On Machine Learning on Google Cloud Platform
    Alexis Perrier / Giuseppe Ciaburro / Kishore Ayyadevara
    Enhance your understanding of Computer Vision and image processing by developing real-world projects in OpenCV 3Key FeaturesGet to grips with the basics of Computer Vision and image processingThis is a step-by-step guide to developing several real-world Computer Vision projects using OpenCV 3This book takes a special focus on working with Tesseract OCR, a free, open-source libr...
    Disponible

    72,22 €

  • MLOps with Red Hat OpenShift
    Faisal Masood / Ross Brigoli
    Build and manage MLOps pipelines with this practical guide to using Red Hat OpenShift Data Science, unleashing the power of machine learning workflowsKey FeaturesGrasp MLOps and machine learning project lifecycle through concept introductionsGet hands on with provisioning and configuring Red Hat OpenShift Data ScienceExplore model training, deployment, and MLOps pipeline buildi...
    Disponible

    64,10 €

  • Data Labeling in Machine Learning with Python
    Vijaya Kumar Suda
    Take your data preparation, machine learning, and GenAI skills to the next level by learning a range of Python algorithms and tools for data labelingKey FeaturesGenerate labels for regression in scenarios with limited training dataApply generative AI and large language models (LLMs) to explore and label text dataLeverage Python libraries for image, video, and audio data analysi...
    Disponible

    86,16 €

  • Data Engineering with Scala and Spark
    David Radford / Eric Tome / Rupam Bhattacharjee
    Take your data engineering skills to the next level by learning how to utilize Scala and functional programming to create continuous and scheduled pipelines that ingest, transform, and aggregate dataKey Features- Transform data into a clean and trusted source of information for your organization using Scala- Build streaming and batch-processing pipelines with step-by-step expla...
    Disponible

    54,33 €

  • Mastering Snowflake Platform
    Pooja Kelgaonkar
    Embark on the data journey with the ultimate guide to Snowflake masteryDESCRIPTION Handling ever evolving data for business needs can get complex. Traditional methods create bulky and costly-to-maintain data systems. Here, Snowflake emerges as a cost-effective solution, catering to both traditional and modern data needs with zero or minimal maintenance costs.This book helps you...
    Disponible

    50,54 €

  • BASI DI DATI - PROGETTAZIONE, REALIZZAZIONE E PROGRAMMAZIONE
    Roberto Bandiera
    Il lettore viene guidato nelle diverse fasi della progettazione e realizzazione di un database relazionale.Nelle numerose esemplificazioni pratiche viene utilizzato MySQL come software di gestione database.Viene poi trattato il linguaggio SQL per interrogare ed aggiornare il database. Infine vengono presentate le tecniche e gli strumenti per realizzare una applicazione gestiona...
    Disponible

    34,65 €

Otros libros del autor

  • The Puzzling Quirks of Regular Expressions
    David Mertz
    This entertaining puzzle book, for software developers and programming hobbyists, enlightens readers about many of the amazing and surprising behaviors of regular expressions. The author presents a series of questions, each inviting readers to think at length--and indeed to try out code on their own--before turning the page for the author’s discussion and solution.The code sho...
    Disponible

    15,18 €