Using OpenRefine

Using OpenRefine

Using Openrefine

Max De Wilde / Ruben Verborgh

54,87 €
IVA incluido
Disponible
Editorial:
Packt Publishing
Año de edición:
2013
Materia
Programación informática/desarrollo de software
ISBN:
9781783289080
54,87 €
IVA incluido
Disponible

Selecciona una librería:

  • Librería Samer Atenea
  • Librería Aciertas (Toledo)
  • Kálamo Books
  • Librería Perelló (Valencia)
  • Librería Elías (Asturias)
  • Donde los libros
  • Librería Kolima (Madrid)
  • Librería Proteo (Málaga)

With this book on OpenRefine, managing and cleaning your large datasets suddenly got a lot easier! With a cookbook approach and free datasheets included, you’ll quickly and painlessly improve your data managing capabilities.Key FeaturesCreate links between your dataset and others in an instantEffectively transform data with regular expressions and the General Refine Expression LanguageSpot issues in your dataset and take effective action with just a few clicksBook DescriptionData today is like gold - but how can you manage your most valuable assets? Managing large datasets used to be a task for specialists, but the game has changed - data analysis is an open playing field. Messy data is now in your hands! With OpenRefine the task is a little easier, as it provides you with the necessary tools for cleaning and presenting even the most complex data. Once it’s clean, that’s when you can start finding value.Using OpenRefine takes you on a practical and actionable through this popular data transformation tool. Packed with cookbook style recipes that will help you properly get to grips with data, this book is an accessible tutorial for anyone that wants to�maximize the value of their data.This book will teach you all the necessary skills to handle any large dataset and to turn it into high-quality data for the Web. After you learn how to analyze data and spot issues, we’ll see how we can solve them to obtain a clean dataset. Messy and inconsistent data is recovered through advanced techniques such as automated clustering. We’ll then show extract links from keyword and full-text fields using reconciliation and named-entity extraction.What you will learnImport data in various formatsExplore datasets in a matter of secondsApply basic and advanced cell transformationsDeal with cells that contain multiple valuesCreate instantaneous links between datasetsFilter and partition your data easily with regular expressionsUse named-entity extraction on full-text fields to automatically identify topicsPerform advanced data operations with the General Refine Expression LanguageWho this book is forThis book is targeted at anyone who works on or handles a large amount of data. No prior knowledge is required, as we start from the very beginning and gradually reveal more advanced features. You don’t even need your own dataset, as we provide example data to try out the book’s recipes.

Artículos relacionados

  • SPARK 2014 Reference Manual
    AdaCore / Altran UK Ltd
    SPARK 2014 is a programming language and a set of verification tools designed to meet the needs of high-assurance software development. SPARK 2014 is based on Ada 2012, both subsetting the language to remove features that defy verification, but also extending the system of contracts and aspects to support modular, formal verification.This manual is available online for free at ...
    Disponible

    19,91 €

  • Software and Intelligent Sciences
    Yingxu Wang
    The junction of software development and engineering combined with the study of intelligence has created a bustling intersection of theory, design, engineering, and conceptual thought. Software and Intelligent Sciences: New Transdisciplinary Findings sits at a crossroads and informs advanced researchers, students, and practitioners on the developments in computer science, theor...
  • Power System Planning Technologies and Applications
    Fawwaz Elkarmi / Nazih Abu Shikhah / Nazih Abu-Shikhah
    Planning is an important function of the management of any business, providing knowledge of future prospects and enabling prudent and appropriate decision-making. Planning is especially critical for power systems, since electricity is a fundamental part of modern societies and many conventional electrical energy resources currently in use are limited. Power System Planning Tech...
  • Concept Parsing Algorithms (CPA) for Textual Analysis and Discovery
    Masha Etkind / Uri Shafrir
    Text analysis tools aid in extracting meaning from digital content. As digital text becomes more and more complex, new techniques are needed to understand conceptual structure. Concept Parsing Algorithms (CPA) for Textual Analysis and Discovery: Emerging Research and Opportunities provides an innovative perspective on the application of algorithmic tools to study unstructured d...
  • Model-Based Design for Effective Control System Development
    Wei Wu
    Control systems are an integral aspect of modern society and exist across numerous domains and applications. As technology advances more and more, the complexity of such systems continues to increase exponentially. Model-Based Design for Effective Control System Development is a critical source of scholarly information on model-centric approaches and implementations for control...
  • Verification, Validation and Testing in Software Engineering
    ...