Inicio > > Estilos de vida digital > Guías de Internet y servicios en línea > Similarity Joins in Relational Database Systems
Similarity Joins in Relational Database Systems

Similarity Joins in Relational Database Systems

Michael Bohlen / Nikolaus Augsten

48,19 €
IVA incluido
Disponible
Editorial:
Springer Nature B.V.
Año de edición:
2013
Materia
Guías de Internet y servicios en línea
ISBN:
9783031007231
48,19 €
IVA incluido
Disponible

Selecciona una librería:

  • Librería Samer Atenea
  • Librería Aciertas (Toledo)
  • Kálamo Books
  • Librería Perelló (Valencia)
  • Librería Elías (Asturias)
  • Donde los libros
  • Librería Kolima (Madrid)
  • Librería Proteo (Málaga)

State-of-the-art database systems manage and process a variety of complex objects, including strings and trees. For such objects equality comparisons are often not meaningful and must be replaced by similarity comparisons. This book describes the concepts and techniques to incorporate similarity into database systems. We start out by discussing the properties of strings and trees, and identify the edit distance as the de facto standard for comparing complex objects. Since the edit distance is computationally expensive, token-based distances have been introduced to speed up edit distance computations. The basic idea is to decompose complex objects into sets of tokens that can be compared efficiently. Token-based distances are used to compute an approximation of the edit distance and prune expensive edit distance calculations. A key observation when computing similarity joins is that many of the object pairs, for which the similarity is computed, are very different from each other. Filters exploit this property to improve the performance of similarity joins. A filter preprocesses the input data sets and produces a set of candidate pairs. The distance function is evaluated on the candidate pairs only. We describe the essential query processing techniques for filters based on lower and upper bounds. For token equality joins we describe prefix, size, positional and partitioning filters, which can be used to avoid the computation of small intersections that are not needed since the similarity would be too low.

Artículos relacionados

  • Transformation of Knowledge, Information and Data
    Patrick Van Bommel
    ...
  • Advanced Geospatial Practices in Natural Environment Resource Management
    Today, the relentless depletion of natural resources has reached a critical juncture, demanding innovative solutions. Advanced Geospatial Practices in Natural Environment Resource Management dives into the intricate tapestry of issues jeopardizing ecosystems. This book systematically dissects the fundamental drivers, traces the historical evolution, and elucidates the underlyin...
  • Advanced Geospatial Practices in Natural Environment Resource Management
    Today, the relentless depletion of natural resources has reached a critical juncture, demanding innovative solutions. Advanced Geospatial Practices in Natural Environment Resource Management dives into the intricate tapestry of issues jeopardizing ecosystems. This book systematically dissects the fundamental drivers, traces the historical evolution, and elucidates the underlyin...
    Disponible

    274,88 €

  • Accelerate Model Training with PyTorch 2.X
    Maicon Melo Alves
    Dramatically accelerate the building process of complex models using PyTorch to extract the best performance from any computing environmentKey Features- Reduce the model-building time by applying optimization techniques and approaches- Harness the computing power of multiple devices and machines to boost the training process- Focus on model quality by quickly evaluating differe...
    Disponible

    64,00 €

  • Information Theory for Data Science
    Changho Suh
    Information theory deals with mathematical laws that govern the flow, representation and transmission of information, just as the field of physics concerns laws that govern the behavior of the physical universe. The foundation was made in the context of communication while characterizing the fundamental limits of communication and offering codes (sometimes called algorithms) to...
  • Theory of Decision Under Uncertainty
    Itzhak Gilboa
    ...
    Disponible

    49,52 €