Amazon cover image
Image from Amazon.com
Syndetics cover image
Image from Syndetics

Small summaries for big data / Graham Cormode, University of Warwick, Ke Yi, Hong Kong University of Science and Technology.

By: Contributor(s): Material type: TextTextLanguage: English Cambridge : Cambridge University Press, 2021Description: 1 online resource (viii, 270 pages) : digital, PDF file(s)Content type:
  • text
Media type:
Carrier type:
  • online resource
ISBN:
  • 9781108769938 (ebook)
Subject(s): Additional physical formats: No titleDDC classification:
  • 5.7
LOC classification:
  • QA76.9.B45 C67 2021
Online resources: Summary: The massive volume of data generated in modern applications can overwhelm our ability to conveniently transmit, store, and index it. For many scenarios, building a compact summary of a dataset that is vastly smaller enables flexibility and efficiency in a range of queries over the data, in exchange for some approximation. This comprehensive introduction to data summarization, aimed at practitioners and students, showcases the algorithms, their behavior, and the mathematical underpinnings of their operation. The coverage starts with simple sums and approximate counts, building to more advanced probabilistic structures such as the Bloom Filter, distinct value summaries, sketches, and quantile summaries. Summaries are described for specific types of data, such as geometric data, graphs, and vectors and matrices. The authors offer detailed descriptions of and pseudocode for key algorithms that have been incorporated in systems from companies such as Google, Apple, Microsoft, Netflix and Twitter.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)

Title from publisher's bibliographic system (viewed on 29 Oct 2020).

The massive volume of data generated in modern applications can overwhelm our ability to conveniently transmit, store, and index it. For many scenarios, building a compact summary of a dataset that is vastly smaller enables flexibility and efficiency in a range of queries over the data, in exchange for some approximation. This comprehensive introduction to data summarization, aimed at practitioners and students, showcases the algorithms, their behavior, and the mathematical underpinnings of their operation. The coverage starts with simple sums and approximate counts, building to more advanced probabilistic structures such as the Bloom Filter, distinct value summaries, sketches, and quantile summaries. Summaries are described for specific types of data, such as geometric data, graphs, and vectors and matrices. The authors offer detailed descriptions of and pseudocode for key algorithms that have been incorporated in systems from companies such as Google, Apple, Microsoft, Netflix and Twitter.

There are no comments on this title.

to post a comment.
Share
This system is made operational by the in-house staff of the CUP Library.