Bad Data Handbook

Bad Data Handbook

Welcome to data science's dirty secret: real-world data is messy. Data scientists must spend a good deal of time playing software developer, writing code to clean up data before they can actually do anything constructive with it. This is a necessary evil, but we can still make the most of it.



Welcome to data science's dirty secret: real-world data is messy. Data scientists must spend a good deal of time playing software developer, writing code to clean up data before they can actually do anything constructive with it. It's a necessary evil, but you can still make the most of it. This practical book walks you through several real-world examples to demonstrate the theory and practice behind working with and cleaning up dirty data. No one tool solves all of the problems well. Wise data scientists learn many tools and learn where each one shines. To that end, this book takes a polyglot approach: most examples will involve R and Python, but expect the occasional smattering of Groovy and sed/awk fun.

Auteur | Q. Ethan Mccallum
Taal | Engels
Type | Paperback
Categorie | Computers & Informatica

Kijk verder

Boekomslag voor ISBN: 9789053567104
Boekomslag voor ISBN: 9783038007340
Boekomslag voor ISBN: 9780300060782
Boekomslag voor ISBN: 9789027447197
Boekomslag voor ISBN: 9789080522749
Boekomslag voor ISBN: 9789462581159


Boekn ©