18 must read books on R and Python for data scientists

Data Science   |   
Published September 10, 2018   |   

Some people learn better by doing, while others learn easier by reading. When the topic of study is coding, you have to learn any way you can. Starting with the two most popular programming languages helps, but those new to the practice will have to find the easiest ways to learn for them. Thankfully, there are plenty of books on the matter.
First off, R and Python are programming languages for statistics. R is mostly used by and for statisticians and can get really detailed. Python is easy to learn and understand for anyone and is a great language to start with. In this list of recommended books, we’re going to start R and then move on to Python for a more well-rounded approach.

1. R Cookbook

Written by Paul Teetor, our first book helps people overcome daily struggles in data preprocessing, spinning tips as recipes. The topics covered range from probability to statistics to time series analysis. Teetor’s R Cookbook starts at $30.

2. Practical Data Science With R

Written by Nina Zumel and John Mount, this book teaches how to use data science methods in the real world. The book covers real-world challenges in model building, deployment and other areas. Zumel and Mount’s Practical Data Science With R starts at $12.

3. Hands-on Programming With R

Written by Garrett Grolemund, this selection is great for beginners. This book teaches how to write functions and loops in R and how to assemble and disassemble data objects. Explanations are clear, and examples can be reproduced. Grolemund’s Hands-on Programming With R starts at $23.

4. Advanced R

Written by Hadley Wickham, one of the most well-known authors in the field, this book is for intermediate and advanced R users. The selection teaches you the fundamentals of data types, as well as how functional programming can be used to solve bigger problems. Wickham’s Advanced R starts at $30.

5. R Graphics Cookbook

Written by Winston Chang, this selection covers graphs and data manipulation. Chang goes into detail and covers a broad range of topics. Chang’s R Graphics Cookbook starts at $34.

6. Text Mining With R: A Tidy Approach

Written by Julia Silge and David Robinson, this book is another great one for beginners to learn how to mine text data. The book also lets you practice tidyverse principles in text datasets. There are many examples of using R to explore literature, news and other data sources. Silge and Robinson’s Text Mining With R starts at $20.

7. Data Analysis for the Life Sciences With R

Written by Rafael Irizarry and Michael Love, this book helps those interested in data and statistical analysis with R in life sciences. The methods described in this selection are suited for modern data science. Irizarry and Love’s Data Analysis for the Life Sciences With R is available for free.

8. Fundamentals of Data Visualization

Written by Claus Wilke, a professor from UT Austin, the book is a work in progress. This selection focuses primarily on data visualization. While the book will be published by O’Reilly in the future, an early version of Fundamentals of Data Visualization is available for free.

9. The Art of R Programming

Written by Norman Matloff, this book focuses on software development. While going from basic data types to advanced topics, readers don’t need statistical knowledge to begin. Matloff’s The Art of R Programming starts at $10.

10. Advanced Machine Learning With Python

Written by John Hearty, this first book about Python is for machine learning enthusiasts. You’ll learn unsupervised methods, auto-encoders, engineering techniques and much more. Hearty’s Advanced Machine Learning With Python starts at $40.

11. Building Machine Learning Systems With Python

Written by Willi Richert and Luis Pedro Coelho, this book covers the basics with more advanced techniques. Great for new people, this selection also covers image processing, sentiment analysis and more. Richert and Coelho’s Building Machine Learning Systems With Python starts at $33.

12. Programming Collective Intelligence

Written by Toby Segaran, this selection introduces the reader to several ML algorithms like SVM, trees, clustering and others. This is another excellent book for beginners wanting to learn more about ML. Segaran’s Programming Collective Intelligence starts at $26.

13. Python Machine Learning

Written by Sebastian Raschka, this is one of the most comprehensive books on ML in Python. There are detailed explanations about machine learning, neural networks, clustering and other topics. Raschka’s Python Machine Learning starts at $10.

14. Introduction to Machine Learning With Python

Written by Sarah Guido and Andreas Muller, this book is more of an introduction to machine learning. It helps teach supervised and unsupervised learning algorithms. Guido and Muller’s Introduction to Machine Learning With Python starts at $24.

15. Python for Data Analysis

Written by Wes McKinney, this book is a comprehensive guide on data analysis. Topics include manipulation, processing, cleaning and other ways to work with data. McKinney’s Python for Data Analysis starts at $36.

16. Mastering Python for Data Science

Written by Samir Madhavan, this selection is another introduction to data structures in Numpy and Pandas. There are useful descriptions for importing data. The book also teaches how to perform linear algebra in Python. Madhavan’s Mastering Python for Data Science starts at $18.

17. Python Cookbook

Written by David Beazley and Brian Jones, this book with an awkward name is comprehensive and detailed. The book provides examples of idiomatic Python 3 code and explains why and how coding works. Beazley and Jones’ Python Cookbook starts at $20.

18. Learning With Python

Written by Allen Downey, Jeffrey Elkner and Chris Meyers, this book is great for beginners curious about the Python language. Topics include general computer science, and the book takes a more formal approach. Downey, Elkner and Meyer’s Learning With Python is available for free.
These books come highly recommended across multiple sources, so you’ve likely heard of them before if you’ve been working with R or Python for a while. You may need to research to find resources specifically tailored to your situation and what aspect you’re struggling with. This list covers broad topics, so don’t be afraid to dig a little deeper.