Computer ScienceScience & MathematicsEconomics & FinanceBusiness & ManagementPolitics & GovernmentHistoryPhilosophy

R for Data Science

Import, Tidy, Transform, Visualize, and Model Data

by Garrett Grolemund, Hadley Wickham

R for Data Science

Subscribe to new books via telegram channel

DescriptionTable of ContentsDetailsHashtagsReport an issue

Book Description

Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible.

Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You'll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you've learned along the way.

You'll learn how to:
- Wrangle: transform your datasets into a form convenient for analysis;
- Program: learn powerful R tools for solving data problems with greater clarity and ease;
- Explore: examine your data, generate hypotheses, and quickly test them;
- Model: provide a low-dimensional summary that captures true "signals" in your dataset;
- Communicate: learn R Markdown for integrating prose, code, and results.

This open book is licensed under a Creative Commons License (CC BY-NC-ND). Free download in PDF format is not available. You can read R for Data Science book online for free.

Table of Contents

Chapter 1
Chapter 2
Chapter 3
Data visualisation
Chapter 4
Workflow: basics
Chapter 5
Data transformation
Chapter 6
Workflow: scripts
Chapter 7
Exploratory Data Analysis
Chapter 8
Workflow: projects
Chapter 9
Chapter 10
Chapter 11
Data import
Chapter 12
Tidy data
Chapter 13
Relational data
Chapter 14
Chapter 15
Chapter 16
Dates and times
Chapter 17
Chapter 18
Chapter 19
Chapter 20
Chapter 21
Chapter 22
Chapter 23
Model basics
Chapter 24
Model building
Chapter 25
Many models
Chapter 26
Chapter 27
R Markdown
Chapter 28
Graphics for communication
Chapter 29
R Markdown formats
Chapter 30
R Markdown workflow

Book Details

R for Data Science
Computer Science
O'Reilly Media
ISBN13 Digital
ISBN10 Digital

Related Books

Regression Models for Data Science in R
The ideal reader for this book will be quantitatively literate and has a basic understanding of statistical concepts and R programming. The student should have a basic understanding of statistical inference such as contained in "Statistical inference for data science". The book gives a rigorous treatment of the elementary concepts of regr...
The Data Science Design Manual
This engaging and clearly written textbook/reference provides a must-have introduction to the rapidly emerging interdisciplinary field of data science. It focuses on the principles fundamental to becoming a good data scientist and the key skills needed to build systems for collecting, analyzing, and interpreting data. The Data Science Design Manual...
Data Science with Microsoft SQL Server 2016
R is one of the most popular, powerful data analytics languages and environments in use by data scientists. Actionable business data is often stored in Relational Database Management Systems (RDBMS), and one of the most widely used RDBMS is Microsoft SQL Server. Much more than a database server, it's a rich ecostructure with advanced analytic ...
What Is Data Science?
We've all heard it: according to Hal Varian, statistics is the next sexy job. Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement mean? Why do we suddenly care about statistics and about data? This report examines the many sides of data science - the technologi...
Data Science at the Command Line
This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packe...
IPython Interactive Computing and Visualization Cookbook
Python is one of the leading open source platforms for data science and numerical computing. IPython and the associated Jupyter Notebook offer efficient interfaces to Python for data analysis and interactive visualization, and they constitute an ideal gateway to the platform. IPython Interactive Computing and Visualization Cookbook, 2nd Edition ...