Skip to content

This repository contains Programs in the R programming language.

Notifications You must be signed in to change notification settings

madhurimarawat/R-for-Datascience

Repository files navigation

R-for-Datascience

This repository contains Programs in the R Programming Language.

About R Programming

--> R is an open-source programming language that is widely used as a statistical software and data analysis tool.

--> R generally comes with the Command-line interface.

--> R is available across widely used platforms like Windows, Linux, and macOS.

--> R also provides rich Library support.


Modes of Executions

Rprogramming language can be executed in the following two modes:

1. Interactive mode

a) R Studio

R can also be run on the R Studio IDLE. It is an acronym of "Integrated DeveLopment Environment".

b) Google Colab

Colaboratory, or “Colab” for short, is a product from Google Research which allows anybody to write and execute r code in Jupyter notebook through the browser.

2. Script mode

R programs are written in editors and saved as the file with the .r extension which can be executed further.

Basic Datatypes

Datatypes-of-R-programming

R data types are the essential features that accept and store various data types.

Some of the most common data types in R are:

  1. Numeric: Decimal numbers like 10.5, 55, 787.

  2. Integer: Whole numbers like 1L, 55L, and 100L (the letter “L” declares this as an integer).

  3. Character: Strings of text like “hello”, “R”, and “data”.

  4. Logical: Boolean values like TRUE or FALSE.

  5. Factor: Categorical variables like “red”, “green”, and “blue”.

  6. Vector: A collection of elements of the same data type like c(1,2,3) or c(“a”,“b”,“c”).

  7. Vectors are of two types

    1. Atomic Vectors-Sequence of same data type that share the same data type.
    2. List- Lists are a "recursive" type (of vector), i.e list can hold non-homogeneous data type.

  8. Matrix: A two-dimensional array of elements of the same data type like matrix(1:9,nrow=3).

  9. Data frame: A table-like structure with rows and columns that can have different data types like data.frame(name=c(“Alice”,“Bob”),age=c(25,30)).

  10. List: It is a collection of elements that can have different data types like list(name=“Alice”,age=25,scores=c(90,80,70)).

  11. Array: It is a list or vector with two or more dimensions. An array is like a stacked matrix; a matrix is a two-dimensional array.

Features of R

Features-of-R-programming

Mode of Execution Used: R Studio

R

--> Visit the official website: R

--> Download according to the platform that will be used like Linux, Macos or Windows.

--> Follow the setup wizard.

R Studio

--> Visit the official website: R Studio

--> Download according to the platform that will be used like Linux, Macos or Windows.

--> Follow the setup wizard.

--> Create a new file with the extention of .r and then this file can be executed in the console.

Dataset Used

Iris Dataset

--> Iris has 4 numerical features and a tri class target variable.

--> This dataset can be used for classification as well as clustering.

--> In this dataset, there are 4 features sepal length, sepal width, petal length and petal width and the target variable has 3 classes namely ‘setosa’, ‘versicolor’, and ‘virginica’.

--> Dataset is already cleaned,no preprocessing required.

--> This dataset is simply used for understanding CSV features and data Visualization.

Automobile Dataset

--> Dataset is taken from: 🔗Automobile Dataset

--> This contains data about various automobile in Comma Separated Value (CSV) format.

--> CSV file contains the details of automobile-mileage,length,body-style among other attributes.

--> It contains the following dimensions-[60 rows X 6 columns].

--> The csv file is already preprocessed ,thus their is no need for data cleaning.

NBA Players Dataset

--> Dataset is taken from: 🔗NBA Dataset

--> This contains data about various NBA Players in Comma Separated Value (CSV) format.

--> CSV file contains the details of players-height,weight,team,position among other attributes.

--> It contains the following dimensions-[457 rows X 9 columns].

--> The csv file is already preprocessed ,thus their is no need for data cleaning.

Libraries of R

To install R library this command is used-

install.packages(library_name)
Libraries-of-R-programming

Thanks for Visiting 😄

Drop a 🌟 if you find this repository useful.

If you have any doubts or suggestions, feel free to reach me.

📫 How to reach me:   Linkedin Badge     Mail Illustration📫