## Overview

I’ve used MATLAB almost exclusively - both at work and at home - but we’ve migrated most of our analysis pipelines to Python at work, so I’ve largely done the same at home. The transition was relatively painless, thanks largely to the numpy and pandas libraries.

## rpy2

I stumbled upon rpy2, which is a Python library that allows users to execute R code and access R functions directly from Python. It does this by running an embedded R process within Python, and providing a set of classes for passing data back and forth between the two.

Through rpy2, R is accesible through two interfaces: a high-level interface, with convenient classes for mapping data into the R space, and a low-level interface, with more generalized classes that are less convenient but measurably faster.

The high-level-interface is the easiest way to use rpy2. The high-level
interface is instantiated by importing the `rpy2.robjects`

module

```
import rpy2.robjects as ro
```

The `robjects`

module provides wrappers for objects in the R space. By objects,
I mean variables (like R lists and data.frames), functions (like `t.test`

) and
other R objects.

## Usage

Using rpy2 with Python is usually a three-step process:

- Pass data from Python into R
- Manipulate data in R
- Pass data from R back into Python

This is best illustrated by examples.

### Basic Example

Here’s an example performing calling `t.test`

on some random vectors.

```
import rpy2.robjects as ro
import numpy as np
x = np.random.normal(size=10)
y = np.random.normal(size=10)
# 1. Pass data from Python into R
xr = ro.vectors.FloatVector(x)
yr = ro.vectors.FloatVector(y)
# 2. Call t.test on data in R
ttest = ro.r['t.test']
res = ttest(xr, yr, paired=True)
print(res)
# 3. Pass data from R back into Python
pval = res.rx2('p.value')[0]
```

Here I use the `FloatVector`

constructor to pass the data into R. rpy2 has wrappers for
all types of R data, and their names are self-explanatory (e.g. `StrVector`

,
`FloatVector`

, `ListVector`

, `DataFrame`

, etc.).
The function `ro.r`

allows you to execute raw R code. Here I use it to create a
Python variable linked to the `t.test`

function, but you can also use it to execute
as many lines of R code as you want (though I wouldn’t recommend doing so).

When the above code block is run, the console should print the value of the
variable `res`

:

```
R object with classes: ('htest',) mapped to:
<ListVector - Python:0x7f1c01d94688 / R:0x55c5630e8f18>
[FloatVector, FloatVector, FloatVector, FloatVector, ..., FloatVector, StrVector, StrVector, StrVector]
statistic: <class 'rpy2.robjects.vectors.FloatVector'>
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f1ba4863c48 / R:0x55c5640f7590>
[0.308857]
parameter: <class 'rpy2.robjects.vectors.FloatVector'>
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f1ba4863e88 / R:0x55c5640f7050>
[9.000000]
p.value: <class 'rpy2.robjects.vectors.FloatVector'>
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f1ba48634c8 / R:0x55c5640f7280>
[0.764460]
conf.int: <class 'rpy2.robjects.vectors.FloatVector'>
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f1ba4863888 / R:0x55c561e7ac28>
[-0.996924, 1.312192]
estimate: <class 'rpy2.robjects.vectors.FloatVector'>
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f1ba4863048 / R:0x55c5640f74e8>
[0.157634]
null.value: <class 'rpy2.robjects.vectors.FloatVector'>
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f1ba4804f88 / R:0x55c5640f7018>
[0.000000]
alternative: <class 'rpy2.robjects.vectors.StrVector'>
R object with classes: ('character',) mapped to:
<StrVector - Python:0x7f1ba49c9508 / R:0x55c5640ccd08>
['two.sided']
method: <class 'rpy2.robjects.vectors.StrVector'>
R object with classes: ('character',) mapped to:
<StrVector - Python:0x7f1ba49c9ac8 / R:0x55c5640cd8d8>
['Paired t-test']
data.name: <class 'rpy2.robjects.vectors.StrVector'>
R object with classes: ('character',) mapped to:
<StrVector - Python:0x7f1ba4863848 / R:0x55c5640e1eb8>
['c(-1.86332790..., '-0.1250831111..., '-0.3137351152...]
```

Not as clean as output you’d see in R, but it contains the same information.
From the first two lines, we can see that the variable is an `htest`

object
mapped to a ListVector (e.g. R `list`

). The remaining lines describe each
element of the list, and its name. For instance, we see the `statistic`

element
is numeric and, as such, is mapped to a FloatVector.

We can access elements of these lists in Python using the `rx2`

method,
which functions as the `[[`

operator in R.

```
print(res.rx2('p.value')[0])
# Output: 0.764460
```

The addition of the `[0]`

at the end is to grab the value itself. In R, *everything*
is a vector (including single numbers or strings) so `res.rx2('p.value')`

would
just return another FloatVector (since `res$p.value`

in R would return a vector
with a single `float`

.

If you want to convert everything back into pure Python, you could convert the
named list into a `dict`

like so.

```
res = dict(zip(res.names, res))
print(res['p-value'][0])
# Output: 0.764460
```

Note that the values of the `dict`

are still rpy2 vectors, so we include the
`[0]`

again to access the number itself.