PypeR

PypeR (PYthon-piPE-R, originally names as Rinpy - R in Python) allows people to use R in Python through PIPE.

The package can be downloaded at:

PyPI - the Python Package Index.
SourceForge

While PypeR will be first distributed on the above two databases, it can also been download in some other locations, which are possibly have a few days lagging behind for a newly distributed version:

For installation and usages, please see README

For upgrading LOG, please see CHANGES

For examples, please see test.py

Known problems or bugs in the current distribution:

Some errors in command strings sent to R can be fatal to PypeR!
For example, incomplete command strings, typically, missing quotes or parentheses, will lead to a dead waiting: R is waiting for more input from Python while Python is trying to getting output from R.

	>>> from pyper import *
	>>> r = R()
	>>> r('a <- 10 / (3 + 2')

In Such cases, the user has to press "Ctrl - C" to break the pipe, and restart R again. So far I have no idea for a solution for this problem.

For suggestions or supports, please email to "Xiao-Qin Xia"<x q x i a 7 0 @ g m a i l . c o m>

README:

PypeR (PYthon-piPE-R) PypeR is free software subjected to the GPL license 3.0. and comes with ABSOLUTELY NO WARRANT. This package provides a light-weight interface to use R in Python by pipe. It can be used on multiple platforms since it is written in pure python. Please refer to http://bioinfo.ihb.ac.cn/softwares/PypeR/ for bug fixes. ****************************************************************************** Requirements: Python 2.3 or later. PypeR can run with different Python implementations: Python 2.X.X Python 3.X.X Jython IronPython ****************************************************************************** For installation: # python setup.py install or # easy_install PypeR or # pip install PypeR To upgrade to the newest version: # python setup.py install or # easy_install --upgrade PypeR or # pip install --upgrade PypeR ****************************************************************************** Known issues: 1. Problem: cannot run with IronPython on mono. Platform: tested on Ubuntu-10.04, mono 2.4.4, IronPython 2.6 Beta 2 DEBUG (2.6.0.20) on .NET 2.0.50727.1433 Behavior: report "TypeError: Cannot cast from source type to desination type." Reason: This happens when call a function found in the dict "str_func", e.g. "str_func[type(obj)](obj)". Most likely it is caused by a bug of IronPython (or mono?). Solution: It is possible to use IF...ELIF...ESLE to replace the dict "str_func". However it will make the function "Str4R" much longer than it is now. ****************************************************************************** For Help: Please see the documents of the module, class, and methods For example: The script "test.py" covers all the typical uses of PypeR Citation: @article{Xia:McClelland:Wang:2010:JSSOBK:v35c02, author = "Xiao-Qin Xia and Michael McClelland and Yipeng Wang", title = "PypeR, A Python Package for Using R in Python", journal = "Journal of Statistical Software, Code Snippets", volume = "35", number = "2", pages = "1--8", day = "30", month = "7", year = "2010", CODEN = "JSSOBK", ISSN = "1548-7660", bibdate = "2010-03-23", URL = "http://www.jstatsoft.org/v35/c02", accepted = "2010-03-23", acknowledgement = "", keywords = "", submitted = "2009-10-23", } ****************************************************************************** Usage: The usage of this packages is very simple. Examples are presented in the file "test.py" in the distribution package. PypeR provide a class "R" to wrap the R language. An instance of the R class is used to manage an R process. Different instances can use different R installations. On POSIX systems (including the Cygwin environment on Windows), it is even possible to use an R installed on a remote computer. Basicly, there are four ways to use an instance of the R class. 1. Use the methods of the instance methods include: run:This method is used to pass an R command string to the R process, the return value is a string - the standard output from R. Note that the return value usually includes the R expression (a series of R codes) themselves and the output of the R expression. If the real result value is wanted, use the function "get" instead. assign: Assign a value to an R variable. No return value. get: Get the result of an R expression. remove: Remove a R variable. 2. Call the instance as a function The instance is callable. If called as a function, it behaves just same as its "run" method. 3. Use the instance as a Python dictionary The instance can mimic some operations on a python dictionary, typically, to assign values to R variables, to retrieve values for any R expression, or delete an R variable. These two operations do same jobs as the methods "assign", "get", and "remove". 4. Access R variables as if they are the attributes of the instance. If the variable name cannot be found in the instance or its class, the instance will try to get/set/remove it in R. This way is similar to 3, but with more limitations, e.g., the R variable name cannot contain any DOT (.) Considering that any code block in R is an expression, the "get" method (or the form of retrieving values from a dictionary) can be used to run a number of R commands with the final result returned. Note that PypeR do NOT validate/convert a variable name when pass it to R. If a variable name with a leading underscore ("_"), although it legal in python, is passed to R, an RError will be raised. Conversions: Python -> R None -> NULL, NaN -> NaN, Inf -> Inf R -> Python (numpy) NULL -> None, NA -> None, NaN -> None (NaN), Inf -> None (Inf) DEBUG model: Since the child process (R) can be easily killed by any ocassional error in the codes passed to it, PypeR is set to "DEBUG" model by default. This means that any code blocks send to R will be wrapped in the function "try()", which will prevent R from crashing. To disable the "DEBUG" model, the user can simple set the variable "_DEBUG_MODE" in the R class or in its instance to False. To model the behavior of the "get" method of a Python dictionary, the method "get" allows wild values for variables that does not exists in R. Then the R expression will always be wrapped in "try()" to avoid R crashing if the method "get" is called. ****************************************************************************** FAQs: 1. Q: I got error message when trying to use PypeR: >>> from pyper import * >>> r = R() Traceback (most recent call last): ...... "WindowsError: [Error 2] The system cannot find the file specified" >>> A: Usually this means PypeR cannot find the R program on Windows. There is two ways to tell PypeR where R is. E.g., R is installed at "C:\Program Files\R\R-2.11.1". method 1 - initialize R with full path: >>> r = R(RCMD="C:\\Program Files\\R\\R-2.11.1\\bin\\R") method 2 - add "C:\\Program Files\\R\\R-2.11.1\\bin" into the PATH environmental variable: (1) Right click "My Computer" on Windows XP (or "Computer" on Windows 7), either on your desktop or in your start menu. (2) Click "Properties" (3) In Windows 7, click "Advanced System Settings" on the left. (4) In the "Advanced" tab, click the "Environment Variables" button. (5) Double-click the PATH variable, and add your R path to the list. Entries are separated by semicolons. For example: %WinDir%\System32;C:\Program Files\R\R-2.11.1\bin 2. Q: What is the differences between r("myvar"), r["myvar"], r.myvar, r.get("myvar", "a wild value"), and r.assign("myvar", "a new value")? A: These forms serve for different purpose: (1) r("myvar") Here "myvar" is a R variable name or a complex R expression. This equals to type myvar on the R terminal. The information displayed on the terminal will be returned as a Python string. (2) r["myvar"] This form can be used to get values from R, or set value for a R variable. If it is used to get value from R, "myvar" can be a R variable name or a complex R expression, and this will return the value of "myvar" instead of the output on STDOUT. To set value for the R variable, the form is: r["myvar"] = "something" here "myvar" should be valid R variable name. (3) r.myvar This form is similar to r["myvar"], but diffs in two aspects: a) myvar have to following Python's name convention too. b) IMPORTANT: if myvar is a attribute of the python object r, it will override (shield) the variable myvar in R! (4) r.get("myvar", "a wild value") This form is similar to r["myvar"], but it can only be used to get values from R. If there is no variable "myvar" in R, the value "a wild value" will be returned (5) r.assign("myvar", "a new value") This form is used to assign value ("a new value") to a R variable ("myvar"). Here "myvar" should be a valid R variable name. 3. Q: How can I get a named list in R returned as a dict, - just as been done in RPy? A: Due to named R list allows replicated names, while Python dictionary does not, conversion of R named list to Python dictionary may lead to lost of data. That is why PypeR return a list of tuples (name, value) for R named list. However it is possible to get dictionary returned since PypeR 1.0.2: method 1: >>> r.get("myvar", use_dict=True) method 2: >>> r.use_dict = True >>> r.myvar If this failed, please update your PypeR with the newest version.

CHANGES:

LOG of Changes of PypeR Changes from PypeR-1.1.1 to PypeR-1.1.2: on Wed Mar 26 10:32:36 HKT 2014 1. fixed the problem that PypeR cannot convert NaN in Pandas to R NaN. (This bug was found by Brent Pedersen) 2. fixed the problem in dealing with empty data.frame. (This bug was found and solved by Brent Pedersen) 3. All R data frame will be converted to numpy.array, then to pandas DataFrame (and no Series any more). 4. add dump_stdout to output raw string made by PypeR, this is useful for debugging. (This option is introduced by Uwe Schmitt) 5. Most importantly: solve the bug in passing quotes from R to Python. Changes from PypeR-1.1.0 to PypeR-1.1.1: on Wed Oct 10 11:07:27 CST 2012 1. Added support to Pandas Series and DataFrame. (Thanks Joost Delsman very much for providing codes for this purpose!) 2. Addressed some compatibility problems for Python 2.7 on Windows. (Many users provided helps!) 3. Addressed a potential problem related to one-element tuple in conversion of R data frames to Python objects. (Thank Matt Knox for point this out and provide a solution.) 4. Addressed the row-column confusion in conversion of R arrays to Python objects. 5. Added supports to some special data, e.g. NaN and Inf. 6. More clear codes for test (in test.py). Changes from PypeR-1.0.4 to PypeR-1.1.0: on Mon Sep 13 14:22:47 PDT 2010 1. Removed the module "pipeio.py" (the modification to the subprocess package by Josiah Carlson). Therefore, the package "pywin32" is no longer need on Windows. 2. Comparing to PypeR-1.0.X, which can only run with Python-2.X.X (>= 2.4), now PypeR runs with Python 2.X.X (>= 2.3), Python 3.X.X, Jython, and IronPython. 3. The parameters "wait0" and "wait" have been cancelled for the "R" class and the "runR" function. Now R can be initialized a little faster. Changes from PypeR-1.0.3 to PypeR-1.0.4: on Thu Sep 9 09:53:05 PDT 2010 1. Fixed the bug in parsing R command with backslash or space in the PATH - now the following code works: >>> r = R(RCMD="C:\\Program Files\\R\\R-2.X.X\\bin\\R") Changes from PypeR-1.0.2 to PypeR-1.0.3: on Fri Sep 3 16:30:12 PDT 2010: 1. Fixed the bug in passing R values to Python which happens on Windows aftersome library (e.g. "fastICA") change newline from '\r\n' to '\n'. 2. Fixed the bug in passing R values with the backslash character ('\') to Python. 3. Fixed the bug on Windows in parsing user library for R. 4. change the default value for parameter "use_dict". Now "use_dict" can be one of the three values: None, True, False. None is the default value, which allows a Python dictionary returned for R named list without replicated name, or a Python list if replicated names exist. Here is some examples: >>> from pyper import * >>> r = R() # the default value for use_dict is None >>> r('a <- list(aa=3, bb=4, cc=5, aa=6)') # with replicated name "aa" 'try({a <- list(aa=3, bb=4, cc=5, aa=6)})\r\n' >>> r.a [('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)] >>> r.get('a', use_dict=False) [('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)] >>> r.get('a', use_dict=True) {'aa': 6, 'cc': 5, 'bb': 4} >>> r.use_dict = True # change the default value from None to True >>> r.a {'aa': 6, 'cc': 5, 'bb': 4} >>> r.get('a') {'aa': 6, 'cc': 5, 'bb': 4} >>> r.get('a', use_dict=False) [('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)] >>> r.get('a', use_dict=None) # "None" here means that r.use_dict should be used! {'aa': 6, 'cc': 5, 'bb': 4} >>> r.use_dict = None # recover the default value >>> r.a [('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)] Changes from PypeR-1.0.1 to PypeR-1.0.2: on Fri Sep 3 2010: 1. Fix bugs in converting logical data in R to Python 2. R Objects other than factor, NULL, vector, matrix, data.frame, or list, will be converted to character using the function as.character 3. add the optional parameter "use_dict" for the R class and for the "get" method of the R class. Changes from PypeR-1.0 to PypeR-1.0.1: 1. Redict stderr to stdout for R. This can be disabled by setting "return_err=False" when initialize a R instance, e.g., "r = R(return_err=False)" 2. In order that user settings can be used by R, R is launched with arguments "--quiet --no-save --no-restore" instead of "--vanilla", which equal to "--no-save --no-restore --no-site-file --no-init-file --no-environ". 3. A version variable is added for the module: __version__ = 1.01 4. Suppressed the popup terminal window on Windows. 5. Added the user-specific library PATH in IDLE on Windows.