PypeR

PypeR (PYthon-piPE-R, originally names as Rinpy - R in Python) allows people to use R in Python through PIPE.

The package can be downloaded at:

While PypeR will be first distributed on the above two databases, it can also been download in some other locations, which are possibly have a few days lagging behind for a newly distributed version:

For installation and usages, please see README

For upgrading LOG, please see CHANGES

For examples, please see test.py

Known problems or bugs in the current distribution:

  1. Some errors in command strings sent to R can be fatal to PypeR!
    For example, incomplete command strings, typically, missing quotes or parentheses, will lead to a dead waiting: R is waiting for more input from Python while Python is trying to getting output from R.
    	>>> from pyper import *
    	>>> r = R()
    	>>> r('a <- 10 / (3 + 2')
    					
    In Such cases, the user has to press "Ctrl - C" to break the pipe, and restart R again. So far I have no idea for a solution for this problem.

    For suggestions or supports, please email to    "Xiao-Qin Xia"<x q x i a 7 0 @ g m a i l . c o m>

     
    README:
    					
    		              PypeR (PYthon-piPE-R)
    
    PypeR is free software subjected to the GPL license 3.0. and comes with
    ABSOLUTELY NO WARRANT. This package provides a light-weight interface to use R
    in Python by pipe.  It can be used on multiple platforms since it is written in
    pure python. 
    
    Please refer to http://bioinfo.ihb.ac.cn/softwares/PypeR/ for bug fixes.
    
    ******************************************************************************
    
    Requirements:
    	Python 2.3 or later. 
    	PypeR can run with different Python implementations: 
    		Python 2.X.X
    		Python 3.X.X
    		Jython
    		IronPython
    
    ******************************************************************************
    
    For installation:
    	# python setup.py install
    	  or
    	# easy_install PypeR
    	  or
    	# pip install PypeR
    
    To upgrade to the newest version:
    	# python setup.py install
    	  or
    	# easy_install --upgrade PypeR
    	  or
    	# pip install --upgrade PypeR
    
    ******************************************************************************
    
    Known issues:
    	1. Problem: cannot run with IronPython on mono.
    	   Platform: tested on Ubuntu-10.04, mono 2.4.4, IronPython 2.6 Beta 2
    		   DEBUG (2.6.0.20) on .NET 2.0.50727.1433
    	   Behavior: report "TypeError: Cannot cast from source type to desination type."
    	   Reason: This happens when call a function found in the dict "str_func",
    		   e.g. "str_func[type(obj)](obj)".  Most likely it is caused by a bug
    		   of IronPython (or mono?). 
    	   Solution: It is possible to use IF...ELIF...ESLE to replace the dict
    		   "str_func". However it will make the function "Str4R" much longer
    		   than it is now.
    
    ******************************************************************************
    
    For Help:
    	Please see the documents of the module, class, and methods
    
    For example:
    	The script "test.py" covers all the typical uses of PypeR
    
    Citation:
    	@article{Xia:McClelland:Wang:2010:JSSOBK:v35c02,
    	  author =	"Xiao-Qin Xia and Michael McClelland and Yipeng Wang",
    	  title =	"PypeR, A Python Package for Using R in Python",
    	  journal =	"Journal of Statistical Software, Code Snippets",
    	  volume =	"35",
    	  number =	"2",
    	  pages =	"1--8",
    	  day =  	"30",
    	  month =	"7",
    	  year = 	"2010",
    	  CODEN =	"JSSOBK",
    	  ISSN = 	"1548-7660",
    	  bibdate =	"2010-03-23",
    	  URL =  	"http://www.jstatsoft.org/v35/c02",
    	  accepted =	"2010-03-23",
    	  acknowledgement = "",
    	  keywords =	"",
    	  submitted =	"2009-10-23",
    	}
    
    
    ******************************************************************************
    
    Usage:
        The usage of this packages is very simple. Examples are presented in the
        file "test.py" in the distribution package.
    
        PypeR provide a class "R" to wrap the R language. An instance of the R
        class is used to manage an R process. Different instances can use different
        R installations. On POSIX systems (including the Cygwin environment on
        Windows), it is even possible to use an R installed on a remote computer.
    
        Basicly, there are four ways to use an instance of the R class.
    
        1. Use the methods of the instance
            methods include:
                run:This method is used to pass an R command string to the R process,
                    the return value is a string - the standard output from R. Note
                    that the return value usually includes the R expression (a
                    series of R codes) themselves and the output of the R
                    expression.  If the real result value is wanted, use the
                    function "get" instead.
                assign: Assign a value to an R variable. No return value.
                get: Get the result of an R expression.
                remove: Remove a R variable.
    
        2. Call the instance as a function
            The instance is callable. If called as a function, it behaves just
            same as its "run" method.
    
        3. Use the instance as a Python dictionary
            The instance can mimic some operations on a python dictionary,
            typically, to assign values to R variables, to retrieve values for any
            R expression, or delete an R variable. These two operations do same
            jobs as the methods "assign", "get", and "remove".
    
        4. Access R variables as if they are the attributes of the instance.
            If the variable name cannot be found in the instance or its class, the
            instance will try to get/set/remove it in R. This way is similar to 3,
            but with more limitations, e.g., the R variable name cannot contain any
            DOT (.)
    
        Considering that any code block in R is an expression, the "get" method (or
        the form of retrieving values from a dictionary) can be used to run a
        number of R commands with the final result returned.
    
        Note that PypeR do NOT validate/convert a variable name when pass it to R.
        If a variable name with a leading underscore ("_"), although it legal in
        python, is passed to R, an RError will be raised.
    
    Conversions:
        Python -> R
            None -> NULL, NaN -> NaN, Inf -> Inf
        R -> Python (numpy)
            NULL -> None, NA -> None, NaN -> None (NaN), Inf -> None (Inf)
    
    DEBUG model:
        Since the child process (R) can be easily killed by any ocassional error in
        the codes passed to it, PypeR is set to "DEBUG" model by default. This
        means that any code blocks send to R will be wrapped in the function
        "try()", which will prevent R from crashing. To disable the "DEBUG" model,
        the user can simple set the variable "_DEBUG_MODE" in the R class or in its
        instance to False.
    
        To model the behavior of the "get" method of a Python dictionary, the
        method "get" allows wild values for variables that does not exists in R.
        Then the R expression will always be wrapped in "try()" to avoid R crashing
        if the method "get" is called.
    
    ******************************************************************************
    
    FAQs:
    
    1.	Q: I got error message when trying to use PypeR:
    		>>> from pyper import *
    		>>> r = R()
    
    		Traceback (most recent call last):
    		......
    		"WindowsError: [Error 2] The system cannot find the file specified"
    		>>>
    
    	A: Usually this means PypeR cannot find the R program on Windows. There is
    		two ways to tell PypeR where R is. E.g., R is installed at 
    		"C:\Program Files\R\R-2.11.1".
    
    		method 1 - initialize R with full path:
    			>>>  r = R(RCMD="C:\\Program Files\\R\\R-2.11.1\\bin\\R")
    
    		method 2 - add "C:\\Program Files\\R\\R-2.11.1\\bin" into the PATH
    			environmental variable:
    
    			(1) Right click "My Computer" on Windows XP (or "Computer" on
    				Windows 7), either on your desktop or in your start menu.
    			(2) Click "Properties"
    			(3) In Windows 7, click "Advanced System Settings" on the left.
    			(4) In the "Advanced" tab, click the "Environment Variables" button.
    			(5) Double-click the PATH variable, and add your R path to the list.
    				Entries are separated by semicolons.  For example:
    				%WinDir%\System32;C:\Program Files\R\R-2.11.1\bin
    
    2.	Q: What is the differences between r("myvar"), r["myvar"], r.myvar,
    		r.get("myvar", "a wild value"), and r.assign("myvar", "a new value")?
    
    	A: These forms serve for different purpose:
    		(1) r("myvar")
    			Here "myvar" is a R variable name or a complex R expression. This
    			equals to type myvar on the R terminal. The information displayed
    			on the terminal will be returned as a Python string.
    
    		(2) r["myvar"] 
    			This form can be used to get values from R, or set value for a R
    			variable. 
    
    			If it is used to get value from R, "myvar" can be a R variable name
    			or a complex R expression, and this will return the value of
    			"myvar" instead of the output on STDOUT. 
    
    			To set value for the R variable, the form is:
    				r["myvar"] = "something"
    			here "myvar" should be valid R variable name.
    
    		(3) r.myvar
    			This form is similar to r["myvar"], but diffs in two aspects:
    			a) myvar have to following Python's name convention too.
    			b) IMPORTANT: if myvar is a attribute of the python object r, it
    				will override (shield) the variable myvar in R!
    
    		(4) r.get("myvar", "a wild value")
    			This form is similar to r["myvar"], but it can only be used to get
    			values from R. If there is no variable "myvar" in R, the value "a
    			wild value" will be returned
    
    		(5) r.assign("myvar", "a new value") 
    			This form is used to assign value ("a new value") to a R variable
    			("myvar"). Here "myvar" should be a valid R variable name.
    
    3.	Q: How can I get a named list in R returned as a dict, - just as been done
    		in RPy?
    
    	A: Due to named R list allows replicated names, while Python dictionary
    		does not, conversion of R named list to Python dictionary may lead to
    		lost of data. That is why PypeR return a list of tuples (name, value)
    		for R named list. However it is possible to get dictionary returned
    		since PypeR 1.0.2:
    			method 1:
    				>>> r.get("myvar", use_dict=True)
    			method 2:
    				>>> r.use_dict = True
    				>>> r.myvar
    		If this failed, please update your PypeR with the newest version.
    
    					

    CHANGES:

    					
    			LOG of Changes of PypeR
    
    
    Changes from PypeR-1.1.1 to PypeR-1.1.2:
        on Wed Mar 26 10:32:36 HKT 2014
    
        1. fixed the problem that PypeR cannot convert NaN in Pandas to R NaN. (This bug was found by Brent Pedersen)
    
        2. fixed the problem in dealing with empty data.frame. (This bug was found and solved by Brent Pedersen)
    
        3. All R data frame will be converted to numpy.array, then to pandas DataFrame (and no Series any more).
        
        4. add dump_stdout to output raw string made by PypeR, this is useful for debugging. (This option is introduced by Uwe Schmitt)
    
        5. Most importantly: solve the bug in passing quotes from R to Python.
    
    
    Changes from PypeR-1.1.0 to PypeR-1.1.1:
    
    	on Wed Oct 10 11:07:27 CST 2012
    
    	1. Added support to Pandas Series and DataFrame. (Thanks Joost Delsman very
    		much for providing codes for this purpose!)
    	
    	2. Addressed some compatibility problems for Python 2.7 on Windows. (Many
    		users provided helps!)
    
    	3. Addressed a potential problem related to one-element tuple in conversion
    		of R data frames to Python objects. (Thank Matt Knox for point this out
    		and provide a solution.) 
    
    	4. Addressed the row-column confusion in conversion of R arrays to Python
    		objects.
    
    	5. Added supports to some special data, e.g. NaN and Inf.
    
    	6. More clear codes for test (in test.py).
    
    
    Changes from PypeR-1.0.4 to PypeR-1.1.0:
    
    	on Mon Sep 13 14:22:47 PDT 2010
    
    	1. Removed the module "pipeio.py" (the modification to the subprocess
    		package by Josiah Carlson). Therefore, the package "pywin32" is no
    		longer need on Windows.
    
    	2. Comparing to PypeR-1.0.X, which can only run with Python-2.X.X (>= 2.4),
    		now PypeR runs with Python 2.X.X (>= 2.3), Python 3.X.X, Jython, and
    		IronPython.
    
    	3. The parameters "wait0" and "wait" have been cancelled for the "R" class
    		and the "runR" function. Now R can be initialized a little faster.
    
    
    Changes from PypeR-1.0.3 to PypeR-1.0.4:
    
    	on Thu Sep  9 09:53:05 PDT 2010
    
    	1. Fixed the bug in parsing R command with backslash or space in the PATH -
    		now the following code works:
    
    		>>> r = R(RCMD="C:\\Program Files\\R\\R-2.X.X\\bin\\R")
    
    
    Changes from PypeR-1.0.2 to PypeR-1.0.3:
    
    	on Fri Sep  3 16:30:12 PDT 2010:
    
    	1. Fixed the bug in passing R values to Python which happens on Windows
    		aftersome library (e.g. "fastICA") change newline from '\r\n' to '\n'.
    
    	2. Fixed the bug in passing R values with the backslash character ('\') to
    		Python.
    
    	3. Fixed the bug on Windows in parsing user library for R.
    
    	4. change the default value for parameter "use_dict". Now "use_dict" can be
    		one of the three values: None, True, False. None is the default value,
    		which allows a Python dictionary returned for R named list without
    		replicated name, or a Python list if replicated names exist. Here is
    		some examples:
    
    		>>> from pyper import *
    		>>> r = R() # the default value for use_dict is None
    		>>> r('a <- list(aa=3, bb=4, cc=5, aa=6)') # with replicated name "aa"
    		'try({a <- list(aa=3, bb=4, cc=5, aa=6)})\r\n'
    		>>> r.a
    		[('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)]
    		>>> r.get('a', use_dict=False)
    		[('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)]
    		>>> r.get('a', use_dict=True)
    		{'aa': 6, 'cc': 5, 'bb': 4}
    		>>> r.use_dict = True # change the default value from None to True
    		>>> r.a
    		{'aa': 6, 'cc': 5, 'bb': 4}
    		>>> r.get('a')
    		{'aa': 6, 'cc': 5, 'bb': 4}
    		>>> r.get('a', use_dict=False)
    		[('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)]
    		>>> r.get('a', use_dict=None) # "None" here means that r.use_dict should be used!
    		{'aa': 6, 'cc': 5, 'bb': 4}
    		>>> r.use_dict = None # recover the default value
    		>>> r.a
    		[('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)]
    
    
    Changes from PypeR-1.0.1 to PypeR-1.0.2:
    
    	on Fri Sep 3 2010:
    
    	1. Fix bugs in converting logical data in R to Python
    
    	2. R Objects other than factor, NULL, vector, matrix, data.frame, or list,
    		will be converted to character using the function as.character
    	
    	3. add the optional parameter "use_dict" for the R class and for the "get"
    		method of the R class.
    
    
    Changes from PypeR-1.0 to PypeR-1.0.1:
    
    	1. Redict stderr to stdout for R. This can be disabled by setting
    		"return_err=False" when initialize a R instance, e.g., 
    		"r = R(return_err=False)"
    
    	2. In order that user settings can be used by R, R is launched with
    		arguments "--quiet --no-save --no-restore" instead of "--vanilla",
    		which equal to "--no-save --no-restore --no-site-file --no-init-file
    		--no-environ".
    
    	3. A version variable is added for the module: __version__ = 1.01
    
    	4. Suppressed the popup terminal window on Windows.
    
    	5. Added the user-specific library PATH in IDLE on Windows.