README:
PypeR (PYthon-piPE-R)
PypeR is free software subjected to the GPL license 3.0. and comes with
ABSOLUTELY NO WARRANT. This package provides a light-weight interface to use R
in Python by pipe. It can be used on multiple platforms since it is written in
pure python.
Please refer to http://bioinfo.ihb.ac.cn/softwares/PypeR/ for bug fixes.
******************************************************************************
Requirements:
Python 2.3 or later.
PypeR can run with different Python implementations:
Python 2.X.X
Python 3.X.X
Jython
IronPython
******************************************************************************
For installation:
# python setup.py install
or
# easy_install PypeR
or
# pip install PypeR
To upgrade to the newest version:
# python setup.py install
or
# easy_install --upgrade PypeR
or
# pip install --upgrade PypeR
******************************************************************************
Known issues:
1. Problem: cannot run with IronPython on mono.
Platform: tested on Ubuntu-10.04, mono 2.4.4, IronPython 2.6 Beta 2
DEBUG (2.6.0.20) on .NET 2.0.50727.1433
Behavior: report "TypeError: Cannot cast from source type to desination type."
Reason: This happens when call a function found in the dict "str_func",
e.g. "str_func[type(obj)](obj)". Most likely it is caused by a bug
of IronPython (or mono?).
Solution: It is possible to use IF...ELIF...ESLE to replace the dict
"str_func". However it will make the function "Str4R" much longer
than it is now.
******************************************************************************
For Help:
Please see the documents of the module, class, and methods
For example:
The script "test.py" covers all the typical uses of PypeR
Citation:
@article{Xia:McClelland:Wang:2010:JSSOBK:v35c02,
author = "Xiao-Qin Xia and Michael McClelland and Yipeng Wang",
title = "PypeR, A Python Package for Using R in Python",
journal = "Journal of Statistical Software, Code Snippets",
volume = "35",
number = "2",
pages = "1--8",
day = "30",
month = "7",
year = "2010",
CODEN = "JSSOBK",
ISSN = "1548-7660",
bibdate = "2010-03-23",
URL = "http://www.jstatsoft.org/v35/c02",
accepted = "2010-03-23",
acknowledgement = "",
keywords = "",
submitted = "2009-10-23",
}
******************************************************************************
Usage:
The usage of this packages is very simple. Examples are presented in the
file "test.py" in the distribution package.
PypeR provide a class "R" to wrap the R language. An instance of the R
class is used to manage an R process. Different instances can use different
R installations. On POSIX systems (including the Cygwin environment on
Windows), it is even possible to use an R installed on a remote computer.
Basicly, there are four ways to use an instance of the R class.
1. Use the methods of the instance
methods include:
run:This method is used to pass an R command string to the R process,
the return value is a string - the standard output from R. Note
that the return value usually includes the R expression (a
series of R codes) themselves and the output of the R
expression. If the real result value is wanted, use the
function "get" instead.
assign: Assign a value to an R variable. No return value.
get: Get the result of an R expression.
remove: Remove a R variable.
2. Call the instance as a function
The instance is callable. If called as a function, it behaves just
same as its "run" method.
3. Use the instance as a Python dictionary
The instance can mimic some operations on a python dictionary,
typically, to assign values to R variables, to retrieve values for any
R expression, or delete an R variable. These two operations do same
jobs as the methods "assign", "get", and "remove".
4. Access R variables as if they are the attributes of the instance.
If the variable name cannot be found in the instance or its class, the
instance will try to get/set/remove it in R. This way is similar to 3,
but with more limitations, e.g., the R variable name cannot contain any
DOT (.)
Considering that any code block in R is an expression, the "get" method (or
the form of retrieving values from a dictionary) can be used to run a
number of R commands with the final result returned.
Note that PypeR do NOT validate/convert a variable name when pass it to R.
If a variable name with a leading underscore ("_"), although it legal in
python, is passed to R, an RError will be raised.
Conversions:
Python -> R
None -> NULL, NaN -> NaN, Inf -> Inf
R -> Python (numpy)
NULL -> None, NA -> None, NaN -> None (NaN), Inf -> None (Inf)
DEBUG model:
Since the child process (R) can be easily killed by any ocassional error in
the codes passed to it, PypeR is set to "DEBUG" model by default. This
means that any code blocks send to R will be wrapped in the function
"try()", which will prevent R from crashing. To disable the "DEBUG" model,
the user can simple set the variable "_DEBUG_MODE" in the R class or in its
instance to False.
To model the behavior of the "get" method of a Python dictionary, the
method "get" allows wild values for variables that does not exists in R.
Then the R expression will always be wrapped in "try()" to avoid R crashing
if the method "get" is called.
******************************************************************************
FAQs:
1. Q: I got error message when trying to use PypeR:
>>> from pyper import *
>>> r = R()
Traceback (most recent call last):
......
"WindowsError: [Error 2] The system cannot find the file specified"
>>>
A: Usually this means PypeR cannot find the R program on Windows. There is
two ways to tell PypeR where R is. E.g., R is installed at
"C:\Program Files\R\R-2.11.1".
method 1 - initialize R with full path:
>>> r = R(RCMD="C:\\Program Files\\R\\R-2.11.1\\bin\\R")
method 2 - add "C:\\Program Files\\R\\R-2.11.1\\bin" into the PATH
environmental variable:
(1) Right click "My Computer" on Windows XP (or "Computer" on
Windows 7), either on your desktop or in your start menu.
(2) Click "Properties"
(3) In Windows 7, click "Advanced System Settings" on the left.
(4) In the "Advanced" tab, click the "Environment Variables" button.
(5) Double-click the PATH variable, and add your R path to the list.
Entries are separated by semicolons. For example:
%WinDir%\System32;C:\Program Files\R\R-2.11.1\bin
2. Q: What is the differences between r("myvar"), r["myvar"], r.myvar,
r.get("myvar", "a wild value"), and r.assign("myvar", "a new value")?
A: These forms serve for different purpose:
(1) r("myvar")
Here "myvar" is a R variable name or a complex R expression. This
equals to type myvar on the R terminal. The information displayed
on the terminal will be returned as a Python string.
(2) r["myvar"]
This form can be used to get values from R, or set value for a R
variable.
If it is used to get value from R, "myvar" can be a R variable name
or a complex R expression, and this will return the value of
"myvar" instead of the output on STDOUT.
To set value for the R variable, the form is:
r["myvar"] = "something"
here "myvar" should be valid R variable name.
(3) r.myvar
This form is similar to r["myvar"], but diffs in two aspects:
a) myvar have to following Python's name convention too.
b) IMPORTANT: if myvar is a attribute of the python object r, it
will override (shield) the variable myvar in R!
(4) r.get("myvar", "a wild value")
This form is similar to r["myvar"], but it can only be used to get
values from R. If there is no variable "myvar" in R, the value "a
wild value" will be returned
(5) r.assign("myvar", "a new value")
This form is used to assign value ("a new value") to a R variable
("myvar"). Here "myvar" should be a valid R variable name.
3. Q: How can I get a named list in R returned as a dict, - just as been done
in RPy?
A: Due to named R list allows replicated names, while Python dictionary
does not, conversion of R named list to Python dictionary may lead to
lost of data. That is why PypeR return a list of tuples (name, value)
for R named list. However it is possible to get dictionary returned
since PypeR 1.0.2:
method 1:
>>> r.get("myvar", use_dict=True)
method 2:
>>> r.use_dict = True
>>> r.myvar
If this failed, please update your PypeR with the newest version.
|
CHANGES:
LOG of Changes of PypeR
Changes from PypeR-1.1.1 to PypeR-1.1.2:
on Wed Mar 26 10:32:36 HKT 2014
1. fixed the problem that PypeR cannot convert NaN in Pandas to R NaN. (This bug was found by Brent Pedersen)
2. fixed the problem in dealing with empty data.frame. (This bug was found and solved by Brent Pedersen)
3. All R data frame will be converted to numpy.array, then to pandas DataFrame (and no Series any more).
4. add dump_stdout to output raw string made by PypeR, this is useful for debugging. (This option is introduced by Uwe Schmitt)
5. Most importantly: solve the bug in passing quotes from R to Python.
Changes from PypeR-1.1.0 to PypeR-1.1.1:
on Wed Oct 10 11:07:27 CST 2012
1. Added support to Pandas Series and DataFrame. (Thanks Joost Delsman very
much for providing codes for this purpose!)
2. Addressed some compatibility problems for Python 2.7 on Windows. (Many
users provided helps!)
3. Addressed a potential problem related to one-element tuple in conversion
of R data frames to Python objects. (Thank Matt Knox for point this out
and provide a solution.)
4. Addressed the row-column confusion in conversion of R arrays to Python
objects.
5. Added supports to some special data, e.g. NaN and Inf.
6. More clear codes for test (in test.py).
Changes from PypeR-1.0.4 to PypeR-1.1.0:
on Mon Sep 13 14:22:47 PDT 2010
1. Removed the module "pipeio.py" (the modification to the subprocess
package by Josiah Carlson). Therefore, the package "pywin32" is no
longer need on Windows.
2. Comparing to PypeR-1.0.X, which can only run with Python-2.X.X (>= 2.4),
now PypeR runs with Python 2.X.X (>= 2.3), Python 3.X.X, Jython, and
IronPython.
3. The parameters "wait0" and "wait" have been cancelled for the "R" class
and the "runR" function. Now R can be initialized a little faster.
Changes from PypeR-1.0.3 to PypeR-1.0.4:
on Thu Sep 9 09:53:05 PDT 2010
1. Fixed the bug in parsing R command with backslash or space in the PATH -
now the following code works:
>>> r = R(RCMD="C:\\Program Files\\R\\R-2.X.X\\bin\\R")
Changes from PypeR-1.0.2 to PypeR-1.0.3:
on Fri Sep 3 16:30:12 PDT 2010:
1. Fixed the bug in passing R values to Python which happens on Windows
aftersome library (e.g. "fastICA") change newline from '\r\n' to '\n'.
2. Fixed the bug in passing R values with the backslash character ('\') to
Python.
3. Fixed the bug on Windows in parsing user library for R.
4. change the default value for parameter "use_dict". Now "use_dict" can be
one of the three values: None, True, False. None is the default value,
which allows a Python dictionary returned for R named list without
replicated name, or a Python list if replicated names exist. Here is
some examples:
>>> from pyper import *
>>> r = R() # the default value for use_dict is None
>>> r('a <- list(aa=3, bb=4, cc=5, aa=6)') # with replicated name "aa"
'try({a <- list(aa=3, bb=4, cc=5, aa=6)})\r\n'
>>> r.a
[('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)]
>>> r.get('a', use_dict=False)
[('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)]
>>> r.get('a', use_dict=True)
{'aa': 6, 'cc': 5, 'bb': 4}
>>> r.use_dict = True # change the default value from None to True
>>> r.a
{'aa': 6, 'cc': 5, 'bb': 4}
>>> r.get('a')
{'aa': 6, 'cc': 5, 'bb': 4}
>>> r.get('a', use_dict=False)
[('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)]
>>> r.get('a', use_dict=None) # "None" here means that r.use_dict should be used!
{'aa': 6, 'cc': 5, 'bb': 4}
>>> r.use_dict = None # recover the default value
>>> r.a
[('aa', 3), ('bb', 4), ('cc', 5), ('aa', 6)]
Changes from PypeR-1.0.1 to PypeR-1.0.2:
on Fri Sep 3 2010:
1. Fix bugs in converting logical data in R to Python
2. R Objects other than factor, NULL, vector, matrix, data.frame, or list,
will be converted to character using the function as.character
3. add the optional parameter "use_dict" for the R class and for the "get"
method of the R class.
Changes from PypeR-1.0 to PypeR-1.0.1:
1. Redict stderr to stdout for R. This can be disabled by setting
"return_err=False" when initialize a R instance, e.g.,
"r = R(return_err=False)"
2. In order that user settings can be used by R, R is launched with
arguments "--quiet --no-save --no-restore" instead of "--vanilla",
which equal to "--no-save --no-restore --no-site-file --no-init-file
--no-environ".
3. A version variable is added for the module: __version__ = 1.01
4. Suppressed the popup terminal window on Windows.
5. Added the user-specific library PATH in IDLE on Windows.
|