1
0
mirror of https://git.savannah.gnu.org/git/emacs/org-mode.git synced 2024-11-26 07:33:39 +00:00

some thoughts on referencing data from R

This commit is contained in:
Dan Davison 2009-02-12 11:07:25 -05:00
parent 1dd3e1c330
commit a81fe7e15b
2 changed files with 79 additions and 144 deletions

View File

@ -1,113 +0,0 @@
#+TITLE: rorg --- R and org-mode
* Objectives
** Send data to R from org
Org-mode includes orgtbl-mode, an extremely convenient way of using
tabular data in a plain text file. Currently, spreadsheet
functionality is available in org tables using the emacs package
calc. It would be a boon both to org users and R users to allow
org tables to be manipulated with the R programming language. Org
tables give R users an easy way to enter and display data; R gives
org users a powerful way to perform vector operations, statistical
tests, and visualization on their tables.
*** Implementations
**** naive
Naive implementation would be to use =(org-export-table "tmp.csv")=
and =(ess-execute "read.csv('tmp.csv')")=.
**** org-R
org-R passes data to R from two sources: org tables, or csv
files. Org tables are first exported to a temporary csv file
using [[file:existing_tools/org-R.el::defun%20org%20R%20export%20to%20csv%20csv%20file%20options][org-R-export-to-csv]].
**** org-exp-blocks
**** RweaveOrg
NA
** Evaluate R code in org and deal with output appropriately
*** vector output
When R code evaluation generates vectors and 2-dimensional arrays,
this should be formatted appropriately in org buffers
(orgtbl-mode) as well as in export targets (html, latex). Values
assigned to in the global environment should be available to
blocks of R code elsewhere in the org buffer.
**** Implementations
***** org-R
org-R converts R output (vectors, or matrices / 2d-arrays) to an
org table and stores it in the org buffer, or in a separate org
file (csv output would also be perfectly possible).
***** org-exp-blocks
***** RweaveOrg
*** graphical output
R can generate graphical output on a screen graphics device
(e.g. X11, quartz), and in various standard image file formats
(png, jpg, ps, pdf, etc). When graphical output is generated by
evaluation of R code in Org, at least the following two things are desirable:
1. output to screen for immediate viewing is possible
2. graphical output to file is linked to appropriately from the
org file This should have the automatic consequence that it is
included appropriately in subsequent export targets (html,
latex).
**** Implementations
***** org-R
org-R does (1) if no output file is specified and (2) otherwise
***** org-exp-blocks
***** RweaveOrg
* Notes
** Special editing and evaluation of source code in R blocks
Unfortunately org-mode how two different block types, both useful.
In developing RweaveOrg, a third was introduced.
Eric is leaning towards using the =#+begin_src= blocks, as that is
really what these blocks contain: source code. Austin believes
that specifying export options at the beginning of a block is
useful functionality, to be preserved if possible.
Note that upper and lower case are not relevant in block headings.
*** Source code blocks
Org has an extremely useful method of editing source code and
examples in their native modes. In the case of R code, we want to
be able to use the full functionality of ESS mode, including
interactive evaluation of code.
Source code blocks look like the following and allow for the
special editing of code inside of the block through
`org-edit-special'.
#+BEGIN_SRC r
,## hit C-c ' within this block to enter a temporary buffer in r-mode.
,## while in the temporary buffer, hit C-c C-c on this comment to
,## evaluate this block
a <- 3
a
,## hit C-c ' to exit the temporary buffer
#+END_SRC
*** dblocks
dblocks are useful because org-mode will automatically call
`org-dblock-write:dblock-type' where dblock-type is the string
following the =#+BEGIN:= portion of the line.
dblocks look like the following and allow for evaluation of the
code inside of the block by calling =\C-c\C-c= on the header of
the block.
#+BEGIN: dblock-type
#+END:
*** R blocks
In developing RweaveOrg, Austin created [[file:existing_tools/RweaveOrg/org-sweave.el][org-sweave.el]]. This
allows for the kind of blocks shown in [[file:existing_tools/RweaveOrg/testing.Rorg][testing.Rorg]]. These blocks
have the advantage of accepting options to the Sweave preprocessor
following the #+BEGIN_R declaration.
* tasks
* buffer dictionary
LocalWords: DBlocks dblocks

110
rorg.org
View File

@ -190,20 +190,20 @@ Are there side-effects which need to be considered aside from those
internal to the source-code evaluation process?
** reference to data and evaluation results
I think this will be very important. I would suggest that since we
are using lisp we use lists as our medium of exchange. Then all we
need are functions going converting all of our target formats to and
from lists. These functions are already provided by for org tables.
I think this will be very important. I would suggest that since we
are using lisp we use lists as our medium of exchange. Then all we
need are functions going converting all of our target formats to and
from lists. These functions are already provided by for org tables.
It would be a boon both to org users and R users to allow org tables
to be manipulated with the R programming language. Org tables give R
users an easy way to enter and display data; R gives org users a
powerful way to perform vector operations, statistical tests, and
visualization on their tables.
It would be a boon both to org users and R users to allow org tables
to be manipulated with the R programming language. Org tables give R
users an easy way to enter and display data; R gives org users a
powerful way to perform vector operations, statistical tests, and
visualization on their tables.
This means that we will need to consider unique id's for source
blocks, as well as for org tables, and for any other data source or
target.
This means that we will need to consider unique id's for source
blocks, as well as for org tables, and for any other data source or
target.
*** Implementations
**** naive
@ -214,25 +214,71 @@ target.
files. Org tables are first exported to a temporary csv file
using [[file:existing_tools/org-R.el::defun%20org%20R%20export%20to%20csv%20csv%20file%20options][org-R-export-to-csv]].
**** org-exp-blocks
org-exp-blocks uses [[org-interblock-R-command-to-string]] to send
commands to an R process running in a comint buffer through ESS.
org-exp-blocks has no support for dumping table data to R process, or
vice versa.
org-exp-blocks uses [[org-interblock-R-command-to-string]] to send
commands to an R process running in a comint buffer through ESS.
org-exp-blocks has no support for dumping table data to R process, or
vice versa.
**** RweaveOrg
NA
*** reference format
This will be tricky, Dan has already come up with a solution for R, I
need to look more closely at that and we should try to come up with a
formats for referencing data from source-code in such a way that it
will be as source-code-language independent as possible.
This will be tricky, Dan has already come up with a solution for R, I
need to look more closely at that and we should try to come up with a
formats for referencing data from source-code in such a way that it
will be as source-code-language independent as possible.
**** Dan: thinking aloud re: referencing data from R
Suppose in some R code, we want to reference data in an org
table. I think that requires the use of 'header arguments', since
otherwise, under pure evaluation of a code block without header
args, R has no way to locate the data in the org buffer. So that
suggests a mechanism like that used by org-R whereby table names
or unique entry IDs are used to reference org tables (and indeed
potentially row/column ranges within org tables, although that
subsetting could also be done in R).
Specifically what org-R does is write the table to a temp csv
file, and tell R the name of that file. However:
1. We are not limited to a single source of input; the same sort
of thing could be done for several sources of input
2. I don't think we even have to use temp files. An alternative
would be to have org pass the table contents as a csv-format
string to textConnection() in R, thus creating an arbitrary
number of input objects in the appropriate R environment
(scope) from which the R code can read data when necessary.
That suggests a header option syntax something like
#+begin_src emacs-lisp
'(:R-obj-name-1 tbl-name-or-id-1 :R-obj-name-2 tbl-name-or-id-2)
#+end_src emacs-lisp
As a result of passing that option, the code would be able to access
the data referenced by table-name-or-id-2 via read.table(R-obj-name-1).
An extension of that idea would be to allow remote files to be used as
data sources. In this case one might need just the remote file (if
it's a csv file), or if it's an org file then the name of the file
plus a table reference within that org file. Thus maybe something like
#+begin_src emacs-lisp
'((R-obj-name-1 . (:tblref tbl-name-or-id-1 :file file-1))
(R-obj-name-2 . (:tblref tbl-name-or-id-2 :file file-2)))
#+end_src emacs-lisp
*** source-target pairs
The following can be used for special considerations based on
source-target pairs
The following can be used for special considerations based on
source-target pairs
Dan: I don't quite understand this subtree; Eric -- could you give
a little more explanation of this and of your comment above
regarding using [[lists as our medium of exchange]]?
**** source block output from org tables
**** source block outpt from other source block
**** source block output from org list
@ -240,15 +286,16 @@ source-target pairs
**** org table from org table
**** org properties from source block
**** org properties from org table
** export
once the previous objectives are met export should be fairly simple.
Basically it will consist of triggering the evaluation of source code
blocks with the org-export-preprocess-hook.
once the previous objectives are met export should be fairly simple.
Basically it will consist of triggering the evaluation of source code
blocks with the org-export-preprocess-hook.
This block export evaluation will be aware of the target format
through the htmlp and latexp variables, and can then create quoted
=#+begin_html= and =#+begin_latex= blocks appropriately.
This block export evaluation will be aware of the target format
through the htmlp and latexp variables, and can then create quoted
=#+begin_html= and =#+begin_latex= blocks appropriately.
* Notes
@ -395,7 +442,7 @@ a
following the #+BEGIN_R declaration.
*** block headers/parameters
regardless of the syntax/format chosen for the source blocks, we will
Regardless of the syntax/format chosen for the source blocks, we will
need to be able to pass a list of parameters to these blocks. These
should include (but should certainly not be limited to)
- label or id :: Label of the block, should we provide facilities for
@ -488,6 +535,7 @@ through the process of fleshing out objectives, and cashing those
objectives out into tasks. That said, please feel free to make any
changes that you see fit.
** Dan <2009-02-12 Thu 10:23>
Good job Eric with major works on this file.
* Buffer Dictionary
LocalWords: DBlocks dblocks