mirror of
https://git.savannah.gnu.org/git/emacs/org-mode.git
synced 2024-11-26 07:33:39 +00:00
some thoughts on referencing data from R
This commit is contained in:
parent
1dd3e1c330
commit
a81fe7e15b
@ -1,113 +0,0 @@
|
||||
#+TITLE: rorg --- R and org-mode
|
||||
|
||||
* Objectives
|
||||
** Send data to R from org
|
||||
Org-mode includes orgtbl-mode, an extremely convenient way of using
|
||||
tabular data in a plain text file. Currently, spreadsheet
|
||||
functionality is available in org tables using the emacs package
|
||||
calc. It would be a boon both to org users and R users to allow
|
||||
org tables to be manipulated with the R programming language. Org
|
||||
tables give R users an easy way to enter and display data; R gives
|
||||
org users a powerful way to perform vector operations, statistical
|
||||
tests, and visualization on their tables.
|
||||
|
||||
*** Implementations
|
||||
**** naive
|
||||
Naive implementation would be to use =(org-export-table "tmp.csv")=
|
||||
and =(ess-execute "read.csv('tmp.csv')")=.
|
||||
**** org-R
|
||||
org-R passes data to R from two sources: org tables, or csv
|
||||
files. Org tables are first exported to a temporary csv file
|
||||
using [[file:existing_tools/org-R.el::defun%20org%20R%20export%20to%20csv%20csv%20file%20options][org-R-export-to-csv]].
|
||||
**** org-exp-blocks
|
||||
**** RweaveOrg
|
||||
NA
|
||||
|
||||
** Evaluate R code in org and deal with output appropriately
|
||||
*** vector output
|
||||
When R code evaluation generates vectors and 2-dimensional arrays,
|
||||
this should be formatted appropriately in org buffers
|
||||
(orgtbl-mode) as well as in export targets (html, latex). Values
|
||||
assigned to in the global environment should be available to
|
||||
blocks of R code elsewhere in the org buffer.
|
||||
**** Implementations
|
||||
***** org-R
|
||||
org-R converts R output (vectors, or matrices / 2d-arrays) to an
|
||||
org table and stores it in the org buffer, or in a separate org
|
||||
file (csv output would also be perfectly possible).
|
||||
***** org-exp-blocks
|
||||
***** RweaveOrg
|
||||
*** graphical output
|
||||
R can generate graphical output on a screen graphics device
|
||||
(e.g. X11, quartz), and in various standard image file formats
|
||||
(png, jpg, ps, pdf, etc). When graphical output is generated by
|
||||
evaluation of R code in Org, at least the following two things are desirable:
|
||||
1. output to screen for immediate viewing is possible
|
||||
2. graphical output to file is linked to appropriately from the
|
||||
org file This should have the automatic consequence that it is
|
||||
included appropriately in subsequent export targets (html,
|
||||
latex).
|
||||
**** Implementations
|
||||
***** org-R
|
||||
org-R does (1) if no output file is specified and (2) otherwise
|
||||
***** org-exp-blocks
|
||||
***** RweaveOrg
|
||||
|
||||
|
||||
* Notes
|
||||
** Special editing and evaluation of source code in R blocks
|
||||
Unfortunately org-mode how two different block types, both useful.
|
||||
In developing RweaveOrg, a third was introduced.
|
||||
|
||||
Eric is leaning towards using the =#+begin_src= blocks, as that is
|
||||
really what these blocks contain: source code. Austin believes
|
||||
that specifying export options at the beginning of a block is
|
||||
useful functionality, to be preserved if possible.
|
||||
|
||||
Note that upper and lower case are not relevant in block headings.
|
||||
|
||||
*** Source code blocks
|
||||
Org has an extremely useful method of editing source code and
|
||||
examples in their native modes. In the case of R code, we want to
|
||||
be able to use the full functionality of ESS mode, including
|
||||
interactive evaluation of code.
|
||||
|
||||
Source code blocks look like the following and allow for the
|
||||
special editing of code inside of the block through
|
||||
`org-edit-special'.
|
||||
|
||||
#+BEGIN_SRC r
|
||||
|
||||
,## hit C-c ' within this block to enter a temporary buffer in r-mode.
|
||||
|
||||
,## while in the temporary buffer, hit C-c C-c on this comment to
|
||||
,## evaluate this block
|
||||
a <- 3
|
||||
a
|
||||
|
||||
,## hit C-c ' to exit the temporary buffer
|
||||
#+END_SRC
|
||||
|
||||
*** dblocks
|
||||
dblocks are useful because org-mode will automatically call
|
||||
`org-dblock-write:dblock-type' where dblock-type is the string
|
||||
following the =#+BEGIN:= portion of the line.
|
||||
|
||||
dblocks look like the following and allow for evaluation of the
|
||||
code inside of the block by calling =\C-c\C-c= on the header of
|
||||
the block.
|
||||
|
||||
#+BEGIN: dblock-type
|
||||
#+END:
|
||||
|
||||
*** R blocks
|
||||
In developing RweaveOrg, Austin created [[file:existing_tools/RweaveOrg/org-sweave.el][org-sweave.el]]. This
|
||||
allows for the kind of blocks shown in [[file:existing_tools/RweaveOrg/testing.Rorg][testing.Rorg]]. These blocks
|
||||
have the advantage of accepting options to the Sweave preprocessor
|
||||
following the #+BEGIN_R declaration.
|
||||
|
||||
|
||||
* tasks
|
||||
|
||||
* buffer dictionary
|
||||
LocalWords: DBlocks dblocks
|
110
rorg.org
110
rorg.org
@ -190,20 +190,20 @@ Are there side-effects which need to be considered aside from those
|
||||
internal to the source-code evaluation process?
|
||||
|
||||
** reference to data and evaluation results
|
||||
I think this will be very important. I would suggest that since we
|
||||
are using lisp we use lists as our medium of exchange. Then all we
|
||||
need are functions going converting all of our target formats to and
|
||||
from lists. These functions are already provided by for org tables.
|
||||
I think this will be very important. I would suggest that since we
|
||||
are using lisp we use lists as our medium of exchange. Then all we
|
||||
need are functions going converting all of our target formats to and
|
||||
from lists. These functions are already provided by for org tables.
|
||||
|
||||
It would be a boon both to org users and R users to allow org tables
|
||||
to be manipulated with the R programming language. Org tables give R
|
||||
users an easy way to enter and display data; R gives org users a
|
||||
powerful way to perform vector operations, statistical tests, and
|
||||
visualization on their tables.
|
||||
It would be a boon both to org users and R users to allow org tables
|
||||
to be manipulated with the R programming language. Org tables give R
|
||||
users an easy way to enter and display data; R gives org users a
|
||||
powerful way to perform vector operations, statistical tests, and
|
||||
visualization on their tables.
|
||||
|
||||
This means that we will need to consider unique id's for source
|
||||
blocks, as well as for org tables, and for any other data source or
|
||||
target.
|
||||
This means that we will need to consider unique id's for source
|
||||
blocks, as well as for org tables, and for any other data source or
|
||||
target.
|
||||
|
||||
*** Implementations
|
||||
**** naive
|
||||
@ -214,25 +214,71 @@ target.
|
||||
files. Org tables are first exported to a temporary csv file
|
||||
using [[file:existing_tools/org-R.el::defun%20org%20R%20export%20to%20csv%20csv%20file%20options][org-R-export-to-csv]].
|
||||
**** org-exp-blocks
|
||||
org-exp-blocks uses [[org-interblock-R-command-to-string]] to send
|
||||
commands to an R process running in a comint buffer through ESS.
|
||||
org-exp-blocks has no support for dumping table data to R process, or
|
||||
vice versa.
|
||||
org-exp-blocks uses [[org-interblock-R-command-to-string]] to send
|
||||
commands to an R process running in a comint buffer through ESS.
|
||||
org-exp-blocks has no support for dumping table data to R process, or
|
||||
vice versa.
|
||||
|
||||
**** RweaveOrg
|
||||
NA
|
||||
|
||||
*** reference format
|
||||
This will be tricky, Dan has already come up with a solution for R, I
|
||||
need to look more closely at that and we should try to come up with a
|
||||
formats for referencing data from source-code in such a way that it
|
||||
will be as source-code-language independent as possible.
|
||||
This will be tricky, Dan has already come up with a solution for R, I
|
||||
need to look more closely at that and we should try to come up with a
|
||||
formats for referencing data from source-code in such a way that it
|
||||
will be as source-code-language independent as possible.
|
||||
|
||||
**** Dan: thinking aloud re: referencing data from R
|
||||
Suppose in some R code, we want to reference data in an org
|
||||
table. I think that requires the use of 'header arguments', since
|
||||
otherwise, under pure evaluation of a code block without header
|
||||
args, R has no way to locate the data in the org buffer. So that
|
||||
suggests a mechanism like that used by org-R whereby table names
|
||||
or unique entry IDs are used to reference org tables (and indeed
|
||||
potentially row/column ranges within org tables, although that
|
||||
subsetting could also be done in R).
|
||||
|
||||
Specifically what org-R does is write the table to a temp csv
|
||||
file, and tell R the name of that file. However:
|
||||
|
||||
1. We are not limited to a single source of input; the same sort
|
||||
of thing could be done for several sources of input
|
||||
|
||||
2. I don't think we even have to use temp files. An alternative
|
||||
would be to have org pass the table contents as a csv-format
|
||||
string to textConnection() in R, thus creating an arbitrary
|
||||
number of input objects in the appropriate R environment
|
||||
(scope) from which the R code can read data when necessary.
|
||||
|
||||
That suggests a header option syntax something like
|
||||
|
||||
#+begin_src emacs-lisp
|
||||
'(:R-obj-name-1 tbl-name-or-id-1 :R-obj-name-2 tbl-name-or-id-2)
|
||||
#+end_src emacs-lisp
|
||||
|
||||
As a result of passing that option, the code would be able to access
|
||||
the data referenced by table-name-or-id-2 via read.table(R-obj-name-1).
|
||||
|
||||
An extension of that idea would be to allow remote files to be used as
|
||||
data sources. In this case one might need just the remote file (if
|
||||
it's a csv file), or if it's an org file then the name of the file
|
||||
plus a table reference within that org file. Thus maybe something like
|
||||
|
||||
#+begin_src emacs-lisp
|
||||
'((R-obj-name-1 . (:tblref tbl-name-or-id-1 :file file-1))
|
||||
(R-obj-name-2 . (:tblref tbl-name-or-id-2 :file file-2)))
|
||||
#+end_src emacs-lisp
|
||||
|
||||
|
||||
*** source-target pairs
|
||||
|
||||
The following can be used for special considerations based on
|
||||
source-target pairs
|
||||
The following can be used for special considerations based on
|
||||
source-target pairs
|
||||
|
||||
Dan: I don't quite understand this subtree; Eric -- could you give
|
||||
a little more explanation of this and of your comment above
|
||||
regarding using [[lists as our medium of exchange]]?
|
||||
|
||||
**** source block output from org tables
|
||||
**** source block outpt from other source block
|
||||
**** source block output from org list
|
||||
@ -240,15 +286,16 @@ source-target pairs
|
||||
**** org table from org table
|
||||
**** org properties from source block
|
||||
**** org properties from org table
|
||||
|
||||
|
||||
|
||||
** export
|
||||
once the previous objectives are met export should be fairly simple.
|
||||
Basically it will consist of triggering the evaluation of source code
|
||||
blocks with the org-export-preprocess-hook.
|
||||
once the previous objectives are met export should be fairly simple.
|
||||
Basically it will consist of triggering the evaluation of source code
|
||||
blocks with the org-export-preprocess-hook.
|
||||
|
||||
This block export evaluation will be aware of the target format
|
||||
through the htmlp and latexp variables, and can then create quoted
|
||||
=#+begin_html= and =#+begin_latex= blocks appropriately.
|
||||
This block export evaluation will be aware of the target format
|
||||
through the htmlp and latexp variables, and can then create quoted
|
||||
=#+begin_html= and =#+begin_latex= blocks appropriately.
|
||||
|
||||
|
||||
* Notes
|
||||
@ -395,7 +442,7 @@ a
|
||||
following the #+BEGIN_R declaration.
|
||||
|
||||
*** block headers/parameters
|
||||
regardless of the syntax/format chosen for the source blocks, we will
|
||||
Regardless of the syntax/format chosen for the source blocks, we will
|
||||
need to be able to pass a list of parameters to these blocks. These
|
||||
should include (but should certainly not be limited to)
|
||||
- label or id :: Label of the block, should we provide facilities for
|
||||
@ -488,6 +535,7 @@ through the process of fleshing out objectives, and cashing those
|
||||
objectives out into tasks. That said, please feel free to make any
|
||||
changes that you see fit.
|
||||
|
||||
|
||||
** Dan <2009-02-12 Thu 10:23>
|
||||
Good job Eric with major works on this file.
|
||||
* Buffer Dictionary
|
||||
LocalWords: DBlocks dblocks
|
||||
|
Loading…
Reference in New Issue
Block a user