From a81fe7e15be9f08f9e5251ccb90e4bfc557be85c Mon Sep 17 00:00:00 2001 From: Dan Davison Date: Thu, 12 Feb 2009 11:07:25 -0500 Subject: [PATCH] some thoughts on referencing data from R --- nogit-rorg-dan.org | 113 --------------------------------------------- rorg.org | 110 ++++++++++++++++++++++++++++++------------- 2 files changed, 79 insertions(+), 144 deletions(-) delete mode 100644 nogit-rorg-dan.org diff --git a/nogit-rorg-dan.org b/nogit-rorg-dan.org deleted file mode 100644 index e72f2fa9c..000000000 --- a/nogit-rorg-dan.org +++ /dev/null @@ -1,113 +0,0 @@ -#+TITLE: rorg --- R and org-mode - -* Objectives -** Send data to R from org - Org-mode includes orgtbl-mode, an extremely convenient way of using - tabular data in a plain text file. Currently, spreadsheet - functionality is available in org tables using the emacs package - calc. It would be a boon both to org users and R users to allow - org tables to be manipulated with the R programming language. Org - tables give R users an easy way to enter and display data; R gives - org users a powerful way to perform vector operations, statistical - tests, and visualization on their tables. - -*** Implementations -**** naive - Naive implementation would be to use =(org-export-table "tmp.csv")= - and =(ess-execute "read.csv('tmp.csv')")=. -**** org-R - org-R passes data to R from two sources: org tables, or csv - files. Org tables are first exported to a temporary csv file - using [[file:existing_tools/org-R.el::defun%20org%20R%20export%20to%20csv%20csv%20file%20options][org-R-export-to-csv]]. -**** org-exp-blocks -**** RweaveOrg - NA - -** Evaluate R code in org and deal with output appropriately -*** vector output - When R code evaluation generates vectors and 2-dimensional arrays, - this should be formatted appropriately in org buffers - (orgtbl-mode) as well as in export targets (html, latex). Values - assigned to in the global environment should be available to - blocks of R code elsewhere in the org buffer. -**** Implementations -***** org-R - org-R converts R output (vectors, or matrices / 2d-arrays) to an - org table and stores it in the org buffer, or in a separate org - file (csv output would also be perfectly possible). -***** org-exp-blocks -***** RweaveOrg -*** graphical output - R can generate graphical output on a screen graphics device - (e.g. X11, quartz), and in various standard image file formats - (png, jpg, ps, pdf, etc). When graphical output is generated by - evaluation of R code in Org, at least the following two things are desirable: - 1. output to screen for immediate viewing is possible - 2. graphical output to file is linked to appropriately from the - org file This should have the automatic consequence that it is - included appropriately in subsequent export targets (html, - latex). -**** Implementations -***** org-R - org-R does (1) if no output file is specified and (2) otherwise -***** org-exp-blocks -***** RweaveOrg - - -* Notes -** Special editing and evaluation of source code in R blocks - Unfortunately org-mode how two different block types, both useful. - In developing RweaveOrg, a third was introduced. - - Eric is leaning towards using the =#+begin_src= blocks, as that is - really what these blocks contain: source code. Austin believes - that specifying export options at the beginning of a block is - useful functionality, to be preserved if possible. - - Note that upper and lower case are not relevant in block headings. - -*** Source code blocks - Org has an extremely useful method of editing source code and - examples in their native modes. In the case of R code, we want to - be able to use the full functionality of ESS mode, including - interactive evaluation of code. - - Source code blocks look like the following and allow for the - special editing of code inside of the block through - `org-edit-special'. - -#+BEGIN_SRC r - -,## hit C-c ' within this block to enter a temporary buffer in r-mode. - -,## while in the temporary buffer, hit C-c C-c on this comment to -,## evaluate this block -a <- 3 -a - -,## hit C-c ' to exit the temporary buffer -#+END_SRC - -*** dblocks - dblocks are useful because org-mode will automatically call - `org-dblock-write:dblock-type' where dblock-type is the string - following the =#+BEGIN:= portion of the line. - - dblocks look like the following and allow for evaluation of the - code inside of the block by calling =\C-c\C-c= on the header of - the block. - -#+BEGIN: dblock-type -#+END: - -*** R blocks - In developing RweaveOrg, Austin created [[file:existing_tools/RweaveOrg/org-sweave.el][org-sweave.el]]. This - allows for the kind of blocks shown in [[file:existing_tools/RweaveOrg/testing.Rorg][testing.Rorg]]. These blocks - have the advantage of accepting options to the Sweave preprocessor - following the #+BEGIN_R declaration. - - -* tasks - -* buffer dictionary - LocalWords: DBlocks dblocks diff --git a/rorg.org b/rorg.org index b812e211f..47393e91b 100644 --- a/rorg.org +++ b/rorg.org @@ -190,20 +190,20 @@ Are there side-effects which need to be considered aside from those internal to the source-code evaluation process? ** reference to data and evaluation results -I think this will be very important. I would suggest that since we -are using lisp we use lists as our medium of exchange. Then all we -need are functions going converting all of our target formats to and -from lists. These functions are already provided by for org tables. + I think this will be very important. I would suggest that since we + are using lisp we use lists as our medium of exchange. Then all we + need are functions going converting all of our target formats to and + from lists. These functions are already provided by for org tables. -It would be a boon both to org users and R users to allow org tables -to be manipulated with the R programming language. Org tables give R -users an easy way to enter and display data; R gives org users a -powerful way to perform vector operations, statistical tests, and -visualization on their tables. + It would be a boon both to org users and R users to allow org tables + to be manipulated with the R programming language. Org tables give R + users an easy way to enter and display data; R gives org users a + powerful way to perform vector operations, statistical tests, and + visualization on their tables. -This means that we will need to consider unique id's for source -blocks, as well as for org tables, and for any other data source or -target. + This means that we will need to consider unique id's for source + blocks, as well as for org tables, and for any other data source or + target. *** Implementations **** naive @@ -214,25 +214,71 @@ target. files. Org tables are first exported to a temporary csv file using [[file:existing_tools/org-R.el::defun%20org%20R%20export%20to%20csv%20csv%20file%20options][org-R-export-to-csv]]. **** org-exp-blocks -org-exp-blocks uses [[org-interblock-R-command-to-string]] to send -commands to an R process running in a comint buffer through ESS. -org-exp-blocks has no support for dumping table data to R process, or -vice versa. + org-exp-blocks uses [[org-interblock-R-command-to-string]] to send + commands to an R process running in a comint buffer through ESS. + org-exp-blocks has no support for dumping table data to R process, or + vice versa. **** RweaveOrg NA *** reference format -This will be tricky, Dan has already come up with a solution for R, I -need to look more closely at that and we should try to come up with a -formats for referencing data from source-code in such a way that it -will be as source-code-language independent as possible. + This will be tricky, Dan has already come up with a solution for R, I + need to look more closely at that and we should try to come up with a + formats for referencing data from source-code in such a way that it + will be as source-code-language independent as possible. + +**** Dan: thinking aloud re: referencing data from R + Suppose in some R code, we want to reference data in an org + table. I think that requires the use of 'header arguments', since + otherwise, under pure evaluation of a code block without header + args, R has no way to locate the data in the org buffer. So that + suggests a mechanism like that used by org-R whereby table names + or unique entry IDs are used to reference org tables (and indeed + potentially row/column ranges within org tables, although that + subsetting could also be done in R). + + Specifically what org-R does is write the table to a temp csv + file, and tell R the name of that file. However: + + 1. We are not limited to a single source of input; the same sort + of thing could be done for several sources of input + + 2. I don't think we even have to use temp files. An alternative + would be to have org pass the table contents as a csv-format + string to textConnection() in R, thus creating an arbitrary + number of input objects in the appropriate R environment + (scope) from which the R code can read data when necessary. + + That suggests a header option syntax something like + +#+begin_src emacs-lisp +'(:R-obj-name-1 tbl-name-or-id-1 :R-obj-name-2 tbl-name-or-id-2) +#+end_src emacs-lisp + +As a result of passing that option, the code would be able to access +the data referenced by table-name-or-id-2 via read.table(R-obj-name-1). + +An extension of that idea would be to allow remote files to be used as +data sources. In this case one might need just the remote file (if +it's a csv file), or if it's an org file then the name of the file +plus a table reference within that org file. Thus maybe something like + +#+begin_src emacs-lisp +'((R-obj-name-1 . (:tblref tbl-name-or-id-1 :file file-1)) + (R-obj-name-2 . (:tblref tbl-name-or-id-2 :file file-2))) +#+end_src emacs-lisp + *** source-target pairs -The following can be used for special considerations based on -source-target pairs + The following can be used for special considerations based on + source-target pairs + Dan: I don't quite understand this subtree; Eric -- could you give + a little more explanation of this and of your comment above + regarding using [[lists as our medium of exchange]]? + **** source block output from org tables **** source block outpt from other source block **** source block output from org list @@ -240,15 +286,16 @@ source-target pairs **** org table from org table **** org properties from source block **** org properties from org table - + + ** export -once the previous objectives are met export should be fairly simple. -Basically it will consist of triggering the evaluation of source code -blocks with the org-export-preprocess-hook. + once the previous objectives are met export should be fairly simple. + Basically it will consist of triggering the evaluation of source code + blocks with the org-export-preprocess-hook. -This block export evaluation will be aware of the target format -through the htmlp and latexp variables, and can then create quoted -=#+begin_html= and =#+begin_latex= blocks appropriately. + This block export evaluation will be aware of the target format + through the htmlp and latexp variables, and can then create quoted + =#+begin_html= and =#+begin_latex= blocks appropriately. * Notes @@ -395,7 +442,7 @@ a following the #+BEGIN_R declaration. *** block headers/parameters -regardless of the syntax/format chosen for the source blocks, we will +Regardless of the syntax/format chosen for the source blocks, we will need to be able to pass a list of parameters to these blocks. These should include (but should certainly not be limited to) - label or id :: Label of the block, should we provide facilities for @@ -488,6 +535,7 @@ through the process of fleshing out objectives, and cashing those objectives out into tasks. That said, please feel free to make any changes that you see fit. - +** Dan <2009-02-12 Thu 10:23> + Good job Eric with major works on this file. * Buffer Dictionary LocalWords: DBlocks dblocks