Skip to content

exportRmd::convertDocToRmd

Louis edited this page Mar 13, 2015 · 1 revision

The convertDocToRmd is called by either convertSingleDoc or convertFolder with a document and folder as arguments (the folder is just the document's parent in any case).

It creates an empty string, text; names the Rmarkdown identically to the input document, changing file extension to .Rmd; likewise the image output folder becomes inputname_images.

NB I don't aim to use this form of image storage... For now I'd prefer URLs, which will be tricky since Apps Script is very restrictive on the attributes that can be applied to any object. LINK_URL could be used for the image's source URL, but that would preclude using this space for an actual link URL (it's sensible for dynamic documents to respect these semantics). Instead I'd look to use a bastardisation of the comments system... see to-do list for gory details.

  • inSrc seems to be a marker of whether the line being read is currently within a source code range of lines, demarcated with "--- source code" or "--- src" and "---" and converted to <pre></pre>
  • inClass is a similar marker of whether the line being read is currently within a div range demarcated with --- class whateverclassnameyouwant.
    • This would be perfect for R chunks 😀 First word can be language (e.g. r), if there's { wrapping the statement it becomes a chunk, if not it just becomes a language block.
    • NB if using single-cell code blocks as tables there'd be no need to have it explicit. Is plain text preferable and not clicking a button to insert code blocks...? Or is the readability of these code blocks preferable...? Would be a much nicer separation even amenable to sending the code out via an API to actually be executed......

These relate to the working of processParagraph (and its results).

The script steps through all child elements (similar to a HTML document's DOM, the Docs document has an iterable number of children).

After encoding the document (appending the processed version of each element to a single string variable, text). It pushes the text to an array of files with three components per file: filename Rmd_filename, plain text mimeType, content text.

Any previously converted Rmarkdown files are removed without attempt to check if we're just converting a single document and might want to keep the rest in that subfolder...(!)

  • May want to add a setting on this behaviour or only do it when converting an entire folder. Interesting that files aren't just edited... Means there's no modification history 😕 Editing would be far preferable, and plain text shouldn't be a problem..?

If the post has images they're saved in a sub-folder corresponding to the input file name under the /assets/images/sub-directory.

  • This doesn't seem to work... 😞

If the files created after text processing don't have a blob on them (why?), a file is added to the folder made of said blob.

function convertDocToRmd(document, destination_folder) {
  var scriptProperties = PropertiesService.getScriptProperties(); 
  var image_prefix=scriptProperties.getProperty("image_folder_prefix");
  var numChildren = document.getActiveSection().getNumChildren();
  var text = "";
  var Rmd_filename = document.getName()+".Rmd";
  var image_foldername = document.getName()+"_images";
  var inSrc = false;
  var inClass = false;
  var globalImageCounter = 0;
  var globalListCounters = {};
  // edbacher: added a variable for indent in src <pre> block. Let style sheet do margin.
  var srcIndent = "";
  
  var postHasImages = false; 
  
  var files = [];
  
  // Walk through all the child elements of the doc.
  for (var i = 0; i < numChildren; i++) {
    var child = document.getActiveSection().getChild(i);
    var result = processParagraph(i, child, inSrc, globalImageCounter, globalListCounters, image_prefix + image_foldername);
    globalImageCounter += (result && result.images) ? result.images.length : 0;
    if (result!==null) {
      if (result.sourcePretty==="start" && !inSrc) {
        inSrc=true;
        text+="<pre class=\"prettyprint\">\n";
      } else if (result.sourcePretty==="end" && inSrc) {
        inSrc=false;
        text+="</pre>\n\n";
      } else if (result.source==="start" && !inSrc) {
        inSrc=true;
        text+="<pre>\n";
      } else if (result.source==="end" && inSrc) {
        inSrc=false;
        text+="</pre>\n\n";
      } else if (result.inClass==="start" && !inClass) {
        inClass=true;
        text+="<div class=\""+result.className+"\">\n";
      } else if (result.inClass==="end" && inClass) {
        inClass=false;
        text+="</div>\n\n";
      } else if (inClass) {
        text+=result.text+"\n\n";
      } else if (inSrc) {
        text+=(srcIndent+escapeHTML(result.text)+"\n");
      } else if (result.text && result.text.length>0) {
        text+=result.text+"\n\n";
      }
      
      if (result.images && result.images.length>0) {
        for (var j=0; j<result.images.length; j++) {
          files.push( { "blob": result.images[j].blob } );
          postHasImages = true; 
        }
      }
    } else if (inSrc) { // support empty lines inside source code
      text+='\n';
    }
      
  }
  files.push({"fileName": Rmd_filename, "mimeType": "text/plain", "content": text});
    
  
  // Cleanup any old folders and files in our destination directory with an identical name
  var old_folders = destination_folder.getFoldersByName(image_foldername)
  while (old_folders.hasNext()) {
    var old_folder = old_folders.next();
    old_folder.setTrashed(true)
  }  
  
  // Remove any previously converted Rmarkdown files.
  var old_files = destination_folder.getFilesByName(Rmd_filename)
  while (old_files.hasNext()) {
    var old_file = old_files.next();
    old_file.setTrashed(true)
  }  
  
  // Create a subfolder for images if they exist
  var image_folder; 
  if (postHasImages) { 
    image_folder = DriveApp.createFolder(image_foldername);
    DriveApp.removeFolder(image_folder); // Confusing convention; this just removes the folder from the google drive root.
    destination_folder.addFolder(image_folder)
  }
  
  for (var i = 0; i < files.length; i++) { 
    var saved_file; 
    if (files[i].blob) {
      saved_file = DriveApp.createFile(files[i].blob)
      // The images go into a subfolder matching the post title
      image_folder.addFile(saved_file)
    } else { 
      // The Rmarkdown files all go in the "Rmarkdown" directory
      saved_file = DriveApp.createFile(files[i]["fileName"], files[i]["content"], files[i]["mimeType"])  
      destination_folder.addFile(saved_file)
    }
   DriveApp.removeFile(saved_file) // Removes from google drive root.
  }
  
}
Clone this wiki locally