Skip to content

Commit

Permalink
more examples for data analysis and code generation
Browse files Browse the repository at this point in the history
  • Loading branch information
akochari committed Jun 18, 2024
1 parent 7059dd4 commit 0107323
Show file tree
Hide file tree
Showing 4 changed files with 83 additions and 5 deletions.
32 changes: 31 additions & 1 deletion FF2024/data_analysis.html
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@
<h2 class="anchored" data-anchor-id="llm-as-a-data-analysis-assistant">LLM as a data analysis assistant</h2>
<p>LLMs are able to analyse and provide answers based on tabular data. Some tools allow to upload a file (for example, in .txt, .csv, or excel format); in other cases we can insert the dataset as is into the chat window, and this will work reasonably well.</p>
<p>Once the data is uploaded, it is possible to ask general questions about the dataset, ask for numbers that can be found directly in some cell, questions that require manipulation and combination of data, ask for data visualisation, and in case you want something to work with further it is possible to ask for a reformatted dataset that you can copy and save into a file or to generate code to analyse or visualize the dataset.</p>
<p>Below are suggestions for some of the interactions that you can have with an LLM using an example dataset. We will use the dataset on population of Sweden between 1749 and 2023 provided by Statistics Sweden (Statistiska centralbyrån<em>)</em>.</p>
<p>Below are suggestions for some of the interactions that you can have with an LLM using an example dataset. We will use the dataset on population of Sweden between 1749 and 2023 provided by Statistics Sweden (Statistiska centralbyrån).</p>
<section id="loading-the-dataset" class="level3">
<h3 class="anchored" data-anchor-id="loading-the-dataset">Loading the dataset</h3>
<p>As mentioned, some tools allow upload of files whereas others do not. For example, paid version of Microsoft Copilot and ChatGPT allow to upload file whereas free versions (for example free ChatGPT) often do not allow file upload. In case file upload is not allowed, you can simply select the text of the dataset on the webpage, copy it, and paste the data into the chat window. The spaces or other delimiters should be interpreted correctly by the LLMs in most cases.</p>
Expand Down Expand Up @@ -158,6 +158,36 @@ <h3 class="anchored" data-anchor-id="advanced-generating-code-for-plots-and-dash
<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a>When I select columns containing a space <span class="cf">in</span> the column names I see an error message. Correct the code to avoid this error.</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
</section>
<section id="other-examples-of-using-llms-for-data-analysis" class="level3">
<h3 class="anchored" data-anchor-id="other-examples-of-using-llms-for-data-analysis">Other examples of using LLMs for data analysis</h3>
<section id="tables-and-plots" class="level5">
<h5 class="anchored" data-anchor-id="tables-and-plots"><strong>Tables and plots</strong></h5>
<div class="cell">
<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a>Generate a table of <span class="dv">5</span> columns with values <span class="cf">in</span> each column drawn from <span class="dv">5</span> different distributions.</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell">
<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a>Make a violin plot of the dataset.</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell">
<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>Make a canvas with three subplots; a heatmap, a histogram and a scatterplot</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
</section>
<section id="multivariate-data" class="level4">
<h4 class="anchored" data-anchor-id="multivariate-data"><strong>Multivariate data</strong></h4>
<div class="cell">
<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a>Create an excel spreadsheet with <span class="dv">100</span> rows and <span class="dv">5</span> columns, where each column has values drawn from a normal distribution with a random mean.</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell">
<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a>Include an extra <span class="fu">column</span> (Y) with either <span class="dv">1</span> or <span class="dv">2</span> as values.</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell">
<div class="sourceCode cell-code" id="cb16"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a>Scale column <span class="dv">3</span> to unit variance make a violin plot of the data with a black and white color scheme Make a PCA<span class="sc">-</span>plot, color by the column Y.</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell">
<div class="sourceCode cell-code" id="cb17"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a>Redo PCA with a red and blue color scheme.</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
</section>
</section>
</section>

</main>
Expand Down
36 changes: 35 additions & 1 deletion FF2024/data_analysis.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ LLMs are able to analyse and provide answers based on tabular data. Some tools a

Once the data is uploaded, it is possible to ask general questions about the dataset, ask for numbers that can be found directly in some cell, questions that require manipulation and combination of data, ask for data visualisation, and in case you want something to work with further it is possible to ask for a reformatted dataset that you can copy and save into a file or to generate code to analyse or visualize the dataset.

Below are suggestions for some of the interactions that you can have with an LLM using an example dataset. We will use the dataset on population of Sweden between 1749 and 2023 provided by Statistics Sweden (Statistiska centralbyrån*)*.
Below are suggestions for some of the interactions that you can have with an LLM using an example dataset. We will use the dataset on population of Sweden between 1749 and 2023 provided by Statistics Sweden (Statistiska centralbyrån).

### Loading the dataset

Expand Down Expand Up @@ -97,3 +97,37 @@ In this case the LLM is probably going to generate code for an R Shiny dashboard
```{R}
When I select columns containing a space in the column names I see an error message. Correct the code to avoid this error.
```

### Other examples of using LLMs for data analysis

##### **Tables and plots**

```{R}
Generate a table of 5 columns with values in each column drawn from 5 different distributions.
```

```{R}
Make a violin plot of the dataset.
```

```{R}
Make a canvas with three subplots; a heatmap, a histogram and a scatterplot
```

#### **Multivariate data**

```{R}
Create an excel spreadsheet with 100 rows and 5 columns, where each column has values drawn from a normal distribution with a random mean.
```

```{R}
Include an extra column (Y) with either 1 or 2 as values.
```

```{R}
Scale column 3 to unit variance make a violin plot of the data with a black and white color scheme Make a PCA-plot, color by the column Y.
```

```{R}
Redo PCA with a red and blue color scheme.
```
10 changes: 8 additions & 2 deletions FF2024/website.html
Original file line number Diff line number Diff line change
Expand Up @@ -160,8 +160,14 @@ <h3 class="anchored" data-anchor-id="other-ideas">Other ideas</h3>
<h3 class="anchored" data-anchor-id="advanced-hosting-the-website">🧙‍♂️🧙‍♀️<em>ADVANCED</em> Hosting the website</h3>
<p>Once you generate your website code the next step is to host it. This is something that the LLM cannot do for you but it possible to ask an LLM to provide guidance in this process as well. It is likely to be especially useful to those who are new to website hosting since it can answer beginner questions and provide detailed instructions.</p>
</section>
<section id="section" class="level3">
<h3 class="anchored" data-anchor-id="section"></h3>
<section id="other-examples-of-prompts-for-website-generation" class="level3">
<h3 class="anchored" data-anchor-id="other-examples-of-prompts-for-website-generation">Other examples of prompts for website generation</h3>
<div class="cell">
<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>Create a web page that asks <span class="cf">for</span> <span class="fu">weight</span> (kg) and <span class="fu">height</span> (cm) of a person and calculates BMI</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell">
<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>Create a web page that asks <span class="cf">for</span> the radius of a sphere and calculates the volume</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
</section>
</section>

Expand Down
10 changes: 9 additions & 1 deletion FF2024/website.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -96,4 +96,12 @@ Here are some other interesting types of websites you might want to try to creat

Once you generate your website code the next step is to host it. This is something that the LLM cannot do for you but it possible to ask an LLM to provide guidance in this process as well. It is likely to be especially useful to those who are new to website hosting since it can answer beginner questions and provide detailed instructions.

###
### Other examples of prompts for website generation

```{R}
Create a web page that asks for weight (kg) and height (cm) of a person and calculates BMI
```

```{R}
Create a web page that asks for the radius of a sphere and calculates the volume
```

0 comments on commit 0107323

Please sign in to comment.