Skip to content

Commit

Permalink
Merge pull request #197 from hneth/master
Browse files Browse the repository at this point in the history
Revise vignettes and clean up code
  • Loading branch information
hneth committed Jun 3, 2023
2 parents 010f4ab + a3113de commit fed34f2
Show file tree
Hide file tree
Showing 6 changed files with 50 additions and 60 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: FFTrees
Type: Package
Title: Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees
Version: 1.9.0.9031
Date: 2023-05-31
Version: 1.9.0.9032
Date: 2023-06-01
Authors@R: c(person("Nathaniel", "Phillips", role = c("aut"), email = "Nathaniel.D.Phillips.is@gmail.com", comment = c(ORCID = "0000-0002-8969-7013")),
person("Hansjoerg", "Neth", role = c("aut", "cre"), email = "h.neth@uni.kn", comment = c(ORCID = "0000-0001-5427-3141")),
person("Jan", "Woike", role = "aut", comment = c(ORCID = "0000-0002-6816-121X")),
Expand Down
2 changes: 1 addition & 1 deletion R/plotFFTrees_function.R
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
#' }
#' By default, \code{data = 'train'} (as \code{x} may not contain test data).
#'
#' @param what What should be plotted (as a string)? Valid options are:
#' @param what What should be plotted (as a character string)? Valid options are:
#' \describe{
#' \item{'all'}{Plot the tree diagram with all corresponding guides and performance statistics, but excluding cue accuracies.}
#' \item{'cues'}{Plot only the marginal accuracy of cues in ROC space.
Expand Down
13 changes: 4 additions & 9 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,11 @@ url_JDM_doi <- "https://doi.org/10.1017/S1930297500006239"
[![R-CMD-check](https://github.com/ndphillips/FFTrees/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ndphillips/FFTrees/actions/workflows/R-CMD-check.yaml)
<!-- Devel badges end. -->


<!-- Release badges start: -->
<!-- [![CRAN status](https://www.r-pkg.org/badges/version/FFTrees)](https://CRAN.R-project.org/package=FFTrees) -->
<!-- [![Total downloads](https://cranlogs.r-pkg.org/badges/grand-total/FFTrees?color='00a9e0')](https://www.r-pkg.org/pkg/FFTrees) -->
<!-- Release badges end. -->


<!-- ALL badges start: -->
<!-- [![CRAN status](https://www.r-pkg.org/badges/version/FFTrees)](https://CRAN.R-project.org/package=FFTrees) -->
<!-- [![Build Status](https://travis-ci.org/ndphillips/FFTrees.svg?branch=master)](https://travis-ci.org/ndphillips/FFTrees) -->
Expand Down Expand Up @@ -197,13 +195,11 @@ plot(heart_fft,
heart_fft$competition$test
```


<!-- FFTs by verbal description: -->

### Building FFTs from verbal descriptions

FFTs are so simple that we even can create them 'from words' and then apply them to data!

FFTs are so simple that we even can create them 'from words' and then apply them to data.
For example, let's create a tree with the following three nodes and evaluate its performance on the `heart.test` data:

1. If `sex = 1`, predict _Disease_.
Expand Down Expand Up @@ -235,11 +231,10 @@ plot(my_fft,
![An FFT created from a verbal description.](man/figures/README-example-heart-verbal-1.png)

**Figure\ 2**: An FFT predicting heart disease created from a verbal description.

As we can see, this particular tree is somewhat biased:
The performance measures (in the bottom panel of **Figure\ 2**) show that this particular tree is somewhat biased:
It has nearly perfect _sensitivity_ (i.e., is good at identifying cases of _Disease_) but suffers from low _specificity_ (i.e., performs poorly in identifying _Healthy_ cases).
Expressed in terms of its errors, `my_fft` incurs few misses at the expense of many false alarms.
Although the _accuracy_ of our custom tree still exceeds the data's baseline by a fair amount, the FFTs in `heart_fft` (from above) strike a better balance.
Although the _accuracy_ of our custom tree still exceeds the data's baseline by a fair amount, the FFTs in `heart_fft` (created above) strike a better balance.

<!-- A range of options, rather than 1 optimum: -->

Expand All @@ -249,7 +244,7 @@ To explore this range of options, the **FFTrees** package enables us to design a

## References

We had a lot of fun creating **FFTrees** and hope you like it too!
We had a lot of fun creating the **FFTrees** package and hope you like it too!
As a comprehensive, yet accessible introduction to FFTs, we recommend reading our article in the journal _Judgment and Decision Making_ ([2017](`r url_JDM_doi`)), entitled _FFTrees: A toolbox to create, visualize,and evaluate fast-and-frugal decision trees_ (available in [html](`r url_JDM_html`) | [PDF](`r url_JDM_pdf`)\ ).


Expand Down
31 changes: 15 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<!-- README.md is generated from README.Rmd. Please only edit the .Rmd file! -->
<!-- Title, version and logo: -->

# FFTrees 1.9.0.9031 <img src = "./inst/FFTrees_Logo.jpg" align = "right" alt = "FFTrees" width = "225" />
# FFTrees 1.9.0.9032 <img src = "./inst/FFTrees_Logo.jpg" align = "right" alt = "FFTrees" width = "225" />

<!-- Devel badges start: -->

Expand Down Expand Up @@ -211,8 +211,7 @@ heart_fft$competition$test
### Building FFTs from verbal descriptions

FFTs are so simple that we even can create them ‘from words’ and then
apply them to data!

apply them to data.
For example, let’s create a tree with the following three nodes and
evaluate its performance on the `heart.test` data:

Expand Down Expand Up @@ -247,15 +246,15 @@ plot(my_fft,
description.](man/figures/README-example-heart-verbal-1.png)

**Figure 2**: An FFT predicting heart disease created from a verbal
description.

As we can see, this particular tree is somewhat biased: It has nearly
perfect *sensitivity* (i.e., is good at identifying cases of *Disease*)
but suffers from low *specificity* (i.e., performs poorly in identifying
description.
The performance measures (in the bottom panel of **Figure 2**) show that
this particular tree is somewhat biased: It has nearly perfect
*sensitivity* (i.e., is good at identifying cases of *Disease*) but
suffers from low *specificity* (i.e., performs poorly in identifying
*Healthy* cases). Expressed in terms of its errors, `my_fft` incurs few
misses at the expense of many false alarms. Although the *accuracy* of
our custom tree still exceeds the data’s baseline by a fair amount, the
FFTs in `heart_fft` (from above) strike a better balance.
FFTs in `heart_fft` (created above) strike a better balance.

<!-- A range of options, rather than 1 optimum: -->

Expand All @@ -267,12 +266,12 @@ package enables us to design and evaluate a range of FFTs.

## References

We had a lot of fun creating **FFTrees** and hope you like it too! As a
comprehensive, yet accessible introduction to FFTs, we recommend reading
our article in the journal *Judgment and Decision Making*
([2017](https://doi.org/10.1017/S1930297500006239)), entitled *FFTrees:
A toolbox to create, visualize,and evaluate fast-and-frugal decision
trees* (available in
We had a lot of fun creating the **FFTrees** package and hope you like
it too! As a comprehensive, yet accessible introduction to FFTs, we
recommend reading our article in the journal *Judgment and Decision
Making* ([2017](https://doi.org/10.1017/S1930297500006239)), entitled
*FFTrees: A toolbox to create, visualize,and evaluate fast-and-frugal
decision trees* (available in
[html](https://journal.sjdm.org/17/17217/jdm17217.html) \|
[PDF](https://journal.sjdm.org/17/17217/jdm17217.pdf) ).

Expand Down Expand Up @@ -333,6 +332,6 @@ Examples include:

------------------------------------------------------------------------

\[File `README.Rmd` last updated on 2023-05-31.\]
\[File `README.Rmd` last updated on 2023-06-01.\]

<!-- eof. -->
2 changes: 1 addition & 1 deletion man/plot.FFTrees.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

58 changes: 27 additions & 31 deletions vignettes/FFTrees_heart.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,8 @@ For definitions of all accuracy statistics, see the [accuracy statistics](FFTree

### Step\ 4: Visualise the final FFT

Use `plot()` to visualize an FFT (an `FFTrees` object):
We use `plot(x)` to visualize an FFT (from an\ `FFTrees` object\ `x`).
Using `data = "train"` evaluates an\ FFT for training data (fitting), whereas `data = "test"` predicts the performance of an\ FFT for a different dataset:

```{r fft-plot, fig.width = 6.5, fig.height = 6}
# Plot predictions of the best FFT when applied to test data:
Expand All @@ -152,9 +153,12 @@ plot(heart.fft, # An FFTrees object

#### Other arguments

The `plot()` function for `FFTrees` object

- `tree`: Which tree in the object should beplotted? To plot a tree other than the best fitting tree (FFT \#1), just specify another tree as an integer (e.g.; `plot(heart.fft, tree = 2)`).

- `data`: For which dataset should statistics be shown? Either `data = "train"` (showing fitting or "Training" performance by default), or `data = "test"` (showing prediction or "Testing" performance).
- `data`: For which dataset should statistics be shown?
Either `data = "train"` (showing fitting or "Training" performance by default), or `data = "test"` (showing prediction or "Testing" performance).

- `stats`: Should accuracy statistics be shown with the tree? To show only the tree, without any performance statistics, include the argument `stats = FALSE`.

Expand All @@ -166,7 +170,9 @@ plot(heart.fft, what = "tree")

- `comp`: Should statistics from competitive algorithms be shown in the ROC curve? To remove the performance statistics of competitive algorithms (e.g.; regression, random forests), include the argument `comp = FALSE`.

- `what`: To show individual cue accuracies (in ROC space), include the argument `what = "cues"`:
- `what`: Which parts of an `FFTrees` object should be visualized (e.g., `all`, `icontree` and `tree`).
Using `what = "roc"` plots tree performance as an ROC\ curve.
To show individual cue accuracies (in ROC space), specify `what = "cues"`:

```{r fft-cues, fig.width = 6, fig.height = 6, out.width = "500px"}
# Plot cue accuracies (for training data) in ROC space:
Expand All @@ -176,20 +182,28 @@ plot(heart.fft, what = "cues")
See the [Plotting FFTrees](FFTrees_plot.html) vignette for details on plotting FFTs.


### Additional steps
### Advanced functions

Creating sets of FFTs and evaluating them on data by printing and plotting individual FFTs provides the core functionality of **FFTrees**.
However, the package also provides more advanced functions for accessing, defining, using and evaluating FFTs.

#### Accessing outputs
#### Accessing outputs

An `FFTrees` object contains many different outputs, to see them all, run `names()`
An `FFTrees` object contains many different outputs.
Basic performance information on the current data and set of FFTs is available by the `summary()` function.
To see and access parts of an `FFTrees` object, use `str()` or `names()`:

```{r fft-names}
# Show the names of all of the outputs in heart.fft:
# Show the names of all outputs in heart.fft:
names(heart.fft)
```

Key elements of an `FFTrees` object are explained in the vignette on [Creating FFTs with FFTrees()](FFTrees_function.html).


#### Predicting for new data

To predict classifications for a new dataset, use the standard `predict()` function.
To predict classification outcomes for new data, use the standard `predict()` function.
For example, here's how to predict the classifications for data in the `heartdisease` object (which actually is just a combination of `heart.train` and `heart.test`):

```{r fft-predict, eval = FALSE}
Expand All @@ -198,32 +212,14 @@ predict(heart.fft,
newdata = heartdisease)
```

#### Defining FFTs in words

To define a specific FFT and apply it to data, we can define a tree by providing its verbal description to the `my.tree` argument:
#### Directly defining FFTs

```{r fft-my-tree, results = 'hide'}
# Create an FFT manually (from description):
my.heart.fft <- FFTrees(formula = diagnosis ~.,
data = heart.train,
data.test = heart.test,
main = "My Heart FFT",
my.tree = "If chol > 350, predict True.
If cp != {a}, predict False.
If age <= 35, predict False, otherwise, predict True.")
```

Running this code evaluates `my.tree` for the specified sets of data.
A visualization of the resulting tree shows its performance summary
(for the training data):

```{r plot-my-fft, fig.width = 6.5, fig.height = 6}
plot(my.heart.fft, data = "train")
```
To define a specific FFT and apply it to data, we can define a tree by providing its verbal description to the `my.tree` argument.
Similarly, we can define sets of FFT definitions (as a data frame) and evaluate them on data by using the `tree.definitions` argument of `FFTrees()`.
As we often start from an existing set of FFTs, **FFTrees** provides a set of functions for extracting, converting, and modifying tree definitions.

The resulting tree is actually not too bad, although its first node is pretty useless (as it only classifies 3\ cases, all as false alarms).
Thus, omitting the first node will result in an even simpler FFT that cannot be worse.
Feel free to verify this ---\ and see the [Manually specifying FFTs](FFTrees_mytree.html) vignette for additional details on defining FFTs from verbal or abstract descriptions.
See the vignette on [Manually specifying FFTs](FFTrees_mytree.html) for defining FFTs from descriptions and modifying tree definitions.


## Vignettes
Expand Down

0 comments on commit fed34f2

Please sign in to comment.