Use repl language tag for sample #1107

abhro · 2024-04-22T02:58:35Z

No description provided.

docs/src/common_mlj_workflows.md

ablaom · 2024-04-23T00:53:33Z

docs/src/learning_networks.md

-W = @node selectrows(Xs, @node shuffle(rs))
+julia> Xs = source(X)
+julia> rs = @node rows(Xs)
+julia> W = @node selectrows(Xs, @node shuffle(rs))



Generally, the pattern has been to only include julia> before input when corresponding output is to follow. Are you finding this confusing?

Yes, quite a bit. There are two reasons for this, one is that prompted and prompt-less code are mixed in, and it's not always (immediately) clear to tell what's code, what's code before the output, and what's output. This is especially highlighted when object's reprs are shown, which are made to look like Julia constructors/code, and so it gets even harder to tell what's output and what's code. The other part is syntax highlighting, which can be a huge boon when reading online sources quickly for some reference.

Good to have this feedback, thank you.

Unfortunately, this convention is pretty widespread, at least in the MLJ ecosystem. I wonder if others are also finding this confusing. @OkonSamuel @EssamWisam What's your view of this? Do we need to include julia> prompts in every single line when there is sometimes REPL output to be included in a docstring? Often we just only add julia> before lines with output immediately following.

My preference is to minimize the use of >julia as it jams up the code (and has to be dropped before copy-pasting into .jl files) which as you said is consistent with the ongoing convention for MLJ. That said, I do find @abhro's point to be valid; it would be not immediately obvious to a beginner that the printed object is not an actual function being called and for that I think we could expose to readers the convention that what immediately comes after a line with >julia is the output of the code in that line.

In any case, in the future, we can explore the possibility of having the outputs be printed in a differently styled cell as in Imbalance tutorials and DataScienceTutorials which would allow dropping all >julia and would be optimal for maximal facilitation of code reuse.

Yes, I think the way the @example macro does it also provides a good separation between output and code.

docs/src/mlj_cheatsheet.md

docs/src/simple_user_defined_models.md

ablaom · 2024-04-23T01:02:46Z

docs/src/transformers.md

-# transforming:
-Xsmall = transform(mach);
-selectrows(Xsmall, 1:4) |> pretty
+julia> # transforming:


This is really weird: a comment at the julia> prompt, here and later. Let's just skip all the prompts that don't precede a block of output (see previous comment). Here and below in this file.

Yes, I'm not particularly happy with this change either, I just did it for consistency. Maybe there's a better way to write this?

To be clear, I meant the prompted lines that only hold comments. I still hold that for any code sample that mixes code and output, the code should be prompted.

Maybe a better way to do this would be to split the code sample into 3, one for setup, one for transforming, and one for predicting?

Yes, you can re-organize so that the code comments become ordinary markdown text between seperate fenced blocks.

docs/src/transformers.md

docs/src/working_with_categorical_data.md

codecov-commenter · 2024-04-23T01:16:00Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.89%. Comparing base (6f46257) to head (8e45385).
Report is 5 commits behind head on dev.

❗ Current head 8e45385 differs from pull request most recent head ae28151. Consider uploading reports for the commit ae28151 to get more accurate results

Additional details and impacted files

@@           Coverage Diff           @@
##              dev    #1107   +/-   ##
=======================================
  Coverage   57.89%   57.89%           
=======================================
  Files           2        2           
  Lines          38       38           
=======================================
  Hits           22       22           
  Misses         16       16

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Co-authored-by: Anthony Blaom, PhD <anthony.blaom@gmail.com>

@example

Remove julia> prompts, replace with @example macro

docs/Project.toml

ablaom · 2024-05-06T22:57:54Z

@abhro My question regarding ParallelMeans notwithstanding, are you ready for a final review?

abhro · 2024-05-07T01:40:31Z

Sure!

codecov · 2024-05-12T23:43:44Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.89%. Comparing base (e341344) to head (650ebbd).
Report is 13 commits behind head on dev.

Additional details and impacted files

@@           Coverage Diff           @@
##              dev    #1107   +/-   ##
=======================================
  Coverage   57.89%   57.89%           
=======================================
  Files           2        2           
  Lines          38       38           
=======================================
  Hits           22       22           
  Misses         16       16

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ablaom

Thanks for this work @abhro 🙏🏾

ablaom · 2024-05-12T23:48:03Z

@EssamWisam Can you please go over the proposed changes to the cheatsheet.

@abhro Can you please address the ParallelKMeans comment?

abhro · 2024-05-13T00:36:16Z

Hello, I think I addressed it in the original comment thread? But to reiterate, this is to support the change in docs/src/transformers.md, where I changed the predicting transformers code samples to an @example block (lines 196–209), and it loads the KMeans model from the ParallelKMeans package.

EssamWisam

Reviewed the cheatsheet. All is okay. However, I wonder why only some of the backticks were replaced with ```julia ``` and not others. For the purposes of viewing the cheatsheet in the website, it's better to use ```julia ``` whenever the line is long enough.

Otherwise, the cheatsheet, viewed from the website, would look like this:

where many lines are not syntax highlighted (e.g., look below Learning Curves).

Other great changes to facilitate viewing the cheatsheet from the browser:

No overly long sections (chunk tuning into three ## headlines: ## Tuning model wrapper, ## Ranges for tuning (range=...), ## Tuning strategies)
No overly long lines of code. For instance, the following is too long to view on the website. E.g., comments in:

@abhro let me know if you would be willing to revisit the cheat sheet accordingly. If not, I can consider that and add a commit.

For how the sheet views on the website after doing the change in the first bullet above:

MLJ-Sheet.pdf

Thank you for the effort.

ablaom · 2024-05-13T20:34:46Z

... this is to support the change in docs/src/transformers.md, where I changed the predicting transformers code samples to an @example block (lines 196–209), and it loads the KMeans model from the ParallelKMeans package

Thanks for the explanation.

ParallelKMeans is not a regularly maintained package. Can we please change pkg=ParallelKMeans to pkg=NearestNeighborModels? That pkg is already in the Project and both provide KMeans.

abhro · 2024-05-15T00:36:46Z

Hello, @EssamWisam, thank you for the feedback! I'm working on updating the cheatsheet to make it a little more consistent. Can I ask how you generated the pdf? It would help a lot in checking the changes I'm making. Thanks again!

abhro · 2024-05-15T01:17:22Z

@ablaom I can't find the KMeans model in NearestNeighborsModel. I could use the one in Clustering/MLJClusteringInterface. Does that work?

ablaom · 2024-05-15T02:49:58Z

@ablaom I can't find the KMeans model in NearestNeighborsModel. I could use the one in Clustering/MLJClusteringInterface. Does that work?

Oops 😳 Yes, use the MLJClusteringInterface version.

EssamWisam · 2024-05-15T20:02:13Z

Hello, @EssamWisam, thank you for the feedback! I'm working on updating the cheatsheet to make it a little more consistent. Can I ask how you generated the pdf? It would help a lot in checking the changes I'm making. Thanks again!

Well, the MLJ website is currently under development and (as of now only) unless you have some web development frameworks expertise, it's not straightforward to generate it.

That said, avoiding overly ong long lines, using ```julia ``` more often and avoiding overlong long sections are pretty much all the conditions needed for the cheatsheet to render nicely.

abhro · 2024-05-15T20:04:36Z

Alrighty! I think I've made those changes. If there's something I missed, or in some ways it could be made better, please don't hesitate to let me know! Thanks!

docs/src/mlj_cheatsheet.md

EssamWisam · 2024-05-15T20:22:19Z

Alrighty! I think I've made those changes. If there's something I missed, or in some ways it could be made better, please don't hesitate to let me know! Thanks!

Thank you for the valuable contribution, @abhro. I made some minor comments.

Co-authored-by: Essam <essamwisam@outlook.com>

abhro added 2 commits April 21, 2024 22:57

Use repl language tag for sample

a89faf7

Update language tags for code samples

8e45385